National Action Council for Minorities in Engineering(NACME) Artificial Intelligence - Machine Learning (AIML) Intensive Summer Bootcamp at the University of Southern California
Apple | NACME | University of Southern California |
---|---|---|
Developed by:
- Darryn Dunn -
Computer Science: Concentration in Video Game Development
-University of the Pacific (UOP)
- Eliska Peacock -
Symbolic Systems: Concentration in Media & Communication
-Stanford University
- Johnny Williams -
Computer Science: Concentration in Software Engineering
-Louisiana State University (LSU)
This project, led by Professor Chukwuebuka Nweke, focuses on estimating and predicting ground shaking intensity for earthquake engineering. The core of this research involves analyzing microtremor data collected from a site in Utah in order to gain a deeper understanding of the complex interactions between seismic waves and subsurface structures. The project involves identifying both observed and latent features from the horizontal-to-vertical spectral ratio (HVSR) curves. By mapping these features against one another, the research seeks to detail the relationship between subsurface geologic structures and seismic waves. The project utilizes autoencoders to help identify latent features that may contribute to understanding of seismic responses. The insights gained from this research could lead to more accurate seismic hazard assessments and better-informed earthquake engineering practices. By leveraging advanced data analysis techniques, it aims to enhance our ability to predict ground shaking intensity and improve the safety and resilience of structures in earthquake-prone areas.
We are using a dataset provided by graduate students working on their PhD thesis. The data, collected in Utah, is organized into 3-hour segments, each containing an embedded 3-minute segment for frequency comparison.
To help us visualize the data initially, we plotted each 3-minute interval against the frequency for one three-hour segment. This provided a clear visual representation, allowing us to compare future data to ensure it resembles the original raw data.
An autoencoder consists of two main parts: an encoder and a decoder. The encoder compresses the input into a latent space representation, while the decoder reconstructs the original input from this compact representation.
Before plotting the output of the autoencoder, we needed to visualize all the observed features. These features included the area under the curve, peak amplitudes, peak frequencies, and other relevant metrics. By plotting these features, we could gain a comprehensive understanding of the data's characteristics. The following graph illustrates the peak amplitude, area under the curve, and peak frequencies, providing a detailed view of the key features we analyzed.
Examining the area under the curve made it challenging to pinpoint the exact times, as the details were not immediately clear. To address this, we created a separate, more focused graph specifically to highlight the area under the curve. This additional visualization enhances our understanding by providing a clearer, more detailed view of the time intervals, ensuring we can accurately interpret the data.
After plotting the area under the curve for a single three-hour interval, we realized that this approach might not capture the overall trends effectively. Therefore, we decided it would be more informative to calculate and represent the average area under the curve across all eight three-hour intervals we had. This approach allows us to gain a clearer understanding of the data's overall patterns and trends, facilitating better analysis and comparison.
This graph represents the unseen features generated by the autoencoder. In this visualization, we are plotting one three-hour interval against three different latency features, providing insights into the output produced by the model. By analyzing these relationships, we can better understand how the autoencoder captures and represents the underlying patterns in the data. This comparative analysis allows us to assess the model's performance and explore how effectively it retains important information from the original dataset.
Now that we have visualized the features plotted against themselves, we can take the analysis a step further by plotting these unseen features against various observed features, including the area under the curve, peak amplitude, and peak frequency. This approach allows us to explore the relationships between the latent representations generated by the autoencoder and the known features in our dataset. By examining these correlations, we can gain deeper insights into how the autoencoder captures essential patterns in the data, facilitating a better understanding of the underlying dynamics and helping us validate the model's effectiveness.
- Fork this repo
- Change directories into your project
- On the command line, type the following:
pip install numpy
pip install pandas
pip install torch
pip install seaborn
pip install matplotlib
pip install scipy
pip install scikit-learn
- Locate and open the file you downloaded from this repo
Please feel free to contact us
Darryn Dunn: [email protected]
Eliska Peacock: [email protected]
Johnny Williams: [email protected]