Earthquake Prediction https athena ecs csus edukokkirasindex html
- Slides: 15
Earthquake Prediction https: //athena. ecs. csus. edu/~kokkiras/index. html Team - 18 Mentor : Prof. Meiliu Lu Done By : Akhil Madineni Suraj Krishna Kokkirala
Over View Ø Problem Statement Ø Goal Ø Data Overview Ø Data Preprocessing Ø Model Implementation Ø Demo Ø Results Ø Learnings Ø References
Why predicting an Earthquake ? Forecasting earthquakes is one of the most important problems in science because of their catastrophic consequences. Current scientific studies related to earthquake forecasting focus on three key points: When the event will occur ? Where will it occur ? How large will it be ?
Goal The goal is to predict the timing of laboratory earthquakes using seismic signals. The data is been taken from an experimental set-up used to study earthquake physics. The acoustic_data input signal is used to predict the time remaining before the next laboratory earthquake (time_to_failure).
Data Overview File descriptions train. csv - A single, continuous training segment of experimental data. test - A collection of many small segments of acoustic data signals. sample_sumbission. csv - According to the sample submission file, we need to predict time to failure for each segment of test data.
Data Pre-Processing - I Raw Data Fields • acoustic_data - the seismic signal [int 16] • time_to_failure - the time (in seconds) until the next laboratory earthquake [float 64]
Data Pre-Processing - II Analyze Acoustic Data and “time to failure” Data.
Data Pre-Processing - III Steps for converting Data as X and Y for our prediction model 1. Read all rows between earthquakes and create folds. 2. For chunks of size 150'000, 75000, 50000, 30000, we extract a couple of features and store them in a row of a matrix X. The response is stored in a vector y(which is last time step of each chunk. 3. We move by "stride" positions and repeat steps 2 & 3 until the earthquake happens.
Model Implementation Random Forest - It is an ensemble learning method for classification, regression and other tasks that operates by constructing a multitude of decision trees at training time and outputting the class that is the mode of the classes (classification) or mean prediction (regression) of the individual trees.
#MODEL Trees Strides Mean Absolute Error(MAE) Model-1 500 150000 2. 267461 Model-2 2000 150000 2. 265719 Model-3 5000 150000 2. 264752 Model-4 500 75000 2. 282808 Model-5 2000 75000 2. 282194 Model-6 5000 75000 2. 282426 Model-7 50000 2. 284351 Model-8 2000 50000 2. 282518 Model-9 50000 2. 282003 Model-10 500 30000 2. 291800 Model-11 2000 30000 2. 289706 Model-12 5000 30000 2. 289344
Learnings…. • Data Pre-Processing is vital to the accuracy of the models. • Opting for appropriate machine learning techniques and algorithms to model the system. • Plotting data provides useful insights and can lead to better models. • Learnings related from ’R’. • To solve the error “cannot allocate a vector of size xxx MB ” use memory. limit function to assign more RAM to R kernel.
References • https: //en. wikipedia. org/wiki/Earthquake • https: //paperswithcode. com/