Anomaly Detection in Problematic GPS Time Series Data
Anomaly Detection in Problematic GPS Time Series Data and Modeling Dafna Avraham, Yehuda Bock Institute of Geophysics and Planetary Physics, Scripps Institution of Oceanography, University of California, San Diego, La Jolla, CA Introduction Anomalous event detection in Global Positioning System (GPS) time series is an important matter in geodetic research. The Scripps Orbit and Permanent Array Center (SOPAC) generates continuous and daily time series in three dimensions for over 1400 global GPS stations that are analyzed using a computerized modeling program, which is limited to fitting slopes (velocities), offsets, periodic (annual and semiannual) terms, and postseismic decays. Currently, anomalous events are not adequately recognized or considered. We have developed anomaly detection algorithms that are capable of detecting signals, outliers, trends in the data, and modeling problems. The algorithms contain modified versions of noise analysis, correlation statistics, and threshold utility. They run on the complete set of global GPS time series, successfully uncovering a majority of the previously undetected anomalies. We spatially cluster the types of anomalies in order to reveal the geophysical factors that contribute to the occurrence of the incongruities. We are developing a new interactive environment that will allow users to analyze on-the-fly temporal and spatial subsets of GPS time series in various ways, and to detect anomalous events using these newly developed methods. We are incorporating this into the GPS Explorer data portal, a joint project of SOPAC and JPL to provide user-friendly GPS data products and on-line modeling applications. (http: //geoapp. ucsd. edu) Anomalies in GPS Time Series Modeling Problems: These are seen when the model does not represent the data well. Often times, this happens either because the model is lacking an important model term(s), or because the data has gaps and jumps that mislead the model. Anomaly Detection Algorithms for GPS Time Series Signal and Modeling Problem Detection Algorithm Outlier Detection Algorithm Geophysical Anomalies and Conclusions Trend Detection Algorithm San Gabriel basin Los Angeles basin Santa Ana basin The daily GPS time series data is displayed in sets of three plots per GPS site, representing the north, east, and up directional components, respectively from top to bottom. Left: Signal and/or Modeling Problems. Middle: Outliers in beginning of series. Right: Trends in detrended data. Problem: A model that does Problem: Outliers are Problem: A detrended series not consistently fit the data problematic because they that still exhibits significant constitutes a modeling skew the data, and in turn, trend indicates that the data problem. Similarly, data that they can bias the model. contains unaccounted for deviates away from the model Extreme outliers must be information and/or modeling. in a particular pattern removed. represents a signal. Method: Create a threshold Method: Using the correlation Method: Search each GPS for each residual series that is coefficient, r, we can measure the site for existence of eightequal to 5 times the strength of the linear association month windows during which interquartile range (IQR). The between time (X) and distance (Y) in the residual series does not IQR is a very robust estimator GPS data. Since -1<r <1, with a change sign, and therefore of the spread of the series value of 0 representing no linear does not resemble white noise. since it is more resistant to association, and a value close to 1 This signifies a lack of outliers than the standard or -1 representing a strong linear important, but unaccounted deviation. Thus, residuals that association, we determined that a for, model terms. cross this threshold value greater than. 7 or less than -. 7 Spatial Clustering of Anomalies correspond to outliers. signifies trend. Signals and Modeling Problems Outliers Yellowstone Mount St. Helens Long Valley Caldera Volcanic Signals: Volcanoes affect ground motion in patterns that the anomaly detection algorithms consistently recognize, which is seen above as concentrations of detected sites in volcanic regions (Mt. St. Helens, Long Valley Caldera, and Yellowstone). Parkfield Earthquake San Simeon Earthquake Hector Mine & Landers Earthquakes Postseismic Deformation: The algorithms effectively detect post-seismic deformation, which is the anomalous trademark of medium to large earthquakes (1992 Mw=7. 3 Landers, 1999 Mw= 7. 1 Hector Mine, 2003 Mw=6. 5 San Simeon, and 2004 Mw= 6. 0 Parkfield). The epicenter for each earthquake is circled in red on the References map above. Outliers: Outliers are caused by many different sources. If the outliers are extreme, they can distort the model, and it is therefore very important to detect and remove them from the data. Jamason and Mindy Squibb at SOPAC. Anthropogenic effects: The algorithms detect anomalous sites (in orange) in the Los Angeles basin, Santa Ana basin, and San Gabriel basin, which are regions where anthropogenic effects occur. Trend Signals: Many signals such as postseismic decays, anthropogenic effects, and volcanic signals are recognized as data that deviate away from the model in particular patterns. Trend: Due to geophysical forces, GPS time series inherently contain a linear velocity (trend). Thus, the series are detrended before further analysis is performed. Nevertheless, some series containisaasignificant trend, Acknowledgments. Dafnastill Avraham 2009 SCEC especially when two or more project. trends are estimated. intern under the ACCESS-U Support is also. The existence of the trend in a detrended series signifies need provided by NASA MEa. SUREs project “Solid the Earth for further. ESDR modeling of the data, and so in trend Science System” with JPL. Help thisdetection research is critical. was provided by Brendan Crowell, Peng Fang, Paul The algorithms we developed successfully detect many GPS time series that exhibit geophysical anomalies, which often occur in the form of anthropogenic effects (such a groundwater removal and oil extraction) , volcanic signals, or postseismic deformation. Above: Spatial diagrams displaying, in orange, the anomalous GPS time series that our algorithms detected in Western United States. It is important to consider the spatial component of problematic sites because spatial clusters (seen here as condensed orange areas) often indicate underlying geophysical signals that may have gone unnoticed or unaccounted for in the model. These diagrams were created using GPS Explorer, an on-line data and modeling application created by SOPAC and JPL (http: //geoapp. ucsd. edu). [1] http: //sopac. ucsd. edu/cgi-bin/refined. Java. Time. Series. cgi [2] Nikolaidis, R. (2002), Observation of Geodetic and Seismic Deformation with the Global Positioning System, Ph. D. thesis, Univ. of Calif. , San Diego. [3] Diebold, F. X. (2007). Elements of Forecasting. Mason, OH: Thomson Higher Education. [4] Stoodley, K. D. C. and Mirnia, M. (1979). The Automatic Detection of Transients, Step Changes and Slope Changes in the Monitoring of Medical Time
- Slides: 1