Spatio Temporal Cluster Detection Using AMOEBA Jimmy Kroon
Spatio – Temporal Cluster Detection Using AMOEBA Jimmy Kroon Pennsylvania State University Advisor: Dr. Frank Hardisty
This is a parody – Original Art: http: //projectswordtoys. blogspot. com/2009/05/project-sword-annual-1967. html
Outline • Introduction – Clustering and Project Direction • The Spatial Scan Statistic and Sat. Scan • AMOEBA • Proposed Spatio-Temporal AMOEBA Method • Software, Data, and Progress
Cluster Detection Cluster: “a geographically and/or temporally bounded group of occurrences of sufficient size and concentration to be unlikely to have occurred by chance” (Knox, 1989) Two Typical Uses Disease Surveillance Week of 2/7/2010 Data: Google Flu Trends – Analysis: Geo. Da Epidemiological Studies Brain Cancer in NM Kulldorff et al. 1998 Clusters : Sa. TScan : AMOEBA : ST AMOEBA : Progress
Time in Spatial Analysis Time Matters: • Many geographic phenomena are dynamic. • Spatial patterns we see probably change over time • The American Association of Geographers describes temporal geography as a ‘frontier’ of GIScience. Spatio-temporal clusters may exhibit behaviors not seen in purely spatial clusters. • Growth • Movement • Splits / Joins Clusters : Sa. TScan : AMOEBA : ST AMOEBA : Progress
Research Problem Primary: No method exists for the determining the true extent of irregularly shaped clusters in spatio-temporal datasets. Secondary: Spatial AMOEBA has not been implemented in R Project Goals • A demonstration of spatio-temporal cluster detection based on the AMOEBA procedure. • R scripts for running spatial and spatio-temporal AMOEBA will be contributed to the R community. Clusters : Sa. TScan : AMOEBA : ST AMOEBA : Progress
The Spatial Scan Statistic • Scan data with a moving ‘window’, calculating local autocorrelation for spatial units that fall within the window. • Select the window(s) with the highest calculated autocorrelation value as possible cluster(s). • The spatial scan statistic is by far the most popular cluster detection technique, largely due to the availability of Sa. TScan software by Martin Kulldorff. Clusters : Sa. TScan : AMOEBA : ST AMOEBA : Progress
The Spatial Scan Statistic Clusters : Sa. TScan : AMOEBA : ST AMOEBA : Progress
Drawbacks of the Spatial Scan Statistic Clusters that are not similar in shape to the scanning window can produce errors. • False inclusions • False exclusions • Identify thin clusters as multiple small clusters • Cannot detect holes in clusters Clusters : Sa. TScan : AMOEBA : ST AMOEBA : Progress
The Elliptical Spatial Scan Statistic • Must choose shapes a priori to avoid pre-selection bias See Kulldorff et al. 2006 Clusters : Sa. TScan : AMOEBA : ST AMOEBA : Progress
AMOEBA • • • Ecotope-Based – Regions of contiguous spatial units that are related in terms of z-value Multidirectional – Search in all directions. Optimum – Procedure takes place at the finest spatial scale possible and is capable of revealing all spatial association present in the dataset (Aldstadt and Getis, 2006). AMOEBA Clusters : Sa. TScan : AMOEBA : ST AMOEBA : Progress
AMOEBA Defining an Ecotope • Add a seed location (one polygon) to the ecotope • Calculate Gi* (Getis-Ord local autocorrelation statistic) • • Search in all directions for contiguous polygons Those that increase Gi* are added to the growing ecotope for that seed location • Keep searching for more neighbors, growing the ecotope until Gi* no longer increases Repeat – creating ecotopes for each polygon in the dataset Clusters : Sa. TScan : AMOEBA : ST AMOEBA : Progress
The R Neighbor Object Clusters : Sa. TScan : AMOEBA : ST AMOEBA : Progress
Finding an Ecotope with AMOEBA Clusters : Sa. TScan : AMOEBA : ST AMOEBA : Progress
Finding an Ecotope with AMOEBA Clusters : Sa. TScan : AMOEBA : ST AMOEBA : Progress
Finding an Ecotope with AMOEBA Clusters : Sa. TScan : AMOEBA : ST AMOEBA : Progress
Finding an Ecotope with AMOEBA Clusters : Sa. TScan : AMOEBA : ST AMOEBA : Progress
Finding an Ecotope with AMOEBA Clusters : Sa. TScan : AMOEBA : ST AMOEBA : Progress
Finding an Ecotope with AMOEBA Clusters : Sa. TScan : AMOEBA : ST AMOEBA : Progress
Finding an Ecotope with AMOEBA Clusters : Sa. TScan : AMOEBA : ST AMOEBA : Progress
Finding an Ecotope with AMOEBA Clusters : Sa. TScan : AMOEBA : ST AMOEBA : Progress
Finding an Ecotope with AMOEBA Clusters : Sa. TScan : AMOEBA : ST AMOEBA : Progress
AMOEBA From Ecotopes to Clusters • • • Rank ecotopes by final Gi* Select that with the highest Gi* as a cluster Eliminate intersecting ecotopes Select the ecotope with the next highest Gi* as a second cluster Repeat • Probability of clusters can be tested using Monte Carlo simulation Clusters : Sa. TScan : AMOEBA : ST AMOEBA : Progress
Incorporating Time into AMOEBA Remember - Spatio-temporal clusters may exhibit behaviors not seen in purely spatial clusters. • Growth • Movement • Splits / Joins Visualize temporal data as layers of data with time extending vertically through the layers. • Each spatio-temporal unit has spatial neighbors and temporal neighbors Clusters : Sa. TScan : AMOEBA : ST AMOEBA : Progress
The Spatio-Temporal Scan Statistic See Kulldorff et al. 1998 Clusters : Sa. TScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA Clusters : Sa. TScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA Clusters : Sa. TScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA Clusters : Sa. TScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA Clusters : Sa. TScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA Clusters : Sa. TScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA Clusters : Sa. TScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA Clusters : Sa. TScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA Clusters : Sa. TScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA Clusters : Sa. TScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA Clusters : Sa. TScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA Clusters : Sa. TScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA Clusters : Sa. TScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA Clusters : Sa. TScan : AMOEBA : ST AMOEBA : Progress
Software Environment and Test Data The R Project • Free, open source statistical software • Extendable with user contributed packages • www. r-project. org Google Flu Trends • Estimates flu incidence levels using aggregated data about user searches for certain keywords • 90% accurate compared to CDC data • State-level data - updated daily • www. google. org/googleflu SEER (Surveillance Epidemiology and End Results) • National Cancer Institute incidence, survival, and mortality data Clusters : Sa. TScan : AMOEBA : ST AMOEBA : Progress
AMOEBA Arc. Toolbox for Arc. GIS Python Scripts by Jared Aldstadt and Yeming Fan (Aldstadt, 2010) Google Flu Trends – Feb 1, 2009 Clusters : Sa. TScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA in Python: 2009 Flu Epidemic Clusters : Sa. TScan : AMOEBA : ST AMOEBA : Progress
Hmmm… Clusters : Sa. TScan : AMOEBA : ST AMOEBA : Progress
R Programming Progress Compete … Geoprocessing tasks • Create spatio-temporal neighbor list • Delineate ecotopes • Sort and eliminate intersecting ecotopes • Returns primary cluster Poly. ID’s that match the Python results To Do … • Monte Carlo simulation • Process results and add to the output shapefile • Test, test Clusters : Sa. TScan : AMOEBA : ST AMOEBA : Progress
References Aldstadt, Jared, and Arthur Getis. 2006. Using AMOEBA to Create a Spatial Weights Matrix and Identify Spatial Clusters. Geographical Analysis 38: 327 -343. Aldstadt, Jared. 2010. Spatial Analysis Tools (Arc. GIS). Spatial Analysis Tools. http: //www. acsu. buffalo. edu/~geojared/tools. htm. Bellec, S, D Hémon, J Rudant, A Goubin, and J Clavel. 2006. Spatial and space–time clustering of childhood acute leukaemia in France from 1990 to 2000: a nationwide study. British Journal of Cancer Duczmal, Luiz, Martin Kulldorff, and Lan Huang. 2006. Evaluation of Spatial Scan Statistics for Irregularly Shaped Clusters. Journal of Computational and Graphical Statistics 15(2): 428 -442. Knox, G. 1989. Detection of Clusters. In Methodology of Enquiries into Disease Clustering, ed. P Elliott, 17 -22. London: Small Area Health Statistics Unit. Kulldorff, Martin, Athas, William, Feuer, Eric, Miller, Barry, and Key, Charles. 1998. Evaluating cluster alarms: A space-time scan statistic and brain cancer in Los Alamos, New Mexico. American Journal of Public Health 88(9): 1377 -1380. Kulldorff, Martin, Lan Huang, Linda Pickle, and Luiz Duczmal. 2006. An elliptic spatial scan statistic. Statistics in Medicine 25(22): 3929. Kulldorff, Martin. 1999. Geographic Information Systems (GIS) community health: Some statistical issues. Journal of Public Health Management and Practice 5(2): 100 -106. Original artwork for parody title slide: http: //projectswordtoys. blogspot. com/2009/05/project-sword-annual-1967. html
- Slides: 45