Building dashboards for a COVID19 Data Ecosystem Natalie
Building dashboards for a COVID-19 Data Ecosystem Natalie Gable, Alexandra Stavrianidi, and Yvette Wei Department of Electrical Engineering, Stanford University Department of Mathematics, Stanford University Department of Statistics, Stanford University Introduction This project is a collaboration of Stanford Medicine (Department of Epidemiology and Population Health) with the Center of Population Health Sciences and has received the generous support of Google Cloud to build a data platform with the goal of identifying the social and demographic factors that influence the risk of SARS-COV-2 and to predict geographic hotspots of infection. When deployed to the California counties, this platform will enable public health officials to monitor the progression of COVID-19 at sub-county levels and better inform public health strategy. Our team contributed to this platform by building dashboards on COVID-SEDo. H Correlation, Health Disparities and Hotspot Mapping. Hotspot Detection and Mapping Fig 1. Sa. TScan clusters geographic output example. Fig 2. Cylindrical windows scanning through space and time. For hotspot detection we run Sa. TScan on Google Cloud. Sa. TScan is a software that analyzes space-time data using space-time scan statistics to detect disease clusters. Sa. TScan’s space-time scan statistic can simultaneously test for outbreaks of any size at any location, by using cylindrical windows with variable radius and height. We run the space-time permutation model of Sat. Scan on our COVID-19 cases by date and zipcode data, which are updated daily on a Big. Query database. Fig 3. Hotspot mapping dashboard on Looker. The county selected is Monterey and the time period is the last 2 weeks. We store our Sa. TScan analysis outputs on a Hotspot database on Big. Query and visualize them on Looker. As shown on the left heatmap, one can filter by county and by dates to look at the hotspots that have been active during that period in the selected county at the zipcode level. Significant hotspots have a p-value less than 0. 5 and active hotspots are the ones that remain active on the day of the query. Hotspots are ranked based on the observed to expected cases ratio where observed are the cases provided by the data and expected is the number of cases that would exist if there was no spatio-temporal correlation ( and thus all hotspots have an observed to expected ratio that is larger than 1). Hotspots are colour scaled with higher observed to expected ratio hotspots being on the red side of the colour scale and lower observed to expected ratio hotspots on the yellow side of the scale. COVID-19 + SEDo. H Correlation On the left we have an index selector which can select from 18 variables of most interest among all SEDo. H variables, including community resilience, COVID-19 Community Vulnerability Index, Healthy Places Index, and Social Vulnerability Index (SVI), etc. On the right we have the COVID confirmed cases per 100 k heatmap. Conclusion Health Disparities and COVID-19 ● Using the CA Healthy Places Index (HPI) and COVID-19 data to explore relationships between socioeconomic opportunities and health outcomes by zip code. We created interactive maps to explore the relationship between these two. ● ● Exploring the ways that data science frameworks can be applied to public health and epidemiology problems. Working on a diverse team of public health experts and data specialists. Setting up a framework for future public health projects to integrate and leverage data tools and data-driven frameworks. . Acknowledgements ● ● ● ● Lorene Nelson Hoda Abdel Magid Barbara Topol Alistair Lindawson Sam Jaros Mike Hittle Kari Hanson
- Slides: 1