GIScience and the Big Data Age Yihong Yuan

GIScience and the Big Data Age Yihong Yuan Department of Geography Texas State University 1

About me • Yihong Yuan Assistant Professor yuan@txstate. edu. ELA 366, 512 -245 -3208 • Research Interests – Spatio-temporal data mining – Human mobility and activity patterns – Big data analytics 2

Geography and Big Data • GIS – Not only about mapping functions • Big Geo-data – Information and communication technologies (ICTs) • Greater mobility flexibility • A wide range of spatio-temporal data sources • Align marketing campaigns to spatial patterns. 3

• “Geography is one of the most natural, logical and intuitive ways to discover, visualize, overlay, compare, slice, sort and apply big data to a problem” • “GIS used to be about the analysis of relatively static institutional data, but new data streams mean that today’s GIS problems look very much the same as today’s big data problems: extract meaningful information from a fire hose of inputs” 4

• Traditional geographic knowledge discovery – e. g. , high resolution trajectories • Incomplete Spatio-temporal datasets – Low resolution – Few individual attributes – Uncertainty? 5

Past Research • Georeferenced mobile phone data analytics – Individual-oriented research – – Activity space » Measurements: Radius, Eccentricity, entropy » Correlation between phone usage and activity space Trajectory and sequence patterns » Time series analysis – Urban-oriented studies • Spatial clusters • Spatial rhythms • • Dynamic clustering Functional time points 6

UML Model about Geo-referenced mobile phone data 7

Example Mobile Phone Dataset • Mobile Phone Connections in 10 cities in northeast China – Time, Duration, and Locations of Mobile Phone Connections in 9 days – Age and Gender Attributes of the Users – Possibility of simulated data 8

Analysis of Activity space • Three measurements – Radius -> Scale • eigenvectors of trajectories – Eccentricity -> Shape • Range [0, 1] • Closer to a straight line or a circle – Entropy->Regularity • How random the visiting patterns are 9

Correlation between individual activity space and phone usage 10

Results • For People with Higher Mobile Phone Usage: – Larger Activity Space – Trajectories are Closer to a Circle – Movement is More Random, Less Predictable 11

Activity space vs Trajectory 12

Analysis of trajectory patterns • Compare trajectories from phone records – Sequences of cell IDs • Edit distance Method – String matching and auto-correction 13

Analysis of trajectory patterns (Cont. ) • Applications – Identify similar users • Clustering analysis – Identify outlier users 14

15

Urban hotspots and clusters • The changing clustering of urban area Weekdays Weekends T 1: 8 am-9 am T 2: 2 pm-3 pm T 3: 7 pm-8 pm 16

Urban clusters (Cont. ) • Mobility patterns of different population groups – Weekday 2 pm-3 pm Age: 12 -17 Age: > 60 17

Urban clusters (Cont. ) • Provide input for urban infrastructure planning – Are public facilities where people are? ? A park Age: > 60 18

Dynamic Clustering • Focus on “rhythms” instead of just “clusters” • Various mobility patterns in urban area – How to explore? – time series analysis CBD, Beijing Suburb, Beijing 19

Dynamic Clustering (Cont. ) • Methods – Divide study area • Voronoi polygon (based on towers) • What to compare: 24 -hour series for each polygon based on mobility count • Outlier detection e. g. , traffic congestion 20

Outlier polygons • 15 outliers for weekdays and 18 for weekends Weekday Weekends 21

Mobility patterns in outlier areas • Outlier Polygon 238 – Night clubs and other leisure facilities – International trading center • Outlier Polygon 125 – Several community colleges – Not many night clubs, bars, etc. Polygon 238 Polygon 125 22

Current and future research 23

Setting up functional time in cities • Standardization of time – Determination of the beginning/end of a day • The development of ICT – Real-time activity patterns – More flexibility in time management and activity scheduling • i. e. , fixed parking hour policy may not be applicable in Central business districts 24

Setting up functional time in cities 25

Cross-country comparison for Social Media websites • Flickr data, 100 million records and geotagged photos • Similarity and dissimilarity of human mobility in various cities – “A tale of many cities” 26

Current and future research • Mobility patterns in developing and developed countries – China as a focus • Weibo and Twitter check-in data – Comparison study for special time period – Holiday patterns 27

Current and future research • Mass media and Social Media – GDELT dataset • Geo-tagged news Events from 1970 s – Public relations and interaction between countries 28

(a) (b) 29

Big data and GIS jobs… • • • Traditional GIS jobs: GIS Technician/Analyst/consultant GIS manager/researcher …… Where are the positions? Public sector… NGA, USGS, State and local Gov, DOT, planning dept. • Private company…Oil&Gas, Mapping companies, Land management, Utility… • Non-profit agency… Nature Conservancy, International Crane Foundation • Consulting firms…Surveying, Remote Sensing… 30

Example: Private Sector Jobs • • • Mapping Companies Software Developers Utilities Land Development Non-Profits Others 31

Job Skills • • • Project Management Technical Support Report Writing Public Speaking Research/Literature review Programming 32

Software Skills (cont. ) • GIS software packages • Arc. GIS, ENVI, GDAL • Mobile & Web Technology – Silverlight / Flex /HTML / ASP – Android Dev • Python / C#. . . • Database: Access, SQL Server, Postgres. SQL 33

Job Postings • Company Website – ESRI summer internship program • Relevant Employment Websites – – – – General sites: Monster. com / Indeed. com Linkedin. com Glassdoor. com GIS Jobs Clearinghouse (gjc. org) GISjobs. com & Geojobs. org Geo. Community GIS Café WI State Cartographers Office • http: //www. sco. wisc. edu/jobs. php 34

Job Postings • Internal Company Postings • Company Website • Relevant Employment Websites – GIS Jobs Clearinghouse (gjc. org) – GISjobs. com & Geojobs. org – Geo. Community – GIS Café – Monster. com 35

Job Postings • Internal Company Postings • Company Website • Relevant Employment Websites – GIS Jobs Clearinghouse (gjc. org) – GISjobs. com & Geojobs. org – Geo. Community – GIS Café – Monster. com 36

Big data jobs… • Spatial data are inherently big data… • For GIS major… – Data Scientist • • • This is a more “General” term Focus on big (geo)data analytics Highly competitive salary Graduate degree (MA possible, Ph. D preferred) Many opportunities… • Skill set: • Strong statistical background • Strong and programming: Python, R, etc, 37

Example positions • Data Scientist @ ESRI – http: //www. simplyhired. com/job/data-scientist-agriculturejob/esri/5 jjxyxjt 4 b? cid=ntvzgigizsvnqhofbuscopqozjkxqugd • Research Data Scientist – http: //www. americasjobexchange. com/job-detail/job-opening-AJE 569661132? source=indeed&utm_source=Indeed&utm_medium=cpc&ut m_campaign=Indeed • Other potential groups: Apple geo-group, Twitter geo-group, Facebook data science group 38

39
- Slides: 39