Big Data Era in Sky and Earth Observation
Big Data Era in Sky and Earth Observation
Impact of BIGSKYEARTH • We want to set the ground for a long-term networking across the earth-space domain • We want to help European researchers to use the best available tools to deliver cutting edge science in the Big Data era
The era of Big Data has arrived! Example: images + time = surface movements sentinel-1 A Launch date: 3. April 2014 up to 2. 4 TB/day of imaging radar data for 7 years (fully open and free data access policy) Applications: Oceans and ice, Changing lands, Emergency response First in a constellation: 1 A/B/C, 2 A/B/C, 3 A/B/C, 4 A/B, 5 P/A/B, . . . Part of the European Earth Observation Programme Copernicus: the most ambitious Earth observation programme to date: 30 satellites: peta-bytes now: zetta-bytes in a decade
The era of Big Data has arrived! GAIA space telescope Pan-STARRS (NEO defence) LOFAR (radio telescope): 1 petabyte per year >100 TB of data Launch date: 19. December 2013 1 PB of data after 5 years (40 million observations a day!) (free data catalogue access policy) Applications: 3 D catalogue of ~1 billion astronomical objects Large Synoptic Survey Telescope: 30 TB of imaging data each night Square Kilometre Array: 1. 5 exabytes per year The Virtual Observatory (VO): provides standards describing all astronomical resources worldwide and supports standardized discovery and access to these collections
Shared challenges: data tsunami ESA’s G-POD Digital curation and data access • store, maintain & preserve huge amounts of data • large multidimensional & highly interrelated datasets = paradigm change: push the computing to the data Visualization • visualizing large quantities of data with: low signal-to-noise ratio, high dynamic range, multidimensional parameter space, multi-layered time-dependent, . . . Google Earth Engine Adaptation to new high performance computing (HPC) technologies • heterogeneous supercomputing environments • new programming techniques for GPUs and cloud computing GPUs as numerical co-processors Training of a new generation of scientists • astroinformatics, geoinformatics, bioinformatics • natural sciences + IT/CS = exploration with statistics New books and online courses
Why COST Action? Astronomy BIG-SKY-EARTH Earth Observation detectors, telescopes, satellites computing resources, digital curation, numerical methods, knowledge discovery in big data sets, machine learning The challenges are inherently global and transdisciplinary! • boosting the communication within and between disciplines • identify and cluster relevant common solutions • go beyond individual or national projects • diversify the pool of experts • addressing and documenting various issues in Big Data science • spread good practices between astro- and geo- communities • build a joint agenda and training/education resources Set the ground for a long-term networking!
Action objectives Objective 4: Dissemination e. g. online tools and training materials, workshops, different target groups Objective 1: Framing the Joint Long-term Agenda identify, compare and assess the common narrative, methods, techniques and tools used in astro-, geo and computer sciences Objective 2: Incubation of New Knowledge development of solutions to the challenges e. g. Implementation of DBMS; standardization of data communication across disciplines; joint visualization tool or joint education materials Objective 3: Defragmentation of Existing Knowledge use international collaboration to defragment and systematize the Big Data knowledge e. g. underlying numerical methods, algorithms and backend IT/CS solutions
Management structure WG 1: Optimisation of database tools in astro- and geophysics contexts (focused more on the back-end tools) WG 3: Education of a new generation of experts in knowledge extraction from massive datasets WG 2: Data mining and machine learning in the petabyte era as frontiers in astronomy and Earth observation (focused more on the front-end tools) WG 4: Visualization of high dimensional data Management Committee (MC) (management and supervision) National delegates MC Chair & Vice-Chair Technical Manager Training, Dissemination and Liaison Manager (scientific & outreach) WG 1, 2, 3, 4 Scientific Leaders & co-Leaders Inclusion manager STSM Committee Core Group
STSM=Short Term Scientific Missions; JSS=Joint Student Supervisions; WG=Working Group; CG=Core Group; MC=Management Committee TD=Technical deliverables 2015 2016 2017 1 2 3 4 5 6 7 8 9 10 11 12 1 2 Location YEAR 1, SEMESTER 1 Date Number of Average Number of participants reimburseme Total Local to be nt (per expected Reimbursem Organiser Total cost of Number of total reimbursed participant) ent costs Support the Meeting days participants from COST (EUR) Define rules on STSM and JSS Define rules on WG operation MC 2 nd meeting Belgrade WG 1, 2, 3, 4 meeting Belgrade March 30/31 1 40 40 475 19000 March 30/31 2 15 15 300 4500 1200 5700 2 2 1000 2000 At least 2 STSMs 600 19600 First JSS introduced Website setup Website operation WG 1+WG 2 meeting lyon, International Symposium on Methodologies for Intelligent Systems October 21 -23, 2015 2500 1000 2000 11000 2 50 30 300 9000 1 12 12 300 3600 2 50 30 300 9000 2000 11000 6 36 24 700 16800 4320 21120 3 60 30 308 9240 3160 12400 1 12 12 150 1800 1 40 40 480 19200 4 4 1000 4000 2 1000 2000 1000 YEAR 1, SEMESTER 2 CG 1 st meeting WG 3+WG 4 meeting Training school Machine Learning and Principles and Practice of Knowledge Discovery in Databases; Porto, www. ecmlpkdd 2015. org September, 7 -11, 2015 Dubrovnik (Croatia) October 5 -9, + Astroinformatics Conference 2015 DLR site at Oberpfaffenhofen Workshop Brno University CG 2 nd meeting Brno University MC 3 rd meeting Brno University At least 4 STSMs 240 3840 2040 600 19800 0 4000 JSSs Dissemination meeting Website operation Joint publication the International Geoscience and Remote Sensing Symposium 2015 (IGARSS 2015), July 26 -31, 2015, Milan, Italy: http: //www. igarss 2015. org/default. asp
STSM=Short Term Scientific Missions; JSS=Joint Student Supervisions; WG=Working Group; CG=Core Group; MC=Management Committee TD=Technical deliverables 2015 2016 2017 1 2 3 4 5 6 7 8 9 10 11 12 1 2 Location Date Number of Average expected participant reimburse Total Local Total cost total s to be ment (per Reimburse Organiser of the Number of participant reimbursed participant) ment costs Support Meeting days s from COST (EUR) YEAR 2, SEMESTER 1 WG 2+WG 4 meeting WG 1+WG 3 meeting Workshop lithuania 2 50 30 300 9000 2000 11000 3 60 30 360 10800 3000 13800 1 12 12 150 6 36 24 700 16800 4 4 macedonia Research Center for Spatial Information, University Politehnica Bucharest http: //ceospacetech. pub. ro CG 3 rd meeting Training school Cambridge; Cambridge Big Data http: //www. bigdata. cam. ac. uk At least 4 STSMs 1000 1800 4000 240 2040 4320 21120 0 4000 JSSs Website operation 750 YEAR 2, SEMESTER 2 WG 1, 2, 3, 4 meeting Conference in conjunction with Astro-Info 16 (IAU), Sorrento, Italy CG 4 th meeting lisbon MC 4 th meeting lisbon 1 50 30 300 9000 2000 11000 2 80 40 300 12000 14000 1 12 12 150 1 40 40 475 19000 5 5 1000 5000 2 1000 2000 750 1000 At least 4 STSMs 1800 240 2040 600 19600 0 5000 JSSs Dissemination meeting Website operation Joint publication European Week for Astrophysics and Space Science (EWASS) 2016 in Athens, 4 - 8 July 2016
Impact Big Data is not just bigger, it is different! Success in research will depend on the ability to mine knowledge from that data. And some of the most interesting science probably hasn’t even been imagined! Across the earth-space domains of Big Data: • recognizing and documenting well defined common problems • cultivating a community of practitioners who work together on sharing a common informatics techniques • leveraging the best practices, methodologies, conceptual approaches and tools • improved knowledge extraction from Big Data (e. g. finding outliers, finding trends in multidimensional sets, . . . ) • New up-to-date educational tools and materials Big Data is the 4 th paradigm in addition to theory, measurements, and modeling
- Slides: 11