Themes in Intelligent and Information Systems Outline Significant














- Slides: 14
Themes in Intelligent and Information Systems
Outline Significant ongoing IIS-GEO interactions Themes Major new directions
Significant Ongoing IIS-GEO Interactions SHEKHAR CCC’s Spatial Computing 2020 http: //www. cra. org/ccc/visioning-activities/spatial-computing Climate Informatics workshop series http: //www. climateinformatics. org Sustainability tracks AAAI and IJCAI conferences Discovery Informatics symposium series http: www. discoveryinformaticsinitiative. org NSF CISE Expeditions awards Climate (Kumar) Sustainability (Gomes) NSF GEO/CISE Earth. Cube awards Machine reading (Peters) Semantic Web and linked data (Arko, Cheatham) Software metadata (Gil, Peckham) GEON, HIS, … MIT’s env science conference KNOBLOCK LIU GOMES GIL KUMAR GOMES PETERS CHEATHAM
Outline Significant ongoing IIS-GEO interactions Themes Collecting data Integrating data Analyzing data Visualizing data Cross-cutting themes
Collecting Data 1. Sensors In-sensor compression, pre-processing, noise reduction, signal filtering, trend analysis, etc. E. g. , mass-spectrometry to identify chemicals Humans as sensors E. g. , http: //mahali. mit. edu 2. Autonomous vehicles (underwater, gliders) Survey an area, eliminate measurement bias, re-survey for PANKRATIUS SHEKHAR SKINNER & JOHNSONROBERSON monitoring and tracking Object recognition from images Convolutional neural networks (CNNs) to learn feature representations directly from the data 3. Social knowledge collection Collecting structured metadata Involving citizen scientists to curate legacy data catalogs NORTH GIL
Integrating Data Semantic Web technologies Semantics to integrate across sources http: //www. isi. edu/integration/karma Sharing as “linked data” Reuse of ontologies for new uses/applications Supporting collaborative science “Ontology Design Patterns” to efficiently align diverse datasets Entity resolution/linking across data sources Millions of records Provenance standards Describe meaning of data and its quality KNOBLOCK Mc. GUINNESS CHEATHAM GIL PLALE
Analyzing Data: Machine Learning High-dimensional structured estimation, multivariate statistical dependency learning Small data samples and large numbers of features Predictive modeling: nonlinear models for small samples Semi-parametric: parametric linear combination of monotone nonlinear transformations Non-parametric: multi-task learning where models change based on “phase” of input Robust representations Deep learning: neural networks trained through perturbations of the input so representation is invariant BANERJEE Network theory to identify patterns (eg tele-connections in climate) Probabilistic graphical models: incorporating knowledge representation + structure learning Directed (Bayes nets) or undirected (Markov random fields) Latent unobserved variables EBERTUPHOFF Dependencies (causal discovery) over space/time Interactive modeling Online learning Adaptive modeling as new data becomes available Causal inference SMYTH KUMAR Nonlinearity, additive noise models, transportability, causal priors (eg temporal) Hybrid physical-statistical models Lack of representative ground truth Multi-scale, multi-resolution data LIU
Processing Data Scale: parallelism Multi-core, cloud computing Efficient algorithms PANKRATIUS ZHANG Workflows Sharing and reuse of modules and workflows Collaborative analytics Intelligent workflow systems GIL PLALE Mining distributed data sources Parallel processing Distributed data mining BORNE
Visualizing Data Large-scale data visualization Steerable online massive processing and rendering (eg with GPU computing) “Semantic interaction” for non-programmers SAMET NORTH Direct manipulation of model output is transformed into operations on the inputs Autonomous steering based on expert observation Low-cost virtual reality technologies Navigating Manipulating Collaborating Visualizing Multi-touch interaction Interactive visual analytics so expert can inject intuition/guidance Human-in-the-loop analysis, human-system cooperation “Virtual humans” as guides or tutors to understand data or analytic tools KRUM
Outline Significant ongoing IIS-GEO interactions Themes Collecting data Integrating data Analyzing data Visualizing data Cross-cutting themes Spatiotemporal science environments Computer-aided discovery Intelligent data systems Machine learning for geosciences
Cross-Cutting Themes: 1) Spatiotemporal Science Environments Capture: Implicit data capture (eg touch) Integration: Efficient integration of geospatial data Processing: Efficient location-based queries Approximate results Analysis: Learning Gaussian nonparametric processes Multivariate parametric models (eg autoregressive models) Probabilistic graphical models Granger graphical models, causality Frequent temporal pattern mining and dependency discovery, trajectory mining, anomaly detection, state-space models SAMET SHANKHAR BANERJEE EBERTUPHOFF LIU BORNE SMYTH Predictive modeling Dimension reduction and latent-variable models Visualization: Geo-visualization High-speed geographical visualizations (Google Maps, Google Earth) Geo-referenced data (eg Microsoft’s Photo. Synth) Presentation in small devices NORTH
Cross-Cutting Themes: 2) Computer-Aided Discovery Intelligent user interfaces Assistance in formulating questions Intelligent presentation in target (small) devices PANKRATIUS KRUM Automated analysis with human steering GIL Guiding and tutoring of a dataset Autonomous sensors for surveillance and object recognition SKINNER NORTH
Cross-Cutting Themes: 3) Intelligent (meta)data systems Linking across data sources with semantics Automatic generation of metadata Recommendations of data/analytics to use Trusted data, provenance PIERCE BORNE KNOBLOCK CHEATHAM PLALE
Cross-Cutting Themes: 4) Machine Learning for Geosciences Scale (e. g. , dimensionality reduction) Knowledge-rich models Adaptivity (e. g. , online learning) Interactivity Robustness Causality BANERJEE EBERTUPHOFF SMYTH LIU