PHYSICSGUIDED DATA LEARING MODELS Introduction to Artificial Intelligence

  • Slides: 35
Download presentation
PHYSICS-GUIDED DATA LEARING MODELS Introduction to Artificial Intelligence and Machine Learning “AI 101” Noel

PHYSICS-GUIDED DATA LEARING MODELS Introduction to Artificial Intelligence and Machine Learning “AI 101” Noel P. Greis Monica L. Nogueira Sambit Bhattacharya • “Project 1” … a review of project goals • “AI 101” … an overview of artificial intelligence and machine learning relevant to ROI (Nogueira) • “Physics-Guided (Recurrent) Neural Nets” …. An illustrative example of a theory-guided data learning model (Bhattacharya) • “Looking Ahead” … some thoughts about theory-guided data learning models (Greis) 26 October 2018 University of North Carolina at Charlotte - North Carolina State University - Fayetteville State University Confidential Information. For Limited Distribution as Appropriate.

Project 1 Goals Can we develop a new class of hybrid models in which

Project 1 Goals Can we develop a new class of hybrid models in which physics-based models are combined with machine learning to enable new theory-guided data learning tools? Statistics Pattern Neurocomputing Recognition Deep Learning Data Mining Databases Machine Learning Artificial Intelligence + Knowledge Discovery • Establish a toolbox of available data-driven modeling approaches • Develop framework for combining physics-based and data-driven models • Demonstrate improved physical consistency and reduced error for hybrid approaches PROJECT 1: Physics-Guided Data Learning Models University of North Carolina at Charlotte - North Carolina State University - Fayetteville State University Confidential Information. For Limited Distribution as Appropriate.

Project 1 Goals The Data Science View The Physics-Based View We have a good

Project 1 Goals The Data Science View The Physics-Based View We have a good understanding of the physical system. We are able to describe it mathematically and can produce “theoretical” predictions of system behavior. Numerical Models Analytical Models Hybrid Models We have no direct theoretical knowledge about the system but we have a lot of experimental data on how it behaves and can “learn” the physics and make predictions. Empirical Models AI-Based Models Why need machine learning when we already have physics-based models, and vice versa? Theory and practice often diverge based on real-world environments. AND…solving physics-based models in practice can be complicated and time-consuming. AND…the computational cost of gathering data for machine learning models can be prohibitive. The Need for Theory-Guided Data Science University of North Carolina at Charlotte - North Carolina State University - Fayetteville State University Confidential Information. For Limited Distribution as Appropriate.

“AI 101” … an overview of artificial intelligence and machine learning relevant to ROI

“AI 101” … an overview of artificial intelligence and machine learning relevant to ROI ARTIFICIAL INTELLIGENCE • Logic reasoning • Natural Language Processing • Computer Vision • Robotics • Computer systems simulate human intelligence processes, including: learning, reasoning, self-correction. • “Cognitive computing” find solutions in complex situations where answers may be ambiguous and uncertain. • Strong vs. Weak AI MACHINE LEARNING • Virtual Assistants • Email Spam & Malware Filtering • Product Recommendation • Self-driving cars • • Automatic Machine Translation Object Classification in Photos Image Caption Generation Automatic Game Playing • Computer systems that learn by generalizing data examples without relying on rules-based programming. • Supervised, Unsupervised, and Reinforcement Learning DEEP LEARNING • Large neural networks and huge amounts of data create hierarchy of models which allows computer to learn complicated concepts by building them out of simpler ones. University of North Carolina at Charlotte - North Carolina State University - Fayetteville State University Confidential Information. For Limited Distribution as Appropriate.

“AI 101” … an overview of artificial intelligence and machine learning relevant to ROI

“AI 101” … an overview of artificial intelligence and machine learning relevant to ROI Machine Learning Algorithms By Learning Style Supervised Learning • Learn model from labeled training data • Classification task predicts (discrete, unordered) class labels TESTING New Data x 2 TRAINING Labels Training Data Machine Learning Algorithm x 1 Predictive Model • Regression Analysis predicts continuous outcome based on the relationship between predictor (explanatory) variables and a continuous response variable (outcome) Prediction decision boundary y fit line that minimizes distance University of North Carolina at Charlotte - North Carolina State University - Fayetteville State University Confidential Information. For Limited Distribution as Appropriate. x

“AI 101” … an overview of artificial intelligence and machine learning relevant to ROI

“AI 101” … an overview of artificial intelligence and machine learning relevant to ROI Machine Learning Algorithms By Learning Style Reinforcement Learning • Employ a system (agent) that improves its performance based on interactions with the environment. Environment reward state Agent action • Agent uses reinforcement learning, through interaction with the environment, to learn a series of actions (inputs) that maximizes the reward signal, measured by a reward function. • Example: chess engine University of North Carolina at Charlotte - North Carolina State University - Fayetteville State University Confidential Information. For Limited Distribution as Appropriate.

“AI 101” … an overview of artificial intelligence and machine learning relevant to ROI

“AI 101” … an overview of artificial intelligence and machine learning relevant to ROI Machine Learning Algorithms By Learning Style Unsupervised Learning • Explore structure of data to extract meaningful information without the guidance of a known outcome variable or reward function. • Clustering organizes data into meaningful subgroups (clusters) – “unsupervised classification” x 1 • Dimensionality Reduction compresses data onto smaller dimensional subspace while retaining most of the relevant information x 3 x 2 x 1 z 1 x 2 University of North Carolina at Charlotte - North Carolina State University - Fayetteville State University Confidential Information. For Limited Distribution as Appropriate. z 2

“AI 101” … an overview of artificial intelligence and machine learning relevant to ROI

“AI 101” … an overview of artificial intelligence and machine learning relevant to ROI Roadmap for Building Machine Learning Systems Labels Raw Data • • Training Dataset Validation Dataset Learning Algorithm Final Model Test Dataset New Data Labels Preprocessing Learning Feature Extraction & Scaling Feature Selection Dimensionality Reduction Sampling • Model Selection • Cross-Validation • Performance Metrics • Hyperparameter Optimization Evaluation Prediction University of North Carolina at Charlotte - North Carolina State University - Fayetteville State University Confidential Information. For Limited Distribution as Appropriate.

“AI 101” … an overview of artificial intelligence and machine learning relevant to ROI

“AI 101” … an overview of artificial intelligence and machine learning relevant to ROI http: //www. asimovinstitute. org/neural-network-zoo/ University of North Carolina at Charlotte - North Carolina State University - Fayetteville State University Confidential Information. For Limited Distribution as Appropriate.

“AI 101” … an overview of artificial intelligence and machine learning relevant to ROI

“AI 101” … an overview of artificial intelligence and machine learning relevant to ROI Perceptron – The MCP Neuron Algorithm automatically learns the optimal weights to be multiplied by input features that will “fire” the neuron or not, i. e. determine the output. ŷ Φ(w. Tx) = 0 x 2 The activation function of the perceptron converts the net input z = w. T x into binary outputs (-1 or 1) to discriminate between two linearly separable classes Φ(w. Tx) < 0 University of North Carolina at Charlotte - North Carolina State University - Fayetteville State University Confidential Information. For Limited Distribution as Appropriate. Φ(w. Tx) > 0 x 1

“AI 101” … an overview of artificial intelligence and machine learning relevant to ROI

“AI 101” … an overview of artificial intelligence and machine learning relevant to ROI Perceptron Learning Rule Perceptron Algorithm 1. Initialize weights to zero or small random numbers 2. For each training sample xi: a. Compute output value ŷ (class label) b. Simultaneously update weights by wj = wj + Δwj = η(yi – ŷi) xji η is the learning rate (constant between 0. 0 and 1. 0) yi is the true class label of the i training sample ŷi is the predicted class label For convergence, needs: two linearly separable classes, and learning rate sufficiently small. Otherwise, set a maximum number of epochs, and/or a threshold for number of tolerated misclassifications – or perceptron will never stop updating weights. University of North Carolina at Charlotte - North Carolina State University - Fayetteville State University Confidential Information. For Limited Distribution as Appropriate.

“AI 101” … an overview of artificial intelligence and machine learning relevant to ROI

“AI 101” … an overview of artificial intelligence and machine learning relevant to ROI ADAptive LInear NEurons (Adaline) Key differences from perceptron: 1) Uses continuous valued output from linear activation function to compute the model error and update weights, rather than the binary class labels. 2) Quantizer, similar to step function, can be used to predict class labels. University of North Carolina at Charlotte - North Carolina State University - Fayetteville State University Confidential Information. For Limited Distribution as Appropriate.

“AI 101” … an overview of artificial intelligence and machine learning relevant to ROI

“AI 101” … an overview of artificial intelligence and machine learning relevant to ROI Minimizing Cost Functions with Gradient Descent • Supervised ML requires defining an objective function, often a cost function, to be optimized during the learning process. • For Adaline, the cost function to learn the weights can be the Sum of Squared Errors (SSE) between calculated outcomes and true labels. J(w) = 1/2 Σi (yi – Φ(zi) )2 [Differentiable & convex] • The gradient descent optimization algorithm can be used to find the weights that minimize the cost function to classify the input samples. Adaline learning rule: w = w + Δw Δw = - η J(w) Key differences: 1) Φ(zi) with zi = w. T xi is a real number, not an integer class label 2) Weight update calculated for all samples, i. e. “batch” gradient descent University of North Carolina at Charlotte - North Carolina State University - Fayetteville State University Confidential Information. For Limited Distribution as Appropriate.

“AI 101” … an overview of artificial intelligence and machine learning relevant to ROI

“AI 101” … an overview of artificial intelligence and machine learning relevant to ROI Hyperparameter Selection Risks • Finding a good learning rate η for optimal convergence often requires experimentation. • Select different learning rates (η=0. 1 and η=0. 0001) and, after training, plot cost function vs. number of epochs to see which model performs best. Problem 1: Learning rate too large causes to overshoot global minimum Problem 2: Learning rate too small requires very large number of epochs to converge University of North Carolina at Charlotte - North Carolina State University - Fayetteville State University Confidential Information. For Limited Distribution as Appropriate.

“AI 101” … an overview of artificial intelligence and machine learning relevant to ROI

“AI 101” … an overview of artificial intelligence and machine learning relevant to ROI Hyperparameter Selection Risks • Finding a good learning rate η for optimal convergence often requires experimentation. • Select different learning rates (η=0. 01 and η=0. 0001) and after training plot cost function versus number of epochs to see which Adaline – Learning rate 0. 01 (standardized) implementation performs best. Problem 1: Learning rate too large causes to overshoot global minimum Problem 2: Learning rate too small requires very large number of epochs to converge University of North Carolina at Charlotte - University of. Carolina North Carolina at Chapel-Hill - Fayetteville State University of North Carolina at Charlotte - North State University Fayetteville State University Confidential. Information. For. Limited. Distributionasas. Appropriate.

“AI 101” … an overview of artificial intelligence and machine learning relevant to ROI

“AI 101” … an overview of artificial intelligence and machine learning relevant to ROI References for AI 101 • Rashchka, S. (2016) Python Machine Learning: Unlock deeper insights into machine learning with this vital guide to cutting-edge predictive analytics. Packt Publishing Ltd. 2 nd Edition. ISBN 978 -1 -78355 -513 -0 • Shalev-Shwartz, S. & Ben-David, S. (2017) Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press. 7 th Edition. ISBN 978 -1 -107 -05713 -5 • Poole, D. L. & Mackworth, A. K. (2017). Artificial Intelligence: Foundations of Computational Agents. Cambridge University Press. 2 nd Edition. ISBN 978 -1 -107 -19539 -4 University of North Carolina at Charlotte - North Carolina State University - Fayetteville State University Confidential Information. For Limited Distribution as Appropriate.

“Hybrid Models” … an illustrative example of a physics-guided data learning model Theory versus

“Hybrid Models” … an illustrative example of a physics-guided data learning model Theory versus Data in Models Source: Karpatne, 2017 Data Science Models can get parameter values from Theory constraints output of Data Science Models University of North Carolina at Charlotte - North Carolina State University - Fayetteville State University Confidential Information. For Limited Distribution as Appropriate.

“Hybrid Models” … an illustrative example of a physics-guided data learning model Characteristics of

“Hybrid Models” … an illustrative example of a physics-guided data learning model Characteristics of Theory-guided Models Space of PGNNs Source: Karpatne, 2017 More curved line = more variance Closer to Truth = less bias University of North Carolina at Charlotte - North Carolina State University - Fayetteville State University Confidential Information. For Limited Distribution as Appropriate.

“Hybrid Models” … an illustrative example of a physics-guided data learning model Where to

“Hybrid Models” … an illustrative example of a physics-guided data learning model Where to Insert the Physics Theory in these Models? Source: Karpatne, 2017 University of North Carolina at Charlotte - North Carolina State University - Fayetteville State University Confidential Information. For Limited Distribution as Appropriate.

“Hybrid Models” … an illustrative example of a physics-guided data learning model Example: Lake

“Hybrid Models” … an illustrative example of a physics-guided data learning model Example: Lake Temperature Modeling Source: Karpatne, 2017 University of North Carolina at Charlotte - North Carolina State University - Fayetteville State University Confidential Information. For Limited Distribution as Appropriate.

“Hybrid Models” … an illustrative example of a physics-guided data learning model References for

“Hybrid Models” … an illustrative example of a physics-guided data learning model References for Theory Guided Models • Karpatne, A. , Watkins, W. , Read, J. , & Kumar, V. How Can Physics Inform Deep Learning Methods in Scientific Problems? : Recent Progress and Future Prospects. • Karpatne, Anuj, et al. "Physics-guided Neural Networks (PGNN): An Application in Lake Temperature Modeling. " ar. Xiv preprint ar. Xiv: 1710. 11431 (2017). • Long, Y. , She, X. , & Mukhopadhyay, S. (2018). Hybrid. Net: Integrating Model-based and Data-driven Learning to Predict Evolution of Dynamical Systems. ar. Xiv preprint ar. Xiv: 1806. 07439. University of North Carolina at Charlotte - North Carolina State University - Fayetteville State University Confidential Information. For Limited Distribution as Appropriate.

Project 1 Goals Can we develop a new class of hybrid models in which

Project 1 Goals Can we develop a new class of hybrid models in which physics-based models are combined with machine learning (or other data-driven models) to enable new theory-guided data learning models? Statistics Pattern Neurocomputing Recognition Deep Learning Data Mining Databases Machine Learning Artificial Intelligence + Knowledge Discovery TASKS • Establish a toolbox of available data-driven modeling approaches …. ”review paper” • Develop framework for combining physics-based and data-driven models • Demonstrate improved physical consistency and reduced error for hybrid approaches PROJECT 1: Theory-Guided Data Learning Models University of North Carolina at Charlotte - North Carolina State University - Fayetteville State University Confidential Information. For Limited Distribution as Appropriate.

“Looking Ahead” … towards a framework for thinking about Theory-Guided Data Learning BUILDS ON

“Looking Ahead” … towards a framework for thinking about Theory-Guided Data Learning BUILDS ON FOUNDATION OF DATA SCIENCE WHILE TAKING ADVANTAGE OF DOMAIN THEORY AND KNOWLEDGE. Theory versus Data in Predictive Models THEORY-BASED MODELS LIMITED BY CURRENT SCIENTIFIC UNDERSTANDING Knowledge gaps. . . theory not completely understood…. Incomplete model specification…Simplifying assumptions… DATA SCIENCE MODELS SHOW LIMITED PERFORMANCE WHEN DATA IS UNDERREPRESENTED What is Our Ultimate Goal? …to “learn” a model that shows the best generalization performance over any unseen instances and that does not violate any physical constraints informed by theory…. and can be “explained”…. generalizable and scientifically interpretable. Black box…. Requires large data samples…. non-stationary patterns…spurious relationships PERFORMANCE ACCURACY + SIMPLICITY + CONSISTENCY + INTERPRETABILITY Source: Karpatne et al. , 2017 University of North Carolina at Charlotte - North Carolina State University - Fayetteville State University Confidential Information. For Limited Distribution as Appropriate.

“Looking Ahead” … towards a framework for thinking about Theory-Guided Data Learning Applications (Found

“Looking Ahead” … towards a framework for thinking about Theory-Guided Data Learning Applications (Found to Date) for Theory-Guided Data Learning Models • CLIMATE PATTERNS. . . (A graph-based approach to find teleconnections in climate data, Kawale et al. , 2013) • MODELING TURBULENCE (Machine-learning augmented predictive modeling of turbulent separated flows over airfoils, Singh et al. , 2016; Physics-Informed machine learning for predictive turbulence modeling, Wang et al. , 2016) • MATERIALS DISCOVERY (Predicting crystal structure by merging data mining with quantum mechanics, Fischer et al, 2006) • QUANTUM CHEMISTRY (Understanding machine-learned density functionals, Li et al. , 2015) • BIOMEDICAL IMAGING (Robust transmural electrophysiological imaging, Xu et al. , 2015) • DISCOVERY OF GENETIC BIO-MARKERS (Accounting for linkage disequilibrium in genome-wide association studies, Liu et al. , 2013) • SURFACE WATER DYNAMICS AT GLOBAL SCALE (Approach for global monitoring of surface water extent variations using modis data, Khandelwal et al. , 2017) • LAKE HYDROLOGY (Physics-guided Neural Networks (PGNN): An Application in Lake Temperature Modeling, Karpatne et al. , 2018) • PROTEIN STRUCTURE (End-to-end differentiable learning of protein structure, Al. Quraishi, 2018) University of North Carolina at Charlotte - North Carolina State University - Fayetteville State University Confidential Information. For Limited Distribution as Appropriate.

“Looking Ahead” … towardsa framework for thinking about Theory-Guided Data Learning Applications (Found to

“Looking Ahead” … towardsa framework for thinking about Theory-Guided Data Learning Applications (Found to Date) for Theory-Guided Data Learning Models • REVIEW PAPERS • A Big Data Guide to Understanding Theory-Guided Data Science, Faghmous and Kumar, Big Data, Vol. 3, 2014 • Theory-Guided Machine Learning in Materials Science, Frontiers in Materials, Vol. 3, p. 28, 2016 • Theory-Guided Data Science for Climate Change, Computer, Vol. 47, No. 11, 74 -78, 2014 • Towards Enhanced Understanding and Projections of Climate Extremes Using Physics-Guided Data Mining Techniques, Ganguly et al. , Open Source, Vol. 21, No. 4, 777 -795, 2014. • Theory-Guided Data Science: A New Paradigm for Scientific Discovery from Data, Open Source, Karpatne et al. , 2017 • CONFERENCES • Physics-Informed Machine Learning Conference, Santa Fe, New Mexico, 2016 University of North Carolina at Charlotte - North Carolina State University - Fayetteville State University Confidential Information. For Limited Distribution as Appropriate.

“Looking Ahead” … towards a framework for thinking about Theory-Guided Data Learning GOAL……. Develop

“Looking Ahead” … towards a framework for thinking about Theory-Guided Data Learning GOAL……. Develop theory-guided data learning models that will enable better predictions of (fill in the blank…) than either physical models or data models, and a method for determining the parameters that generate the improved predictions. TONY’S “PARADIGM” Get initial guess using theory. . . collect some measurement data. . integrate into a TGDL model. . . get better performance. Phase 1 Baseline Machine Learning Model for predicting Y (surface finish or …) Phase 2 Theory-Guided Machine Learning Model for improved predictions of Y (surface finish or …) Locate optimum Phase 3 Optimize parameters that achieve improved performance (Genetic Algorithms, Electromagnetic Optimization, etc. ) Determine parameters University of North Carolina at Charlotte - North Carolina State University - Fayetteville State University Confidential Information. For Limited Distribution as Appropriate.

“Looking Ahead” … towards a framework for thinking about Theory-Guided Data Learning A Framework

“Looking Ahead” … towards a framework for thinking about Theory-Guided Data Learning A Framework For Thinking About Theory-Guided Data Learning Models TYPE 1: Sequential models…. . One model feeds into the other…. Support Feature Selection Li et al. , “Understanding machine-learned density functionals, ” International Journal of Quantum Chemistry, 2015 Input Physical Model Data Model Output Input Data Model Physical Model Output Generate Missing Data or Intermediate Outputs Xiao, et al. , “A Physics-Informed Machine Learning Framework for Predictive Turbulence Modeling”, 2016 University of North Carolina at Charlotte - North Carolina State University - Fayetteville State University Confidential Information. For Limited Distribution as Appropriate.

“Looking Ahead” … towards a framework for thinking about Theory-Guided Data Learning TYPE 2:

“Looking Ahead” … towards a framework for thinking about Theory-Guided Data Learning TYPE 2: Theory-guided data learning models…. . physical models inform data models Retraining of ANN models using theory-based simulations Use theorybased equations as constraints Physical Model Input Data Model Output J. Liu, K. Wang, S. Ma, and J. Huang, “Accounting for linkage disequilibrium in genome-wide association studies: a penalized regression method, ” Statistics and its interface, vol. 6, no. 1, p. 99, 2013. University of North Carolina at Charlotte - North Carolina State University - Fayetteville State University Confidential Information. For Limited Distribution as Appropriate.

“Looking Ahead” … towards a framework for thinking about Theory-Guided Data Learning TYPE 3:

“Looking Ahead” … towards a framework for thinking about Theory-Guided Data Learning TYPE 3: Conformance between physical model and data model outputs…. . Typically used for calibrating theorybased models using observational data Input Curtarolo et al. , “The high-throughput highway to computational materials design, ” Nature Materials, 2013 Helpful in including domain constraints that might not be known explicitly Physical Model Output Input Minimize errors with loss function to assure compliance between theory and data science Data Model Karpatne, Anuj, et al. "Physics-guided Neural Networks (PGNN): An Application in Lake Temperature Modeling. " ar. Xiv preprint ar. Xiv: 1710. 11431, 2017 University of North Carolina at Charlotte - North Carolina State University - Fayetteville State University Confidential Information. For Limited Distribution as Appropriate.

“Looking Ahead” … towards a framework for thinking about Theory-Guided Data Learning Issues in

“Looking Ahead” … towards a framework for thinking about Theory-Guided Data Learning Issues in Specifying Theory-Guided Data Learning Models • CHOOSING A MODEL ARCHITECTURE ü Scientific theory can inform choice of loss function ü Scientific theory can be important in choosing the model architecture ü Example: learning view-invariant features of human faces[Viewtolerant face recognition and hebbian learning imply mirror symmetric neural tuning to head orientation, Leibo, 2016 University of North Carolina at Charlotte - North Carolina State University - Fayetteville State University Confidential Information. For Limited Distribution as Appropriate.

“Looking Ahead” … towards a framework for thinking about Theory-Guided Data Learning Issues in

“Looking Ahead” … towards a framework for thinking about Theory-Guided Data Learning Issues in Specifying Theory-Guided Data Learning Models • ASSURING CONSISTENCY ü Theory-guided initialization of parameters at early stages to avoid getting “stuck” (can use pre-training strategies – needs lot of data) ü Theory-guided priors in the model space…. example predicting heart signals…. . use probability to determine spatial distribution. ü Theory-guided constrained Lake Cross Section optimization to. Domain restrictinformation the is used to learn classification space of the model boundaries that are physically viable parameters…. . learning physically consistent classification Locations at lower elevation are water if boundaries of water and locations at higher elevation are water University of North Carolina at Charlotte - North Carolina State University - Fayetteville State University Confidential Information. For Limited Distribution as Appropriate.

“Looking Ahead” … towards a framework for thinking about Theory-Guided Data Learning Issues in

“Looking Ahead” … towards a framework for thinking about Theory-Guided Data Learning Issues in Specifying Theory-Guided Data Learning Models • REFINING MODEL OUTPUTS ü To make sure they are in compliance with physical phenomena ü Leverage physical knowledge at final stage of model building ü Post-processing…. example from mapping dynamics of surface water bodies: “Post classification label refinement using implicit ordering constraint among data instances: , Khandelwal et al. , 2013 ü Pruning…. example from materials discovery: “The high throughput highway to computational materials design”, Curtarolo et al. , 2013 Goal for materials discovery…. find novel materials that have desirable properties …. problem is too expensive to estimate structure and properties of every material. DATA MODELS FOR GENERATING TARGET SET OF COMPOUNDS… constraints to generate smaller set than DB approaches DFT FOR POSTPROCESSING FINAL SET DFT = computationally expensive tool for checking material properties to obtain even smaller setat Charlotte - North Carolina State University - Fayetteville State University of North Carolina Confidential Information. For Limited Distribution as Appropriate.

“Looking Ahead” … towards a framework for thinking about Theory-Guided Data Learning Issues in

“Looking Ahead” … towards a framework for thinking about Theory-Guided Data Learning Issues in Specifying Theory-Guided Data Learning Models • CREATING “HYBRID” MODELS OF THEORY AND DATA SCIENCE ü One approach…. . outputs of theory-based components used as inputs to data science and vice-versa…. example in climate science, use theory models for coarse spatial resolutions, and data science to create finer spatial resolutions ü Another approach…. introduce data science outputs into theory-based models…called “field inversion and machine learning ü Yet another approach…. Outputs of theory-based simulations used as inputs to the data science model ü Or…. . use data science methods to predict intermediate quantities of theorybased models that are missing/inaccurate. …. . turbulence modeling example ü Key Issues in designing the data learning model…. ü Number of hidden layer ü Design of linkages along layers ü Decomposing data learning model (ANN) into modular components representing sub-processes of the physical system ü …. . and soon University of North Carolina at Charlotte - North Carolina State University - Fayetteville State University Confidential Information. For Limited Distribution as Appropriate.

“Looking Ahead” … towards a framework for thinking about Theory-Guided Data Learning Issues in

“Looking Ahead” … towards a framework for thinking about Theory-Guided Data Learning Issues in Specifying Theory-Guided Data Learning Models MODELING TURBULENCE OF AIR FOILS Source: Duraisamy, 2016 • Goal is to predict flow characteristics such as lift and drag which are difficult to compute for complex turbulent flows • Very computationally intensive so use inexact approximations (RANS models) • Hybrid approach…. use ANN to predict intermediate quantities of RANS models. TGDL Model University of North Carolina at Charlotte - North Carolina State University - Fayetteville State University Confidential Information. For Limited Distribution as Appropriate.

“Looking Ahead” … towards a framework for thinking about Theory-Guided Data Learning Summary Thoughts

“Looking Ahead” … towards a framework for thinking about Theory-Guided Data Learning Summary Thoughts • Think BROADLY about how to bring scientific knowledge into the data learning models for our specific engineering domain…. . many possible approaches & methods… • There seems to have been a pulse of interest in this area a couple of years back. Applications are scattered across the literature of various fields. • Most work led by two distinct communities of researchers • Growing community for data-driven turbulence modeling funded by NASA and NSF at University of Michigan • NSF-funded group at University of Minnesota under NSF Expeditions in Computing project “Understanding Climate Change Using Data Driven Approaches” • So, there is an opportunity for us……can we demonstrate value? Identified a faculty member at NCSU who was affiliated with the Minnesota Group…. . Nagazia Samatova in Computer Science…. . HPC and data analytics University of North Carolina at Charlotte - North Carolina State University - Fayetteville State University Confidential Information. For Limited Distribution as Appropriate.