Gaussian Processes for Machine Learning Neil Lawrence University
- Slides: 39
Gaussian Processes for Machine Learning Neil Lawrence University of Sheffield 29 th January 2016 Ox. Wa. SP Symposium
P WO L A C SI RLD HY R EMPI ICAL : e t u mp Co o HAN Fast n e v i r D a t a D inf MEC LS E D MO d n a s ic t g s i n t i n r Sta a e L e hin c a M ISTIC Com MOD info? Kno pute : ELS Slow wled ge D riven Phys ics o f in
The Data are Not Enough • Four pillars: • Deterministic/Stochastic • Mechanistic/Emipirical • Goal: model complex phenomena over time • Problem: • Mechanistic models are often inaccurate • Data is often not rich enough for an empirical approach • Question 1: 1 How do we combine inaccurate physical model with machine learning?
Central Dogma DNA Transcription m. RNA Translation Protein
Decision: Transcription Factors Measured using Microarray since 1998 m. RNA Translation Difficult to measure TF Protein Transcription Measured using Microarray since 1998 Other m. RNAs
Mechanistic Model Translation Transcription
Zero Mean Gaussian Sample index 5 2 5 1 10 index 1. 5 0. 5 15 20 0 0 5 10 15 index samples from Gaussian 20 25 25 10 15 20 25 1 0. 9 0. 8 0. 7 0. 6 0. 5 0. 4 0. 3 0. 2 0. 1
Zero Mean Gaussian Process Sample 5 2 1. 5 5 1 10 0. 5 15 0 20 0 5 10 15 20 t samples from Gaussian process 25 25 10 15 20 25 1 0. 9 0. 8 0. 7 0. 6 0. 5 0. 4 0. 3 0. 2 0. 1
Gaussian Processes
Gaussian Processes
Gaussian Processes
Results TPAMI, 2 PNAS papers, 2 Comp Bio
MATLAB Demo • demo_2016_01_29_Ox. Wa. SP. m
Further Challenge • This model inter-relates different functions with mechanistic understanding. • What if you need to inter-relate across different modalities of data at different scales. • E. g. biopsy images + genetic test + mammogram for breast cancer diagnostics.
The Data are Not Enough • Four pillars: • Deterministic/Stochastic • Mechanistic/Empirical • Goal: model complex phenomena over time • Problem: • Mechanistic models are often inaccurate • Data is often not rich enough for an empirical approach • Question 2: 2 How do we formulate the right representations to integrate different data modalities?
Classical Latent Variables
Classical Treatment •
Render Gaussian Non Gaussian
Stochastic Process Composition •
Use Abstraction for Complex Systems High Level Ideas Stratification of Concepts Low Level Mechanisms
Biology and Health ? ? ? Molecular Biology
Neuroscience Behaviour ? ? ? Neuron Firing
Example: Motion Capture Modelling
MATLAB Demo • demo_2016_01_29_Ox. Wa. SP. m
Modelling Digits
MATLAB Demo • demo_2016_01_29_Ox. Wa. SP. m
Health • Complex system • Scarce data • Different modalities • Poor understanding of mechanism • Large scale PLo. S Comp Bio, Nature Methods genotype epigenotype environment State of health clinical tests gene expression Organ states clinical notes Cell states treatment survival analysis biopsy X-ray
To Find Out More • Gaussian Process Summer School • 12 th-15 th September 2016 in Sheffield • This year in parallel with/themed as a UQ orientated school (co-organisation with Rich Wilkinson) • Occurring alongside ENBIS Meeting • http: //gpss. cc/
Future • Methodology • • Deep GPs (also current) Latent Force Models (current but dormant) Latent Action Models and Stochastic Optimal Control (new) Probabilistic Geometries (starting) • Exemplar Applications • • Health and Biology (existing) Developing world (existing) Robotics at different scales (starting) Perception: vision (dormant) haptic (new)
Summary • Complex systems: • ‘big data’ is too ‘small’. • The data are not enough. • Need data efficient methods • http: //www. theguardian. com/media-network/2016/jan/28/google-ai-go-grandmasterreal-winner-deepmind • Solutions: • Hybrid mechanistic-empirical models • Structured models for automated data assimilation
The Digital Oligarchy • Response to concentration of power with data • Citizen. Me • London based start up • User-centric data modelling • New challenges in ML • Integration of ML, systems, cryptography.
Open Data Science and Africa Challenge • “Whole pipeline challenge” • Make software available • Teach summer schools • Support local meetings • Publicity in the Guardian • Opportunities to deploy pipeline solution
Disease Incidence for Malaria
Uganda • Spatial models of disease
Deployed with UN Global Pulse Lab http: //pulselabkampala. ug/hmis/
- Near-optimal sensor placements in gaussian processes
- Gaussian process for dummies
- Concurrent in os
- Concept learning task in machine learning
- Analytical learning in machine learning
- Pac learning model in machine learning
- Pac learning model in machine learning
- Inductive vs analytical learning
- Analytical learning in machine learning
- Instance based learning in machine learning
- Inductive learning machine learning
- First order rule learning in machine learning
- Eager learning
- Cmu machine learning
- Cuadro comparativo entre e-learning b-learning y m-learning
- Iso 22301 utbildning
- Novell typiska drag
- Nationell inriktning för artificiell intelligens
- Ekologiskt fotavtryck
- Shingelfrisyren
- En lathund för arbete med kontinuitetshantering
- Underlag för särskild löneskatt på pensionskostnader
- Personlig tidbok fylla i
- Anatomi organ reproduksi
- Vad är densitet
- Datorkunskap för nybörjare
- Stig kerman
- Debatt artikel mall
- Magnetsjukhus
- Nyckelkompetenser för livslångt lärande
- Påbyggnader för flakfordon
- Arkimedes princip formel
- Publik sektor
- Urban torhamn
- Presentera för publik crossboss
- Vad är ett minoritetsspråk
- Vem räknas som jude
- Klassificeringsstruktur för kommunala verksamheter
- Fimbrietratt
- Bästa kameran för astrofoto