Introduction to concepts in deep learning and crossvalidation

  • Slides: 45
Download presentation
Introduction to concepts in deep learning and crossvalidation Rick Klein Université Grenoble Alpes

Introduction to concepts in deep learning and crossvalidation Rick Klein Université Grenoble Alpes

Preamble 2

Preamble 2

Preamble ○ We are not very efficient 3

Preamble ○ We are not very efficient 3

Preamble ○ We are not very efficient 4

Preamble ○ We are not very efficient 4

Preamble ○ We are not very efficient ○ We are remarkably inefficient □ Developing

Preamble ○ We are not very efficient ○ We are remarkably inefficient □ Developing theory when 50/50 finding replicates □ Convinced false positives contribute 5

Preamble ○ We are not very efficient ○ We are remarkably inefficient □ Developing

Preamble ○ We are not very efficient ○ We are remarkably inefficient □ Developing theory when 50/50 finding replicates □ Convinced false positives contribute ○ And yet despite this, we’ve learned a ton about human behavior and the brain 6

Preamble ○ We are not very efficient ○ We are remarkably inefficient □ Developing

Preamble ○ We are not very efficient ○ We are remarkably inefficient □ Developing theory when 50/50 finding replicates □ Convinced false positives contribute ○ And yet despite this, we’ve learned a ton about human behavior and the brain ○ Imagine if we fix it? 7

Preamble ○ 2012 -> Kate Ratliff (UF) new lab □ First/only phd student □

Preamble ○ 2012 -> Kate Ratliff (UF) new lab □ First/only phd student □ Lab protocol ○ 2014 -> Many Labs 1 published □ Totally revise lab protocol ○ 2018 -> Hans IJzerman (UGA) new lab □ First/only postdoc □ Establish lab workflow (corelab. io) 8

Preamble ○ 2012 -> Kate Ratliff (UF) new lab □ First/only phd student □

Preamble ○ 2012 -> Kate Ratliff (UF) new lab □ First/only phd student □ Lab protocol ○ 2014 -> Many Labs 1 published □ Totally revise lab protocol ○ 2018 -> Hans IJzerman (UGA) new lab □ First/only postdoc □ Establish lab workflow (corelab. io) 9 Final. docx?

Preamble ○ Leave workflow talk to Frederik ○ Talk about exciting possibilities □ At

Preamble ○ Leave workflow talk to Frederik ○ Talk about exciting possibilities □ At least one solid one (guilt-free phacking!) ○ Suspend disbelief □ Don’t necessarily try to integrate this with your current research paradigm □ Exposure + different way of thinking ○ Tools have revolutionized other fields 10

Preamble ○ …Will show immediately useful tools also □ E. g. , how to

Preamble ○ …Will show immediately useful tools also □ E. g. , how to do exploratory/secondary data analysis with confidence □ (half this talk is “things I accidentally learned from data scientists that are actually super helpful”) ○ Perhaps different/complementary approaches 11

Workshop Goals ○ Machine learning/deep learning intro ○ Lessons from computer science □ □

Workshop Goals ○ Machine learning/deep learning intro ○ Lessons from computer science □ □ Cross-validation + prediction R/Rstudio + Reproducible code Git. Hub + collaboration/exposure But I’m not a computer scientist! ○ Make machine learning accessible □ Incentive to learn R □ Fun hands-on examples 12

Caveats ○ Don’t forget: □ □ Replication Pre-registration Increasing statistical power (+collaborations) Open data/materials/code

Caveats ○ Don’t forget: □ □ Replication Pre-registration Increasing statistical power (+collaborations) Open data/materials/code 13

What is Machine Learning? ○ Machine learning □ Broad field □ Humans define inputs/structure/outputs

What is Machine Learning? ○ Machine learning □ Broad field □ Humans define inputs/structure/outputs □ Computer “learns” parameters ○ Three broad categories: □ Supervised learning □ Unsupervised learning □ Reinforcement learning 14

What is Machine Learning? ○ Supervised learning □ Train and test on known (labelled)

What is Machine Learning? ○ Supervised learning □ Train and test on known (labelled) examples 15

What is Machine Learning? ○ Supervised learning □ Train and test on known (labelled)

What is Machine Learning? ○ Supervised learning □ Train and test on known (labelled) examples Train Cat Dog Cat 16

What is Machine Learning? ○ Supervised learning □ Train and test on known (labelled)

What is Machine Learning? ○ Supervised learning □ Train and test on known (labelled) examples Predict + test Train Cat Dog Dog Cat Cat Dog Cat 17

What is Machine Learning? ○ Supervised learning □ Train and test on known (labelled)

What is Machine Learning? ○ Supervised learning □ Train and test on known (labelled) examples Predict + test Train Cat Dog Dog Cat Cat Dog Cat 18

What is Machine Learning? ○ Supervised learning □ Train and test on known (labelled)

What is Machine Learning? ○ Supervised learning □ Train and test on known (labelled) examples Predict + test Train Cat Dog Dog Cat Cat Dog Cat 19 Use

What is Machine Learning?

What is Machine Learning?

What is Machine Learning?

What is Machine Learning?

What is Machine Learning? Column 1 255 231. . 255 134. . 142

What is Machine Learning? Column 1 255 231. . 255 134. . 142

What is Machine Learning? ○ Unsupervised learning □ Unlabelled data -> like factor analysis

What is Machine Learning? ○ Unsupervised learning □ Unlabelled data -> like factor analysis 23

What is Machine Learning? ○ Unsupervised learning □ Unlabelled data -> like factor analysis

What is Machine Learning? ○ Unsupervised learning □ Unlabelled data -> like factor analysis From An Introduction to Statistical Learning: http: //www-bcf. usc. edu/~gareth/ISL/ 24

What is Machine Learning? ○ Reinforcement learning □ Learning through repetition/experience 25

What is Machine Learning? ○ Reinforcement learning □ Learning through repetition/experience 25

What is Machine Learning? ○ Reinforcement learning □ Learning through repetition/experience Mario: https: //www.

What is Machine Learning? ○ Reinforcement learning □ Learning through repetition/experience Mario: https: //www. youtube. com/watch? v=qv 6 UVOQ 0 F 44 &t=249 s Dota 2: https: //blog. openai. com/openai-five/ Go: https: //deepmind. com/research/alphago/ Chess: https: //github. com/Zeta 36/chess-alpha-zero 26

What is Machine Learning? ○ Deep learning = Artificial Neural Network □ Specific sub-type

What is Machine Learning? ○ Deep learning = Artificial Neural Network □ Specific sub-type of machine learning (kind of) □ Can also have “shallow” neural network □ Today, focus on supervised learning ○ Is this “AI”? □ Not “general” intelligence 27

Deep Learning is Amazing ○ Self-driving cars □ https: //selfdrivingcars. mit. edu/ ○ Facial

Deep Learning is Amazing ○ Self-driving cars □ https: //selfdrivingcars. mit. edu/ ○ Facial recognition ○ Siri/Cortana ○ Google Translate □ As of Sept 2016 ○ Which ads to show users 28

Deep Learning is Amazing ○ Shen et al. , 2017 https: //doi. org/10. 1101/240317

Deep Learning is Amazing ○ Shen et al. , 2017 https: //doi. org/10. 1101/240317 ○ Reconstruct viewed images from neural activity 29

Neural Network Architecture 30 github. com/cs 231 n/

Neural Network Architecture 30 github. com/cs 231 n/

Neural Network Architecture Var 1 Var 2 Var 3 31 github. com/cs 231 n/

Neural Network Architecture Var 1 Var 2 Var 3 31 github. com/cs 231 n/

Neural Network Architecture Var 1 Prediction Var 2 Var 3 32 github. com/cs 231

Neural Network Architecture Var 1 Prediction Var 2 Var 3 32 github. com/cs 231 n/

Neural Network Architecture Var 1 Prediction Var 2 Var 3 Called hidden layers for

Neural Network Architecture Var 1 Prediction Var 2 Var 3 Called hidden layers for a reason 33 github. com/cs 231 n/

Neural Network Architecture Var 1 Var 2 Prediction Var 3 Common example, each neuron:

Neural Network Architecture Var 1 Var 2 Prediction Var 3 Common example, each neuron: 1) Multiplies input by a weight and adds a constant (z = w * x + b) 2) Applies non-linear function (e. g. , Re. LU function) github. com/cs 231 n/ 3) Passes on output to next 34 layer

Neural Network Architecture □ Re. LU function: □ Tan. H function: 35

Neural Network Architecture □ Re. LU function: □ Tan. H function: 35

How does it learn? 1. Define cost function ○ E. g. , mean square

How does it learn? 1. Define cost function ○ E. g. , mean square error 2. 3. 4. Randomize initial weights/parameters Run model and evaluate (forward propagation) Iterate and try again ○ Could choose random new values ○ In reality: derivatives and gradient descent 36 github. com/cs 231 n/

Gradient Descent 37

Gradient Descent 37

Neural Network Architecture 38

Neural Network Architecture 38

Neural Network Architecture □ Many different kinds of hidden layers ○ Pooling layers (managing

Neural Network Architecture □ Many different kinds of hidden layers ○ Pooling layers (managing shape/dimensions of data) ○ Convolutional layers (used in computer vision, edge detection, etc) □ Number of nodes? Layers? ○ Mix-and-match 39

Cross-Validation □ WAIT! Isn’t this basically p-hacking? ○ Definitely overfitting ○ Bias vs variance

Cross-Validation □ WAIT! Isn’t this basically p-hacking? ○ Definitely overfitting ○ Bias vs variance trade-off 40 Andrew Ng Deeplearning. ai

Cross-Validation □ □ Solution: Make predictions and test them Immediately applicable to social psychology

Cross-Validation □ □ Solution: Make predictions and test them Immediately applicable to social psychology ○ Yarkoni and Westfall, 2017 □ Our statistical models are really not intended for predicting future outcomes ○ Machine learning or not 41

Cross-Validation □ Train/Test datasets (“hold-out” cross-validation) □ Pros: ○ P-hack all you want ○

Cross-Validation □ Train/Test datasets (“hold-out” cross-validation) □ Pros: ○ P-hack all you want ○ Great for being confident in secondary data analyses □ Cons: ○ Careful not to “leak” between train/test sets □ Ideally use novel test data each time ○ Reduces power -> how to split? □ Depends -> as much in training as possible 42

Cross-Validation □ K-folds validation: ○ Like repeated hold-out validation Uses whole dataset ○ □

Cross-Validation □ K-folds validation: ○ Like repeated hold-out validation Uses whole dataset ○ □ Pros: ○ No loss in power ○ Check robustness to outliers □ Cons: ○ Computationally expensive ○ Theoretically not as convincing as hold-out -> but allegedly performs better 43

Deep Learning/Neural Nets ○ Extremely good at prediction □ Complex patterns in data □

Deep Learning/Neural Nets ○ Extremely good at prediction □ Complex patterns in data □ Unstructured data or structured data ○ NOT very good at explanation □ Prediction or explanation? □ Use NN to see if something is there, other approaches to confirm mechanism □ May be tools coming 44

Code session □ Hands-on tutorials here: osf. io/ya 2 n 8/ ○ rkts-tutorials. html

Code session □ Hands-on tutorials here: osf. io/ya 2 n 8/ ○ rkts-tutorials. html – download and open □ □ in web browser ○ Scripts should run start-to-finish without intervention (except possibly to install packages) Alternative options: Codeocean. com, Git. Hub, Build website in R, Kaggle. com, other workshop stuff – happy to help with any. R is the one skill I wish I learned better sooner 45