Supervised Learning Regression Ayal Gussow 1 Supervised Learning
- Slides: 40
Supervised Learning: Regression Ayal Gussow 1
Supervised Learning Module 1) Supervised learning 1: Regression problems 2) Supervised learning 2: Overfitting, regularization, hyperparameter optimization, and cross-validation 3) Supervised learning 3: Classification problems
Machine-Learning Basic Process Training data: Features (Outcome) Model Trained Model Data: Features Predict
Supervised Learning Module Training samples are provided with the desired output. Example: Predicting the price of a house based on its attributes.
Supervised Learning Module Square feet Number of bedrooms Size of backyard etc $800, 000
Supervised Learning: Regression: The predicted variable takes continuous values (IE house prices). Classification: The predicted variable takes labels (IE color).
Ames Dataset of houses in Ames, Iowa with 79 features per house, including: - Square footage - Number of bathrooms - Number of fireplaces - etc Price
Ames Dataset of houses in Ames, Iowa. Features: - Square footage of first floor - Square footage of second floor Predict: Price
Sample Problem Given the first and second floor square footage of a house in Ames, Iowa, can you predict the house’s price?
Machine-Learning Basic Process Training data Features: Sqr feet 1, Sqr feet 2 Outcome: Housing price Model Linear regression Trained Model Sqr feet 1, Sqr feet 2 Predict House Price 10
Evaluation Sum of squared residuals. Sum of the difference between the model’s predictions and the true values. 11
Today’s Outline 1) Linear regression 2) Decision Trees 3) Ensemble learning (Random Forest) 4) Homework 12
Ames Dataset Training data is 1500 houses with the following features: - Square footage for the first floor We want to predict: House Price
Plot of first floor square footage 14
Predict by first floor square footage We want to predict: Sale Price y – sale price x – square footage of 1 st floor. 15
Predict by first floor square footage y – sale price x – square footage of 1 st floor. 16
Plot of first floor square footage 17
Linear Regression 18
Linear Regression 19
Linear Regression Extra reading: Least Squares. 20
Plot of second floor square footage 21
Plot of second floor square footage (no 0 s) 22
Notebook: Ames Dataset Training data is 1500 houses with the following features: - Square footage for the first floor - Square footage for the second floor We want to predict: House Price
Decision Trees Conceptually simple method that assigns values based on a series of splits in the data. Can model non-linear relationships and interactions between features.
Plot of first floor square footage 25
Plot of first floor square footage 26
Plot of first floor square footage 27
Plot of first floor square footage 28
Plot of first floor square footage 1 2 3 4 29
1 Sample Decision Tree X > 1500? X < 1000? Mean Region 1 X > 2000? Mean Region 2 Mean Region 3 Mean Region 4 2 3 4
Sample Decision Tree X 1 > 1500? X 2 < 1000?
Decision Trees • Can model non-linear relationships and interactions between features. • No need for scaling • Internal feature selection Prone to overfitting.
Random Forest A collection of decision trees. Each tree votes and the average is taken.
Random Forest A collection of decision trees. Each tree votes and the average is taken. X 1 > 1500? X 2 < 1000? Mean Region 1 Mean Region 2 X 3 > 2000? Mean Region 3 Mean Region 4
Random Forest A collection of decision trees. Each tree votes and the average is taken.
Random Forest A collection of decision trees. Each tree votes and the average is taken.
Random Forest A collection of decision trees. Each tree votes and the average is taken.
Random Forest A collection of decision trees. Each tree votes and the average is taken.
Random Forest A collection of decision trees. Each tree votes and the average is taken. Data is subset for each tree. Features to split by are selected randomly.
Random Forest A collection of decision trees. Each tree votes and the average is taken. This helps reduce overfitting, but we lose interpretability. Easiest method to use OOTB. Parameters: n_trees, criterion for splits, max_depth
- Ayal grill
- "deep reinforcement learning"
- Perbedaan supervised dan unsupervised learning
- Linear regression vs multiple regression
- Regression linear model
- Logistic regression vs linear regression
- Logistic regression vs linear regression
- Alexandru niculescu-mizil
- Supervised learning pipeline
- Partially supervised learning
- Supervised vs unsupervised learning
- Supervised and unsupervised learning
- Supervised learning
- Linear regression andrew ng
- Linear regression with multiple features
- Locally weighted regression in machine learning
- Linear regression with multiple variables machine learning
- Cuadro comparativo e-learning b-learning m-learning
- Normalized cut loss for weakly-supervised cnn segmentation
- Normalized cut loss for weakly-supervised cnn segmentation
- Supervised data mining
- Joanne supervised 36 professionals in 6 city libraries
- Supervised diversionary program
- Interactive supervised classification
- Supervised visitation center dc
- Supervised agricultural experience ideas
- Partially supervised classification of text documents
- Supervised agricultural experience definition
- Supervised classification
- Remote-sensing
- On training targets for supervised speech separation
- Supervised vs unsupervised data mining
- Endogeneity problem in regression
- Regression discontinuity
- Charley's regression
- Microsoft regression suite automation tool
- Regression analysis excel 2007
- How to control for confounding
- Regressionskonstante
- Disadvantages of regression analysis
- Regression discontinuity