Classification Logistic Regression Hungyi Lee Step 1 Function
- Slides: 38
Classification: Logistic Regression Hung-yi Lee 李宏毅
Step 1: Function Set Function set: Including all different w and b z 0 class 1 z 0 class 2
Step 1: Function Set … … Sigmoid Function
Step 2: Goodness of a Function Training Data Given a set of w and b, what is its probability of generating the data?
Step 2: Goodness of a Function Cross entropy between two Bernoulli distribution Distribution p: Distribution q: cross entropy
Step 2: Goodness of a Function Cross entropy between two Bernoulli distribution 0. 0 1. 0 minimize cross entropy
Step 3: Find the best function
Step 3: Find the best function
Step 3: Find the best function Larger difference, larger update
Logistic Regression + Square Error Step 1: Step 2: Step 3: (close to target) (far from target)
Logistic Regression + Square Error Step 1: Step 2: Step 3: (far from target) (close to target)
Cross Entropy v. s. Square Error Cross Entropy Total Loss Square Error http: //jmlr. org/proceedi ngs/papers/v 9/glorot 10 a. pdf w 1 w 2
Logistic Regression Linear Regression Step 1: Output: between 0 and 1 Step 2: Step 3: Output: any value
Logistic Regression Linear Regression Step 1: Output: between 0 and 1 Step 2: Cross entropy: Output: any value
Logistic Regression Linear Regression Step 1: Output: between 0 and 1 Step 2: Step 3: Logistic regression: Linear regression: Output: any value
Discriminative v. s. Generative directly find w and b Will we obtain the same set of w and b? The same model (function set), but different function may be selected by the same training data.
Generative v. s. Discriminative Generative Discriminative All: hp, att, sp att, de, speed 73% accuracy 79% accuracy
Generative v. s. Discriminative • Example Training Data Testing Data 1 1 X 4 1 0 Class 1 Class 2 1 1 Class 1? Class 2? 0 1 Class 2 X 4 0 0 X 4 Class 2 How about Naïve Bayes?
Generative v. s. Discriminative • Example Training Data 1 1 X 4 1 0 Class 1 Class 2 0 1 Class 2 X 4 0 0 X 4 Class 2
Training Data 1 1 1 0 Class 1 Class 2 <0. 5 Testing Data 1 1 X 4 0 1 Class 2 X 4 0 0 X 4 Class 2
Generative v. s. Discriminative • Usually people believe discriminative model is better • Benefit of generative model • With the assumption of probability distribution • less training data is needed • more robust to the noise • Priors and class-dependent probabilities can be estimated from different sources.
Multi-class Classification (3 classes as example) C 1 : C 2 : C 3 : Softmax 3 1 -3 0. 88 20 0. 12 2. 7 0. 05 ≈0
Multi-class Classification (3 classes as example) [Bishop, P 209 -210] Cross Entropy Softmax target
Limitation of Logistic Regression Input Feature x 1 x 2 0 0 0 1 1 1 0 1 Can we? Label Class 2 Class 1 Class 2 z≥ 0 z<0 z≥ 0
Limitation of Logistic Regression • Feature transformation Not always easy …. . domain knowledge can be helpful
Limitation of Logistic Regression • Cascading logistic regression models Feature Transformation Classification (ignore bias in this figure)
-1 -2 2 2 -2 -1
(0. 73, 0. 05) (0. 27, 0. 27) (0. 05, 0. 73)
Deep Learning! All the parameters of the logistic regressions are jointly learned. “Neuron” Feature Transformation Classification Neural Network
Reference • Bishop: Chapter 4. 3
Appendix
Three Steps • Step 1. Function Set (Model) feature Otherwise, output: y = class 2 • Step 2. Goodness of a function • Step 3. Find the best function: gradient descent class
Step 2: Loss function Ideal loss: 0 class +1 1 z 0 class -1 2 Approximation: 0 or 1 Ideal loss z
Step 2: Loss function cross entropy Ground 1. 0 Truth
Step 2: Loss function Ideal loss Divided by ln 2 here
- Logistic regression vs linear regression
- Logistic regression vs linear regression
- Step 1 step 2 step 3 step 4
- Hungyi lee
- Hungyi lee
- Hungyi
- Hungyi lee
- Hungyi
- Cost function logistic regression
- Null deviance vs residual deviance
- Sequential logistic regression
- Logistic regression vs random forest
- Perceptron logistic regression
- Multinomial logistic regression
- Andy field regression
- Logistic function
- Multinomial logistic regression
- Logistic regression epidemiology
- Binary
- Logistic regression spss output
- Regresi logistik berganda adalah
- Logistic regression stata
- Logistic regression stata
- Multiple linear regression
- Logistic regression slide
- Ratio test
- Logistic regression outlier
- Advantage of logistic regression
- Disadvantages of linear regression
- Logistic regression stochastic gradient descent
- Mllib logistic regression
- Logistic regression in data mining
- Hessian matrix for 3 variables
- Backpropagation logistic regression
- Logistic regression assignment
- Logistic regression and discriminant analysis
- Pca logistic regression
- Backpropagation logistic regression
- Logistic regression backpropagation