Logistic Regression 1 The basics Michael Claudius Associate

. Logistic Regression 1 The basics Michael Claudius, Associate Professor, Roskilde 31. 03. 2020 Revised 18. 10. 2020

What is logistic regression? • 2 1 2 d e c e m

Evaluation of logistic regression? • Advantages • Also good for small data sets! • White box; knows in details how it works • Easy • Disadvantages • Not good for big data, too slow • Wrong estimates for messy data, outliers • No missing data • Variables (features) must be independent 3 1 2 d e c e m

Prediction • 4 1 2 d e c e m

Estimation elements • It is all math ; that’s it looks complicated so just keep it simple! • • p: estimated probability h: hypothesis function based on θ: hθ X: feature vector or just feature values X 1, X 2, …. . Xn θ: parameter vector weights on features (θ 0, θ 1, θ 2, …. . θn) XT: transposed vector (columns changed to rows) XTθ: matrix multiplication (like linear regression θ 0 + X 1θ 1 + X 2θ 2 …. . + Xnθn σ: the famous sigmoid function ! • A link to Wikipedia 1 2 d e c e m 5

Sigmoid function • σ(t): values 0 – 1 ! 6 1 2 d e c e m

Training • 7 1 2 d e c e m

Cost function • This function for a single training instance fulfills the requirements • c: cost function • θ: parameter vector weights on features (θ 0, θ 1, θ 2, …. . θn) • p: estimated probability • But of course there are many instances, so we need an average of summation… 8 1 2 d e c e m

Average cost function • But of course there are many instances, so we need an average of summation of the whole training set • J(θ): parameter vector weights on features (θ 0, θ 1, θ 2, …. . θn) • How to find the best set ? • No Normal Equation ! • BUT Again we are lucky. . 9 1 2 d e c e m

Partial derivative of average cost function • Why Lucky? , because J(θ) is convex and differentiable • • That’s it has a global minimum and then • We can find the parameters (θ 0, θ 1, θ 2, …. . θn) using Batch Gradient Algorithm ! (BAM) 10 1 2 d e c e m

Assignments • It is time for discussion and solving a few assignments in groups • Logistic Regression Questions 11 1 2 d e c e m
- Slides: 11