CS 189 Brian Chu brian cberkeley edu Office

CS 189 Brian Chu brian. c@berkeley. edu Office Hours: Cory 246, 6 -7 p Mon. (hackerspace lounge) twitter: @brrrianchu brianchu. com

Agenda • • • Email me for slides Questions? Random / HW Why logistic regression Worksheet

Questions • Any grad students? – Thoughts on final project? • Who would be able to make my 12 -1 pm section? – Lecture / worksheet split section • Questions? Concerns? • Lecture pace / content / coverage?

Features • sklearn hog, sklearn tfidf, bag of words, etc.

Terminology • • Shrinkage (regularization) Variable with a hat (ŷ) estimated/predicted P(Y | X) ∝ P(X |Y) P(Y) posterior ∝ likelihood * prior

Why logistic regression • Odds • measure of relative confidence – P =. 9998; 4999: 1 – P =. 9999; 9999: 1 – Doubled confidence! • . 5001% . 5002; 1. 0004: 1 1. 0008: 1 – (basically no change in confidence) • “relative increase or decrease of a factor by one unit becomes more pronounced as the factors absolute difference increases. ”

Log-odds (calculations in base 10) • (0, 1) (-∞, ∞) • Symmetric: . 99 ≈ 2, . 01 ≈ -2 • X units of log-odds same Y % change in confidence – 0. 5 0. 91 ≈ 0 1 –. 9999 ≈ 3 4 • “Log-odds make it clear that increasing from 99. 9% to 99. 99% is just as hard as increasing from 50% to 91%” Credit: https: //dl. dropboxusercontent. com/u/34547557/log-probability. pdf

Logistic Regression • w • x = lg [ P(Y=1|x) / (1 – P(Y=1|x) ] • Intuition: some linear combination of the features tells us the log-odds that Y = 1 • Intuition: some linear combination of the features tells us the “confidence” that Y = 1