Logistic Regression The logistic function The logistic function

  • Slides: 5
Download presentation
Logistic Regression • The logistic function: • The logistic function is useful because it

Logistic Regression • The logistic function: • The logistic function is useful because it can take as an input any value from negative infinity to positive infinity, whereas the output is confined to values between 0 and 1. • The variable z represents the exposure to some set of independent variables, while ƒ(z) represents the probability of a particular outcome, given that set of explanatory variables. • The variable z is a measure of the total contribution of all the independent variables used in the model and is known as the logit. IS 240 – Spring 2010. 02. 22 - SLIDE 1

Probabilistic Models: Logistic Regression • In Information Retrieval, estimates for relevance based on log-linear

Probabilistic Models: Logistic Regression • In Information Retrieval, estimates for relevance based on log-linear model with various statistical measures of document content as independent variables. Log odds of relevance is a linear function of attributes: Term contributions summed: Probability of Relevance is inverse of log odds: IS 240 – Spring 2010. 02. 22 - SLIDE 2

Probability of Relevance Logistic Regression IS 240 – Spring 2010 100 90 80 70

Probability of Relevance Logistic Regression IS 240 – Spring 2010 100 90 80 70 60 50 40 30 20 10 0 - 0 10 20 30 40 50 60 Term Frequency in Document 2010. 02. 22 - SLIDE 3

Probabilistic Models: Logistic Regression Estimation of the Probability of relevance is based on Logistic

Probabilistic Models: Logistic Regression Estimation of the Probability of relevance is based on Logistic regression from a sample set of documents to determine values of the coefficients. At retrieval the probability estimate is obtained by: For the 6 X attribute measures shown on the next slide IS 240 – Spring 2010. 02. 22 - SLIDE 4

Probabilistic Models: Logistic Regression attributes (“TREC 3”) Average Absolute Query Frequency Query Length Average

Probabilistic Models: Logistic Regression attributes (“TREC 3”) Average Absolute Query Frequency Query Length Average Absolute Document Frequency Document Length Average Inverse Document Frequency Number of Terms in common between query and document -- logged IS 240 – Spring 2010. 02. 22 - SLIDE 5