Goodness of Fit Tests The goal of goodness

  • Slides: 8
Download presentation
Goodness of Fit Tests • The goal of goodness of fit tests is to

Goodness of Fit Tests • The goal of goodness of fit tests is to test if the data comes from a certain distribution. • There are various situations to which these tests apply. • The first situation we will explore is when we observe count data in k different categories in which case we will use the χ2 test. • The aim is to test the null hypothesis that the probabilities of the k categories are p 1, p 2, …, pk. • We distinguish between two cases. STA 286 week 11 1

Chi-Squared Test - Case 1 • The null hypothesis completely specifies the probabilities of

Chi-Squared Test - Case 1 • The null hypothesis completely specifies the probabilities of each of the k categories. • For each category we calculate the expected count Ei = npi. • The test statistic and its distribution are… STA 286 week 11 2

Example • The statistic department at U of T offers introductory courses for students

Example • The statistic department at U of T offers introductory courses for students from other disciplines. The department believes that 40% of the students are math major, 30% are computer science, 20% biology and 10% chemistry. A random sample of 120 students revealed 52, 38, 21, and 9 from the four majors above. Does this data support the department claim? STA 286 week 11 3

Chi-Squared Test - Case 2 • The null hypothesis does not fully specify the

Chi-Squared Test - Case 2 • The null hypothesis does not fully specify the probabilities. • In this case the probabilities of the different categories may be functions of other parameters. • First use the sample data to estimate r unknown parameters. • Then use the estimated parameters to estimate the k probabilities. • For each category, calculate the estimated expected count. • The test statistic is… STA 286 week 11 4

Example • A farmer believes that the number of eggs a chicken will give

Example • A farmer believes that the number of eggs a chicken will give per day has a Poisson(λ) distribution. He observed the following data…. STA 286 week 11 5

Remark • In many cases we will observe data that are not categorized and

Remark • In many cases we will observe data that are not categorized and we would want to test if the data comes from a certain distribution. • If the distribution we are testing is discrete the values of the variable will be the actual categories. • However, if the variable takes infinite possible values, the grouping should be done so that the expected frequency in each category is at least 5. • If the distribution we are testing is continuous we need to group the measurement of the random variable of interest into k intervals. Very often the choice of cells is done arbitrarily. • χ2 tests has low power when they are applied to continuous data, in which case we can use other tests. STA 286 week 11 6

Kolmogorov-Smirnov Goodness-of-Fit Test • K-S test is also called the Kolmogorov-Smirnov D test. •

Kolmogorov-Smirnov Goodness-of-Fit Test • K-S test is also called the Kolmogorov-Smirnov D test. • The K-S goodness-of-fit tests whether or not a given distribution is not significantly different from one hypothesized. • It is a more powerful alternative to chi-square goodness-of-fit tests. • The test statistic in the K-S test is based on the largest absolute difference between the cumulative observed proportion and the cumulative proportion expected on the basis of the hypothesized distribution. STA 286 week 11 7

Example STA 286 week 11 8

Example STA 286 week 11 8