Multiple Regression Analysis Inference Assumptions of the Classical

  • Slides: 34
Download presentation
Multiple Regression Analysis: Inference

Multiple Regression Analysis: Inference

Assumptions of the Classical Linear Model (CLM) Given the Gauss-Markov assumptions, OLS is BLUE.

Assumptions of the Classical Linear Model (CLM) Given the Gauss-Markov assumptions, OLS is BLUE. Beyond the Gauss-Markov assumptions, we need another assumption to conduct tests of hypotheses (inference). Assume that u is independent of x , …, xk and u is normally distributed with zero mean and variance σ²: 1 u ~ N(0, σ²). 2

CLM Assumptions (continued. . . ) Under CLM, OLS is BLUE; OLS is the

CLM Assumptions (continued. . . ) Under CLM, OLS is BLUE; OLS is the minimum variance unbiased estimator. y|x ~ N(ß + ß x +…+ ß x , σ²) 0 1 1 k k

Normal Sampling Distributions Under the CLM assumptions, conditional on the sample values of the

Normal Sampling Distributions Under the CLM assumptions, conditional on the sample values of the explanatory variables so that is distributed normally because it is a linear combination of the right-hand side variables.

The t Test Under the CLM assumptions, the expression follows a t distribution (versus

The t Test Under the CLM assumptions, the expression follows a t distribution (versus a standard normal distribution), because we have to estimate σ² by. Note the degrees of freedom: n – k – 1.

t Distribution

t Distribution

The t Test - Knowing the sampling distribution allows us to carry out hypothesis

The t Test - Knowing the sampling distribution allows us to carry out hypothesis tests. - Start with this null hypothesis. - Example: H : ß = 0 0 j If we accept the null hypothesis, then we conclude that x has no effect on y, controlling for other x’s. j

Steps of the t Test 1. Form the relevant hypothesis. - one-sided hypothesis -

Steps of the t Test 1. Form the relevant hypothesis. - one-sided hypothesis - two-sided hypothesis 2. Calculate the t statistic. 3. Find the critical value, c. - Given a significance level, α, we look up the corresponding percentile in a t distribution with n – k – 1 degrees of freedom and call it c, the critical value. 4. Apply rejection rule to determine whether or not to accept the null hypothesis.

Types of Hypotheses and Significance Levels Hypothesis: null vs. alternative - one-sided H :

Types of Hypotheses and Significance Levels Hypothesis: null vs. alternative - one-sided H : ß = 0 and H : ß < 0 or H : ß >0 - two-sided H : ß = 0 and H : ß 0 0 0 j j 1 1 j j Significance level (α) - If we want to have only a 5% probability of rejecting Ho, if it really is true, then we say our significance level is 5%. - α values are generally 0. 01, 0. 05, or 0. 10 - α values are dictated by sample size

Critical Value c What do you need to find c? 1. t-distribution table (Appendix

Critical Value c What do you need to find c? 1. t-distribution table (Appendix Table B. 3, p. 723 Hirschey 2. Significance level 3. Degrees of freedom - n – k – 1, where n is the # of observations, k is the # of RHS variables, and 1 is for the constant.

One-Sided Alternatives yi = ß 0 + ß 1 x 1 i + …

One-Sided Alternatives yi = ß 0 + ß 1 x 1 i + … + ßkxki + ui H 0 : ßj = 0 H 1 : ßj > 0 Fail to reject (1 - a) 0 a c Critical value c: the (1 – α)th percentile in a t-dist with n – k – 1 DF. t-statistic: Results: Reject H if t-statistic > c; fail to reject Ho if t-statistic < c 0

One-Sided Alternatives yi = ß 0 + ß 1 x 1 i + …

One-Sided Alternatives yi = ß 0 + ß 1 x 1 i + … + ßkxki + ui H 0 : ßj = 0 H 1 : ßj < 0 Fail to reject α -c (1 - α) 0 Critical value c: the (1 – α)th percentile in a t-dist with n – k – 1 DF. t-statistic: Results: Reject Ho if t-statistic < -c; fail to reject Ho if t-statistic > -c

One-Sided Alternatives yi = ß 0 + ß 1 X 1 i + …

One-Sided Alternatives yi = ß 0 + ß 1 X 1 i + … + ßk. Xki + ui H 0 : ßj = 0 H 1: fail to reject α/2 reject (1 - α) -c 0 α /2 c Critical value: the (1 – α/2)th percentile in a t-dist with n – k – 1 DF. t-statistic: Results: Reject H if |t-statistic|> c; fail to reject H if |t-statistic|< c 0 0

Summary for H 0: ßi = 0 -unless otherwise stated, the alternative is assumed

Summary for H 0: ßi = 0 -unless otherwise stated, the alternative is assumed to be two-sided. -if we reject the null hypothesis, we typically say “x is statistically significant at the α% level. ” j -if we fail to reject the null hypothesis, we typically say “x is statistically insignificant at the α% level. ” j

Testing Other Hypotheses -A more general form of the t-statistic recognizes that we may

Testing Other Hypotheses -A more general form of the t-statistic recognizes that we may want to test H : ß = a 0 j j -In this case, the appropriate t-statistic is where aj = 0 for the conventional t-test

t-Test: Example Tile Example Q = 17. 513 – 0. 296 P + 0.

t-Test: Example Tile Example Q = 17. 513 – 0. 296 P + 0. 0661 + 0. 036 A (-0. 35) (-2. 91) (2. 56) (4. 61) - t-statistics are in parentheses Questions: (a) How do we calculate the standard errors? (b) Which coefficients are statistically different from zero?

Confidence Intervals Another way to use classical statistical testing is to construct a confidence

Confidence Intervals Another way to use classical statistical testing is to construct a confidence interval using the same critical value as was used for a two-sided test. A (1 – α)% confidence interval is defined as where c is the in a distribution. percentile

Confidence Interval (continued. . . )

Confidence Interval (continued. . . )

Computing p-values for t Tests An alternative to the classical approach is to ask,

Computing p-values for t Tests An alternative to the classical approach is to ask, “what is the smallest significance level at which the null hypothesis would be rejected? ” Compute the t-statistic, and then obtain the probability of getting a larger value than this calculated value. The p-value is this probability.

Example: Regression Relation Between Units Sold and Personal Selling expenditures for Electronic Data Processing

Example: Regression Relation Between Units Sold and Personal Selling expenditures for Electronic Data Processing (EDP), Inc. Units sold = -1292. 3 + 0. 09289 PSE (396. 5) + (0. 01097) (a) What are the associated t-statistics for the intercept and slope parameter estimates? (b) t-stat for = - 3. 26 p-value 0. 009 = 8. 47 p-value 0. 000 If p-value < α, then reject H : ß = 0 0 i If p-value > α, then fail to reject H : ß = 0 0 i (c) What conclusion about the statistical significance of the estimated parameters do you reach, given these p-values?

Testing a Linear Combination of Parameter Estimates Let’s suppose that, instead of testing whether

Testing a Linear Combination of Parameter Estimates Let’s suppose that, instead of testing whether ß 1 is equal to a constant, you want to test to see if it is equal to another parameter, that is H : ß = ß. 0 1 2 Use the same basic procedure forming a t-statistic.

Note:

Note:

Overall Significance H: ß =ß =…=ß =0 0 1 2 Use of F-statistic k

Overall Significance H: ß =ß =…=ß =0 0 1 2 Use of F-statistic k

F Distribution with 4 and 30 degrees of freedom (for a regression model with

F Distribution with 4 and 30 degrees of freedom (for a regression model with four X variables based on 35 observations).

The F Statistic Reject H 0 at a significance level if F > c

The F Statistic Reject H 0 at a significance level if F > c fail to reject Appendix Tables B. 2, pp. 720 -722. Hirschey a (1 - a) 0 c reject F

Example: UNITS = -117. 513 – 0. 296 P + 0. 036 AD +

Example: UNITS = -117. 513 – 0. 296 P + 0. 036 AD + 0. 006 PSE t t (-0. 35) (-2. 91) t (2. 56) P = Price AD = Advertising PSE = Selling Expenses UNITS = # of units Sold t (4. 61) t t standard error of the regression is 123. 9 R² = 0. 97 n = 32 = 0. 958 (a) Calculate the F-statistic. (b) What are the degrees-of-freedom associated with the Fstatistic? (c) What is the cutoff value of this F-statistic when α = 0. 05? When α = 0. 01?

General Linear Restrictions The basic form of the F-statistic will work for any set

General Linear Restrictions The basic form of the F-statistic will work for any set of linear restrictions. First estimate the unrestricted (UR) model and then estimate the restricted (R) model. In each case, make note of the SSE.

Test of General Linear Restrictions - This F-statistic is measuring the relative increase in

Test of General Linear Restrictions - This F-statistic is measuring the relative increase in SSE, when moving from the unrestricted (UR) model to the restricted (R) model. - q = number of restrictions

Example: Unrestricted Model Restricted Model (under H ); note q = 1 0

Example: Unrestricted Model Restricted Model (under H ); note q = 1 0

F-Statistic Summary - Just as with t-statistics, p-values can be calculated by looking up

F-Statistic Summary - Just as with t-statistics, p-values can be calculated by looking up the percentile in the appropriate F distribution. - If q = 1, then F = t², and the p-values will be the same.

Summary: Inferences - t-Test (a) one-sided vs. two-sided hypotheses (b) tests associated with a

Summary: Inferences - t-Test (a) one-sided vs. two-sided hypotheses (b) tests associated with a constant value (c) tests associated with linear combinations of parameters (d) p-values of t-tests - Confidence intervals for estimated coefficients - F-test - p-values of F-tests

Structure of Applied Research

Structure of Applied Research