Multiple Linear Regression Matrix Formulation Let x x

  • Slides: 35
Download presentation
Multiple Linear Regression Matrix Formulation Let x = (x 1, x 2, … ,

Multiple Linear Regression Matrix Formulation Let x = (x 1, x 2, … , xn)′ be a n 1 column vector and let g(x) be a scalar function of x. Then, by definition,

For example, let Let a = (a 1, a 2, … , a n)′

For example, let Let a = (a 1, a 2, … , a n)′ be a n 1 column vector of constants. It is easy to verify that and that, for symmetrical A (n n)

Theory of Multiple Regression Suppose we have response variables Yi , i = 1,

Theory of Multiple Regression Suppose we have response variables Yi , i = 1, 2, … , n and k explanatory variables/predictors X 1, X 2, … , X k. i = 1, 2, … , n There are k+2 parameters b 0 , b 1 , b 2 , …, bk and σ2

X is called the design matrix

X is called the design matrix

OLS (ordinary least squares) estimation

OLS (ordinary least squares) estimation

Fitted values are given by H is called the “hat matrix” (… it puts

Fitted values are given by H is called the “hat matrix” (… it puts the hats on the Y’s)

The error sum of squares, SSRES , is The estimate of s 2 is

The error sum of squares, SSRES , is The estimate of s 2 is based on this.

Example: Find a model of the form for the data below. y x 1

Example: Find a model of the form for the data below. y x 1 x 2 3. 5 3. 1 30 3. 2 3. 4 25 3. 0 20 2. 9 3. 2 30 4. 0 3. 9 40 2. 5 2. 8 25 2. 3 2. 2 30

X is called the design matrix

X is called the design matrix

The model in matrix form is given by: We have already seen that Now

The model in matrix form is given by: We have already seen that Now calculate this for our example

R can be used to calculate X’X and the answer is:

R can be used to calculate X’X and the answer is:

To input the matrix in R use X=matrix(c(1, 1, 3. 4, 3. 0, 3.

To input the matrix in R use X=matrix(c(1, 1, 3. 4, 3. 0, 3. 4, 3. 9, 2. 8, 2. 2, 30, 25, 20, 30, 40, 25, 30), 7, 3) Number of rows Number of columns

Notice command for matrix multiplication

Notice command for matrix multiplication

The inverse of X’X can also be obtained by using R

The inverse of X’X can also be obtained by using R

We also need to calculate X’Y Now

We also need to calculate X’Y Now

Notice that this is the same result as obtained previously using the lm result

Notice that this is the same result as obtained previously using the lm result on R

So y = 0. 2138 + 0. 8984 x 1 + 0. 01745 x

So y = 0. 2138 + 0. 8984 x 1 + 0. 01745 x 2 + e

The “hat matrix” is given by

The “hat matrix” is given by

The fitted Y values are obtained by

The fitted Y values are obtained by

Recall once more we are looking at the model

Recall once more we are looking at the model

Compare with

Compare with

Error Terms and Inference A useful result is : n : number of points

Error Terms and Inference A useful result is : n : number of points k: number of explanatory variables

In addition we can show that: where s. e. (bi)= c (i+1) And c(i+1)

In addition we can show that: where s. e. (bi)= c (i+1) And c(i+1) is the (i+1)th diagonal element of

For our example:

For our example:

was calculated as:

was calculated as:

This means that c 11= 6. 683, c 22=0. 7600, c 33=0. 0053 Note

This means that c 11= 6. 683, c 22=0. 7600, c 33=0. 0053 Note that c 11 is associated with b 0, c 22 with b 1 and c 33 with b 2 We will calculate the standard error for b 1 This is 0. 7600 x 0. 2902 = 0. 2530

The value of ^b 1 is 0. 8984 Now carry out a hypothesis test.

The value of ^b 1 is 0. 8984 Now carry out a hypothesis test. H 0: b 1 = 0 H 1: b 1 ≠ 0 The standard error of b 1 is 0. 2530

The test statistic is This calculates as (0. 8984 – 0)/0. 2530 = 3.

The test statistic is This calculates as (0. 8984 – 0)/0. 2530 = 3. 55

Ds…. . ………………. . . . t tables using 4 degrees of freedom give

Ds…. . ………………. . . . t tables using 4 degrees of freedom give cut of point of 2. 776 for 2. 5%.

We therefore accept H 1. There is no evidence at the 5% level that

We therefore accept H 1. There is no evidence at the 5% level that b 1 is zero. The process can be repeated for the other b values and confidence intervals calculated in the usual way. CI for 2 - based on the 42 distribution of ((4 0. 08422)/11. 14 , (4 0. 08422)/0. 4844) i. e. (0. 030 , 0. 695)

The sum of squares of the residuals can also be calculated.

The sum of squares of the residuals can also be calculated.