CHAPTER 1 THE LINEAR REGRESSION MODEL AN OVERVIEW

THE LINEAR REGRESSION MODEL (LPM) Ø The general form of the LPM model is:

POPULATION (TRUE) MODEL Yi = B 1 + B 2 X 2 i +

REGRESSION COEFFICIENTS Ø B 1 is the intercept. Ø B 2 to Bk are

SAMPLE REGRESSION FUNCTION Ø The sample counterpart is: Yi = b 1 + b

THE NATURE OF THE Y VARIABLE Ø Ratio Scale: Ø Ratio of two variables,

THE NATURE OF DATA Ø Time Series Data ØA set of observations that a

THE NATURE OF DATA Ø Cross-Section Data ØData on one or more variables collected

THE NATURE OF DATA Ø Panel, Longitudinal or Micro-panel Data ØCombines features of both

METHOD OF ORDINARY LEAST SQUARES Ø Method of Ordinary Least Squares (OLS) does not

CLASSICAL LINEAR REGRESSION MODEL Ø Assumptions of the Classical Linear Regression Model (CLRM): Ø

GAUSS-MARKOV THEOREM Ø On the basis of assumptions A-1 to A-7, the OLS method

HYPOTHESIS TESTING: t TEST Ø To test the following hypothesis: H 0: Bk =

HYPOTHESIS TESTING: t TEST Ø An alternative method is seeing whether zero lies within

GOODNESS OF FIT, R 2 Ø R 2, the coefficient of determination, is an

HYPOTHESIS TESTING: F TEST Ø Testing the following hypothesis is equivalent to testing the

Slides: 17

Download presentation

CHAPTER 1 THE LINEAR REGRESSION MODEL: AN OVERVIEW Damodar Gujarati Econometrics by Example, second edition

THE LINEAR REGRESSION MODEL (LPM) Ø The general form of the LPM model is: Yi = B 1 + B 2 X 2 i + B 3 X 3 i + … + Bk. Xki + ui Ø Or, as written in short form: Yi = BX + ui Ø Y is the regressand, X is a vector of regressors, and u is an error term. Damodar Gujarati Econometrics by Example, second edition

POPULATION (TRUE) MODEL Yi = B 1 + B 2 X 2 i + B 3 X 3 i + … + Bk. Xki + ui Ø This equation is known as the population or true model. Ø It consists of two components: Ø(1) A deterministic component, BX (the conditional mean of Y, or E(Y|X)). Ø(2) A nonsystematic, or random component, ui. Damodar Gujarati Econometrics by Example, second edition

REGRESSION COEFFICIENTS Ø B 1 is the intercept. Ø B 2 to Bk are the slope coefficients. Ø Collectively, they are the regression coefficients or regression parameters. Ø Each slope coefficient measures the (partial) rate of change in the mean value of Y for a unit change in the value of a regressor, ceteris paribus. Damodar Gujarati Econometrics by Example, second edition

SAMPLE REGRESSION FUNCTION Ø The sample counterpart is: Yi = b 1 + b 2 X 2 i + b 3 X 3 i + … + bk. Xki + ei Ø Or, as written in short form: Yi = b. X + ei where e is a residual. Ø The deterministic component is written as: Damodar Gujarati Econometrics by Example, second edition

THE NATURE OF THE Y VARIABLE Ø Ratio Scale: Ø Ratio of two variables, distance between two variables, and ordering of variables are meaningful. Ø Interval Scale: Ø Distance and ordering between two variables meaningful, but not ratio. Ø Ordinal Scale: Ø Ordering of two variables meaningful (not ratio or distance). Ø Nominal Scale: Ø Categorical or dummy variables, qualitative in nature. Damodar Gujarati Econometrics by Example, second edition

THE NATURE OF DATA Ø Time Series Data ØA set of observations that a variable takes at different times, such as daily (e. g. , stock prices), weekly (e. g. , money supply), monthly (e. g. , the unemployment rate), quarterly (e. g. , GDP), annually (e. g. , government budgets), quinquenially or every five years (e. g. , the census of manufactures), or decennially or every ten years (e. g. , the census of population). Damodar Gujarati Econometrics by Example, second edition

THE NATURE OF DATA Ø Cross-Section Data ØData on one or more variables collected at the same point in time. ØExamples are the census of population conducted by the Census Bureau every 10 years, opinion polls conducted by various polling organizations, and temperature at a given time in several places. Damodar Gujarati Econometrics by Example, second edition

THE NATURE OF DATA Ø Panel, Longitudinal or Micro-panel Data ØCombines features of both cross-section and time series data. ØSame cross-sectional units are followed over time. ØPanel data represents a special type of pooled data (simply time series, cross-sectional, where the same cross-sectional units are not necessarily followed over time). Damodar Gujarati Econometrics by Example, second edition

METHOD OF ORDINARY LEAST SQUARES Ø Method of Ordinary Least Squares (OLS) does not minimize the sum of the error term, but minimizes error sum of squares (ESS): Ø To obtain values of the regression coefficients, derivatives are taken with respect to the regression coefficients and set equal to zero. Damodar Gujarati Econometrics by Example, second edition

CLASSICAL LINEAR REGRESSION MODEL Ø Assumptions of the Classical Linear Regression Model (CLRM): Ø A-1: Model is linear in the parameters. Ø A-2: Regressors are fixed or nonstochastic. Ø A-3: Given X, the expected value of the error term is zero, or E(ui |X) = 0. Damodar Gujarati Econometrics by Example, second edition

CLASSICAL LINEAR REGRESSION MODEL Ø Assumptions of the Classical Linear Regression Model (CLRM): Ø A-4: Homoscedastic, or constant, variance of ui, or var(ui|X) = σ2. Ø A-5: No autocorrelation, or cov(ui, uj|X) = 0, i ≠ j. Ø A-6: No multicollinearity, or no perfect linear relationships among the X variables. Ø A-7: No specification bias. Damodar Gujarati Econometrics by Example, second edition

GAUSS-MARKOV THEOREM Ø On the basis of assumptions A-1 to A-7, the OLS method gives best linear unbiased estimators (BLUE): Ø(1) Estimators are linear functions of the dependent variable Y. Ø(2) The estimators are unbiased; in repeated applications of the method, the estimators approach their true values. Ø(3) In the class of linear estimators, OLS estimators have minimum variance; i. e. , they are efficient, or the “best” estimators. Damodar Gujarati Econometrics by Example, second edition

HYPOTHESIS TESTING: t TEST Ø To test the following hypothesis: H 0: Bk = 0 H 1: Bk ≠ 0 we calculate the following and use the t table to obtain the critical t value with n-k degrees of freedom for a given level of significance (or α, equal to 10%, 5%, or 1%): If this value is greater than the critical t value, we can reject H 0. Damodar Gujarati Econometrics by Example, second edition

HYPOTHESIS TESTING: t TEST Ø An alternative method is seeing whether zero lies within the confidence interval: Ø If zero lies in this interval, we cannot reject H 0. Ø The p-value gives the exact level of significance, or the lowest level of significance at which we can reject H 0. Damodar Gujarati Econometrics by Example, second edition

GOODNESS OF FIT, R 2 Ø R 2, the coefficient of determination, is an overall measure of goodness of fit of the estimated regression line. Ø Gives the percentage of the total variation in the dependent variable that is explained by the regressors. Ø It is a value between 0 (no fit) and 1 (perfect fit). Ø Let: Ø Then: Damodar Gujarati Econometrics by Example, second edition

HYPOTHESIS TESTING: F TEST Ø Testing the following hypothesis is equivalent to testing the hypothesis that all the slope coefficients are 0: H 0: R 2 = 0 H 1: R 2 ≠ 0 Ø Calculate the following and use the F table to obtain the critical F value with k-1 degrees of freedom in the numerator and n-k degrees of freedom in the denominator for a given level of significance: If this value is greater than the critical F value, reject H 0. Damodar Gujarati Econometrics by Example, second edition