162 Topic 1 3 Linear Panel Data Regression

  • Slides: 63
Download presentation
1/62: Topic 1. 3 – Linear Panel Data Regression Microeconometric Modeling William Greene Stern

1/62: Topic 1. 3 – Linear Panel Data Regression Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA 1. 3 Linear Panel Data Regression Models

2/62: Topic 1. 3 – Linear Panel Data Regression Concepts • • • •

2/62: Topic 1. 3 – Linear Panel Data Regression Concepts • • • • Unbalanced Panel Cluster Estimator Block Bootstrap Difference in Differences Incidental Parameters Problem Endogeneity Instrumental Variable Control Function Estimator Mundlak Form Correlated Random Effects Hausman Test Lagrange Multiplier (LM) Test Variable Addition (Wu) Test Models • • • Linear Regression Fixed Effects LR Model Random Effects LR Model

3/62: Topic 1. 3 – Linear Panel Data Regression

3/62: Topic 1. 3 – Linear Panel Data Regression

4/62: Topic 1. 3 – Linear Panel Data Regression

4/62: Topic 1. 3 – Linear Panel Data Regression

5/62: Topic 1. 3 – Linear Panel Data Regression German Socioeconomic Panel

5/62: Topic 1. 3 – Linear Panel Data Regression German Socioeconomic Panel

6/62: Topic 1. 3 – Linear Panel Data Regression Balanced and Unbalanced Panels Distinction:

6/62: Topic 1. 3 – Linear Panel Data Regression Balanced and Unbalanced Panels Distinction: Balanced vs. Unbalanced Panels p A notation to help with mechanics p zi, t, i = 1, …, N; t = 1, …, Ti p The role of the assumption n Mathematical and notational convenience: Balanced, p Unbalanced: p n n n=NT The fixed Ti assumption almost never necessary. If unbalancedness is due to nonrandom attrition from an otherwise balanced panel, then this will require special considerations.

7/62: Topic 1. 3 – Linear Panel Data Regression An Unbalanced Panel: RWM’s GSOEP

7/62: Topic 1. 3 – Linear Panel Data Regression An Unbalanced Panel: RWM’s GSOEP Data on Health Care N = 7, 293 Households Some households exited then returned

8/62: Topic 1. 3 – Linear Panel Data Regression Cornwell and Rupert Data Cornwell

8/62: Topic 1. 3 – Linear Panel Data Regression Cornwell and Rupert Data Cornwell and Rupert Returns to Schooling Data, 595 Individuals, 7 Years (Extracted from NLSY. ) Variables in the file are EXP WKS OCC IND SOUTH SMSA MS FEM UNION ED LWAGE = = = work experience weeks worked occupation, 1 if blue collar, 1 if manufacturing industry 1 if resides in south 1 if resides in a city (SMSA) 1 if married 1 if female 1 if wage set by union contract years of education log of wage = dependent variable in regressions These data were analyzed in Cornwell, C. and Rupert, P. , "Efficient Estimation with Panel Data: An Empirical Comparison of Instrumental Variable Estimators, " Journal of Applied Econometrics, 3, 1988, pp. 149 -155. See Baltagi, page 122 for further analysis. The data were downloaded from the website for Baltagi's text.

9/62: Topic 1. 3 – Linear Panel Data Regression

9/62: Topic 1. 3 – Linear Panel Data Regression

10/62: Topic 1. 3 – Linear Panel Data Regression Common Effects Models p Unobserved

10/62: Topic 1. 3 – Linear Panel Data Regression Common Effects Models p Unobserved individual effects in regression: E[yit | xit, ci] Notation: p Linear specification: Fixed Effects: E[ci | Xi ] = g(Xi). Cov[xit, ci] ≠ 0 effects are correlated with included variables. n Random Effects: E[ci | Xi ] = μ; effects are uncorrelated with included variables. If Xi contains a constant term, μ=0 WLOG. Common: Cov[xit, ci] =0, but E[ci | Xi ] = μ is needed for the full model

11/62: Topic 1. 3 – Linear Panel Data Regression Convenient Notation p Fixed Effects

11/62: Topic 1. 3 – Linear Panel Data Regression Convenient Notation p Fixed Effects – the ‘dummy variable model’ Individual specific constant terms. p Random Effects – the ‘error components model’ Compound (“composed”) disturbance

12/62: Topic 1. 3 – Linear Panel Data Regression Estimating β β is the

12/62: Topic 1. 3 – Linear Panel Data Regression Estimating β β is the partial effect of interest p Can it be estimated (consistently) in the presence of (unmeasured) ci? p n n Does pooled least squares “work? ” Strategies for “controlling for ci” using the sample data.

13/62: Topic 1. 3 – Linear Panel Data Regression 1. The Pooled Regression p

13/62: Topic 1. 3 – Linear Panel Data Regression 1. The Pooled Regression p Presence of omitted effects p Potential bias/inconsistency of OLS – Depends on ‘fixed’ or ‘random’ n n If FE, X is endogenous: Omitted Variables Bias If RE, OLS is OK but standard errors are incorrect.

14/62: Topic 1. 3 – Linear Panel Data Regression OLS with Individual Effects The

14/62: Topic 1. 3 – Linear Panel Data Regression OLS with Individual Effects The omitted variable(s) are the group means

15/62: Topic 1. 3 – Linear Panel Data Regression Ordinary Least Squares p Standard

15/62: Topic 1. 3 – Linear Panel Data Regression Ordinary Least Squares p Standard results for OLS in a generalized regression model n n p Consistent if RE, inconsistent if FE. Inefficient in all cases. True Variance

16/62: Topic 1. 3 – Linear Panel Data Regression Estimating the Sampling Variance of

16/62: Topic 1. 3 – Linear Panel Data Regression Estimating the Sampling Variance of b b may or may not be consistent for . We estimate its variance regardless p s 2(X X)-1 is not the correct matrix p n n p Correlation across observations: Yes Heteroscedasticity: Maybe Is there a “robust” covariance matrix? n n n Robust estimation (in general) The White estimator for heteroscedasticity A Robust estimator for OLS.

17/62: Topic 1. 3 – Linear Panel Data Regression A Cluster Estimator

17/62: Topic 1. 3 – Linear Panel Data Regression A Cluster Estimator

18/62: Topic 1. 3 – Linear Panel Data Regression Clustering of Observations: Repeat Sales

18/62: Topic 1. 3 – Linear Panel Data Regression Clustering of Observations: Repeat Sales of Monet Paintings

19/62: Topic 1. 3 – Linear Panel Data Regression Alternative OLS Variance Estimators Cluster

19/62: Topic 1. 3 – Linear Panel Data Regression Alternative OLS Variance Estimators Cluster correction usually increases SEs +--------------+--------+---------+ |Variable | Coefficient | Standard Error |b/St. Er. |P[|Z|>z] | +--------------+--------+---------+ Constant 5. 40159723. 04838934 111. 628. 0000 EXP. 04084968. 00218534 18. 693. 0000 EXPSQ -. 00068788. 480428 D-04 -14. 318. 0000 OCC -. 13830480. 01480107 -9. 344. 0000 SMSA. 14856267. 01206772 12. 311. 0000 MS. 06798358. 02074599 3. 277. 0010 FEM -. 40020215. 02526118 -15. 843. 0000 UNION. 09409925. 01253203 7. 509. 0000 ED. 05812166. 00260039 22. 351. 0000 Robust Constant 5. 40159723. 10156038 53. 186. 0000 EXP. 04084968. 00432272 9. 450. 0000 EXPSQ -. 00068788. 983981 D-04 -6. 991. 0000 OCC -. 13830480. 02772631 -4. 988. 0000 SMSA. 14856267. 02423668 6. 130. 0000 MS. 06798358. 04382220 1. 551. 1208 FEM -. 40020215. 04961926 -8. 065. 0000 UNION. 09409925. 02422669 3. 884. 0001 ED. 05812166. 00555697 10. 459. 0000

20/62: Topic 1. 3 – Linear Panel Data Regression Results of Simple Bootstrap Estimation

20/62: Topic 1. 3 – Linear Panel Data Regression Results of Simple Bootstrap Estimation

21/62: Topic 1. 3 – Linear Panel Data Regression The bootstrap replication must account

21/62: Topic 1. 3 – Linear Panel Data Regression The bootstrap replication must account for panel data nature of the data set. Bootstrap variance for a panel data estimator p Panel Bootstrap = Block Bootstrap p Data set is N groups of size Ti p Bootstrap sample is N groups of size Ti drawn with replacement.

22/62: Topic 1. 3 – Linear Panel Data Regression

22/62: Topic 1. 3 – Linear Panel Data Regression

23/62: Topic 1. 3 – Linear Panel Data Regression Difference-in-Differences Model With two periods

23/62: Topic 1. 3 – Linear Panel Data Regression Difference-in-Differences Model With two periods and strict exogeneity of D and T, This is a linear regression model. If there are no regressors,

24/62: Topic 1. 3 – Linear Panel Data Regression Difference in Differences

24/62: Topic 1. 3 – Linear Panel Data Regression Difference in Differences

25/62: Topic 1. 3 – Linear Panel Data Regression UK Office of Fair Trading,

25/62: Topic 1. 3 – Linear Panel Data Regression UK Office of Fair Trading, May 2012 http: //dera. ioe. ac. uk/14610/1/oft 1416. pdf

26/62: Topic 1. 3 – Linear Panel Data Regression Outcome is the fees charged.

26/62: Topic 1. 3 – Linear Panel Data Regression Outcome is the fees charged. Activity is collusion on fees.

27/62: Topic 1. 3 – Linear Panel Data Regression Treatment Schools: Treatment is an

27/62: Topic 1. 3 – Linear Panel Data Regression Treatment Schools: Treatment is an intervention by the Office of Fair Trading Control Schools were not involved in the conspiracy Treatment is not voluntary

28/62: Topic 1. 3 – Linear Panel Data Regression Apparent Impact of the Intervention

28/62: Topic 1. 3 – Linear Panel Data Regression Apparent Impact of the Intervention

29/62: Topic 1. 3 – Linear Panel Data Regression

29/62: Topic 1. 3 – Linear Panel Data Regression

30/62: Topic 1. 3 – Linear Panel Data Regression Treatment (Intervention) Effect = 1

30/62: Topic 1. 3 – Linear Panel Data Regression Treatment (Intervention) Effect = 1 + 2 if SS school

31/62: Topic 1. 3 – Linear Panel Data Regression In order to test robustness

31/62: Topic 1. 3 – Linear Panel Data Regression In order to test robustness two versions of the fixed effects model were run. The first is Ordinary Least Squares, and the second is heteroscedasticity and auto-correlation robust (HAC) standard errors in order to check for heteroscedasticity and autocorrelation.

32/62: Topic 1. 3 – Linear Panel Data Regression

32/62: Topic 1. 3 – Linear Panel Data Regression

33/62: Topic 1. 3 – Linear Panel Data Regression The cumulative impact of the

33/62: Topic 1. 3 – Linear Panel Data Regression The cumulative impact of the intervention is the area between the two paths from intervention to time T.

34/62: Topic 1. 3 – Linear Panel Data Regression

34/62: Topic 1. 3 – Linear Panel Data Regression

35/62: Topic 1. 3 – Linear Panel Data Regression 2. Estimation with Fixed Effects

35/62: Topic 1. 3 – Linear Panel Data Regression 2. Estimation with Fixed Effects p The fixed effects model p ci is arbitrarily correlated with xit but E[εit|Xi, ci]=0 Dummy variable representation p

36/62: Topic 1. 3 – Linear Panel Data Regression The Fixed Effects Model yi

36/62: Topic 1. 3 – Linear Panel Data Regression The Fixed Effects Model yi = Xi + di i + εi, for each individual E[ci | Xi ] = g(Xi); Effects are correlated with included variables. Cov[xit, ci] ≠ 0

37/62: Topic 1. 3 – Linear Panel Data Regression Estimating the Fixed Effects Model

37/62: Topic 1. 3 – Linear Panel Data Regression Estimating the Fixed Effects Model The FEM is a plain vanilla regression model but with many independent variables p Least squares is unbiased, consistent, efficient, but inconvenient if N is large. p

38/62: Topic 1. 3 – Linear Panel Data Regression The Within Transformation Removes the

38/62: Topic 1. 3 – Linear Panel Data Regression The Within Transformation Removes the Effects Wooldridge notation for data in deviations from group means

39/62: Topic 1. 3 – Linear Panel Data Regression Least Squares Dummy Variable Estimator

39/62: Topic 1. 3 – Linear Panel Data Regression Least Squares Dummy Variable Estimator b is obtained by ‘within’ groups least squares (group mean deviations) p Normal equations for a are D’Xb+D’Da=D’y a = (D’D)-1 D’(y – Xb) p Notes: This is simple algebra – the estimator is just OLS Least squares is an estimator, not a model. (Repeat twice. ) Note what ai is when Ti = 1. Follow this with yit-ai-xit’b=0 if Ti=1.

40/62: Topic 1. 3 – Linear Panel Data Regression Inference About OLS p p

40/62: Topic 1. 3 – Linear Panel Data Regression Inference About OLS p p p Assume strict exogeneity: Cov[εit, (xjs, cj)]=0. Every disturbance in every period for each person is uncorrelated with variables and effects for every person and across periods. Now, it’s just least squares in a classical linear regression model. Asy. Var[b] =

41/62: Topic 1. 3 – Linear Panel Data Regression Application Cornwell and Rupert

41/62: Topic 1. 3 – Linear Panel Data Regression Application Cornwell and Rupert

42/62: Topic 1. 3 – Linear Panel Data Regression LSDV Results Note huge changes

42/62: Topic 1. 3 – Linear Panel Data Regression LSDV Results Note huge changes in the coefficients. SMSA and MS change signs. Significance changes completely. Pooled OLS

43/62: Topic 1. 3 – Linear Panel Data Regression Estimated Fixed Effects

43/62: Topic 1. 3 – Linear Panel Data Regression Estimated Fixed Effects

44/62: Topic 1. 3 – Linear Panel Data Regression The Effect of the Effects

44/62: Topic 1. 3 – Linear Panel Data Regression The Effect of the Effects R 2 rises from. 26510 to. 90542

45/62: Topic 1. 3 – Linear Panel Data Regression Robust Covariance Matrix for LSDV

45/62: Topic 1. 3 – Linear Panel Data Regression Robust Covariance Matrix for LSDV Cluster Estimator for Within Estimator Effect is less pronounced than for OLS

46/62: Topic 1. 3 – Linear Panel Data Regression Endogeneity in the FEM yi

46/62: Topic 1. 3 – Linear Panel Data Regression Endogeneity in the FEM yi = Xi + diαi + εi for each individual E[wi | Xi ] = g(Xi); Effects are correlated with included variables. Cov[xit, wi] ≠ 0 X is endogenous because of the correlation between xit and wi

47/62: Topic 1. 3 – Linear Panel Data Regression The within (LSDV) estimator is

47/62: Topic 1. 3 – Linear Panel Data Regression The within (LSDV) estimator is an instrumental variable (IV) estimator

48/62: Topic 1. 3 – Linear Panel Data Regression LSDV is a Control Function

48/62: Topic 1. 3 – Linear Panel Data Regression LSDV is a Control Function Estimator

49/62: Topic 1. 3 – Linear Panel Data Regression LSDV is a Control Function

49/62: Topic 1. 3 – Linear Panel Data Regression LSDV is a Control Function Estimator

50/62: Topic 1. 3 – Linear Panel Data Regression The problem here is the

50/62: Topic 1. 3 – Linear Panel Data Regression The problem here is the estimator of the disturbance variance. The matrix is OK. Note, for example, . 01374007/. 01950085 (top panel) =. 16510 /. 23432 (bottom panel).

51/62: Topic 1. 3 – Linear Panel Data Regression

51/62: Topic 1. 3 – Linear Panel Data Regression

52/62: Topic 1. 3 – Linear Panel Data Regression 3. The Random Effects Model

52/62: Topic 1. 3 – Linear Panel Data Regression 3. The Random Effects Model The random effects model ci is uncorrelated with xit for all t; E[ci |Xi] = 0 E[εit|Xi, ci]=0

53/62: Topic 1. 3 – Linear Panel Data Regression Random vs. Fixed Effects p

53/62: Topic 1. 3 – Linear Panel Data Regression Random vs. Fixed Effects p Random Effects n n n p Fixed Effects n n p Small number of parameters Efficient estimation Objectionable orthogonality assumption (ci Xi) Robust – generally consistent Large number of parameters More reasonable assumption Precludes time invariant regressors Which is the more reasonable model?

54/62: Topic 1. 3 – Linear Panel Data Regression Mundlak’s Estimator Mundlak, Y. ,

54/62: Topic 1. 3 – Linear Panel Data Regression Mundlak’s Estimator Mundlak, Y. , “On the Pooling of Time Series and Cross Section Data, Econometrica, 46, 1978, pp. 69 -85.

55/62: Topic 1. 3 – Linear Panel Data Regression Mundlak Form of FE Model

55/62: Topic 1. 3 – Linear Panel Data Regression Mundlak Form of FE Model +--------------+--------+--------+-----+ |Variable| Coefficient | Standard Error |b/St. Er. |P[|Z|>z]| Mean of X| +--------------+--------+--------+-----+ x(i, t) OCC | -. 02021384. 01375165 -1. 470. 1416. 51116447 SMSA | -. 04250645. 01951727 -2. 178. 0294. 65378151 MS | -. 02946444. 01915264 -1. 538. 1240. 81440576 EXP |. 09665711. 00119262 81. 046. 0000 19. 8537815 z(i) FEM | -. 34322129. 05725632 -5. 994. 0000. 11260504 ED |. 05099781. 00575551 8. 861. 0000 12. 8453782 Means of x(I, t) and constant Constant| 5. 72655261. 10300460 55. 595. 0000 OCCB | -. 10850252. 03635921 -2. 984. 0028. 51116447 SMSAB |. 22934020. 03282197 6. 987. 0000. 65378151 MSB |. 20453332. 05329948 3. 837. 0001. 81440576 EXPB | -. 08988632. 00165025 -54. 468. 0000 19. 8537815 Estimates: Var[e] =. 0235632 Var[u] =. 0773825

56/62: Topic 1. 3 – Linear Panel Data Regression Correlated Random Effects

56/62: Topic 1. 3 – Linear Panel Data Regression Correlated Random Effects

57/62: Topic 1. 3 – Linear Panel Data Regression LM Tests +-------------------------+ | Random

57/62: Topic 1. 3 – Linear Panel Data Regression LM Tests +-------------------------+ | Random Effects Model: v(i, t) = e(i, t) + u(i) | | Estimates: Var[e] =. 216794 D+02 | | Var[u] =. 958560 D+01 | | Corr[v(i, t), v(i, s)] =. 306592 | | Lagrange Multiplier Test vs. Model (3) = 4419. 33 | | ( 1 df, prob value =. 000000) | | (High values of LM favor FEM/REM over CR model. ) | | Baltagi-Li form of LM Statistic = 1618. 75 | +--------------------------------------------------+ | Random Effects Model: v(i, t) = e(i, t) + u(i) | | Estimates: Var[e] =. 210257 D+02 | | Var[u] =. 860646 D+01 | | Corr[v(i, t), v(i, s)] =. 290444 | | Lagrange Multiplier Test vs. Model (3) = 1561. 57 | | ( 1 df, prob value =. 000000) | | (High values of LM favor FEM/REM over CR model. ) | | Baltagi-Li form of LM Statistic = 1561. 57 | +-------------------------+ Unbalanced Panel #(T=1) = 1525 #(T=2) = 1079 #(T=3) = 825 #(T=4) = 926 #(T=5) = 1051 #(T=6) = 1200 #(T=7) = 887 Balanced Panel T = 7 REGRESS ; Lhs=docvis ; Rhs=one, hhninc, age, female, educ ; panel $

58/62: Topic 1. 3 – Linear Panel Data Regression A One Way REM

58/62: Topic 1. 3 – Linear Panel Data Regression A One Way REM

59/62: Topic 1. 3 – Linear Panel Data Regression A Variable Addition Test Asymptotically

59/62: Topic 1. 3 – Linear Panel Data Regression A Variable Addition Test Asymptotically equivalent to Hausman p Also equivalent to Mundlak formulation p In the random effects model, using FGLS p n n n Only applies to time varying variables Add expanded group means to the regression (i. e. , observation i, t gets same group means for all t. Use standard F or Wald test to test for coefficients on means equal to 0. Large F or chi-squared weighs against random effects specification.

60/62: Topic 1. 3 – Linear Panel Data Regression Variable Addition

60/62: Topic 1. 3 – Linear Panel Data Regression Variable Addition

61/62: Topic 1. 3 – Linear Panel Data Regression Application: Wu Test NAMELIST ;

61/62: Topic 1. 3 – Linear Panel Data Regression Application: Wu Test NAMELIST ; XV = exp, expsq, wks, occ, ind, south, smsa, ms, union$ NAMELIST ; (new) xmeans = expb, expsqb, wksb, occb, indb, southb, smsab, msb, unionb $ CREATE ; xmeans = Group Mean(xv, pds=ti) $ REGRESS ; Lhs = lwage ; Rhs = xmeans, Xv, ed, fem, one ; panel ; random ; Test: xmeans $

62/62: Topic 1. 3 – Linear Panel Data Regression Means Added

62/62: Topic 1. 3 – Linear Panel Data Regression Means Added

63/62: Topic 1. 3 – Linear Panel Data Regression Use OLS instead with robust

63/62: Topic 1. 3 – Linear Panel Data Regression Use OLS instead with robust standard errors. Identical results.