The Power of Regression Previous Research Literature Claim
- Slides: 61
The Power of Regression • Previous Research Literature Claim • Foreign-owned manufacturing plants have greater levels of strike activity than domestic plants • In Canada, strike rates of 25. 5% versus 20. 3% • Budd’s Claim • Foreign-owned plants are larger and located in strike -prone industries • Need multivariate regression analysis! 1
The Power of Regression Dependent Variable: Strike Incidence (1) U. S. Corporate Parent (Canadian Parent omitted) 0. 230** (0. 117) (2) (3) 0. 201* (0. 119) 0. 065 (0. 132) 0. 177** (0. 019) 0. 094** (0. 020) Number of Employees (1000 s) --- Industry Effects? No No Yes 2, 170 Sample Size * Statistically significant at the 0. 10 level; ** at the 0. 05 level (two-tailed tests). 2
Important Regression Topics • Prediction • Various confidence and prediction intervals • Diagnostics • Are assumptions for estimation & testing fulfilled? • Specifications • Quadratic terms? Logarithmic dep. vars. ? • Additional hypothesis tests • Partial F tests • Dummy dependent variables • Probit and logit models 3
Confidence Intervals • The true population [whatever] is within the following interval (1 - )% of the time: Estimate ± t /2 Standard Error. Estimate • Just need • Estimate • Standard Error • Shape / Distribution (including degrees of freedom) 4
Prediction Interval for New Observation at xp 1. Point Estimate 2. Standard Error 3. Shape • t distribution with n-k-1 d. f 4. So prediction interval for a new observation is Siegel, p. 481 5
Prediction Interval for Mean Observations at xp 1. Point Estimate 2. Standard Error 3. Shape • t distribution with n-k-1 d. f 4. So prediction interval for a new observation is Siegel, p. 483 6
Earlier Example Hours of Study (x) and Exam Regression Statistics Score (y) Example Multiple R 0. 770 R Squared 0. 594 Adj. R Squared 0. 543 Standard Error 10. 710 Obs. 1. Find 95% CI for Joe’s exam score (studies for 20 hours) 2. Find 95% CI for mean score for those who studied for 20 hours 10 ANOVA df SS MS F Significance Regression 1 1340. 452 1341. 452 11. 686 0. 009 Residual 8 917. 648 114. 706 Total 9 2258. 100 Coeff. Std. Error t stat p value Lower 95% Upper 95% Intercept 39. 401 12. 153 3. 242 0. 012 11. 375 67. 426 hours 2. 122 0. 621 3. 418 0. 009 0. 691 3. 554 -x = 18. 80 7
Diagnostics / Misspecification • For estimation & testing to be valid… • y = b 0 + b 1 x 1 + b 2 x 2 + … + bkxk + e makes sense • Errors (ei) are independent • of each other Violations render • of the independent variables our inferences • Homoskedasticity invalid and misleading! • Error variance independent of the independent variables • e 2 is a constant • Var(ei) xi 2 (i. e. , not heteroskedasticity) 8
Common Problems • Misspecification • Omitted variable bias • Nonlinear rather than linear relationship • Levels, logs, or percent changes? • Data Problems • Skewed variables and outliers • Multicollinearity • Sample selection (non-random data) • Missing data • Problems with residuals (error terms) • Non-independent errors • Heteroskedasticity 9
Omitted Variable Bias • Question 3 from Sample Exam B wage = 9. 05 + 1. 39 union (1. 65) (0. 66) wage = 9. 56 + 1. 42 union + 3. 87 ability (1. 49) (0. 56) (1. 56) wage = -3. 03 + 0. 60 union + 0. 25 revenue (0. 70) (0. 45) (0. 08) • H. Farber thinks the average union wage is different from average nonunion wage because unionized employers are more selective and hire individuals with higher ability. • M. Friedman thinks the average union wage is different from the average nonunion wage because unionized employers have different levels of revenue per employee. 10
Checking the Assumptions • How to check the validity of the assumptions? • Cynicism, Realism, and Theory • Robustness Checks • Check different specifications • But don’t just choose the best one! • Automated Variable Selection Methods • e. g. , Stepwise regression (Siegel, p. 547) • Misspecification and Other Tests • Examine Diagnostic Plots 11
Diagnostic Plots Increasing spread might indicate heteroskedasticity. Try transformations or weighted least squares. 12
Diagnostic Plots “Tilt” from outliers might indicate skewness. Try log transformation 13
Problematic Outliers Stock Performance and CEO Golf Handicaps (New York Times, 5 -31 -98) Number of obs = 44 R-squared = 0. 1718 ------------------------stockrating | Coef. Std. Err. t P>|t| -------+-----------------handicap | -1. 711. 580 -2. 95 0. 005 _cons | 73. 234 8. 992 8. 14 0. 000 ------------------------ Without 7 “Outliers” Number of obs = 51 R-squared = 0. 0017 ------------------------stockrating | Coef. Std. Err. t P>|t| -------+-----------------handicap | -. 173. 593 -0. 29 0. 771 _cons | 55. 137 9. 790 5. 63 0. 000 ------------------------ With the 7 “Outliers” 14
Are They Really Outliers? ? Diagnostic Plot is OK BE CAREFUL! Stock Performance and CEO Golf Handicaps (New York Times, 5 -31 -98) 15
Diagnostic Plots Curvature might indicate nonlinearity. Try quadratic specification 16
Diagnostic Plots Good diagnostic plot. Lacks obvious indications of other problems. 17
Adding Squared (Quadratic) Term Job Performance regression on Salary (in $1, 000 s) (Egg Data) Source | SS df MS Number of obs = 576 ------- -+----------F(2, 573) = 122. 42 Model | 255. 61 2 127. 8 Prob > F = 0. 0000 Residual | 598. 22 573 1. 044 R-squared = 0. 2994 -----+----------Adj R-squared = 0. 2969 Total | 853. 83 575 1. 485 Root MSE = 1. 0218 --------+----------------------job performance| Coef. Std. Err. t P>|t| --------+----------------------salary |. 0980844. 0260215 3. 77 0. 000 salary squared | -. 000337. 0001905 -1. 77 0. 077 _cons | -1. 720966. 8720358 -1. 97 0. 049 ------------------------------ Salary Squared = Salary 2 [=salary^2 in Excel] 18
Quadratic Regression Job perf = -1. 72 + 0. 098 salary – 0. 00034 salary squared Quadratic regression (nonlinear) 19
Quadratic Regression Job perf = -1. 72 + 0. 098 salary – 0. 00034 salary squared -linear coeff. Max = 2*quadratic coeff. But where? Effect of salary will eventually turn negative 20
Another Specification Possibility • If data are very skewed, can try a log specification • Can use logs instead of levels for independent and/or dependent variables • Note that the interpretation of the coefficients will change • Re-familiarize yourself with Siegel, pp. 68 -69 21
Quick Note on Logs • a is the natural logarithm of x if: 2. 71828 a = x a or, e =x • The natural logarithm is abbreviated “ln” • ln(x) = a • In Excel, use ln function • We call this the “log” but don’t use the “log” function! • Usefulness: spreads out small values and narrows large values which can reduce skewness 22
Earnings Distribution Skewed to the right Weekly Earnings from the March 2002 CPS, n=15, 000 23
Residuals from Levels Regression Skewed to the right— use of t distribution is suspect Residuals from a regression of Weekly Earnings on demographic characteristics 24
Log Earnings Distribution Not perfectly symmetrical, but better Natural Logarithm of Weekly Earnings from the March 2002 CPS, i. e. , =ln(weekly earnings) 25
Residuals from Log Regression Almost symmetrical —use of t distribution is probably OK Residuals from a regression of Log Weekly Earnings on demographic characteristics 26
Hypothesis Tests • We’ve been doing hypothesis tests for single coefficients • H 0: = 0 reject if |t| > t /2, n-k-1 • H A: 0 • What about testing more than one coefficient at the same time? • e. g. , want to see if an entire group of 10 dummy variables for 10 industries should be in the model • Joint tests can be conducted using partial F tests 27
Partial F Tests H 0: 1 = 2 = 3 = … = C = 0 HA: at least one i 0 • How to test this? • Consider two regressions • One as if H 0 is true • i. e. , 1 = 2 = 3 = … = C = 0 • This is a “restricted” (or constrained) model • Plus a “full” (or unconstrained) model in which the computer can estimate what it wants for each coefficient 28
Partial F Tests • Statistically, need to distinguish between • Full regression “no better” than the restricted regression – versus – • Full regression is “significantly better” than the restricted regression • To do this, look at variance of prediction errors • If this declines significantly, then reject H 0 • From ANOVA, we know ratio of two variances has an F distribution • So use F test 29
Partial F Tests • SSresidual = Sum of Squares Residual • C = #constraints • The partial F statistic has C, n-k-1 degrees of freedom • Reject H 0 if F > F , C, n-k-1 30
Coal Mining Example (Again) Regression Statistics R Squared 0. 955 Adj. R Squared 0. 949 Standard Error 108. 052 Obs. 47 ANOVA df SS MS F Significance Regression 6 9975694. 933 1662615. 822 142. 406 0. 000 Residual 40 467007. 875 11675. 197 Total 46 10442702. 809 Coeff. Std. Error t stat p value Lower 95% Upper 95% -168. 510 258. 819 -0. 651 0. 519 -691. 603 354. 583 hours 1. 244 0. 186 6. 565 0. 000 0. 001 0. 002 tons 0. 048 0. 403 0. 119 0. 906 -0. 001 19. 618 5. 660 3. 466 0. 001 8. 178 31. 058 159. 851 78. 218 2. 044 0. 048 1. 766 317. 935 Act 1952 -9. 839 100. 045 -0. 098 0. 922 -212. 038 192. 360 Act 1969 -203. 010 111. 535 -1. 820 0. 076 -428. 431 22. 411 Intercept unemp WWII 31
Minitab Output Predictor Constant hours tons unemp WWII Act 1952 Act 1969 S = 108. 1 Analysis of Source Regression Error Total Coef -168. 5 1. 2235 0. 0478 19. 618 159. 85 -9. 8 -203. 0 St. Dev 258. 8 0. 186 0. 403 5. 660 78. 22 100. 0 111. 5 R-Sq = 95. 5% Variance DF SS 6 9975695 40 467008 46 10442703 T -0. 65 6. 56 0. 12 3. 47 2. 04 -0. 10 -1. 82 P 0. 519 0. 000 0. 906 0. 001 0. 048 0. 922 0. 076 R-Sq(adj) = 94. 9% MS 1662616 11675 F 142. 41 P 0. 000 32
Is the Overall Model Significant? H 0: 1 = 2 = 3 = … = 6 = 0 HA: at least one i 0 • Note: for testing the overall model, C=k • i. e. , testing all coefficients together • From the previous slides, we have SSresidual for the “full” (or unconstrained) model • SSresidual=467, 007. 875 • But what about for the restricted (H 0 true) regression? • Estimate a constant only regression 33
Constant-Only Model Regression Statistics R Squared 0 Adj. R Squared 0 Standard Error 476. 461 Obs. 47 ANOVA df SS MS Regression 0 0 0 Residual 46 10442702. 809 227015. 278 Total 46 10442702. 809 Coeff. Std. Error t stat p value Lower 95% Upper 95% 671. 937 69. 499 9. 668 0. 0000 532. 042 811. 830 Intercept F Significance. . 34
Partial F Tests = 142. 406 H 0: 1 = 2 = 3 = … = 6 = 0 HA: at least one i 0 • Reject H 0 if F > F , C, n-k-1 = F 0. 05, 6, 40 = 2. 34 • 142. 406 > 2. 34 so reject H 0. Yes, overall model is significant 35
Denominator Degrees of Freedom Select F Distribution 5% Critical Values Numerator Degrees of Freedom 1 2 3 4 5 6 … 1 161 199 216 225 230 234 2 18. 5 19. 0 19. 2 19. 3 3 10. 1 9. 55 9. 28 9. 12 9. 01 8. 94 8 5. 32 4. 46 4. 07 3. 84 3. 69 3. 58 10 4. 96 4. 10 3. 71 3. 48 3. 33 3. 22 11 4. 84 3. 98 3. 59 3. 36 3. 20 3. 09 12 4. 75 3. 89 3. 49 3. 26 3. 11 3. 00 18 4. 41 3. 55 3. 16 2. 93 2. 77 2. 66 40 3. 94 3. 09 2. 84 2. 46 2. 31 2. 19 1000 3. 85 3. 00 2. 61 2. 38 2. 22 2. 11 … 36
A Small Shortcut Regression Statistics R Squared 0. 955 Adj. R Squared 0. 949 Standard Error 108. 052 Obs. 47 For constant only model, SSresidual=10, 442, 702. 809 ANOVA df SS MS F Significance Regression 6 9975694. 933 1662615. 822 142. 406 0. 000 Residual 40 467007. 875 11675. 197 Total 46 10442702. 809 Coeff. Std. Error t stat p value Lower 95% Upper 95% -168. 510 258. 819 hours 1. 244 0. 186 tons 0. 048 0. 403 19. 618 5. 660 159. 851 78. 218 2. 044 0. 048 1. 766 317. 935 Act 1952 -9. 839 100. 045 -0. 098 0. 922 -212. 038 192. 360 Act 1969 -203. 010 111. 535 -1. 820 0. 076 -428. 431 22. 411 Intercept unemp WWII -0. 651 0. 519 -691. 603 354. 583 So to test overall model, you 0. 000 0. 001 0. 002 don’t 6. 565 need to run a constant 0. 119 0. 906 -0. 001 only 3. 466 model 0. 001 8. 178 31. 058 37
An Even Better Shortcut Regression Statistics R Squared 0. 955 Adj. R Squared 0. 949 Standard Error 108. 052 Obs. 47 ANOVA df SS MS F Significance Regression 6 9975694. 933 1662615. 822 142. 406 0. 000 Residual 40 467007. 875 11675. 197 Total 46 10442702. 809 Coeff. Std. Error -168. 510 258. 819 hours 1. 244 0. 186 tons 0. 048 0. 403 19. 618 5. 660 159. 851 78. 218 Act 1952 -9. 839 100. 045 -0. 098 0. 922 -212. 038 192. 360 Act 1969 -203. 010 111. 535 -1. 820 0. 076 -428. 431 22. 411 Intercept unemp WWII t stat p value Lower 95% Upper 95% In fact, the ANOVA table F -0. 651 0. 519 -691. 603 354. 583 test is 6. 565 exactly 0. 000 the test 0. 001 for the 0. 002 0. 001 overall 0. 119 model 0. 906 being -0. 001 3. 466 0. 001 8. 178 31. 058 significant—recall Unit 2. 044 0. 048 1. 7668 317. 935 38
Testing Any Subset Regression Statistics R Squared Partial F test can be 0. 955 Adj. R Squared 0. 949 used to test any subset Standard Error 108. 052 of Obs. variables 47 ANOVA df SS MS F Significance Regression 6 9975694. 933 1662615. 822 142. 406 0. 000 Residual 40 467007. 875 11675. 197 Total 46 10442702. 809 Coeff. Std. Error t stat p value Lower 95% Upper 95% -168. 510 258. 819 -0. 651 0. 519 -691. 603 354. 583 Intercept hours 1. 244 tons 0. 048 unemp WWII 19. 618 159. 851 Act 1952 -9. 839 Act 1969 -203. 010 6. 565 0. 000 0. 001 0. 002 For example, 0. 403 0. 119 0. 906 -0. 001 H 0: WWII 3. 466 = Act 1952 5. 660 0. 001 = Act 1969 8. 178 = 0 31. 058 78. 218 2. 044 0. 048 1. 766 317. 935 H : at least one 0 A i 100. 045 -0. 098 0. 922 -212. 038 192. 360 0. 186 111. 535 -1. 820 0. 076 -428. 431 22. 411 39
Restricted Model Regression Statistics R Squared 0. 955 Adj. R Squared 0. 949 Standard Error 108. 052 Obs. Restricted regression with WWII = Act 1952 = Act 1969 = 0 47 ANOVA df SS MS F Significance Regression 3 9837344. 76 3279114. 920 232. 923 0. 000 Residual 43 605358. 049 14078. 094 Total 46 10442702. 809 Coeff. Std. Error t stat p value 147. 821 166. 406 0. 888 0. 379 0. 0015 0. 0001 20. 522 0. 000 -0. 0008 0. 0003 -2. 536 0. 015 7. 298 4. 386 1. 664 0. 103 Intercept hours tons unemp 40
Partial F Tests = 3. 950 H 0: WWII = Act 1952 = Act 1969 = 0 HA: at least one i 0 • Reject H 0 if F > F , C, n-k-1 = F 0. 05, 3, 40 = 2. 84 • 3. 95 > 2. 84 so reject H 0. Yes, subset of three coefficients are jointly significant 41
Blocks Regression and Two-Way ANOVA 1 2 3 4 5 Treatments A B C 10 9 8 12 6 5 18 15 14 20 18 18 8 7 8 “Stack” data using dummy variables A B C B 2 B 3 B 4 B 5 Value 1 0 0 0 10 1 0 0 0 12 1 0 0 0 18 1 0 0 1 0 20 1 0 0 0 1 8 0 1 0 0 0 9 0 1 0 0 0 6 0 1 0 0 15 0 1 0 0 0 18 0 1 1 0 0 0 1 7 0 0 1 0 0 8 … … 42
Recall Two-Way Results ANOVA: Two-Factor Without Replication Source of Variation SS df MS 312. 267 4 Treatment 26. 533 Error Total Blocks F Pvalue F crit 78. 067 38. 711 0. 000 3. 84 2 13. 267 6. 579 0. 020 4. 46 16. 133 8 2. 017 354. 933 14 43
Regression and Two-Way ANOVA Source | SS df MS -----+-----------Model | 338. 800 6 56. 467 Residual | 16. 133 8 2. 017 -------+---------Total | 354. 933 14 25. 352 Number of obs F( 6, 8) Prob > F R-squared Adj R-squared Root MSE = = = 15 28. 00 0. 0001 0. 9545 0. 9205 1. 4201 ------------------------------treatment | Coef. Std. Err. t P>|t| [95% Conf. Int] -----+-------------------------b | -2. 600. 898 -2. 89 0. 020 -4. 671 -. 529 c | -3. 000. 898 -3. 34 0. 010 -5. 071 -. 929 b 2 | -1. 333 1. 160 -1. 15 0. 283 -4. 007 1. 340 b 3 | 6. 667 1. 160 5. 75 0. 000 3. 993 9. 340 b 4 | 9. 667 1. 160 8. 34 0. 000 6. 993 12. 340 b 5 | -1. 333 1. 160 -1. 15 0. 283 -4. 007 1. 340 _cons | 10. 867. 970 11. 20 0. 000 8. 630 13. 104 ------------------------------44
Regression and Two-Way ANOVA Regression Excerpt for Full Model Source | SS df MS -----+---------Model | 338. 800 6 56. 467 Residual | 16. 133 8 2. 017 -----+---------Total | 354. 933 14 25. 352 Regression Excerpt for b 2= b 3 =… 0 Source | SS df MS -----+---------Model | 26. 533 2 13. 267 Residual | 328. 40 12 27. 367 -----+---------Total | 354. 933 14 25. 352 Use these SSresidual values to do partial F tests and you will get exactly the same answers as the Two. Way ANOVA tests Regression Excerpt for b= c = 0 Source | SS df MS -----+---------Model | 312. 267 4 78. 067 Residual | 42. 667 10 4. 267 -----+---------Total | 354. 933 14 25. 352 45
Denominator Degrees of Freedom Select F Distribution 5% Critical Values 1 1 161 2 18. 5 3 10. 1 8 5. 32 10 4. 96 11 4. 84 12 4. 75 18 4. 41 40 3. 94 1000 3. 85 3. 84 Numerator Degrees of Freedom 2 3 4 5 6 9 199 216 225 230 234 241 19. 0 19. 2 19. 3 19. 4 9. 55 9. 28 9. 12 9. 01 8. 94 8. 81 4. 46 4. 07 3. 84 3. 69 3. 58 3. 39 4. 10 3. 71 3. 48 3. 33 3. 22 3. 02 3. 98 3. 59 3. 36 3. 20 3. 09 2. 90 3. 89 3. 49 3. 26 3. 11 3. 00 2. 80 3. 55 3. 16 2. 93 2. 77 2. 66 2. 46 3. 09 2. 84 2. 46 2. 31 2. 19 2. 12 3. 00 2. 61 2. 38 2. 22 2. 11 1. 89 3. 00 2. 60 2. 37 2. 21 2. 10 1. 83 … 46
3 Seconds of Calculus 47
Regression Coefficients • y = b 0 + b 1 x (linear form) 1 unit change in x changes y by b 1 • log(y) = b 0 + b 1 x (semi-log form) 1 unit change in x changes y by b 1 (x 100) percent • log(y) = b 0 + b 1 log(x) (double-log form) 1 percent change in x changes y by b 1 percent 48
Log Regression Coefficients • wage = 9. 05 + 1. 39 union • Predicted wage is $1. 39 higher for unionized workers (on average) • log(wage) = 2. 20 + 0. 15 union • Semi-elasticity • Predicted wage is approximately 15% higher for unionized workers (on average) • log(wage) = 1. 61 + 0. 30 log(profits) • Elasticity • A one percent increase in profits increases predicted wages by approximately 0. 3 percent 49
Multicollinearity Auto repair records, weight, and engine size Number of obs = 69 F( 2, 66) = 6. 84 Prob > F = 0. 0020 R-squared = 0. 1718 Adj R-squared = 0. 1467 Root MSE =. 91445 -----------------------repair | Coef. Std. Err. t P>|t| -------+-------------------weight | -. 00017. 00038 -0. 41 0. 685 engine | -. 00313. 00328 -0. 96 0. 342 _cons | 4. 50161. 61987 7. 26 0. 000 -----------------------50
Multicollinearity • Two (or more) independent variables are so highly correlated that a multiple regression can’t disentangle the unique contributions of each • Large standard errors and lack of statistical significance for individual coefficients • But joint significance • Identifying multicollinearity • Some say “rule of thumb |r|>0. 70” (or 0. 80) • But better to look at results • OK for prediction • Bad for assessing theory 51
Prediction With Multicollinearity • Prediction at the Mean (weight=3019 and engine=197) Model for prediction Lower Upper Predicted 95% Limit (Mean) Repair (Mean) Multiple Regression 3. 411 3. 191 3. 631 Weight Only 3. 412 3. 193 3. 632 Engine Only 3. 410 3. 192 3. 629 52
Dummy Dependent Variables • Dummy dependent variables • y = b 0 + b 1 x 1 + … + bkxk + e • Where y is a {0, 1} indicator variable • Examples • Do you intend to quit? yes / no • Did the worker receive training? yes/no • Do you think the President is doing a good job? yes/no • Was there a strike? yes / no • Did the company go bankrupt? yes/no 53
Linear Probability Model • Mathematically / computationally, can estimate a regression as usual (the monkeys won’t know the difference) • This is called a “linear probability model” • Right-hand side is linear • And is estimating probabilities • P(y =1) = b 0 + b 1 x 1 + … + bkxk • b 1=0. 15 (for example) means that a one unit change in x 1 increases probability that y=1 by 0. 15 (fifteen percentage points) 54
Linear Probability Model • Excel won’t know the difference, but perhaps it should • Linear probability model problems • e 2 = P(y=1) [1 -P(y=1)] • But P(y =1) = b 0 + b 1 x 1 + … + bkxk • So e 2 is • Predicted probabilities are not bounded by 0, 1 • R 2 is not an accurate measure of predictive ability • Can use a pseudo-R 2 measure • Such as percent correctly predicted 55
Logit Model & Probit Model • Solution to these problems is to use nonlinear functional forms that bound P(y=1) between 0, 1 • Logit Model (logistic regression) Recall, ln(x) = a when ea = x • Probit Model • Where is the normal cumulative distribution function 56
Logit Model & Probit Model • Nonlinear so need statistical package to do the calculations • Can do individual (z-tests, not t-tests) and joint statistical testing as with other regressions • Also confidence intervals • Need to convert coefficients to marginal effects for interpretation • Should be aware of these models • Though in many cases, a linear probability model works just fine 57
Example • Dep. Var: 1 if you know of the FMLA, 0 otherwise Probit estimates Number of obs = 1189 LR chi 2(14) = 232. 39 Prob > chi 2 = 0. 0000 Log likelihood = -707. 94377 Pseudo R 2 = 0. 1410 ------------------------------FMLAknow | Coef. Std. Err. z P>|z| [95% Conf. Int] -----+-------------------------union |. 238. 101 2. 35 0. 019. 039. 436 age | -. 002. 018 -0. 13 0. 897 -. 038. 033 agesq |. 135. 219 0. 62 0. 536 -. 293. 564 nonwhite | -. 571. 098 -5. 80 0. 000 -. 764 -. 378 income | 1. 465. 393 3. 73 0. 000. 696 2. 235 incomesq | -5. 854 2. 853 -2. 05 0. 040 -11. 45 -. 262 [other controls omitted] _cons | -1. 188. 328 -3. 62 0. 000 -1. 831 -. 545 ------------------------------58
Marginal Effects • For numerical interpretation / prediction, need to convert coefficients to marginal effects • Example: Logit Model • So b 1 gives effect on Log( • ), not P(y=1) • Probit is similar • Can re-arrange to find out effect on P(y=1) • Usually do this at the sample means 59
Marginal Effects Probit estimates Number of obs = 1189 LR chi 2(14) = 232. 39 Prob > chi 2 = 0. 0000 Log likelihood = -707. 94377 Pseudo R 2 = 0. 1410 ------------------------------FMLAknow | d. F/dx Std. Err. z P>|z| [95% Conf. Int] -----+-------------------------union |. 095. 040 2. 35 0. 019. 017. 173 age | -. 001. 007 -0. 13 0. 897 -. 015. 013 agesq |. 054. 087 0. 62 0. 536 -. 117. 225 Nonwhite | -. 222. 036 -5. 80 0. 000 -. 293 -. 151 income |. 585. 157 3. 73 0. 000. 278. 891 incomesq | -2. 335 1. 138 -2. 05 0. 040 -4. 566 -. 105 [other controls omitted] ------------------------------ For numerical interpretation / prediction, need to convert coefficients to marginal effects 60
But Linear Probability Model is OK, Too Probit Coeff. 0. 238 (0. 101) Probit Marginal 0. 095 (0. 040) Regression 0. 084 (0. 035) Nonwhite -0. 571 (0. 098) -0. 222 (0. 037) -0. 192 (0. 033) Income 1. 465 (0. 393) 0. 585 (0. 157) 0. 442 (0. 091) Income Squared -5. 854 (2. 853) -2. 335 (1. 138) -1. 354 (0. 316) Union So regression is usually OK, but should still be familiar with logit and probit methods 61
- An example of a claim
- Linear regression vs multiple regression
- Linear model regression
- Logistic regression vs linear regression
- Logistic regression vs linear regression
- Power traiangle
- Scientific validity
- Counterclaim example
- Counterclaim paragraph
- What does counterclaim and rebuttal mean
- Literary claim
- Regression analysis in quantitative research
- Hát kết hợp bộ gõ cơ thể
- Lp html
- Bổ thể
- Tỉ lệ cơ thể trẻ em
- Chó sói
- Thang điểm glasgow
- Hát lên người ơi
- Môn thể thao bắt đầu bằng chữ f
- Thế nào là hệ số cao nhất
- Các châu lục và đại dương trên thế giới
- Công thức tính độ biến thiên đông lượng
- Trời xanh đây là của chúng ta thể thơ
- Mật thư tọa độ 5x5
- Phép trừ bù
- độ dài liên kết
- Các châu lục và đại dương trên thế giới
- Thể thơ truyền thống
- Quá trình desamine hóa có thể tạo ra
- Một số thể thơ truyền thống
- Cái miệng bé xinh thế chỉ nói điều hay thôi
- Vẽ hình chiếu vuông góc của vật thể sau
- Thế nào là sự mỏi cơ
- đặc điểm cơ thể của người tối cổ
- Thế nào là giọng cùng tên? *
- Vẽ hình chiếu đứng bằng cạnh của vật thể
- Fecboak
- Thẻ vin
- đại từ thay thế
- điện thế nghỉ
- Tư thế ngồi viết
- Diễn thế sinh thái là
- Dạng đột biến một nhiễm là
- Số.nguyên tố
- Tư thế ngồi viết
- Lời thề hippocrates
- Thiếu nhi thế giới liên hoan
- ưu thế lai là gì
- Khi nào hổ mẹ dạy hổ con săn mồi
- Sự nuôi và dạy con của hổ
- Sơ đồ cơ thể người
- Từ ngữ thể hiện lòng nhân hậu
- Thế nào là mạng điện lắp đặt kiểu nổi
- Previous owner of original united states
- Previous lesson means
- Value received and value parted with
- Previous uil science tests
- In the previous lesson
- From your previous lesson you have learned
- Leaders join me hand and arm signal
- In the previous lesson