Lesson 11 Topics Statistical procedures PROC LOGIST REG

  • Slides: 26
Download presentation
Lesson 11 - Topics • Statistical procedures: PROC LOGIST, REG • Multiple logistic and

Lesson 11 - Topics • Statistical procedures: PROC LOGIST, REG • Multiple logistic and linear regression • Introduction to SAS macros Programs 21 -22

Logistic Regression Model a binary factor (yes/no) as a function of one or more

Logistic Regression Model a binary factor (yes/no) as a function of one or more independent variables. TOMHS Example: Smoking as a function of age, gender, race, and education Log(p/1 -p) = b 0 +b 1 x 1 + b 2 x 2 + bkxk

DATA stat ; INFILE '~/SAS_Files/tomhsfull. data' ; INPUT @1 ptid $10. @27 age 2.

DATA stat ; INFILE '~/SAS_Files/tomhsfull. data' ; INPUT @1 ptid $10. @27 age 2. @30 sex 1. @32 race 1. @49 educ 1. @51 eversmk 1. @53 nowsmk 1. @180 energy 5. ; if race = 2 then aa = 1; else aa = 0; if sex = 2 then women = 1; else women = 0; if educ in(1, 2, 3, 4, 5, 6) then collgrad = 0; else if educ in(7, 8, 9) then collgrad = 1; if eversmk = 2 then currsmk = 2; else currsmk = nowsmk;

if eversmk = 2 then currsmk = 2; else currsmk = nowsmk; Did you

if eversmk = 2 then currsmk = 2; else currsmk = nowsmk; Did you ever smoke cigarettes? 1 = yes, 2= no Var: eversmk Do you now smoke cigarettes? 1 = yes, 2= no Var: nowsmk Note: Second question only answered if first question is answered yes.

PROC MEANS; VAR age women collgrad aa ; CLASS currsmk; RUN; N currsmk Obs

PROC MEANS; VAR age women collgrad aa ; CLASS currsmk; RUN; N currsmk Obs Variable N Mean ---------------------------1 98 age 98 52. 31 women 98 0. 44 collgrad 98 0. 23 aa 98 0. 45 2 801 age 801 55. 08 women 801 0. 38 collgrad 799 0. 38 aa 801 0. 17 ---------------------------

ODS SELECT Parameter. Estimates Odds. Ratios PROC LOGIST; MODEL currsmk = age women collgrad

ODS SELECT Parameter. Estimates Odds. Ratios PROC LOGIST; MODEL currsmk = age women collgrad aa ; RUN; Analysis of Maximum Likelihood Estimates Parameter DF Estimate Standard Error Intercept age women collgrad aa 1 1 1. 7422 -0. 0732 -0. 2367 -0. 6866 1. 3394 1. 0235 0. 0189 0. 2407 0. 2618 0. 2416 Wald Chi-Square Pr > Chi. Sq 2. 8976 15. 0704 0. 9672 6. 8805 30. 7354 0. 0887 0. 0001 0. 3254 0. 0087 <. 0001 Odds Ratio Estimates Effect age women collgrad aa Point Estimate 0. 929 0. 789 0. 503 3. 817 95% Wald Confidence Limits 0. 896 0. 492 0. 301 2. 377 0. 964 1. 265 0. 841 6. 128 OR = exp(estimate) OR (age) = exp(-0. 07) = 0. 93

Comparison of univariate versus multivariate results Multivariate Parameter DF Estimate Standard Error Intercept age

Comparison of univariate versus multivariate results Multivariate Parameter DF Estimate Standard Error Intercept age women collgrad aa 1 1 1. 7422 -0. 0732 -0. 2367 -0. 6866 1. 3394 1. 0235 0. 0189 0. 2407 0. 2618 0. 2416 Wald Chi-Square Pr > Chi. Sq 2. 8976 15. 0704 0. 9672 6. 8805 30. 7354 0. 0887 0. 0001 0. 3254 0. 0087 <. 0001 Wald Chi-Square Pr > Chi. Sq 2. 8976 15. 8221 1. 4026 7. 7635 39. 5071 0. 0887 <. 0001 0. 2363 0. 0053 <. 0001 Univariate (Separate regression runs) Parameter DF Estimate Standard Error Intercept age women collgrad aa 1 1 1. 7422 -0. 0736 0. 2561 -0. 6945 1. 4091 1. 0235 0. 0185 0. 2162 0. 2492 0. 2242 Note: Women more likely to be AA then men in TOMHS and AA more likely to be smokers.

Linear Regression Model a continuous factor as a function of one or more independent

Linear Regression Model a continuous factor as a function of one or more independent variables. TOMHS Example: Energy (calories) intake as a function of age, gender, race, and education

ODS SELECT Parameter. Estimates ; PROC REG; MODEL energy = age women collgrad aa

ODS SELECT Parameter. Estimates ; PROC REG; MODEL energy = age women collgrad aa ; RUN; The REG Procedure Model: MODEL 1 Dependent Variable: energy Parameter Estimates Variable Intercept age women collgrad aa DF Parameter Estimate Standard Error t Value Pr > |t| 1 1 1 3574. 78842 -20. 67969 -570. 45804 -109. 19062 -253. 62159 184. 91689 3. 25993 44. 34733 44. 01230 54. 07279 19. 33 -6. 34 -12. 86 -2. 48 -4. 69 <. 0001 0. 0133 <. 0001 Energy = 3575 -21*age – 570*women – 109*collgrad – 253*aa

Multivariate Analysis Variable DF Parameter Estimate age women collgrad aa 1 1 -20. 67969

Multivariate Analysis Variable DF Parameter Estimate age women collgrad aa 1 1 -20. 67969 -570. 45804 -109. 19062 -253. 62159 Standard Error t Value Pr > |t| 3. 25993 44. 34733 44. 01230 54. 07279 -6. 34 -12. 86 -2. 48 -4. 69 <. 0001 0. 0133 <. 0001 Univariate Analysis (Separate regression runs) Variable DF Parameter Estimate age women collgrad aa 1 1 -17. 1154 -595. 40078 41. 21749 -388. 19448 Standard Error t Value Pr > |t| 3. 60184 43. 74189 48. 61549 57. 32940 -4. 75 -13. 61 0. 85 -6. 77 <. 0001 0. 3968 <. 0001 Women less likely to be college graduates and also to have lower coloric intake.

PROC MEANS; VAR energy; CLASS women aa collgrad; RUN; Analysis Variable : energy N

PROC MEANS; VAR energy; CLASS women aa collgrad; RUN; Analysis Variable : energy N women aa collgrad Obs N Mean -------------------------------------0 0 0 277 276 2445. 043 1 1 0 1 1 213 2338. 319 0 42 42 2141. 714 1 23 23 1992. 261 0 162 1795. 938 1 71 71 1853. 366 0 92 92 1694. 196 1 20 20 1532. 300

Macro Variables and Use LIBNAME t ‘C: SAS_Files'; %let nut = kcalbl dcholbl calcbl

Macro Variables and Use LIBNAME t ‘C: SAS_Files'; %let nut = kcalbl dcholbl calcbl sodbl; %let cat = clinic; DATA temp; SET t. tomhs (KEEP=ptid &nut &cat); RUN; PROC MEANS DATA=temp ; VAR &nut ; CLASS &cat; TITLE "PROC Means results for variables &nut by &cat"; RUN; * Makes it easy to modify code;

Macro Variables %let macrovarname = characters ; • Defined using %LET statement • Referenced

Macro Variables %let macrovarname = characters ; • Defined using %LET statement • Referenced by using &macrovarname • SAS substitutes the value of macrovarname when it encounters &macrovarname • Useful for making a program easy to modify • Usually put near top of program

Simple Macro to Shorten Code %macro change(v); dbpdif&v = dbp&v sbpdif&v = sbp&v choldif&v

Simple Macro to Shorten Code %macro change(v); dbpdif&v = dbp&v sbpdif&v = sbp&v choldif&v = chol&v glucdif&v = gluc&v %mend change; dbpbl; sbpbl; - cholbl; - glucbl; Suppose I want to compute the change in 4 variables at 3 time points. Can use macro to help you. Variables: option mprint; * Shows code generated in the log; data temp; set temp; %change(12); %change(24); %change(36); run; Dbp 12, 24, 36 and dbpbl Sbp 12, 24, 36 and sbpbl Chol 12, 24, 36 and cholbl Gluc 12, 24, 36 and glucbl

Simple Macro to Shorten Code %macro change(v); 36 %change(12); MPRINT(CHANGE): dbpdif 12 = dbp

Simple Macro to Shorten Code %macro change(v); 36 %change(12); MPRINT(CHANGE): dbpdif 12 = dbp 12 MPRINT(CHANGE): sbpdif 12 = sbp 12 MPRINT(CHANGE): choldif 12 = chol 12 MPRINT(CHANGE): glucdif 12 = gluc 12 36 %change(24); MPRINT(CHANGE): dbpdif 24 = dbp 24 MPRINT(CHANGE): sbpdif 24 = sbp 24 MPRINT(CHANGE): choldif 24 = chol 24 MPRINT(CHANGE): glucdif 24 = gluc 24 37 %change(36); MPRINT(CHANGE): dbpdif 36 = dbp 36 MPRINT(CHANGE): sbpdif 36 = sbp 36 MPRINT(CHANGE): choldif 36 = chol 36 MPRINT(CHANGE): glucdif 36 = gluc 36 38 run; dbpbl; sbpbl; - cholbl; - glucbl; SAS substitutes the value of v everywhere there is an &v

Another Macro Example Goal of Macro named Summary: For a given dataset give summary

Another Macro Example Goal of Macro named Summary: For a given dataset give summary statistics using PROC CONTENTS, MEANS and FREQ and (optionally) display the data using PROC PRINT. Instead of having to write the code each time, write a macro.

Name of macro Parameters to Macro = defaults %macro summary ( dataset=, mvar=_numeric_, fvar

Name of macro Parameters to Macro = defaults %macro summary ( dataset=, mvar=_numeric_, fvar = _character_, print=N, pvar=_all_); dataset: Name of dataset used mvar: List of variables to run for PROC MEANS (default is all numeric var) fvar: List of variables to run for PROC FREQ (default is all character var) print: If set to Y then run PROC PRINT (default is N) pvar: List of variables to run for PROC PRINT Remember: SAS Macros generate SAS code when you call it

Name of macro Parameters to Macro %macro summary ( dataset=, mvar=_numeric_, fvar = _character_,

Name of macro Parameters to Macro %macro summary ( dataset=, mvar=_numeric_, fvar = _character_, print=N, pvar=_all_); proc contents data=&dataset varnum; run; proc means data=&dataset; var &mvar; run; proc freq data=&dataset; tables &fvar; run; %if &print = Y %then %do; This will generate the proc print code proc print data=&dataset; only if the macro variable print equals var &pvar; run; %end; %mend summary; Y.

CALL TO MACRO: libname t ‘C: /PH 6420/data/'; data tomhs; set t. tomhs; run;

CALL TO MACRO: libname t ‘C: /PH 6420/data/'; data tomhs; set t. tomhs; run; option mprint; * Call with only dataset given; %summary (dataset=tomhs); * This is the macro; proc contents data=&dataset varnum; run; proc means data=&dataset; var &mvar; run; proc freq data=&dataset; tables &fvar; run; %if &print = Y %then %do; proc print data=&dataset; var &pvar; run; %end; Code Generated: MPRINT(SUMMARY): MPRINT(SUMMARY): proc contents data=tomhs varnum; run; proc means data=tomhs; var _numeric_; run; proc freq data=tomhs; tables _character_; run;

CALL TO MACRO: libname t ‘C: /PH 6420/data/'; data tomhs; set t. tomhs; run;

CALL TO MACRO: libname t ‘C: /PH 6420/data/'; data tomhs; set t. tomhs; run; option mprint; %summary (dataset=tomhs, print=Y); Code Generated: MPRINT(SUMMARY): MPRINT(SUMMARY): MPRINT(SUMMARY): * This is the macro; proc contents data=&dataset varnum; run; proc means data=&dataset; var &mvar; run; proc freq data=&dataset; tables &fvar; run; %if &print = Y %then %do; proc print data=&dataset; var &pvar; run; %end; proc contents data=state varnum; run; proc means data=state; var _numeric_; run; proc freq data=state; tables _character_; run; proc print data=state; var _all_; run;

CALL TO MACRO: libname t ‘C: /PH 6420/data/'; data tomhs; set t. tomhs; run;

CALL TO MACRO: libname t ‘C: /PH 6420/data/'; data tomhs; set t. tomhs; run; option mprint; %summary (dataset=tomhs, fvar=clinic sex); * This is the macro; proc contents data=&dataset varnum; run; proc means data=&dataset; var &mvar; run; proc freq data=&dataset; tables &fvar; run; %if &print = Y %then %do; proc print data=&dataset; var &pvar; %end; Code Generated: MPRINT(SUMMARY): MPRINT(SUMMARY): proc contents data=tomhs varnum; run; proc means data=tomhs; var _numeric_; run; proc freq data=tomhs; tables clinic sex; run;

-----------------------------------| | Diastolic BP at 12 -Months | | |--------------------| | | N |

-----------------------------------| | Diastolic BP at 12 -Months | | |--------------------| | | N | Mean | Std | Min | Max | |--------------+-------+-------+-------| |Study Group (1 -6) | | | |1 | 15| 77. 8| 9. 3| 68. 0| 94. 0| |2 | 17| 81. 7| 7. 1| 72. 0| 100. 0| |3 | 14| 78. 1| 7. 6| 67. 0| 90. 0| |4 | 14| 77. 7| 6. 0| 66. 0| 89. 0| |5 | 13| 79. 6| 8. 5| 66. 0| 99. 0| |6 | 19| 79. 6| 7. 3| 64. 0| 95. 0| |All | 92| 79. 2| 7. 6| 64. 0| 100. 0| ----------------------------------- proc tabulate data=_last_ noseps; class group; var sbp 12; table (group all), (sbp 12)*(n*f=7. 0 mean*f=7. 1 std*f=7. 1 min*f=7. 1 max*f=7. 1)/rts=30; run;

MACRO BRKSPSS: Creates tabulate table for each var in dlist by group %macro brkspss

MACRO BRKSPSS: Creates tabulate table for each var in dlist by group %macro brkspss (grp, dlist, data=_last_, dec=3, all=all); %do I = 1 %to 100; %let depvar = %scan(&dlist, &i); %let %length(&depvar) = 0 %then %goto done; proc tabulate data=&data noseps; class &grp; var &depvar; table (&grp &all), (&depvar)*(n*f=7. 0 mean*f=7. &dec std*f=7. &dec min*f=7. &dec max*f=7. &dec)/rts=30; run; %end; %done: %mend brkspss; %brkspss(group, dbp 12 sbp 12 chol 12);

MACRO BRKSPSS: Creates tabulate table for each var by group LIBNAME t '~/PH 6420/2017/Data/';

MACRO BRKSPSS: Creates tabulate table for each var by group LIBNAME t '~/PH 6420/2017/Data/'; DATA stat; set t. tomhs; RUN; * Example calls; %brkspss(group, dbp 12 sbp 12 chol 12); %brkspss(group, dbp 12 sbp 12 chol 12, dec=1); * Just 1 -decimal; %brkspss(group, dbp 12 sbp 12 chol 12, all=); * No totals;

Output from last call: First 2 variables. -----------------------------------| | Diastolic BP at 12 -Months

Output from last call: First 2 variables. -----------------------------------| | Diastolic BP at 12 -Months | | |--------------------| | | N | Mean | Std | Min | Max | |--------------+-------+-------+-------| |Study Group (1 -6) | | | |1 | 15| 77. 800| 9. 314| 68. 000| 94. 000| |2 | 17| 81. 706| 7. 078| 72. 000|100. 000| |3 | 14| 78. 071| 7. 580| 67. 000| 90. 000| |4 | 14| 77. 714| 5. 954| 66. 000| 89. 000| |5 | 13| 79. 615| 8. 540| 66. 000| 99. 000| |6 | 19| 79. 579| 7. 313| 64. 000| 95. 000| -----------------------------------| | Systolic BP at 12 -Months | | |--------------------| | | N | Mean | Std | Min | Max | |--------------+-------+-------+-------| |Study Group (1 -6) | | | |1 | 15|120. 200| 12. 537| 93. 000|141. 000| |2 | 17|124. 118| 11. 280|108. 000|142. 000| |3 | 14|117. 429| 9. 436|104. 000|135. 000| |4 | 14|127. 571| 11. 876|112. 000|149. 000| |5 | 13|123. 154| 18. 348| 94. 000|158. 000| |6 | 19|129. 895| 12. 987|105. 000|154. 000| -----------------------------------

Where to put macro? § At beginning of program before you call it %macro

Where to put macro? § At beginning of program before you call it %macro brkspss(parameters); … macro code %mend brkspss; %brkspss (group, dbp 12 sbp 12, data=tomhs); § Save as separate sas file and %include file on top of program. %include ‘/folderpath/brkspss. sas’; %brkspss(group, dbp 12 sbp 12, data=tomhs);