Taking the pain out of looping and storing

  • Slides: 19
Download presentation
Taking the pain out of looping and storing Patrick Royston Nordic and Baltic Stata

Taking the pain out of looping and storing Patrick Royston Nordic and Baltic Stata Users’ meeting, Stockholm, 11 November 2011

Overview • I often find myself running a command repeatedly in a loop •

Overview • I often find myself running a command repeatedly in a loop • I want to save some results and store them in new variable(s) • A new command, looprun, is described that automates the process in a convenient way • It can handle a single loop, or two nested loops • I shall illustrate looprun using profile likelihood functions and surfaces 1

Example 1: Single loop • A non-standard regression in which a nonlinear parameter is

Example 1: Single loop • A non-standard regression in which a nonlinear parameter is to be estimated by the profile likelihood method • Vary the parameter over an interval, fit the model • Store the parameter and the resulting deviance (-2 * log likelihood) in new variables • Plot the deviance against the parameter and draw inferences 2

Example 1 • Fitting a Cox regression to a variable haem (haemoglobin) in a

Example 1 • Fitting a Cox regression to a variable haem (haemoglobin) in a kidney cancer dataset • Wish to find the best-fitting power transformation, haemp • Draw inferences about p 3

Conventional code to solve the problem. . . . capture drop deviance capture drop

Conventional code to solve the problem. . . . capture drop deviance capture drop p capture drop order gen deviance =. gen p =. gen int order = _n local i 0 quietly foreach p of numlist -3 (0. 1) 0. 7 { fracgen haem `p', replace stcox haem_1 sort order local ++i replace deviance = -2 * e(ll) in `i‘ replace p = `p' in `i' } line deviance p, sort 4

Solution using looprun "p=-3(0. 1)0. 7", generate(deviance) store(-2*e(ll)) : /// fracgen haem @, replace

Solution using looprun "p=-3(0. 1)0. 7", generate(deviance) store(-2*e(ll)) : /// fracgen haem @, replace # /// stcox haem_1. line deviance p, sort 5

Resulting plot 6

Resulting plot 6

Example 2: double loop • A non-standard regression in which two nonlinear parameters are

Example 2: double loop • A non-standard regression in which two nonlinear parameters are to be estimated by inspecting the profile likelihood surface • Vary both parameters over a grid, fit the model and store the resulting deviance (-2 * log likelihood) • Plot the deviance against one parameter by the values of the other parameter • Contour plot of the deviance surface • Requires Stata 12 twoway contour 7

Example 2 • Model is a Gaussian growth curve • predictor = b 1+b

Example 2 • Model is a Gaussian growth curve • predictor = b 1+b 2*normal(s*(haem ‒ 12. 2) + m/10) 8

Solution using looprun "m=7 (2) 35" "s=0. 2 (0. 05) 2. 5", /// generate(deviance,

Solution using looprun "m=7 (2) 35" "s=0. 2 (0. 05) 2. 5", /// generate(deviance, replace) store(-2*e(ll)) : /// capture drop z # /// gen z = normal(@2 * (haem - 12. 2) + @1/10) # /// stcox z 9

Graphs of results Plot deviance against s, by m. sum deviance. gen deviance 2

Graphs of results Plot deviance against s, by m. sum deviance. gen deviance 2 = deviance - r(min). line deviance 2 s, sort by(m) 10

Resulting “casement” plot 11

Resulting “casement” plot 11

Contour plot . replace deviance 2 = min(deviance 2, 20). twoway contour deviance 2

Contour plot . replace deviance 2 = min(deviance 2, 20). twoway contour deviance 2 m s, ccuts(0(1)20) /// > yscale(r(7 35)) ylabel(10(5)35) xscale(r(. 2 2. 5)) /// > xlabel(. 25)2. 5) 12

Contour plot 13

Contour plot 13

What can we learn from the contour plot? • Parameter estimates of m and

What can we learn from the contour plot? • Parameter estimates of m and s are highly correlated • Re-parameterisation might help • The MLE is located along a narrow, long channel • Hence the model may not be well identified in this dataset • The likelihood surface has some peculiarities for low s, high m 14

Syntax of looprun "[name 1=]numlist 1" ["[name 2=]numlist 2"] , required [ options ]

Syntax of looprun "[name 1=]numlist 1" ["[name 2=]numlist 2"] , required [ options ] : command 1 [ # command 2. . . ] required description ------------------------------------------store(results_list) results to be stored generate(newvarlist [, replace]) names of new variable(s) to store results in options description ------------------------------------------nodots suppresses progress dots nosort do not sort data before storing results separator(string) character separating commands (default #) placeholder(string) placeholder character(s) (default @) ------------------------------------------15

Main limitation: Handling macros • Cannot assign a local or global macro within a

Main limitation: Handling macros • Cannot assign a local or global macro within a looprun subcommand retrieve it for storage • Easiest way around this is to use scalars, which are global • Need care to avoid clash of scalar names with similarly named variables 16

Conclusion • looprun should take most of the effort out of many simple programming

Conclusion • looprun should take most of the effort out of many simple programming tasks in Stata • looprun can be installed via my UCL webpage: net from http: //www. homepages. ucl. ac. uk/~ucakjpr/stata/ 17

Thank you. 18

Thank you. 18