Working sideways in Stata Jakob Hjort Data Manager

Working sideways in Stata Jakob Hjort Data. Manager, MPH Department of Cardiology Aarhus University Hospital DK-8200 Aarhus Denmark 2014 Nordic and Baltic Stata Users Group Metting

The rectangular dataset

The rectangular dataset Statistics

The rectangular dataset Statistics results ”It is not the data we want it’s the ssence of data”

The rectangular dataset Datamanagement

The rectangular dataset Datamanagement

The rectangular dataset Datamanagement Statistics

The rectangular dataset Datamanagement Statistics - transpose?

The rectangular dataset – subset in matrix using mata? use ”family. dta”, clear * Dataset with: fam_name, inc_mother & inc_father mata st_view(x=0, . , (”inc_mother”, ”inc_father”)) income=colsum(x’)’ st_addvar(”long”, ”inc_household”) st_store(. , ”inc_household”, income) end list fam_name inc_mother inc_father inc_household
![The direct approach generate [type] newvar=exp [if] [in] Datamanagement The direct approach generate [type] newvar=exp [if] [in] Datamanagement](http://slidetodoc.com/presentation_image_h2/fa5ecdba444eb26c9291b46715b851b5/image-10.jpg)
The direct approach generate [type] newvar=exp [if] [in] Datamanagement
![The direct approach generate [type] newvar=exp [if] [in] Datamanagement Weight Height Ex. : generate The direct approach generate [type] newvar=exp [if] [in] Datamanagement Weight Height Ex. : generate](http://slidetodoc.com/presentation_image_h2/fa5ecdba444eb26c9291b46715b851b5/image-11.jpg)
The direct approach generate [type] newvar=exp [if] [in] Datamanagement Weight Height Ex. : generate BMI=Weight/Height^2 BMI
![The direct approach egen [type] newvar=fcn(arguments) [if] [in] [, options] Datamanagement rowtotal, rowmin, rowmax, The direct approach egen [type] newvar=fcn(arguments) [if] [in] [, options] Datamanagement rowtotal, rowmin, rowmax,](http://slidetodoc.com/presentation_image_h2/fa5ecdba444eb26c9291b46715b851b5/image-12.jpg)
The direct approach egen [type] newvar=fcn(arguments) [if] [in] [, options] Datamanagement rowtotal, rowmin, rowmax, rowfirst, rowlast, rowmean, rowmedian, rowmiss, rownonmiss, rowpctile, rowsd, concat, anycount, anymatch, anyvalue, count, diff, fill, group, iqr, kurt, max, mdev, mean, median, min, mode, mtr, pctile, rank, sd, seq, skew, std, tag, total
![The direct approach egen [type] newvar=fcn(arguments) [if] [in] [, options] Datamanagement Ex. : egen The direct approach egen [type] newvar=fcn(arguments) [if] [in] [, options] Datamanagement Ex. : egen](http://slidetodoc.com/presentation_image_h2/fa5ecdba444eb26c9291b46715b851b5/image-13.jpg)
The direct approach egen [type] newvar=fcn(arguments) [if] [in] [, options] Datamanagement Ex. : egen income=rowtotal(inc*) Inc. Jan Inc. Feb Inc. Mar Inc. Apr Inc. May Inc. Jun Inc. Jul … income rowtotal, rowmin, rowmax, rowfirst, rowlast, rowmean, rowmedian, rowmiss, rownonmiss, rowpctile, rowsd, concat, anycount, anymatch, anyvalue, count, diff, fill, group, iqr, kurt, max, mdev, mean, median, min, mode, mtr, pctile, rank, sd, seq, skew, std, tag, total

Looking under the skirts – just for inspiration viewsource _growmin. ado the rowmin() function of egen program define _growmin version 6, missing gettoken type 0 : 0 gettoken g 0 : 0 gettoken eqs 0 : 0 syntax varlist [if] [in] [, BY(string)] if `"`by'"' != "" { _egennoby rowmin() `"`by'"' } end tempvar touse mark `touse' `if' `in' quietly { gen `type' `g' =. tokenize `varlist' while "`1'"!="" { replace `g' = cond(`1' < `g', `1', `g') mac shift } }

Looking under the skirts – just for inspiration viewsource _growmin. ado the rowmin() function of egen program define _growmin version 6, missing gettoken type 0 : 0 gettoken g 0 : 0 gettoken eqs 0 : 0 syntax varlist [if] [in] [, BY(string)] if `"`by'"' != "" { _egennoby rowmin() `"`by'"' } 1. 2. 3. 4. 5. 6. tempvar touse mark `touse' `if' `in' quietly { gen `type' `g' =. tokenize `varlist' while "`1'"!="" { replace `g' = cond(`1' < `g', `1', `g') mac shift } } end 1. Initialize target variable 2. Prepare the variable-list 3. Looping: 4. In-the-loop-commands

Prepare the variable-list. local vars inc. Jan inc. Feb inc. Mar inc. Apr inc. May inc. Jun /// inc. Jul inc. Aug inc. Sep inc. Oct inc. Nov inc. Dec Full specification of each and every variable – OK with 12 but what in case of hundreds? The list is stored in `vars'. unab vars: inc* . unab vars: inc. Jan-inc. Dec Variables can be specified with wildcards - The expanded list is stored in `vars' (unab means unabbreviate – however the command itself can’t be un-abbreviated) . ds inc*. ds inc. Jan-inc. Dec inc. Jan inc. Feb inc. Mar inc. Apr inc. May inc. Jun inc. Jul inc. Aug inc. Sep inc. Oct inc. Nov inc. Dec Variables can be specified with wildcards - The list is stored in `r(varlist)’ Nice feature: the expanded list is shown for inspection 1. Initialize target variable 2. Prepare the variable-list 3. Looping: 4. In-the-loop-commands

Looping ”foreach” is the quickest and the most transparent loop command foreach lvar in inc. Jan inc. Feb { // do stuff with "`lvar'” } unab lvar: inc* foreach lvar in `lvar' { // do stuff with "`lvar'” } ds inc* foreach lvar in `r(varlist)' { // do stuff with "`lvar'” } 1. Initialize target variable 2. Prepare the variable-list 3. Looping: 4. In-the-loop-commands

Looping Hold + press … Left single-quote 0 9 altloop command 6 ”foreach” is the quickest and the most transparent = ` on numeric keypad foreach lvar in inc. Jan inc. Feb { // do stuff with "`lvar'” } Hold + press … alt 0 3 Right single-quote 9 = ’ on numeric keypad unab lvar: inc* foreach lvar in `lvar' { // do stuff with "`lvar'” } ds inc* foreach lvar in `r(varlist)' { // do stuff with "`lvar'” } 1. Initialize target variable 2. Prepare the variable-list 3. Looping: 4. In-the-loop-commands

In the loop generate minimum=. unab vars: inc* foreach lvar in `vars' { replace minimum = cond(`lvar' < minimum, `lvar’, minimum) } generate minimum=. unab vars: inc* foreach lvar in `vars' { replace minimum = `lvar’ if `lvar’<minimum } generate minimum=. unab vars: inc* foreach lvar in `vars' { if `lvar’<minimum { replace minimum = `lvar’ } } ! 1. Initialize target variable 2. Prepare the variable-list 3. Looping: 4. In-the-loop-commands

Some of the danish participants who might know ”the DREAM database” will propably be able to see how these approaches can be useful when working with this fantastic but difficult construction.

Thank you very much
- Slides: 21