Computing for Research I Spring 2012 Stata Graphics
- Slides: 27
Computing for Research I Spring 2012 Stata Graphics February 16 Primary Instructor: Elizabeth Garrett-Mayer
Basic syntax for commands • prefix: command varlist, options • Examples: – regress y x, level(90) – by race: sum y x, detail – ttest y, by(x) unequal
Stata Graphics • Maybe we can just end class now! • Check out these links: – http: //www. ats. ucla. edu/stata/library/Graph Examples/default. htm – http: //www. ats. ucla. edu/stata/topics/graphi cs. htm – http: //data. princeton. edu/stata/graphics. html – http: //www. stata. com/capabilities/graphics. html
Basic univariate displays • • Boxplots Stem and leaf Histograms Density plots
Ceramide Data • • Let’s look at the ceramide markers What are their distributions? Are there outliers? Should we consider taking logs, or using % change? Results of a phase II trial of gemcitabine plus doxorubicin in patients with recurrent head and neck cancers: serum C₁₈-ceramide as a novel biomarker for monitoring response. Saddoughi SA, Garrett-Mayer E, Chaudhary U, O'Brien PE, Afrin LB, Day TA, Gillespie MB, Sharma AK, Wilhoit CS, Bostick R, Senkal CE, Hannun YA, Bielawski J, Simon GR, Shirai K, Ogretmen B. Clin Cancer Res. 2011 Sep 15; 17(18): 6097 -105. Epub 2011 Jul 26.
Histogram • hist c 18
Let’s make it prettier * prettier histograms hist c 18 , freq xaxis(1 2) ylabel(0(2)24) xlabel(20 "Twenty" 40 "Forty") hist c 18, title("Histogram of C 18 Ceramide") subtitle("PI: K. Shirai") hist c 18, ytitle("number of patients") freq yline(0(10)20) hist c 18, xaxis(1 2) xlabel(19. 6 "mean" 11. 9 "median", axis(2) grid) finding help on these can sometimes be tricky! e. g. help axis_choice_options
Boxplots • graph box c 18
Boxplots graph box c 18, by(cycle) graph box c 18, over(cycle) tab cycle graph box c 18 if cycle<7, over(cycle) sort patient cycle merge m: 1 patient using "Ptdata. Gem. Dox. dta" graph box c 18 if cycle<7, over(cycle) over(gender) graph hbox c 18, over(initial) capsize(5)
graph hbox c 18, over(initial) capsize(5) graph hbox c 18, over(initial) medtype(marker)medmarker(msymbol(+) msize(large)) graph hbox c 18, over(initial) ytitle(“C 18”)
Labels • Sometimes xlabels cannot be applied (e. g. boxplots) • need to label your values • Example: cycle for boxplots – label define cycle 1 "cycle 1" 3 "cycle 3" 5 "cycle 5" 7 "cycle 7" – label values cycle – graph box c 18 if cycle<7, over(cycle) • (Hint: use this on the homework!)
Stem and Leaf. stem c 18 Stem-and-leaf plot for c 18 ceramide (C 18 ceramide) c 18 ceramide rounded to nearest multiple of. 1 plot in units of. 1 0** 1** 2** 3** 4** 5** 6** | | | | 42, 43, 44, 46 57, 67, 81, 89, 90, 96, 98, 99 01, 06, 08, 14, 15, 19, 20, 35, 44 62 03, 15, 16, 18, 19, 22 82 17 23, 49 58, 68 37 86
Dotplot • Excellent way to show data across groups when you have a relatively small dataset • dotplot y, over(group) dotplot dotplot c 18, c 18, over(cycle) over(gender) nogroup jitter(3) over(gender) nogroup median center
Dotplot, by gender
Scatterplots • Two way graph • Syntax: – graph twoway scatter y x 1 x 2 – graph twoway scatter y x 1 • Example: – graph twoway scatter c 18 totalceramide
Regression example • • Scatterplot Residual plots Leverage Fitted line with raw data
Code graph twoway scatter c 18 totalcer regress c 18 totalcer * residual plot * (residual vs. fitted) rvfplot * the long way * 1. generate a new variable from the regression, residuals predict resid, res * 2. generate a new variable from the regression, fitted values predict fit scatter res fit, yline(0) * leverage vs. residual plot lvr 2 plot * take transform of C 18? gladder c 18 boxcox c 18 * generate new variable gen logc 18=log(c 18) scatter logc 18 totalcer, mlabel(gender) s(i) scatter logc 18 totalcer, s(Oh) * redo regression regress logc 18 totalcer rvfplot, yline(0) lvr 2 plot predict logfit * make plot of fitted model and raw data scatter logfit logc 18 totalcer, s(i o) c(l. ) graph twoway scatter logfit totalcer, s(i) c(l) || scatter logc 18 totalcer, s(o) c(. )
The next graph to create
Fancier way to put regression lines infile str 14 country setting effort change /// using http: //data. princeton. edu/wws 509/datasets/effort. raw graph twoway scatter change setting (scatter change setting ) (lfit change setting ) (scatter change setting ) (qfit change setting ) (scatter change setting ) (lfitci change setting ) • scatter makes a scatterplot of the two variables • lfit plots the regression line of y on x • qfit plots a fitted quadratic model of y on x • lfitci plots the line AND a confidence interval!
Fancier way to put regression lines Plot using qfit Plot using lfitci
graph twoway (lfitci change setting) • • • (scatter change setting, mlabel(country) ) One slight problem with the labels is the overlap of Costa Rica and Trinidad Tobago (and to a lesser extent Panama and Nicaragua). We can solve this problem by specifying the position of the label relative to the marker using a 12 -hour clock (so 12 is above, 3 is to the right, 6 is below and 9 is to the left) and the mlabv() option. We create a variable to hold the position set by default to 3 o'clock and then move Costa Rica to 9 o'clock and Trinidad Tobago to just a bit above that at 11 o'clock (we can also move Nicaragua and Panama up a bit, say to 2 o'clock).
gen pos=3 replace pos = 11 if country == "Trinidad. Tobago" replace pos = 9 if country == "Costa. Rica" replace pos = 2 if country == "Panama" | country == "Nicaragua“ graph twoway (lfitci change setting) /// (scatter change setting, mlabel(country) mlabv(pos) )
Legends graph twoway (lfitci change setting) /// (scatter change setting, mlabel(country) mlabv(pos) ) /// , title("Fertility Decline by Social Setting") /// ytitle("Fertility Decline") /// legend(ring(0) pos(5) order(2 "linear fit" 1 "95% CI")) graph twoway (lfitci change setting) /// (scatter change setting, mlabel(country) mlabv(pos) ) /// , title("Fertility Decline by Social Setting") /// ytitle("Fertility Decline") /// legend(off)
Spaghetti plots Command available from UCLA: spagplot * spaghetti plots clear insheet using "I: MUSC OncologyShirai, KeisukeOctober 2010ceramide. csv" findit spagplot c 18 cycle, id(patient) nofit * remove patients who only have cycle=1 sort patient cycle by patient: gen visit=_n egen maxvis=max(visit), by(patient) spagplot c 18 cycle if maxvis>1, id(patient) nofit * or, use c(L) graph twoway scatter c 18 cycle if maxvis>1, c(L) help connectstyle
other neat stuff • graph matrix • saving graphs: click and save as desired format • saving and combining (see princeton site, section 3. 3) – http: //data. princeton. edu/stata/graphics. html • See Graph. Examples on ucla site: – http: //www. ats. ucla. edu/stata/library/Graph. Examples/
- Kim ki duk summer fall winter spring
- Winter spring summer and fall
- Computer graphics
- Introduction to computer graphics ppt
- Conventional computing and intelligent computing
- Gay 2012 descriptive research
- Fspos
- Typiska drag för en novell
- Nationell inriktning för artificiell intelligens
- Ekologiskt fotavtryck
- Varför kallas perioden 1918-1939 för mellankrigstiden
- En lathund för arbete med kontinuitetshantering
- Underlag för särskild löneskatt på pensionskostnader
- Tidbok yrkesförare
- Sura för anatom
- Förklara densitet för barn
- Datorkunskap för nybörjare
- Stig kerman
- Hur skriver man en tes
- För och nackdelar med firo
- Nyckelkompetenser för livslångt lärande
- Påbyggnader för flakfordon
- Lufttryck formel
- Offentlig förvaltning
- I gullregnens månad
- Presentera för publik crossboss
- Teckenspråk minoritetsspråk argument
- Bat mitza