Introduction to SAS Essentials Mastering SAS for Data

  • Slides: 48
Download presentation
Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward

Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward 1 SAS ESSENTIALS -- Elliott & Woodward

Chapter 5 Preparing to Use SAS Procedures 2 SAS ESSENTIALS -- Elliott & Woodward

Chapter 5 Preparing to Use SAS Procedures 2 SAS ESSENTIALS -- Elliott & Woodward

LEARNING OBJECTIVES � To be able to use SAS® Support Statements � To be

LEARNING OBJECTIVES � To be able to use SAS® Support Statements � To be able to use TITLE and FOOTNOTE � To be able to include comments in your code � To be able to use RUN and QUIT correctly � To understand SAS PROC statement syntax � To be able to use VAR statements � To be able to use BY statements � To be able to use ID statements � To be able to use LABEL statements in a SAS procedure � To be able to use WHERE statements � To be able to use PROC PRINT � Going Deeper: To be able to use common System Options � Going Deeper: To be able to split column titles 3 SAS ESSENTIALS -- Elliott & Woodward

5. 1 UNDERSTANDING SAS SUPPORT STATEMENTS Using TITLE and FOOTNOTES Statements � Specify up

5. 1 UNDERSTANDING SAS SUPPORT STATEMENTS Using TITLE and FOOTNOTES Statements � Specify up to 10 titles or footnotes TITLE ‘title text’; FOOTNOTE ‘footnote text’; or TITLEn ‘title text’; FOOTNOTEn ‘footnote text 4 First line of either Title or Footnote Define line 2 to 9 of Titles or Footnotes SAS ESSENTIALS -- Elliott & Woodward

TITLE and FOOTNOTES Examples TITLE 'The first line of the title'; TITLE 2 'The

TITLE and FOOTNOTES Examples TITLE 'The first line of the title'; TITLE 2 'The second line of the title'; TITLE 5 'Several lines skipped, then this title on the fifth line'; FOOTNOTE 'This is a footnote'; FOOTNOTE 3 'This is a footnote, line 3'; � Cancel all TITLE and FOOTNOTE lines with the statement TITLE; FOOTNOTE; � Do Hands On Exercise P 114 5 SAS ESSENTIALS -- Elliott & Woodward

HANDS ON EXAMPLE P 114 (DTITLE 1. SAS) Creates titles on lines 1, 2

HANDS ON EXAMPLE P 114 (DTITLE 1. SAS) Creates titles on lines 1, 2 and 4, and a Footnote on line 1 of footnotes. The title for line 1 is retained, TITLE 4 is erased, FOOTNOTE 1 remains. All titles and footnotes are erased. 6

CUSTOMIZING TITLES AND FOOTNOTES � There a number of options that can be used

CUSTOMIZING TITLES AND FOOTNOTES � There a number of options that can be used with the TITLE or FOOTNOTE statements to customize the look of your title. � Some of these options include specifying color with a C= or COLOR= option. For example: TITLE C=BLUE H=5 "This is a title"; � Title appears in (C= or COLOR=) blue with a height (H= or HEIGHT= ) larger than normal. (H=1 is default) � See Appendix A for colors, fonts, and other options. 7 SAS ESSENTIALS -- Elliott & Woodward

Customizing Titles and Footnotes � There a number of options that can be used

Customizing Titles and Footnotes � There a number of options that can be used with the TITLE or FOOTNOTE statements to customize the look of your title. � Some of these options include specifying color with a C= or COLOR= option. For example: TITLE C=BLUE H=5 "This is a title"; � This title appears in the color (C= or COLOR=) blue with a height (H= or HEIGHT= ) larger than normal. � Do Hands on Example p 116. 8 SAS ESSENTIALS -- Elliott & Woodward

9 SAS ESSENTIALS -- Elliott & Woodward

9 SAS ESSENTIALS -- Elliott & Woodward

Including Comments in Your SAS Code � It is a good programming practice to

Including Comments in Your SAS Code � It is a good programming practice to include explanatory comments in your code. There are two options for putting comments in your code � Method 1 - Begin with an asterisk (*), and end with a semi-colon (; ). *This is a message It can be several lines long But it always ends with an ; ******************************* * Boxed messages stand out more, still end in a semicolon * *******************************; DATA MYDATA; * You can put a comment on a line of code; 10 SAS ESSENTIALS -- Elliott & Woodward

� Comments Method 2 – Begin with /* and end with */ /*This is

� Comments Method 2 – Begin with /* and end with */ /*This is a SAS comment*/ The code from /* to */ Is ignored by SAS /* Use this comment technique to comment out lines of code PROC PRINT; These semi-colons are ignored. PROC MEANS; End of comment – the PROCS were ignored*/ 11 SAS ESSENTIALS -- Elliott & Woodward

Using RUN and QUIT Statements � The RUN statement causes previously entered SAS statements

Using RUN and QUIT Statements � The RUN statement causes previously entered SAS statements to be executed. It is called a boundary statement. For example: PROC PRINT; PROC MEANS; RUN; � Another boundary statement is the QUIT statement. It is sometimes used in conjunction with a RUN statement to cease an active procedure. For example: PROC REG; RUN; QUIT; 12 SAS ESSENTIALS -- Elliott & Woodward

5. 2 UNDERSTANDING PROC STATEMENT SYNTAX � Although there are scores of SAS PROCs

5. 2 UNDERSTANDING PROC STATEMENT SYNTAX � Although there are scores of SAS PROCs (procedures), the syntax is consistent across all of them. The general syntax of the SAS PROC statement is: PROC name options; Statements/statementoptions; . . . etc. . . Statements/statementoptions; RUN; 13 SAS ESSENTIALS -- Elliott & Woodward

Four parts of a PROC Statement PROC name options; statements/statementoptions; The name of the

Four parts of a PROC Statement PROC name options; statements/statementoptions; The name of the SAS procedure such as MEANS or PRINT. 14 OPTIONS appear BEFORE the semicolon. Typical options are DATA=, NOPRINT, and others – typically deal with the data set or output. STATEMENTS appear as a separate “phrase” (with its on semicolon. ) These usually specify options within the procedure. There may be multiple Statements. SAS ESSENTIALS -- Elliott & Woodward Statements may have their own options following a slash (/).

Example PROC Syntax (Options) � The most commonly used option within the PROC statement

Example PROC Syntax (Options) � The most commonly used option within the PROC statement is the DATA= option. For example: PROC PRINT DATA=MYDATA; RUN; Note that options appear in the PROC statement BEFORE the semicolon. � The DATA= option tells SAS which data set to use in the analysis. 15 SAS ESSENTIALS -- Elliott & Woodward

Example PROC Syntax (Statements) � Procedure statements are often required to indicate information about

Example PROC Syntax (Statements) � Procedure statements are often required to indicate information about how an analysis is to be performed. For example: PROC PRINT DATA=MYDATA; VAR ID GROUP TIMEl TIME 2; RUN; STATEMENTS appear AFTER the first PROC semicolon 16 SAS ESSENTIALS -- Elliott & Woodward

Example PROC Syntax (Statement Options) � Statements can themselves have options. For example: PROC

Example PROC Syntax (Statement Options) � Statements can themselves have options. For example: PROC FREQ DATA=MYDATA; TABLES GROUP*SOCIO/CHISQ; RUN; This is a statement option – note that it follows a slach (/) within the Statement 17 SAS ESSENTIALS -- Elliott & Woodward

Summary of typical PROC Syntax PROC name PROC option PROC FREQ DATA=MYDATA; TABLES GROUP*SOCIO/CHISQ;

Summary of typical PROC Syntax PROC name PROC option PROC FREQ DATA=MYDATA; TABLES GROUP*SOCIO/CHISQ; RUN; PROC Statement 18 PROC Statement Option SAS ESSENTIALS -- Elliott & Woodward

Common PROC options �Some typical options that are COMMON to most PROCS include: �DATA=

Common PROC options �Some typical options that are COMMON to most PROCS include: �DATA= �NOPRINT �OUT= 19 Specify data set to use in the analysis Do not display certain output Send results to an output data set SAS ESSENTIALS -- Elliott & Woodward

COMMON PROC STATEMENTS 20 SAS ESSENTIALS -- Elliott & Woodward

COMMON PROC STATEMENTS 20 SAS ESSENTIALS -- Elliott & Woodward

Common PROC Statements � VAR variable(s); Instructs SAS to use only the variables in

Common PROC Statements � VAR variable(s); Instructs SAS to use only the variables in the list for the analysis. � BY variable(s); Repeats the procedure for each different value of the named variable(s). (The data set must first be sorted by the variables listed in the BY statement. ) � ID variable(s); Instructs SAS to use the specified variable as an observation identifier in a listing of the data. � LABEL var='label'; Assigns a descriptive label to a variable. � WHERE (expression); Instructs SAS to select only those observations for which the expression is true. 21 SAS ESSENTIALS -- Elliott & Woodward

OPTIONS Specific to a PROC � In PROC MEANS, for example, the MAXDEC option

OPTIONS Specific to a PROC � In PROC MEANS, for example, the MAXDEC option specified how many decimal places to report. � The options specific for PROC will be covered as the PROCS are introduced PROC MEANS DATA=“C: SASDATASOMEDATA” MAXDEC=2; RUN; Notice that BOTH the DATA= and the MAXDEC= options are within the first semicolon. 22 SAS ESSENTIALS -- Elliott & Woodward

Using the VAR Statement in a SAS Procedure � The VAR Statement is often

Using the VAR Statement in a SAS Procedure � The VAR Statement is often used to specify a list of variables to use in an analysis VAR varlist; � An example is as follows: PROC MEANS; VAR HEIGHT WEIGHT AGE; RUN; 23 SAS ESSENTIALS -- Elliott & Woodward

Listing a range of Variables �List a range of variables with consecutive numeric suffixes

Listing a range of Variables �List a range of variables with consecutive numeric suffixes such as Q 1 , Q 2, Q 3, etc. to Q 50 using a single dash between the first and last: Q 1 -Q 50: VAR Q 1 -Q 50; �List a range of variables without consecutive suffixes with two dashes. Example: VAR ID - - TIME 4; 24 USE two dashes to indicate all variables between the indicated names SAS ESSENTIALS -- Elliott & Woodward

Using the BY Statement in a SAS Procedure � The BY statement allows you

Using the BY Statement in a SAS Procedure � The BY statement allows you to quickly analyze subsets of your data. Repeat an analysis for each value in BY (data must be in sorted order by BY variable. ) Example: PROC SORT DATA="C: SASDATASOMEDATA" OUT=SORTED; Note OUT= in OPTIONS BY GP; RUN; PROC MEANS DATA=SORTED MAXDEC=2; BY GP; BY is used first to SORT, RUN; then to request analysis by group. 25 SAS ESSENTIALS -- Elliott & Woodward Sort

HANDS-ON EXERCISE P 121 � Open the File DSORTMEANS. SAS SORT the data set

HANDS-ON EXERCISE P 121 � Open the File DSORTMEANS. SAS SORT the data set by GP, then use the BY statement in a PROC Statement to do analyses by the indexed values (GP) 26 SAS ESSENTIALS -- Elliott & Woodward

RESULTS � Multiple results displayed GP is the BY variable… thus multiple analyses BY

RESULTS � Multiple results displayed GP is the BY variable… thus multiple analyses BY GP EXERCISE – Change the BY value to STATUS instead of GP (Sort first). Rerun the analysis. PAUSE. Continue once you have completed this exercise 27 SAS ESSENTIALS -- Elliott & Woodward

RESULTS Now the output contains analyses by the values of STATUS. 28

RESULTS Now the output contains analyses by the values of STATUS. 28

5. 3 USING THE ID STATEMENT IN A SAS PROCEDURE � The ID statement

5. 3 USING THE ID STATEMENT IN A SAS PROCEDURE � The ID statement provides you with a way to increase the readability of your output. It instructs SAS to use the specified variable as an observation identifier in a listing of the data (Instead of the OBS column. ) * FIRST VERSION; PROC PRINT DATA=MYSASLIB. SOMEDATA; RUN; * SECOND VERSION; PROC PRINT DATA=MYSASLIB. SOMEDATA; ID RAT_ID; RUN; RAT_ID is the ID variable 29 SAS ESSENTIALS -- Elliott & Woodward

EXAMPLE OF ID STATEMENT Ø Observe Results, first without ID Statement Ø Second time

EXAMPLE OF ID STATEMENT Ø Observe Results, first without ID Statement Ø Second time with ID Statement Notice how the Obs statement was replaced with the RAT_ID column 30 SAS ESSENTIALS -- Elliott & Woodward

5. 4 USING THE LABEL STATEMENT IN A SAS PROCEDURE � Aversion of the

5. 4 USING THE LABEL STATEMENT IN A SAS PROCEDURE � Aversion of the LABEL statement allows you to create labels for variable names within a procedure. LABEL var='label'; � Assigns a descriptive label to a variable. Example: PROC PRINT LABEL; ID RAT_ID; LABEL TRT='Treatment'; RUN; 31 NOTE: This assignment of a LABEL only works during this PROC, unlike the LABEL statement in a DATA Step that is saved within the data set. SAS ESSENTIALS -- Elliott & Woodward

Hands-On Exercise p 124 (D_ID. SAS) � Example of the LABEL Statement in a

Hands-On Exercise p 124 (D_ID. SAS) � Example of the LABEL Statement in a PROC � This code: PROC MEANS DATA=WEIGHT; LABEL WT_GRAMS="Treatment" MDATE="MEDOBS Date"; RUN; � Produces this output 32 SAS ESSENTIALS -- Elliott & Woodward

5. 5 USING THE WHERE STATEMENT IN A SAS PROCEDURE � The WHERE statement

5. 5 USING THE WHERE STATEMENT IN A SAS PROCEDURE � The WHERE statement allows you to specify a conditional criterion for which output will be included in an analysis. Example WHERE TRT="A"; within a PROC statement causes the procedure to only use records in the dataset that match the criteria TRT=“A”. 33 SAS ESSENTIALS -- Elliott & Woodward

HANDS ON EXERCISE P 125 � Using the code from the previous Hands-On Example

HANDS ON EXERCISE P 125 � Using the code from the previous Hands-On Example (D_ID. SAS), modify the PROC MEANS statement: PROC MEANS DATA=SORTED; VAR TIME 1 TIME 2; BY STATUS; RUN; EXERCISE - Add the following statement after the BY Statement, and before the RUN statement WHERE STATUS LT 4; Run the edited program and observe the results. PAUSE – Continue once you’ve completed this exercise. 34 SAS ESSENTIALS -- Elliott & Woodward

RESULTS Only the first 3 STATUS value analyses appear in the output. 35

RESULTS Only the first 3 STATUS value analyses appear in the output. 35

Do Hands On Example p 125 � Example of the WHERE statement PROC PRINT

Do Hands On Example p 125 � Example of the WHERE statement PROC PRINT LABEL DATA=WEIGHT; ID RAT_ID; This code produces this output… for only TRT=“A” LABEL TRT='Treatment ' ; WHERE TRT="A"; RUN; 36 SAS ESSENTIALS -- Elliott & Woodward

5. 6 USING PROC PRINT Although several previous examples have used a simple version

5. 6 USING PROC PRINT Although several previous examples have used a simple version of the PROC PRINT procedure, a number of options for this procedure have not been discussed. Here are common options: 37 SAS ESSENTIALS -- Elliott & Woodward

Common Statements for PROC PRINT Ø For example, the SUM statement specifies that a

Common Statements for PROC PRINT Ø For example, the SUM statement specifies that a sum of the values for the variables listed is to be reported. SUM COST; 38 SAS ESSENTIALS -- Elliott & Woodward

Do Hands On Example p 127 � Using APRINT 1. SAS PROC PRINT DATA="C:

Do Hands On Example p 127 � Using APRINT 1. SAS PROC PRINT DATA="C: SASDATASOMEDATA" N = 'Number of Subjects is: ' Obs='Subjects'; SUM TIME 1 TIME 2 TIME 3 TIME 4; TITLE 'PROC PRINT Example'; RUN; 39 SAS ESSENTIALS -- Elliott & Woodward

Output from example showing PROC PRINT options and statement results 40 SAS ESSENTIALS --

Output from example showing PROC PRINT options and statement results 40 SAS ESSENTIALS -- Elliott & Woodward

5. 7 GOING DEEPER: SPLITTING COLUMN TITLES IN PROC PRINT � Normally, SAS splits

5. 7 GOING DEEPER: SPLITTING COLUMN TITLES IN PROC PRINT � Normally, SAS splits titles at blanks when needed to conserve space in a report. � If you want a different look, you can tell SAS where you want the labels to be split using the SPLIT= option. For example: PROC PRINT DATA=SOMEDATASET; SPLIT='*‘ LABEL INC_KEY='Subject*ID*======' AGE='Age in*2014*======' GENDER='Gender* *======‘; � In this code, SAS splits the labels where it sees an asterisk. Do the Hands On Example p 129 (APRINT 3. SAS). 41 SAS ESSENTIALS -- Elliott & Woodward

Results of using the Split Option Note how the splits for labels occur –

Results of using the Split Option Note how the splits for labels occur – according to where the asterisks were in the code. 42 SAS ESSENTIALS -- Elliott & Woodward

5. 8 GOING DEEPER: COMMON SYSTEM OPTIONS � Although not a part of a

5. 8 GOING DEEPER: COMMON SYSTEM OPTIONS � Although not a part of a PROC statement, System Options can be used to customize the way output is displayed or how data in a data set is used. � This section introduces some commonly used options. System Options are specified using the OPTIONS statement. The syntax for the OPTIONS statement is OPTIONS option 1 option 2. . . ; � For example OPTIONS ORIENTATION=LANDSCAPE; 43 SAS ESSENTIALS -- Elliott & Woodward

Common System Options (See Table 5. 14) Common System Options Meaning FIRSTOBS=n and OBS=n;

Common System Options (See Table 5. 14) Common System Options Meaning FIRSTOBS=n and OBS=n; Specifies the first observation to be used in a data set (FIRSTOBS=) and the last observation to be used (OBS= ). For example OPTIONS FIRSTOBS=21; causes SAS to use data records 2 through 21 in any subsequent analysis. When this option is set, it is usually a good idea to reset the values to OPTIONS FIRSTOBS=1; OBS=MAX; at the end of the program so subsequent analyses are not limited by the same options. YEAR CUTOFF= year Specifies the cutoff year for two digit dates in a 100 year span starting with the specified date. For example if YEARCUTOFF=1920 then the data 01115119 would be considered 2019 while 01/15/21 would be seen as 1921. The default YEARCUTOFF is 1926. (For SAS versions 9 through 9. 3, the cutoff year was 1920. ) 44 SAS ESSENTIALS -- Elliott & Woodward

Common System Options (continued) System Options Meaning PROBSIG=n Specifies the number of decimals used

Common System Options (continued) System Options Meaning PROBSIG=n Specifies the number of decimals used when reporting p-values. For example PROBSIG=3 would cause p-values to be reported to three decimal places. LINESIZE= n and PAGESIZE= n Controls number of characters in an output line (LINESIZE) or number of lines on a page (PAGESIZE) for RTF and PDF output. NONUMBER Specifies no page numbers will be included in RTF or PDF output. NODATE Specifies no date will be included in RTF or PDF output. ORIENTATION=option Specified paper orientation. Options are PORTRAIT or LANDSCAPE for RTF or PDF output. NOCENTER 45 Left justifies output (default is centered) SAS ESSENTIALS -- Elliott & Woodward

Do Hands On Exercise p 132 � (SYSOBS. SAS) Sets system option so only

Do Hands On Exercise p 132 � (SYSOBS. SAS) Sets system option so only records 11 to 20 are used in any future data sets. OPTIONS FIRSTOBS=11 OBS=20; PROC PRINT LABEL DATA="C: SASDATASOMEDATA"; RUN; OPTIONS FIRSTOBS=1 OBS=MAX; It is important to reset the options to the defaults to avoid an error in future data sets 46 SAS ESSENTIALS -- Elliott & Woodward

5. 9 SUMMARY � This chapter introduced you to the syntax of SAS procedures

5. 9 SUMMARY � This chapter introduced you to the syntax of SAS procedures in preparation for using specific PROCs discussed in the remainder of the book. It also introduced PROC PRINT and illustrated some of the common options used for this procedure. � Continue to Chapter 6: SAS® ADVANCED PROGRAMMING TOPICS PART 1 47 SAS ESSENTIALS -- Elliott & Woodward

These slides are based on the book: Introduction to SAS Essentials Mastering SAS for

These slides are based on the book: Introduction to SAS Essentials Mastering SAS for Data Analytics, 2 nd Edition By Alan C, Elliott and Wayne A. Woodward Paperback: 512 pages Publisher: Wiley; 2 edition (August 3, 2015) Language: English ISBN-10: 111904216 X ISBN-13: 978 -1119042167 These slides are provided for you to use to teach SAS using this book. Feel free to modify them for your own needs. Please send comments about errors in the slides (or suggestions for improvements) to acelliott@smu. edu. Thanks. 48 SAS ESSENTIALS -- Elliott & Woodward