Introduction to SAS Essentials Mastering SAS for Data

  • Slides: 50
Download presentation
Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward

Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward 1 SAS Essentials - Elliott & Woodward

Intro to SAS Chapter 3, Part 1 2 SAS Essentials - Elliott & Woodward

Intro to SAS Chapter 3, Part 1 2 SAS Essentials - Elliott & Woodward

Chapter 3 LEARNING OBJECTIVES � To be able to work with SAS libraries and

Chapter 3 LEARNING OBJECTIVES � To be able to work with SAS libraries and permanent data sets � To be able to read and write permanent SAS data sets � To be able to interactively import data from another program � To be able to define SAS libraries using program code � To be able to import data using code � To be able to discover the contents of an SAS data set � To be able to understand how the Data Step Reads and Stores Data. 3 SAS Essentials - Elliott & Woodward

3. 1 WORKING WITH SAS LIBRARIES �Permanent Data Sets �All of the data sets

3. 1 WORKING WITH SAS LIBRARIES �Permanent Data Sets �All of the data sets we’ve created have been “Work” data sets. �Work datasets vanish when you end a SAS session �However, we can create permanent data sets �First, we need a location (library or folder) �Then we store data sets in that library (or folder) 4 SAS Essentials - Elliott & Woodward

Data Set vs Code � The SAS data set is a separate file –

Data Set vs Code � The SAS data set is a separate file – like a Word. doc file or an Excel. xls file � The data set is not the SAS code that we’ve used up to this point – the code creates the data set. � SAS data sets have the extension. sas 7 bdat 5 SAS Essentials - Elliott & Woodward

SAS Data Sets � are (usually) created by a DATA statement. � are an

SAS Data Sets � are (usually) created by a DATA statement. � are an internal representation of the data created by the DATA statement � contain more than the data values – they can contain variable names, labels, the results of codings, calculations and variable formats. � are referred to by a name that indicates whether the data set is temporary or permanent 6 SAS Essentials - Elliott & Woodward

Temporary vs Permanent � A temporary SAS data set is named with a single

Temporary vs Permanent � A temporary SAS data set is named with a single level name such as MEASLES, or MAR 2000. (technically the names are WORK. MEASLES and WORK. MAR 2000) � A permanent SAS data set is a file saved on your hard disk. � There are two ways to refer to a permanent data set � We can refer to a permanent SAS data set using a Windows filename such as “C: SASDATASOMEDATA” or “C: RESEARCHMEASLES 2009”. � Or by using a SAS library prefix: MYSASDATA. PEOPLE or RESEARCH. MEASLES 2009 where the � MYSASDATA library name = “C: SASDATA” � RESEARCH library name = “C: RESEARCH” 7 SAS Essentials - Elliott & Woodward

Creating a SAS Data Set 8 SAS Essentials - Elliott & Woodward

Creating a SAS Data Set 8 SAS Essentials - Elliott & Woodward

Two ways to refer to a permanent location LIBRARY NAME FOLDER LOCATION MYSASDATA “C:

Two ways to refer to a permanent location LIBRARY NAME FOLDER LOCATION MYSASDATA “C: SASDATA” RESEARCH “C: RESEARCH” Library Name & data set name Complete name MYSASDATA and REPORT MYSASDATA. REPORT “C: SASDATA” and REPORT “C: SASDATAREPORT” Thus… MYSASDATA and C: SASDATA refer to the same location C: SASDATAREPORT and MYSASDATA. REPORT Both refer to the SAS permanent data set named REPORT. SAS 7 BDAT located in the “C: SASDATA” folder 9 SAS Essentials - Elliott & Woodward The physical name of the file on the hard disk is REPORT. SAS 7 BDAT

Quick Quiz… � You have two SAS libraries defined: � LIB 1 refers to

Quick Quiz… � You have two SAS libraries defined: � LIB 1 refers to C: RESEARCHTAB 1 � LIB 2 refers to N: NETWORK 1. 2. 10 Where is the SAS dataset LIB 1. MYDATA stored on the hard disk? How would you refer to the data set “N: NETWORKJUNE 2013. SAS 7 BDAT” using a SAS library name? SAS Essentials - Elliott & Woodward

Quick Quiz… � You have two SAS libraries defined: � LIB 1 refers to

Quick Quiz… � You have two SAS libraries defined: � LIB 1 refers to C: RESEARCHTAB 1 � LIB 2 refers to N: NETWORK 1. 2. 11 Where is the SAS dataset LIB 1. MYDATA stored on the hard disk? C: RESEARCHTAB 1MYDATA. SASB 7 DAT How would you refer to the data set “N: NETWORKJUNE 2013. SAS 7 BDAT”using a SAS library name? SAS Essentials - Elliott & Woodward

Quick Quiz… � You have two SAS libraries defined: � LIB 1 refers to

Quick Quiz… � You have two SAS libraries defined: � LIB 1 refers to C: RESEARCHTAB 1 � LIB 2 refers to N: NETWORK 1. 2. 12 Where is the SAS dataset LIB 1. MYDATA stored on the hard disk? C: RESEARCHTAB 1MYDATA. SASB 7 DAT How would you refer to the data set “N: NETWORKJUNE 2013. SAS 7 BDAT”using a SAS library name? LIB 2. JUNE 2013 SAS Essentials - Elliott & Woodward

3. 2 CREATING PERMANENT SAS DATA SETS USING THE WINDOWS FILE NAME TECHNIQUE �

3. 2 CREATING PERMANENT SAS DATA SETS USING THE WINDOWS FILE NAME TECHNIQUE � Open the file WRITE. SAS DATA "C: SASDATAPEOPLE"; INPUT ID $ 1 SBP 2 -4 DBP 5 -7 GENDER $ 8 AGE 910 WT 11 -13; DATALINES; Remember in the DCOLUMN. SAS 1120 80 M 15115 program this line was 2130 70 F 25180 DATA MYDATA; 3140100 M 89170 4120 80 F 30150 5125 80 F 20110 ; RUN; PROC MEANS; RUN; 13 SAS Essentials - Elliott & Woodward

WHERE IS IT STORED? � Look at the log file NOTE: There were 5

WHERE IS IT STORED? � Look at the log file NOTE: There were 5 observations read from the data set WC 000001. PEOPLE. NOTE: PROCEDURE MEANS used (Total process time): real time 0. 03 seconds cpu time 0. 01 seconds SAS created a LIBRARY named WC 000001 and put the data set there. � Look at your SAS Explorer (may have to go back a folder) � Yours may have a slightly different name � Double click on the file and verify that it contains the data, then exit the viewer. � 14 SAS Essentials - Elliott & Woodward

However…. � While in this session, you can refer to the data file as

However…. � While in this session, you can refer to the data file as either "C: SASDATAPEOPLE” or WC 000001. PEOPLE �However… WC 000001 is not a permanent library… It goes away when you end the SAS Session. �However… the C: SASDATAPEOPLE. SASB 7 DAT file REMAINS ON YOUR HARD DISK – because you have created a permanent SAS data file. �When you begin SAS again, you can still refer to the data file as "C: SASDATAPEOPLE” 15 SAS Essentials - Elliott & Woodward

Let’s use that data � Using data in a permanent data file (named SOMEDATA.

Let’s use that data � Using data in a permanent data file (named SOMEDATA. SAS 7 BDAT, located at C: SASDATA. � Open the SAS program READFILE. SAS PROC MEANS DATA='c: sasdatasomedata'; RUN; � Note that there is no INPUT or INFILE statement. � Once you have a SAS data file, you do not need to create it again using a DATA step – just use it. � 16 Run this program. SAS Essentials - Elliott & Woodward

Results Variable Label ID ID Number AGE 17 N 50 Mean Std Dev Minimum

Results Variable Label ID ID Number AGE 17 N 50 Mean Std Dev Minimum Maximum 374. 22 167. 4983 101 604 50 10. 46 2. 426133 4 15 TIME 1 TIME 2 Age on Jan 1, 2000 Baseline 6 Months 50 50 21. 268 27. 44 1. 716955 2. 659062 17 21. 3 24. 2 32. 3 TIME 3 12 Months 50 30. 492 3. 025594 22. 7 35. 9 TIME 4 24 Months 50 30. 838 3. 530733 21. 2 36. 1 STATUS Socioecono mic Status SEX 50 3. 94 1. 331104 1 5 50 0. 494872 0 1 SAS Essentials - Elliott & Woodward

3. 3 CREATING PERMANENT SAS DATA SETS USING A SAS LIBRARY � Another technique

3. 3 CREATING PERMANENT SAS DATA SETS USING A SAS LIBRARY � Another technique for reading and writing SAS data sets is using the SAS Library technique. � Before using an SAS Library, you have to "create" one. � The SAS Library technique involves creating a nickname that refers to a drive location called (in SAS terminology) a library name. � For example, you could defina a library name such as MYWORK that refers to C: LOTSOFFILES � Or, you could create the MYWORK library name that refers to a complicated networkfolder such as N: MYNETACCOUNTINGSTATMENTSMYFOLDERRESEARCH 18 SAS Essentials - Elliott & Woodward

Using a Permanent Library Nickname � Once your library "nickname" is set up, you

Using a Permanent Library Nickname � Once your library "nickname" is set up, you can use the short name to refer to your disk location rather than having to use some complex Windows path name every time. With a library name, you could use the file designation: MYWORK. JAN 2016 �to refer to an SAS data file named JAN 2016. SAS 7 BDAT rather than the complicated name: "N: MYNETACCOUNTINGSTATMENTSMYFOLDERRESEARCHJAN 2016" 19 SAS Essentials - Elliott & Woodward

Work vs Permanent Data Files – How they are named � Within the SAS

Work vs Permanent Data Files – How they are named � Within the SAS program, every SAS data set has two parts to its name. For example, the SAS data set referred to as MYSASLIB. MAY 2000 consists of the following: � the library named MYSASLIB; � the data set named MAY 2000. � Even temporary files can be referred to with a temporary) library name, WORK. Therefore, a file named WORK. LOTSADATA consists of: � the library named WORK (refers to the temporary library); � the data set name is LOTSADATA. 20 SAS Essentials - Elliott & Woodward

3. 4 CREATING A SAS LIBRARY USING A DIALOG BOX � To use the

3. 4 CREATING A SAS LIBRARY USING A DIALOG BOX � To use the SAS Library file reference, you first have to “create” a library. � An easy way to create an SAS library in the Windows version of SAS (with a custom name of your own choosing) is to use the New Library dialog box. . 21 SAS Essentials - Elliott & Woodward

Display the SAS Library Dialog Box � To display the SAS Library dialog box,

Display the SAS Library Dialog Box � To display the SAS Library dialog box, make the SAS Explorer window active (click the Explorer tab at the bottom of the left window in the main SAS screen). You should see a window something like this… 22 SAS Essentials - Elliott & Woodward

� From the Windows Menu, choose File/New. This dialog dox should appear. Click Library,

� From the Windows Menu, choose File/New. This dialog dox should appear. Click Library, and Ok. The Library Dialog appears (next slide) 23 SAS Essentials - Elliott & Woodward

Figure 3. 2 New Library Dialog Box (p. 55) � Enter the information shown

Figure 3. 2 New Library Dialog Box (p. 55) � Enter the information shown on this slide. 24 SAS Essentials - Elliott & Woodward

Check to see what files are in the library… � On the left side

Check to see what files are in the library… � On the left side of your SAS screen, click the Explorer tab. You should see a Libraries icon that resembles a filing cabinet. � Double-click the Libraries icon. A new window called Active Libraries appears. You should see a library named MYSASLIB. � Double-click the MYSASLIB icon to display the contents of the MYSASLIB window. 25 SAS Essentials - Elliott & Woodward

 Contents of the MYSASLIB window � Yours may differ slightly… These are the

Contents of the MYSASLIB window � Yours may differ slightly… These are the names of the SAS data sets that should be in your C: SASDATA folder (now also referred to as the library MYSASLIB) when you installed the data sets for the book. 26 SAS Essentials - Elliott & Woodward

SOMEDATA Data Set � One data sets is named SOMEDATA. Double-click the SOMEDATA icon

SOMEDATA Data Set � One data sets is named SOMEDATA. Double-click the SOMEDATA icon to display its contents. (Partially shown here. ) 27 SAS Essentials - Elliott & Woodward

Use a Permanent Data Set in Code � Close the MYSASLIB. SOMEDATA data set

Use a Permanent Data Set in Code � Close the MYSASLIB. SOMEDATA data set by selecting File - Close or by clicking the X at the top right of the window. � Return to the SAS Editor window and enter this code: PROC MEANS DATA=MYSASLIB. SOMEDATA; RUN; � Run this program. Observe how you used the Library name MYSASLIB to refer to the SOMESDATA data set. 28 SAS Essentials - Elliott & Woodward

Another Example. � Using the WRITE. SAS code, change the DATA statement: DATA MYSASLIB.

Another Example. � Using the WRITE. SAS code, change the DATA statement: DATA MYSASLIB. PEOPLE 2; INPUT ID $ 1 SBP 2 -4 DBP 5 -7 GENDER $ 8 AGE 910 WT 11 -13; DATALINES; Change the DATA set 1120 80 M 15115 name to PEOPLE 2 using the library name 2130 70 F 25180 MYSASLIB. 3140100 M 89170 Run the program. This 4120 80 F 30150 creates a data set 5125 80 F 20110 named PEOPLE 2 in your ; MYSASLIB Library. Check it out… RUN; PROC MEANS; RUN; 29 SAS Essentials - Elliott & Woodward

MYSASLIB. People 2 Created � Notice that the SAS data set named PEOPLE 2

MYSASLIB. People 2 Created � Notice that the SAS data set named PEOPLE 2 has appeared in the MYSASLIB library. � Double click on it to verify its contents, then close the viewer. 30 SAS Essentials - Elliott & Woodward

Quick Quiz You run this code: LIBNAME NEWLIB "C: SASDATA"; PROC MEANS DATA=NEWLIB. SOMEDATA;

Quick Quiz You run this code: LIBNAME NEWLIB "C: SASDATA"; PROC MEANS DATA=NEWLIB. SOMEDATA; RUN; 1. 2. 31 Where on your hard drive is the data set SOMEDATA? What is its file name? SAS Essentials - Elliott & Woodward

Quick Quiz You run this code: LIBNAME NEWLIB "C: SASDATA"; PROC MEANS DATA=NEWLIB. SOMEDATA;

Quick Quiz You run this code: LIBNAME NEWLIB "C: SASDATA"; PROC MEANS DATA=NEWLIB. SOMEDATA; RUN; 1. 2. 32 Where on your hard drive is the data set SOMEDATA? C: SASDATA What is its file name? SAS Essentials - Elliott & Woodward

Quick Quiz You run this code: LIBNAME NEWLIB "C: SASDATA"; PROC MEANS DATA=NEWLIB. SOMEDATA;

Quick Quiz You run this code: LIBNAME NEWLIB "C: SASDATA"; PROC MEANS DATA=NEWLIB. SOMEDATA; RUN; Where on your hard drive is the data set SOMEDATA? C: SASDATA 2. What is its file name? SOMEDATA. SAS 7 BDAT Or C: SASDATASOMEDATA. SAS 7 BDAT 1. 33 SAS Essentials - Elliott & Woodward

3. 5 CREATING A SAS LIBRARY USING CODE � You can also create a

3. 5 CREATING A SAS LIBRARY USING CODE � You can also create a TEMPORARY SAS library reference in code using the following technique The LIBNAME statement creates a SAS Library. LIBNAME NEWLIB "C: SASDATA"; PROC MEANS DATA=NEWLIB. SOMEDATA; RUN; � This creates the library named NEWLIB (with the same location as MYSASLIB). However, this library reference is lost when you end the SAS session. � Run this program to verify that it works. Notice the NEWLIB icon in the Explorer. (If you look at the files in NEWLIB, they are the same as the ones in MYSASLIB. ) 34 SAS Essentials - Elliott & Woodward

DO HANDS-ON EXERCISE P 58 � Now do these follow-up exercises � Create the

DO HANDS-ON EXERCISE P 58 � Now do these follow-up exercises � Create the following SAS libraries using the wizard: MYLIB 1, MYLIB 2, MYLIB 3, MYLIB 4, MYLIB 5 � Create the following SAS libraries using the LIBNAME command in code: MYLIBA, MYLIBB, MYLIBC, MYLIBD, MYLIBE 35 SAS Essentials - Elliott & Woodward

3. 6 USING DATA IN PERMANENT SAS DATA SETS � After you have created

3. 6 USING DATA IN PERMANENT SAS DATA SETS � After you have created an SAS library (either a permanent or a temporary one), you can access data in that library within SAS procedures or as input into other DATA steps. For example, the following are three different statements that all access the same data set: PROC MEANS DATA='C: SASDATASOMEDATA'; RUN; � Or PROC MEANS DATA=MYSASLIB. SOMEDATA; RUN; � Or PROC MEANS DATA=MYLIB 2. SOMEDATA; RUN; 36 SAS Essentials - Elliott & Woodward

Do Hands On Exercise p 59… PROC MEANS DATA='c: sasdatasomedata'; RUN; Changed to Notice

Do Hands On Exercise p 59… PROC MEANS DATA='c: sasdatasomedata'; RUN; Changed to Notice quotes on 1 st but not the 2 nd (Library) version. PROC MEANS DATA=MYSASLIB. SOMEDATA RUN; � � 37 Note that there are quotes around the filename when given using a Windows path, but not when you use the library name. Run this program. (Same results as previous. ) SAS Essentials - Elliott & Woodward

3. 7 IMPORTING DATA FROM ANOTHER PROGRAM � Before you import data from another

3. 7 IMPORTING DATA FROM ANOTHER PROGRAM � Before you import data from another file format (such as Excel) you should make sure the file is ready to import. Most files to be imported should follow these criteria: 1. The first row should be SAS friendly variable names 2. Each column should contain data that are consistent to the variable type (character, number, date) desired. 38 SAS Essentials - Elliott & Woodward

Importing a CSV File � The first row contains the names of the SAS

Importing a CSV File � The first row contains the names of the SAS variables, each separated by a comma, and each adhering to SAS naming conventions. (These were discussed in Chapter 2. ) � Beginning on the second line of the file, each line contains the data values for one subject, where each value is separated by a comma. � Two commas in a row indicate a missing value. You could also designate numeric data values that are unknown or missing with a dot (. ) or a missing value code (such as -99). Missing character values can be represented by a double quote such as “”. 39 SAS Essentials - Elliott & Woodward

CSV file ready for import � Note that line 1 contains the 13 names

CSV file ready for import � Note that line 1 contains the 13 names of the variables separated by commas and each following line contains 13 values each separated by a comma, consistent with the data type of the variable. � (Note also that in this case, it is okay that some values contain blanks, such as Civic Hybrid because it is the comma that marks a new value, not a blank. ) 40 SAS Essentials - Elliott & Woodward

Importing using the Import Wizard � Select File/Import data (Hands On Example p 61)

Importing using the Import Wizard � Select File/Import data (Hands On Example p 61) 41 SAS Essentials - Elliott & Woodward

Examine imported file � PROC 42 PRINT DATA=MPG_FOR_CARS; RUN; SAS Essentials - Elliott &

Examine imported file � PROC 42 PRINT DATA=MPG_FOR_CARS; RUN; SAS Essentials - Elliott & Woodward

Importing Data Using the SAS Code � When you used the Wizard, the last

Importing Data Using the SAS Code � When you used the Wizard, the last question was if you wanted to save the code. This is the code SAS creates and uses to do the import. � If you save the code, you can use it again to import the same file, or change it for other imports. � The following code was saved from the Wizard for an import using the Excel. xls import option 43 SAS Essentials - Elliott & Woodward

SAS code to Import CSV � Open the file IMPORTEXAMPLE. SAS Name of resulting

SAS code to Import CSV � Open the file IMPORTEXAMPLE. SAS Name of resulting file PROC IMPORT OUT= WORK. MPG_FOR_CARS DATAFILE= "C: SASDATACARSMPG. CSV" DBMS=CSV REPLACE; GETNAMES=YES; DATAROW=2; Name of file to import RUN; Variables names are on the first line 44 SAS Essentials - Elliott & Woodward Type of Import… in this case CSV

Import Excel File (page 66) �Open IMPORTEXAMPLE 2. SAS PROC IMPORT OUT= WORK. FROMXL

Import Excel File (page 66) �Open IMPORTEXAMPLE 2. SAS PROC IMPORT OUT= WORK. FROMXL DATAFILE= "C: SASDATAEXAMPLE. XLS" DBMS=XLS REPLACE; SHEET="Database"; GETNAMES=YES; RUN; Note XLS file type specified For an Excel Import, it is important to specify the sheet name. 45 SAS Essentials - Elliott & Woodward

Identifiers for Importing Files 46 SAS Essentials - Elliott & Woodward

Identifiers for Importing Files 46 SAS Essentials - Elliott & Woodward

Exporting Data Using the SAS Code PROC EXPORT DATA=dataset OUTFILE='fileneme' or OUTTABLE='tablename' DBMS=dbmsidentifier <LABEL><REPLACE>;

Exporting Data Using the SAS Code PROC EXPORT DATA=dataset OUTFILE='fileneme' or OUTTABLE='tablename' DBMS=dbmsidentifier <LABEL><REPLACE>; � Do Hands on Exercise p 66 (EXPORT. SAS) 47 SAS Essentials - Elliott & Woodward

3. 8 DISCOVERING CONTENTS OF A DATA FILE � Many “canned” data sets come

3. 8 DISCOVERING CONTENTS OF A DATA FILE � Many “canned” data sets come in the SAS format (SASB 7 DAT). � One way to examine the contents is (See Hands On Example p 68) PROC DATASETS; CONTENTS DATA = MYSASLIB. SOMEDATA; or PROC DATASETS; CONTENTS DATA = “C: /SASDATA/SOMEDATA”; 48 SAS Essentials - Elliott & Woodward

Continue to Chapter 3, Part 2 � 3. 9 GOING DEEPER: UNDERSTANDING HOW THE

Continue to Chapter 3, Part 2 � 3. 9 GOING DEEPER: UNDERSTANDING HOW THE DATA STEP READS AND STORES DATA 49 SAS Essentials - Elliott & Woodward

These slides are based on the book: Introduction to SAS Essentials Mastering SAS for

These slides are based on the book: Introduction to SAS Essentials Mastering SAS for Data Analytics, 2 nd Edition By Alan C, Elliott and Wayne A. Woodward Paperback: 512 pages Publisher: Wiley; 2 edition (August 3, 2015) Language: English ISBN-10: 111904216 X ISBN-13: 978 -1119042167 These slides are provided for you to use to teach SAS using this book. Feel free to modify them for your own needs. Please send comments about errors in the slides (or suggestions for improvements) to acelliott@smu. edu. Thanks. 50 SAS ESSENTIALS -- Elliott & Woodward