Introduction to SAS ISYS 650 What Is SAS

  • Slides: 22
Download presentation
Introduction to SAS ISYS 650

Introduction to SAS ISYS 650

What Is SAS? • SAS is a collection of modules that are used to

What Is SAS? • SAS is a collection of modules that are used to process and analyze data. • It began in the late ’ 60 s and early ’ 70 s as a statistical package (Statistical Analysis System). • SAS is also an extremely powerful, generalpurpose programming language. • In recent years, it has been enhanced to provide state-of-the-art data mining tools and programs for Web development and analysis.

Data-Driven Tasks • The functionality of the SAS System is built around the four

Data-Driven Tasks • The functionality of the SAS System is built around the four data-driven tasks common to virtually any application: • 1. data access: – addresses the data required by the application • 2. data management: – shapes data into a form required by the application • 3. data analysis: – summarizes, reduces, or otherwise transforms raw data into meaningful and useful information • 4. data presentation: – communicates information in ways that clearly demonstrate its significance

An Overview of SAS Data Processing DATA steps are used to create SAS data

An Overview of SAS Data Processing DATA steps are used to create SAS data sets. PROC steps are used to process SAS data sets.

Explore the SAS workspace • When you first start SAS, the five main SAS

Explore the SAS workspace • When you first start SAS, the five main SAS windows open: – the Explorer – Results – Program Editor or Editor – Log – Output windows. • Menu: – Tools: New Library

Demo • Creating a new library: – Tools/New Library: • Name • Folder •

Demo • Creating a new library: – Tools/New Library: • Name • Folder • Enable at start up • import a table to the new library from MS Access database and create a SAS data set: – File/Import data • Open a SAS dataset: a SAS data set (also called a table) is a file containing descriptor information and related data values. The file is organized as a table of observations (rows) and variables (columns) that SAS can process.

Components of SAS programs • DATA steps typically create or modify SAS data sets.

Components of SAS programs • DATA steps typically create or modify SAS data sets. For example, you can use DATA steps to – put your data into a SAS data set – compute the values for new variables – check for and correct errors in your data – produce new SAS data sets by subsetting, merging, and updating existing data sets. • PROC (procedure) steps typically analyze and process data in the form of a SAS data set, and they sometimes create SAS data sets that contain the results of the procedure.

A program accessing the SAS data set named “student” in Mydata library DATA my.

A program accessing the SAS data set named “student” in Mydata library DATA my. Student; set Mydata. student; run; PROC print data=my. Student; run; Note: The DATA statement creates a temporary data set that references the “student” data set in the Mydata library. Temporary data sets are stored in the Work library.

SAS Data Access 1. Import Wizard: File/Import Data Demo: Access, Excel

SAS Data Access 1. Import Wizard: File/Import Data Demo: Access, Excel

Process SAS Data Set 1. Reference the library name: PROC print data=mydata. emp; run;

Process SAS Data Set 1. Reference the library name: PROC print data=mydata. emp; run; 2. Reference the Windows name directly: PROC print data="c: mydataemp"; run; 3. Creating a temporary SAS data set from existing SAS data set: DATA my. Student; *USE Mydata. student; *USE "C: Mydatastudent"; *SET Mydata. student; SET "C: Mydatastudent"; run; PROC print data=my. Student; run; Note: To add line comment, use “*”. To add block comment, use /* …. */

Creating a Permanent SAS Data Set by Using Windows’ File Name or Library. File.

Creating a Permanent SAS Data Set by Using Windows’ File Name or Library. File. Name in Data Statement DATA "c: My. Datamy. Student"; *USE Mydata. student; *USE "C: Mydatastudent"; *SET Mydata. student; SET "C: Mydatastudent"; run; DATA My. Data. my. Student; SET "C: Mydatastudent"; run; Note: This example creates a new permanent data set from the “student” data set in My. Data library.

Creating a Data Set Using Input Statement Temporary data set: DATA St. GPA; INPUT

Creating a Data Set Using Input Statement Temporary data set: DATA St. GPA; INPUT SID $ Sname $ GPA; DATALINES; S 1 Peter 3. 2 S 2 Paul 2. 8 S 3 Mary 3. 0 run; Permanent data set: DATA My. Data. St. GPA 2; INPUT SID $ Sname $ GPA; DATALINES; S 1 Peter 3. 2 S 2 Paul 2. 8 S 3 Mary 3. 0 run;

SAS Data Access 2. Using ODBC with PROC SQL; CONNECT TO ODBC(DSN='My. Sales. DB

SAS Data Access 2. Using ODBC with PROC SQL; CONNECT TO ODBC(DSN='My. Sales. DB 2007'); CREATE TABLE temp_sas AS SELECT * FROM CONNECTION TO ODBC(SELECT * FROM Customer); Data Customer; set Work. temp_sas; run; PROC Print DATA=Customer; run; Note: The CREATE TABLE statement creates a SAS data set from the Customer table.

Create a SAS Data Set as the Result of a SQL Join Statement PROC

Create a SAS Data Set as the Result of a SQL Join Statement PROC SQL; CONNECT TO ODBC(DSN='My. Sales. DB 2007'); CREATE TABLE temp_sas AS SELECT * FROM CONNECTION TO ODBC(SELECT Customer. CID, Cname, OID, Odate FROM Customer, Orders where Customer. cid=orders. cid); Data Customer. Order; set Work. temp_sas; run; PROC Print DATA=Customer. Order; run;

SAS Data Management • • • Creating calculated field Use DROP and KEEP to

SAS Data Management • • • Creating calculated field Use DROP and KEEP to select fields Create a subset of a data set Append two data sets Merge data set – Equivalent to SQL outer join

Creating Calculated Field • Arithmetic operators: – +, -, *, /, ** • Using

Creating Calculated Field • Arithmetic operators: – +, -, *, /, ** • Using SAS functions: – ABS, INT, SQRT, ROUND – Date functions: • TODAY(): return current date • INTCK(‘interval’, from, to) – The ‘interva’ can be: DAY, WEEK, MONTH, QTR, YEAR – Example: Age = intck('year', dob, today()); • Year, Month, Qtr

Examples DATA GPAGroup; set work. Mystudent; IF GPA <2. 0 then scholarship=1000; Else scholarship=3000;

Examples DATA GPAGroup; set work. Mystudent; IF GPA <2. 0 then scholarship=1000; Else scholarship=3000; IF GPA <2. 0 then GRPGrp='Poor'; Else GPAGrp='Good'; run; proc print data=GPAGroup; run; DATA Age. Group; set Mydata. student 2; Age = year(today())-year(DOB); run; proc print data=Age. Group; run;

DROP/KEEP DATA Student; set work. Mystudent; DROP Gender DOB; run; proc print data=Student; run;

DROP/KEEP DATA Student; set work. Mystudent; DROP Gender DOB; run; proc print data=Student; run;

Subset a Data Set with IF DATA high. Income; set Mydata. Emp; IF Salary>60000;

Subset a Data Set with IF DATA high. Income; set Mydata. Emp; IF Salary>60000; run; proc print data=high. Income; run;

Vertically Merging Two Data Sets (Append) DATA St. DOB; set Mydata. Student; Name=Sname; KEEP

Vertically Merging Two Data Sets (Append) DATA St. DOB; set Mydata. Student; Name=Sname; KEEP Name DOB; run; DATA Emp. DOB; set Mydata. Emp; DOB=Birthdate; KEEP Name DOB; run; DATA All. DOB; SET STDOB Emp. DOB; Run; proc print data=ALLDOB; run;

Horizontally Merging Two Data Sets (1. Must be sorted by the same field; 2.

Horizontally Merging Two Data Sets (1. Must be sorted by the same field; 2. this operation is equivalent to SQL Outer Join) PROC SQL; CONNECT TO ODBC(DSN='My. Sales. DB 2007'); CREATE TABLE temp_sas AS SELECT * FROM CONNECTION TO ODBC(SELECT * FROM Customer); CREATE TABLE temp_sas 2 AS SELECT * FROM CONNECTION TO ODBC(SELECT * FROM Orders); PROC SORT Data=Work. temp_sas; BY CID; PROC SORT Data=Work. temp_sas 2; BY CID; Data Customer. Orders; MERGE temp_sas 2; BY CID; KEEP CID CNAME OID ODATE SALESPERSON; run; PROC Print DATA=Customer. Orders; run;

A few SAS PROCs • • PROC PRINT PROC SORT PROC MEANS PROC SQL

A few SAS PROCs • • PROC PRINT PROC SORT PROC MEANS PROC SQL