An Introduction to SQL Structured Query Language SQL

  • Slides: 18
Download presentation
An Introduction to SQL

An Introduction to SQL

Structured Query Language (SQL) is a standardized language originally designed as a relational database

Structured Query Language (SQL) is a standardized language originally designed as a relational database query tool. SQL is currently used in software products to retrieve and update data. 2

Structured Query Language: Timeline 1970 1980 1990 2000 IBM develops SQL. 1970 – Dr.

Structured Query Language: Timeline 1970 1980 1990 2000 IBM develops SQL. 1970 – Dr. E. F. Codd of IBM proposes SQL. 1981 – First commercial SQL product is released. 1989 – More than 75 SQL-based systems exist. SAS 6. 06 includes PROC SQL. 1999 – PROC SQL is enhanced for SAS 8. 2004 – PROC SQL is enhanced for SAS® 9. 3

The SQL Procedure Enables the use of SQL in SAS Part of Base SAS

The SQL Procedure Enables the use of SQL in SAS Part of Base SAS Follows American National Standards Institute (ANSI) standards Includes enhancements for compatibility with SAS software 4

PROC SQL Features 5

PROC SQL Features 5

Query SAS data sets proc sql ; select * from orion. employee_payroll having salary=max(salary)

Query SAS data sets proc sql ; select * from orion. employee_payroll having salary=max(salary) ; quit;

Generate reports from SAS data sets proc sql ; select mean(salary) "Average Salary" format=dollar

Generate reports from SAS data sets proc sql ; select mean(salary) "Average Salary" format=dollar 12. , employee_gender from orion. employee_payroll group by employee_gender ; quit;

Combine SAS data sets in many ways Inner Joins Outer Joins Left Full Right

Combine SAS data sets in many ways Inner Joins Outer Joins Left Full Right

Inner joins Return only matching rows Maximum of 256 tables can be joined at

Inner joins Return only matching rows Maximum of 256 tables can be joined at the same time. 9

Outer Joins Return all matching rows, plus nonmatching rows from one or both tables

Outer Joins Return all matching rows, plus nonmatching rows from one or both tables Can be performed on only two tables or views at a time. Left Full Right 10

Create and delete SAS data sets, views, and indexes proc sql ; create table

Create and delete SAS data sets, views, and indexes proc sql ; create table newbp as select mean(BPXSY 1, BPXSY 2, BPXSY 3, BPXSY 4) as mnsbp, mean(BPXDI 1, BPXDI 2, BPXDI 3, BPXDI 4) as mndbp, seqn from nh 9. bloodpressure ; select n(mnsbp) "mnsbp", n(mndbp) "mndbp" from newbp ; quit;

Update existing SAS data sets data tmp; input x b $ @@; datalines; 1

Update existing SAS data sets data tmp; input x b $ @@; datalines; 1 a 1 1 a 2 2 b 1 2 b 2 4 d ; proc print data=tmp; title "tmp"; run; title; proc sql; update tmp set x=x*2 where b contains "a"; title "tmp"; select * from tmp; quit; title;

Access Meta Data proc sql ; select memname, label from dictionary. columns where libname="FRAM"

Access Meta Data proc sql ; select memname, label from dictionary. columns where libname="FRAM" and upcase(label) contains "CHOL"; quit;

Create Macro Variables proc sql ; select mean(age) into : mnage from framexam 5

Create Macro Variables proc sql ; select mean(age) into : mnage from framexam 5 subset ; quit; %put Average age: &mnage;

(Sometimes) reproduce the results of multiple DATA and procedure steps with a single query

(Sometimes) reproduce the results of multiple DATA and procedure steps with a single query proc sql; create table analysis as select a. seqn, mortstat=1 as dead, permth_exm, mean(BPXSY 1, BPXSY 2, BPXSY 3, BPXSY 4) as mnsbp, mean(BPXDI 1, BPXDI 2, BPXDI 3, BPXDI 4) as mndbp, riagendr=1 as male, ridageyr as age, ridreth 2 as race_ethn, lbdhdl as hdl, lbxtc as chol, bmxbmi as bmi from nh 9. mortality(keep=seqn eligstat mortstat permth_exm) a, nh 9. bloodpressure(keep=seqn bpxsy 1 -bpxsy 4 BPXDI 1 -BPXDI 4) b, nh 9. demographics (keep=seqn ridageyr riagendr RIDRETH 2) c, nh 9. bodymeasurements(keep=seqn bmxbmi) d, nh 9. cholesterolhdl(keep= seqn LBDHDL LBXTC) e where eligstat eq 1 and a. seqn=b. seqn and b. seqn=c. seqn and c. seqn=d. seqn and d. seqn=e. seqn order by seqn ; quit;

Structured Query Language Input Output Report SAS Data Set DBMS Table SAS Data View

Structured Query Language Input Output Report SAS Data Set DBMS Table SAS Data View PROC SQL SAS Data Set SAS Data View DBMS Table 16

Terminology Data Processing SAS SQL File Data Set Table Record Observation Row Field Variable

Terminology Data Processing SAS SQL File Data Set Table Record Observation Row Field Variable Column 17

The SQL Procedure Tool for querying data Tool for data manipulation and management An

The SQL Procedure Tool for querying data Tool for data manipulation and management An augmentation to the DATA step not A DATA step replacement 18