The Complete Guide to SAS Indexes by Michael
The Complete Guide to SAS Indexes by Michael A. Raithel Reviewed by Horst Wolter Slide 1
About the Author • Wrote two previous SAS books and over twenty SAS technical papers. • Section Chair at SUGI, SESUG and NESUG and co-chaired NESUG in 1995. • First book entitled Tuning SAS Applications in the MVS Environment resides in the Smithsonian Institution of American History’s Permanent Research Collection of Information Technology. Slide 2
Goals of Indexing • The main goal of using a SAS index is to read only a small portion of a large SAS dataset instead of reading the entire SAS dataset. • Reduce I/Os. • Reduce wall clock time. • Reduce CPU time. Slide 3
When to Index Consider • the size of the subsets (the overhead of using an index can become greater than the overhead of a sequential read of the entire dataset) • frequency of use (is building an index cost effective) • variability of the data (indexed SAS datasets have an overhead to maintain if data changes often and may not be cost effective) Slide 4
Indexing Guidelines Subset Size (Table 2. 1 from book) Indexing Action 1% to 15% An index will definitely improve processing. There should be dramatic resource savings in the lower end of this range. 16% to 30% An index will improve processing. However, the resource savings will not be as dramatic as in the lower range. 31% to 60% An index may improve processing, or it might worsen processing. Be very careful in this subset range. 61% to 100% Do not use an index. A sequential read of the entire dataset is very likely to be more efficient. Slide 5
Index Variable Selection What variables to consider for an index? • Often used to subset data • Values that represent a small subset • A SAS dataset sorted by the index variable is more efficient Slide 6
How to create an Index • Data Step Method DATA data-set-name(INDEX=(index-name=(var 1 var 2 etc. ) </UNIQUE> </NOMISS>)); • SQL Method proc sql; CREATE <UNIQUE> INDEX index-name ON table-name ( column <, . . . column>); quit; Slide 7
Viewing Index Information proc contents data=data-set-name <centiles>; run; Indexes 1 Alphabetic List of Indexes and Attributes # 1 Index var 1 # of Unique Values 1535958 Slide 8
Using Indexes • With a WHERE clause • With a BY statement SAS determines whether or not to use an index based on a number of rules as described in the book. Slide 9
Other Topics covered in the book • Composite indexes • Creating indexes while using PROC SORT, PROC DATASETS • Creating Multiple Indexes • Rules for SAS Using an Index • Removing Indexes • Index options e. g. Unique Option, No. Miss Option • Recovering Missing Index Files • Repairing Damaged Index Files • And more Slide 10
From SAS Press Slide 11
Questions? Slide 12
- Slides: 12