Developing ClientServer Applications to Maximize SAS 9 Parallel




































- Slides: 36
Developing Client/Server Applications to Maximize SAS® 9 Parallel Capabilities Cheryl Doninger SAS Institute Copyright © 2003, SAS Institute Inc. All rights reserved.
The SAS Intelligence Value Chain § § Copyright © 2003, SAS Institute Inc. All rights reserved. Usability Interoperability Manageability Scalability
Scalability – SAS 9 SAS Scalable Architecture Clients Scalable Performance Data Server OLAP Metadata Stored Process Scalable Servers SAS Teradata. Sybase DB 2 Oracle Scalable SAS/ACCESS Pipin SAS SAS g g CONNECT CPU 1 CPU 2 Threaded Procedures THREAD 1 THREAD 2 THREAD N… Copyright © 2003, SAS Institute Inc. All rights reserved. Remote Host
Copyright © 2003, SAS Institute Inc. All rights reserved.
Copyright © 2003, SAS Institute Inc. All rights reserved.
Scalability with SAS 9 parallel processes parallel threads Copyright © 2003, SAS Institute Inc. All rights reserved.
Single Threaded V 8 SAS Copyright © 2003, SAS Institute Inc. All rights reserved.
Multiple Processes Copyright © 2003, SAS Institute Inc. All rights reserved.
SAS 9 Multiple Threads Copyright © 2003, SAS Institute Inc. All rights reserved.
Multiple Processes and Multiple Threads Copyright © 2003, SAS Institute Inc. All rights reserved.
A Very Satisfied MP CONNECT Customer… "I've been dreaming of this capability within SAS for approximately 12 years. The first day back in the office after the course, within 30 minutes I was able to apply the technique to an existing program and reduce processing time by over 50%. ” David Walker Centers for Disease Control and Prevention Copyright © 2003, SAS Institute Inc. All rights reserved.
Independent Parallelism Data Source A Data Source B 0 Copyright © 2003, SAS Institute Inc. All rights reserved. Proc Sort elapsed time
MP CONNECT – Independent Scale Up Extract Oracle Data PROC STEP Read and DATA STEP Summarize SAS Data Read and Execute Summari ze Simultaneously SAS Data SMP Server Copyright © 2003, SAS Institute Inc. All rights reserved.
MP CONNECT – Independent Scale Out SAS Session 2 Parent SAS Session n Copyright © 2003, SAS Institute Inc. All rights reserved.
Piping – Worth the Price of Admission to SAS 9… “…piping is the big one that has made a difference to our day - jobs have been cut by up to 60% meaning we can deliver in a much quicker time frame at end of month. ” Charles Pollack SUNCORP METWAY Copyright © 2003, SAS Institute Inc. All rights reserved.
Pipeline Parallelism Data Step Proc Sort 0 Copyright © 2003, SAS Institute Inc. All rights reserved. elapsed time
MP CONNECT – Piping – Scale Up DATA STEP PROC Read and STEP Summarize DATA STEP SAS Data SMP Server Copyright © 2003, SAS Institute Inc. All rights reserved. Overlapped Execution
MP CONNECT – Piping – Scale Out SAS Session 2 Parent SAS Session n Copyright © 2003, SAS Institute Inc. All rights reserved.
When to Use MP CONNECT § § § long running jobs independent data sources independent tasks that can be overlapped utilize SMP hardware or processors on a network Copyright © 2003, SAS Institute Inc. All rights reserved.
Considerations for MP CONNECT § I/O bottlenecks § WORK library § CPU bottlenecks Copyright © 2003, SAS Institute Inc. All rights reserved.
Gartner’s Definition of Grid Computing “a grid is a collection of resources owned by multiple organizations that is coordinated to allow them to solve a common problem” Copyright © 2003, SAS Institute Inc. All rights reserved.
MP CONNECT in Cluster Environment § § Copyright © 2003, SAS Institute Inc. All rights reserved. 32 node Linux cluster / MOSIX 1 Ghz Intel P 3 processors 1 G RAM per processor 100 Mb backplane
MP CONNECT in Cluster Environment i Host No. Iter 4 17 18 task 4 task 17 task 18 3940 3920 3900 Estimated Work Time for Time/20 Entire Iter Problem 0: 04: 17 446: 05 446: 03 445: 26 Distribution Efficiency 96% 96% Total elapsed time: 14: 30: 03 Cumulative working time: 447: 46 Cumulative waiting time: 15: 14: 54 Scaling efficiency: 96. 50% Copyright © 2003, SAS Institute Inc. All rights reserved. Wait Time/20 Iter 0: 07 0: 09 0: 00: 10
MP CONNECT in Grid Environment § 100 heterogeneous nodes § W 2 K, WXP, variety of Unix OS’s § combination of V 8 SAS and SAS 9 Copyright © 2003, SAS Institute Inc. All rights reserved.
MP CONNECT in Grid Environment i Host 7 48 97 No. Iter Host ld 055 in 028 hd 204 570 1230 3120 Estimated Work Time for Time/30 Entire Distribution Iter Problem Efficiency 0: 15: 18 0: 06: 52 0: 02: 40 1060: 11 476: 07 184: 42 Total elapsed time: Cumulative working time: Cumulative waiting time: Scaling efficiency: Copyright © 2003, SAS Institute Inc. All rights reserved. 204% 91% 35% 5: 12: 19 468: 41 0: 39: 42 90. 04% Wait Time/30 Iter 0: 00: 00 0: 01
Combining Parallel Processes and Threads Copyright © 2003, SAS Institute Inc. All rights reserved.
SAS 9 Partitioned Data Model SAS® 8 data SAS 9 SPDE Engine & SPD Server® metadata 1 data 2 data 3 data 4 index Hybrid index Bitmap/B-tree Copyright © 2003, SAS Institute Inc. All rights reserved. Index metadata
MP CONNECT and SPDE Engine § § Copyright © 2003, SAS Institute Inc. All rights reserved. single input, 4. 8 GB, 20 million obs two data steps, two PROC FREQs 4 -way unix box six iterations of implementation
MP CONNECT and SPDE Engine partitione d input 4 Data Step parallel 1 session s partitione d USER= Copyright © 2003, SAS Institute Inc. All rights reserved. Data Step 2 Proc Freq 1 Proc Freq 2
MP CONNECT and SPDE Engine total improvement in elapsed time of 65% Copyright © 2003, SAS Institute Inc. All rights reserved.
MP CONNECT and Threaded SUMMARY § two raw input files (~1. 5 G each) § 8 -way 900 MHz unix box § two data steps, two PROC SUMMARYs, and a merge Copyright © 2003, SAS Institute Inc. All rights reserved.
MP CONNECT and Threaded Summary Sales. txt Goals. txt Data step 1 Step Data step Summary Merge Copyright © 2003, SAS Institute Inc. All rights reserved.
MP CONNECT and Threaded Summary total improvement in elapsed time of 70% Copyright © 2003, SAS Institute Inc. All rights reserved.
Considerations for Combining MP CONNECT and Threading § tune threads per session on SMP −CPUCOUNT −THREADS/NOTHREADS −OS processor set command § depends on −application, −data, and −hardware configuration Copyright © 2003, SAS Institute Inc. All rights reserved.
For More Info… § Scalability and Performance Community −http: //support. sas. com/rnd/scalability Copyright © 2003, SAS Institute Inc. All rights reserved.
Copyright © 2003, SAS Institute Inc. All rights reserved.