Developing ClientServer Applications to Maximize SAS 9 Parallel

  • Slides: 36
Download presentation
Developing Client/Server Applications to Maximize SAS® 9 Parallel Capabilities Cheryl Doninger SAS Institute Copyright

Developing Client/Server Applications to Maximize SAS® 9 Parallel Capabilities Cheryl Doninger SAS Institute Copyright © 2003, SAS Institute Inc. All rights reserved.

The SAS Intelligence Value Chain § § Copyright © 2003, SAS Institute Inc. All

The SAS Intelligence Value Chain § § Copyright © 2003, SAS Institute Inc. All rights reserved. Usability Interoperability Manageability Scalability

Scalability – SAS 9 SAS Scalable Architecture Clients Scalable Performance Data Server OLAP Metadata

Scalability – SAS 9 SAS Scalable Architecture Clients Scalable Performance Data Server OLAP Metadata Stored Process Scalable Servers SAS Teradata. Sybase DB 2 Oracle Scalable SAS/ACCESS Pipin SAS SAS g g CONNECT CPU 1 CPU 2 Threaded Procedures THREAD 1 THREAD 2 THREAD N… Copyright © 2003, SAS Institute Inc. All rights reserved. Remote Host

Copyright © 2003, SAS Institute Inc. All rights reserved.

Copyright © 2003, SAS Institute Inc. All rights reserved.

Copyright © 2003, SAS Institute Inc. All rights reserved.

Copyright © 2003, SAS Institute Inc. All rights reserved.

Scalability with SAS 9 parallel processes parallel threads Copyright © 2003, SAS Institute Inc.

Scalability with SAS 9 parallel processes parallel threads Copyright © 2003, SAS Institute Inc. All rights reserved.

Single Threaded V 8 SAS Copyright © 2003, SAS Institute Inc. All rights reserved.

Single Threaded V 8 SAS Copyright © 2003, SAS Institute Inc. All rights reserved.

Multiple Processes Copyright © 2003, SAS Institute Inc. All rights reserved.

Multiple Processes Copyright © 2003, SAS Institute Inc. All rights reserved.

SAS 9 Multiple Threads Copyright © 2003, SAS Institute Inc. All rights reserved.

SAS 9 Multiple Threads Copyright © 2003, SAS Institute Inc. All rights reserved.

Multiple Processes and Multiple Threads Copyright © 2003, SAS Institute Inc. All rights reserved.

Multiple Processes and Multiple Threads Copyright © 2003, SAS Institute Inc. All rights reserved.

A Very Satisfied MP CONNECT Customer… "I've been dreaming of this capability within SAS

A Very Satisfied MP CONNECT Customer… "I've been dreaming of this capability within SAS for approximately 12 years. The first day back in the office after the course, within 30 minutes I was able to apply the technique to an existing program and reduce processing time by over 50%. ” David Walker Centers for Disease Control and Prevention Copyright © 2003, SAS Institute Inc. All rights reserved.

Independent Parallelism Data Source A Data Source B 0 Copyright © 2003, SAS Institute

Independent Parallelism Data Source A Data Source B 0 Copyright © 2003, SAS Institute Inc. All rights reserved. Proc Sort elapsed time

MP CONNECT – Independent Scale Up Extract Oracle Data PROC STEP Read and DATA

MP CONNECT – Independent Scale Up Extract Oracle Data PROC STEP Read and DATA STEP Summarize SAS Data Read and Execute Summari ze Simultaneously SAS Data SMP Server Copyright © 2003, SAS Institute Inc. All rights reserved.

MP CONNECT – Independent Scale Out SAS Session 2 Parent SAS Session n Copyright

MP CONNECT – Independent Scale Out SAS Session 2 Parent SAS Session n Copyright © 2003, SAS Institute Inc. All rights reserved.

Piping – Worth the Price of Admission to SAS 9… “…piping is the big

Piping – Worth the Price of Admission to SAS 9… “…piping is the big one that has made a difference to our day - jobs have been cut by up to 60% meaning we can deliver in a much quicker time frame at end of month. ” Charles Pollack SUNCORP METWAY Copyright © 2003, SAS Institute Inc. All rights reserved.

Pipeline Parallelism Data Step Proc Sort 0 Copyright © 2003, SAS Institute Inc. All

Pipeline Parallelism Data Step Proc Sort 0 Copyright © 2003, SAS Institute Inc. All rights reserved. elapsed time

MP CONNECT – Piping – Scale Up DATA STEP PROC Read and STEP Summarize

MP CONNECT – Piping – Scale Up DATA STEP PROC Read and STEP Summarize DATA STEP SAS Data SMP Server Copyright © 2003, SAS Institute Inc. All rights reserved. Overlapped Execution

MP CONNECT – Piping – Scale Out SAS Session 2 Parent SAS Session n

MP CONNECT – Piping – Scale Out SAS Session 2 Parent SAS Session n Copyright © 2003, SAS Institute Inc. All rights reserved.

When to Use MP CONNECT § § § long running jobs independent data sources

When to Use MP CONNECT § § § long running jobs independent data sources independent tasks that can be overlapped utilize SMP hardware or processors on a network Copyright © 2003, SAS Institute Inc. All rights reserved.

Considerations for MP CONNECT § I/O bottlenecks § WORK library § CPU bottlenecks Copyright

Considerations for MP CONNECT § I/O bottlenecks § WORK library § CPU bottlenecks Copyright © 2003, SAS Institute Inc. All rights reserved.

Gartner’s Definition of Grid Computing “a grid is a collection of resources owned by

Gartner’s Definition of Grid Computing “a grid is a collection of resources owned by multiple organizations that is coordinated to allow them to solve a common problem” Copyright © 2003, SAS Institute Inc. All rights reserved.

MP CONNECT in Cluster Environment § § Copyright © 2003, SAS Institute Inc. All

MP CONNECT in Cluster Environment § § Copyright © 2003, SAS Institute Inc. All rights reserved. 32 node Linux cluster / MOSIX 1 Ghz Intel P 3 processors 1 G RAM per processor 100 Mb backplane

MP CONNECT in Cluster Environment i Host No. Iter 4 17 18 task 4

MP CONNECT in Cluster Environment i Host No. Iter 4 17 18 task 4 task 17 task 18 3940 3920 3900 Estimated Work Time for Time/20 Entire Iter Problem 0: 04: 17 446: 05 446: 03 445: 26 Distribution Efficiency 96% 96% Total elapsed time: 14: 30: 03 Cumulative working time: 447: 46 Cumulative waiting time: 15: 14: 54 Scaling efficiency: 96. 50% Copyright © 2003, SAS Institute Inc. All rights reserved. Wait Time/20 Iter 0: 07 0: 09 0: 00: 10

MP CONNECT in Grid Environment § 100 heterogeneous nodes § W 2 K, WXP,

MP CONNECT in Grid Environment § 100 heterogeneous nodes § W 2 K, WXP, variety of Unix OS’s § combination of V 8 SAS and SAS 9 Copyright © 2003, SAS Institute Inc. All rights reserved.

MP CONNECT in Grid Environment i Host 7 48 97 No. Iter Host ld

MP CONNECT in Grid Environment i Host 7 48 97 No. Iter Host ld 055 in 028 hd 204 570 1230 3120 Estimated Work Time for Time/30 Entire Distribution Iter Problem Efficiency 0: 15: 18 0: 06: 52 0: 02: 40 1060: 11 476: 07 184: 42 Total elapsed time: Cumulative working time: Cumulative waiting time: Scaling efficiency: Copyright © 2003, SAS Institute Inc. All rights reserved. 204% 91% 35% 5: 12: 19 468: 41 0: 39: 42 90. 04% Wait Time/30 Iter 0: 00: 00 0: 01

Combining Parallel Processes and Threads Copyright © 2003, SAS Institute Inc. All rights reserved.

Combining Parallel Processes and Threads Copyright © 2003, SAS Institute Inc. All rights reserved.

SAS 9 Partitioned Data Model SAS® 8 data SAS 9 SPDE Engine & SPD

SAS 9 Partitioned Data Model SAS® 8 data SAS 9 SPDE Engine & SPD Server® metadata 1 data 2 data 3 data 4 index Hybrid index Bitmap/B-tree Copyright © 2003, SAS Institute Inc. All rights reserved. Index metadata

MP CONNECT and SPDE Engine § § Copyright © 2003, SAS Institute Inc. All

MP CONNECT and SPDE Engine § § Copyright © 2003, SAS Institute Inc. All rights reserved. single input, 4. 8 GB, 20 million obs two data steps, two PROC FREQs 4 -way unix box six iterations of implementation

MP CONNECT and SPDE Engine partitione d input 4 Data Step parallel 1 session

MP CONNECT and SPDE Engine partitione d input 4 Data Step parallel 1 session s partitione d USER= Copyright © 2003, SAS Institute Inc. All rights reserved. Data Step 2 Proc Freq 1 Proc Freq 2

MP CONNECT and SPDE Engine total improvement in elapsed time of 65% Copyright ©

MP CONNECT and SPDE Engine total improvement in elapsed time of 65% Copyright © 2003, SAS Institute Inc. All rights reserved.

MP CONNECT and Threaded SUMMARY § two raw input files (~1. 5 G each)

MP CONNECT and Threaded SUMMARY § two raw input files (~1. 5 G each) § 8 -way 900 MHz unix box § two data steps, two PROC SUMMARYs, and a merge Copyright © 2003, SAS Institute Inc. All rights reserved.

MP CONNECT and Threaded Summary Sales. txt Goals. txt Data step 1 Step Data

MP CONNECT and Threaded Summary Sales. txt Goals. txt Data step 1 Step Data step Summary Merge Copyright © 2003, SAS Institute Inc. All rights reserved.

MP CONNECT and Threaded Summary total improvement in elapsed time of 70% Copyright ©

MP CONNECT and Threaded Summary total improvement in elapsed time of 70% Copyright © 2003, SAS Institute Inc. All rights reserved.

Considerations for Combining MP CONNECT and Threading § tune threads per session on SMP

Considerations for Combining MP CONNECT and Threading § tune threads per session on SMP −CPUCOUNT −THREADS/NOTHREADS −OS processor set command § depends on −application, −data, and −hardware configuration Copyright © 2003, SAS Institute Inc. All rights reserved.

For More Info… § Scalability and Performance Community −http: //support. sas. com/rnd/scalability Copyright ©

For More Info… § Scalability and Performance Community −http: //support. sas. com/rnd/scalability Copyright © 2003, SAS Institute Inc. All rights reserved.

Copyright © 2003, SAS Institute Inc. All rights reserved.

Copyright © 2003, SAS Institute Inc. All rights reserved.