To Compress or not to Compress Chuck Hopf

  • Slides: 28
Download presentation
To Compress or not to Compress? Chuck Hopf

To Compress or not to Compress? Chuck Hopf

What is your precious? • Gollum says every data center has something that is

What is your precious? • Gollum says every data center has something that is precious or hard to come by – CPU Time – DASD Space – Run Time – IO – Memory

Lots of talk • On the LISTSERVE – does compression use more CPU? Does

Lots of talk • On the LISTSERVE – does compression use more CPU? Does it save DASD space? • On the LISTSERVE – what is the best BUFNO= to use with MXG

Testing theories • Built two tests – COMPRESS=NO varying BUFNO from 2 10 15

Testing theories • Built two tests – COMPRESS=NO varying BUFNO from 2 10 15 20 – COMPRESS=YES again varying the BUFNO

An Epiphany! • What if you run with COMPRESS=NO and send the output to

An Epiphany! • What if you run with COMPRESS=NO and send the output to PDB as a temporary dataset and then at the end, turn on COMPRESS=YES and do a PROC COPY INDD=PDB OUTDD=PERMPDB NOCLONE; ? That would eliminate all of the compression during the reading and writing of all of the interim datasets but still create a compressed PDB.

So there are now 3 Tests! • TEST=NO - COMPRESS=NO • TEST=NO/YES - COMPRESS=NO

So there are now 3 Tests! • TEST=NO - COMPRESS=NO • TEST=NO/YES - COMPRESS=NO but final PDB is compressed • TEST=YES – COMPRESS=YES

CPU Time

CPU Time

Elapsed Time

Elapsed Time

Low Memory

Low Memory

High Memory

High Memory

EXCP DASD

EXCP DASD

DASD IO Time

DASD IO Time

DASD Space

DASD Space

DASD Space by DDNAME

DASD Space by DDNAME

Conclusions? • Running with COMPRESS=NO and then copying to a compressed PDB optimizes permanent

Conclusions? • Running with COMPRESS=NO and then copying to a compressed PDB optimizes permanent DASD space and uses very little additional CPU. • Even better, use the LIBNAME OPTION to turn it on where you want: – LIBNAME PDB COMPRESS=YES; /* z. OS only */ • Memory requirements increase with BUFNO but are not really that bad and

Caveats! • BLKSIZE matters. SAS procs are sometimes built with a BLKSIZE of 6160

Caveats! • BLKSIZE matters. SAS procs are sometimes built with a BLKSIZE of 6160 on WORK. This radically affects the IO counts. Use the recommended BLKSIZE=DASD(OPT) and leave the DCB attributes off of SAS datasets. • REGION may have to be increased – use REGION=0 M and be sure you are using the MXG defaults for MEMSIZE. • This all applies to z. OS not to ASCII platforms

So What About ASCII? • Using the same data, tests run with SAS 9.

So What About ASCII? • Using the same data, tests run with SAS 9. 2 on Win 7 system • 1. 5 GB memory • Dell 4600 – P 4 2. 7 GHz

ASCII Results Test COMPRESS=NO COMPRESS=YES COMPRESS=NO COMPRESS=YES COMPRESS=NO COMPRESS=YES COMPRESS=NO COMPRESS=YES BUFNO Elapsed CPU

ASCII Results Test COMPRESS=NO COMPRESS=YES COMPRESS=NO COMPRESS=YES COMPRESS=NO COMPRESS=YES COMPRESS=NO COMPRESS=YES BUFNO Elapsed CPU DEFAULT 06: 45. 4 03: 25. 0 DEFAULT 06: 12. 5 02: 51. 6 16 K 07: 35. 1 03: 56. 8 16 K 05: 57. 8 02: 49. 1 32 K 07: 39. 2 02: 58. 0 32 K 06: 05. 4 02: 51. 0 40 K 08: 28. 6 04: 17. 0 40 K 06: 20. 4 02: 59. 1 80 K 07: 44. 1 04: 02. 8 80 K 05: 59. 2 02: 54. 5 16 M 07: 42. 1 04: 01. 1 16 M 06: 09. 2 02: 53. 0 32 M 07: 43. 4 03: 54. 9 32 M 05: 57. 4 02: 51. 1 64 M 08: 02. 7 03: 58. 7 64 M 06: 37. 8 02: 55. 5 128 M 08: 14. 2 03: 55. 0 128 M 06: 30. 0 02: 58. 0 10 07: 11. 5 03: 16. 1 10 05: 56. 2 02: 37. 1 40 07: 17. 5 03: 20. 9 40 06: 00. 1 02: 41. 1 80 07: 13. 0 03: 24. 1 80 05: 57. 6 02: 36. 2 160 07: 16. 1 03: 24. 0 160 05: 44. 6 02: 26. 5 Memory 95713 k 95721 K 275537 K 179769 K 275537 K 179679 K 275537 K 179769 K 275537 K 179769 K 96259 K 96649 K 97603 K 98892 K 99529 K 102095 K 103379 K 108825 K

Wow! • COMPRESS=YES outperforms COMPRESS=NO! • BUFNO makes some difference but not a lot

Wow! • COMPRESS=YES outperforms COMPRESS=NO! • BUFNO makes some difference but not a lot and BUFNO=10 looks to be optimal – Difference is in seconds not minutes – But… there is something we don’t understand in the memory numbers • Runs faster under Win 7 than under z. OS – But does not include download time

So What Should You Do? • It Depends on what your ‘precious’ is –

So What Should You Do? • It Depends on what your ‘precious’ is – Running z. OS • Optimal for CPU and DASD is COMPRESS=NO with a copy to a compressed dataset at the end or by setting the compress=YES option with a LIBNAME • Optimal for CPU is COMPRESS=NO • Optimal for DASD is COMPRESS=YES • BUFNO=10 is optimal for run time – Running ASCII • Optimal for CPU and DASD is COMPRESS=YES

JCL //* SAMPLE JCL TO RUN BUILDPDB WITH COMPRESS=NO AND COMPRESS AT //* THE

JCL //* SAMPLE JCL TO RUN BUILDPDB WITH COMPRESS=NO AND COMPRESS AT //* THE END USING PROC COPY //S 1 EXEC MXGSASV 9 //PDB DD DSN=MXG. PDB(+1), SPACE=(CYL, (500, 500)), // DISP=(, CATLG, DELETE) //SPININ DD DSN=MXG. SPIN(0), SPACE=(CYL, (500, 500)) // DISP=(, CATLG, DELETE) //SPIN DD DSN=MXG. SPIN(+1), DISP=OLD //CICSTRAN DD DSN=MXG. CICSTRAN(+1), SPACE=(CYL, (500, 500)), // DISP=(, CATLG, DELETE) //DB 2 ACCT DD DSN=MXG. DB 2 ACCT(+1), SPACE=(CYL, (500, 500)), // DISP=(, CATLG, DELETE) //SMF DD DSN=YOUR, SMF DATA, DISP=SHR //SYSIN DD * OPTIONS COMPRESS=NO BUFNO=10; LIBNAME PDB COMPRESS=YES; LIBNAME SPIN COMPRESS=YES; %LET SPININ=SPININ; %UTILBLDP( MACKEEPX= MACRO _LDB 2 ACCT % MACRO _KDB 2 ACC COMPRESS=YES % MACRO _KCICTRN COMPRESS=YES % , SPINCNT=7, SPINUOW=2, OUTFILE=INSTREAM); %INCLUDE INSTREAM; JCL is in the 27. 10 SOURCLIB as JCLCMPDB

Why UTILBLDP? • Allows you to add data sources to BUILDPDB without having to

Why UTILBLDP? • Allows you to add data sources to BUILDPDB without having to edit the macros in the SOURCLIB. • Allows you to suppress data sources like 110 and DB 2 and TYPE 74 and process them in other jobs again without editing the macros. • Flexibility

Example OPTIONS COMPRESS=NO BUFNO=10; LIBNAME PDB COMPRESS=YES; LIBNAME SPIN COMPRESS=YES; %LET SPININ=SPININ; %UTILBLDP( USERADD=42,

Example OPTIONS COMPRESS=NO BUFNO=10; LIBNAME PDB COMPRESS=YES; LIBNAME SPIN COMPRESS=YES; %LET SPININ=SPININ; %UTILBLDP( USERADD=42, SUPPRESS=110 DB 2, SPINCNT=7, OUTFILE=INSTREAM); %INCLUDE INSTREAM; RUN;

MXG User Experience • Running MXG with WPS instead of SAS • Data from

MXG User Experience • Running MXG with WPS instead of SAS • Data from multiple platforms • Processed under two Virtual products • Also, Comparison of SAS/PC and WPS on z. Linux

PC/SAS VMWARE/Windows versus PC/SAS Hyper-V/Windows: (four platform’s data, three installation “groups” PROD/QA/DEV) Data From

PC/SAS VMWARE/Windows versus PC/SAS Hyper-V/Windows: (four platform’s data, three installation “groups” PROD/QA/DEV) Data From VMWARE(PROD) Hyper-V(PROD) Unix z. OS z. VM/Linux Windows Servers 00: 05: 30 00: 01: 30 00: 03: 07 02: 43: 08 00: 10: 56 00: 04: 54 00: 08 09: 32: 57 Data From VMWARE(QA) Hyper-V(QA) Unix ZOS z. VM/Linux Windows Servers 00: 31 00: 01: 27 00: 01: 02 00: 41: 24 00: 04: 18 00: 02: 46 00: 07: 06 02: 34: 19 Data From Unix ZOS z. VM/Linux Windows Servers VMWARE(DEV) Hyper-V(DEV) 00: 43 00: 21 00: 01: 08 00: 09: 06 00: 02: 42 00: 01: 42 00: 03: 34 00: 38: 47 Processing of performance Data collected from Unix, z. VM/Linux, z. OS and Windows.

PC/SAS versus LNX/WPS • PC/SAS VMWARE/Windows versus WPS z. VM/Linux • PC/SAS VMWARE is

PC/SAS versus LNX/WPS • PC/SAS VMWARE/Windows versus WPS z. VM/Linux • PC/SAS VMWARE is taking 2: 43: 08 to process the data from “Window Servers” for what the WPS z. VM/Linux environment can do in 1: 30: 00 (hh: mm: ss). • That is, the Mainframe WPS z. VM/Linux is a 45% improvement over the PC/SAS VMWARE/WIN. • This is most likely due to the extra bandwidth the mainframe has for I/O’s compared to the Windows environment. • The results for Windows would probably be better if WIN 2008 had been used.

PC/SAS versus WPS on z • PC/SAS under Hyper-V • WPS under z. VM/Linux

PC/SAS versus WPS on z • PC/SAS under Hyper-V • WPS under z. VM/Linux on z-10

Z 10: SAS versus WPS • • z. OS/SAS versus z. OS/WPS to run

Z 10: SAS versus WPS • • z. OS/SAS versus z. OS/WPS to run MXG 30% more I/O’s for SAS TCB for WPS = 551, 423 TCB for SAS = 551, 273 NOTES: WPS version 2. 4. 0. 1 and SAS 9. 1. 3 MXG from FEB 2009