Fast Benchmarks Michele Michelotto INFN Padova Manfred Alef
Fast Benchmarks Michele Michelotto – INFN Padova Manfred Alef – Grid. Ka Karlsruhe 1
Fast Benchmark � Request mainly from WLCG community via GDB to have a fast benchmark � Requirements clear Open source � Easy to run � Small, no download (apart from first download) � � Requirement not clear to me How much fast? Reproducible? Reliable? � Single core or multicore? � � Use Cases Run everytime we land on a a queue/VM/Cloud machine? � Run to sample the resources available? � Run to crosscheck is the HS 06 declared are reliable? � 2
An example with Geant 4 � Thanks to G. Cosmo and A. Dotti � Based on Geant 4 � Runs on linux x 86 -64 and ARM � realist description of the geometry of the detector � footprint 1/3 to ¼ of real experiment � No digitization, no analysis. � Cpu bound, no I/O � Download a bootstrap. sh script from cern � Running the script download the rest of the program and compile (5 – 10 minutes) �. /run. sh <num. Threads> <num. Events> 3
LHCB fast benchmark � New contact with P. Charpentier (LHCb) provided by Manfred � Small python script running about one minute, single threaded � Differences in measuring a slot in an idle, half loaded or fully loaded machine 4
Script from Manfred � Manfred prepared a script to run several time the LHCB. py script and make averages starting from ONE core loaded to N parallel instances � N is the number of logical cpus available � I tried to make a comparison of LHCB. py with HS 06 using the load curve of several architectures I had measured in the past 5
Xeon E 5 2660 16 C/32 T - 2600 MHz 25. 000 HS 06/core 20. 000 LHCB/core 15. 000 10. 000 5. 000 0 5 10 15 20 25 30 35 HS 06/LHCB 2. 500 HS 06/LH. . . 2. 000 1. 500 1. 000 0. 500 0. 000 0 6 5 10 15 20
AMD Opteron 6272 (32 C - 2100 MHz) 12. 000 10. 000 8. 000 6. 000 4. 000 HS 06/core 2. 000 LHCB/core 0. 000 0 5 10 15 20 25 30 35 HS 06/LHCB 1. 800 1. 600 1. 400 1. 200 1. 000 HS 06/LH. . . 0. 800 0. 600 0. 400 0. 200 7 0. 000 0 5 10 15 20 25
Intel ATOM C 2750 (8 C @ 2400 MHz) 9. 000 8. 000 7. 000 6. 000 5. 000 4. 000 3. 000 HS 06/core 2. 000 LHCB/core 1. 000 0 1 2 3 4 5 6 7 8 9 HS 06/LHCB 2. 000 1. 800 1. 600 1. 400 1. 200 1. 000 HS 06/LHCB 0. 800 0. 600 0. 400 0. 200 8 0. 000 0 1 2 3 4 5 6
Nvidia Tegra K 1 – 4 C – 2. 3 GHz 10. 000 9. 000 8. 000 7. 000 6. 000 5. 000 4. 000 3. 000 HS 06/core 2. 000 LHCB/core 1. 000 0 1 2 3 4 5 HS 06/LHCB 1. 800 1. 600 1. 400 1. 200 1. 000 0. 800 HS 06/LHCB 0. 600 0. 400 0. 200 0. 000 0 9 1 2 3 4 5
Odroid Exynox 5422 4 C - 2. 0 GHz 8. 000 7. 000 6. 000 5. 000 4. 000 3. 000 2. 000 HS 06/core 1. 000 LHCB/core 0. 000 0 1 2 3 4 5 HS 06/LHCB 2. 500 2. 000 1. 500 1. 000 HS 06/LHCB 0. 500 0. 000 10 0 1 2 3 4 5
LHCB single instances with HS 06 as load 25. 000 20. 000 15. 000 HS 06/core LHCB HS 06 loaded 10. 000 5. 000 0. 000 -5 11 0 5 10 15 20 25 30 35
LHCB single instances with HS 06 as load 2. 500 2. 000 1. 500 HS 06/LHCB loaded 1. 000 0. 500 0. 000 0 12 5 10 15 20 25 30 35
All together now 2. 500 2. 000 Xeon E 5 HS 06/LHCB 1. 500 Xeon E 5 HS 06/LHCB loaded Opteron Hs 06/LHCB Avoton HS 06/LHCB Tegra K 1 HS 06/LHCB 1. 000 Exynox HS 06/LHCB 0. 500 0. 000 0 13 5 10 15 20 25 30 35
Measuring on a production cluster � Manfred measured on a Grid. Ka production cluster the LHCb. py score compared with the HS 06 per slot � I made some plot of HS 06/LHCB vs load 14
Sometimes LHCb gets slow � HS 06/LHCb score is around 1. 2 – 1. 6 � Occasionally is can go to more than 2. 0 15
My conclusion � LHCb. py is a small script that runs easily everywhere you have python � Is very simple to maintain. About 30 lines of code � It takes about one minute to estimate the performance of the cpu on which its running � The fast answer of course is not free. It’s a tradeoff between speed and precision 16
- Slides: 17