VAM Benchmark 2 VAM vs Sim A By
VAM Benchmark (2) -- VAM vs. Sim A By kp-ryct. com 2019 -12 -17 Version 1. 3
Conditions • System info: • • OS: Cent. OS Linux release 7. 6. 1810 (Core) kernel: Linux version 3. 10. 0 -957. 12. 2. el 7. x 86_64 CPU: Intel(R) Core(TM) i 7 -6700 K CPU @ 4. 00 GHz Memory: 4 G • Sim A: • Released at 2019. • Has its in-house Verilog-A compiler; Supports BSIM-CMG 110, BSIM 4 4. 5. 0, which will be tested in this benchmark. • Features: • Its model interface is opened, so customers can develop their own models and compile to shared library, which can be loaded by Sim A at run time. • Flow of VAM: • First using VAM to convert Verilog-A module source codes to C codes; • Then calling GCC (come with Sim A, version 6. 3. 0, -O 3 -ffastmath) to compile the C codes to shared library. • Finally Sim A loads the shared library at run time and links the modules inside the shared library through a lightweight wrapper as if they are build-in models. • It is similar as the in-house Verilog-A compiler flow of Sim A.
Conditions • About the testing: • Sim A is run in the normal SPICE mode, not parallel SPICE or fast SPICE. • The in-house Verilog-A compiler of Sim A has bugs in handling 0 V vsource (V(a, b) <+ 0. 0) in some situations. So we modified the Verilog-A source codes by removing all the internal nodes which connected with terminals with 0 V vsource for the given testing model card. • Sim A will use other math library (not libm), which will bering more than 30% (BSIM-CMG 110) or 15% (BSIM 4 4. 5. 0) benefit comparing with using libm. To keep consistency, we changed to use libm when testing the build-in models. • About the results: • The “tran time” is just the “intrinsic tran analysis time”, which doesn't include the time that finding the IC solution, but includes the time of matrix operation and others. • The “memory” is the “peak resident memory” used in tran analysis.
Testing cases of BSIM-CMG 110 Instances Equ Case 1 100 k mosfet, 1 res, 2 vsource 5 Case 1. 1 100 k mosfet (selfheating is ON), 1 res, 2 vsource 6 Case 2 2 k mosfet, 2 vsource 5004 Case 2. 1 2 k mosfet (selfheating is ON), 2 vsource 5005 Case 3 ~3. 7 k mosfet, ~20 vsource ~2 k Case 4 ~1 k mosfet, ~4 k cap, ~40 vsource ~600 Case 5 ~55 k mosfet, ~70 vsource ~27 k • NOTE: • Case 1 and Case 1. 1 are designed to reveal the difference in model calculation and memory usage: 100 k mosfet in parallel (but with different instance parameters, to avoid the optimization by some simulators), and then in series with a resistor in S term (see appendix B). The calculation time and memory used in other parts can be ignored comparing with the one used in model side. • Case 2 and Case 2. 1 are a chain with 1 k inverters. • The difference between case 1 and case 1. 1 is the selfheating is turning on in case 1. 1, and some related parameters are also set. Similar as case 2 and case 2. 1.
Results of BSIM-CMG 110 Build-in VAM Ratio Tran Time (s) Steps Mem (MB) Tran Time (X) Mem (X) Case 1 64. 58 83 567 51. 71 83 338 1. 25 1. 67 Case 1. 1 83. 35 83 722 68. 88 83 346 1. 21 2. 09 Case 2 193. 4 10617 94. 5 150. 8 10618 88. 9 1. 28 1. 06 Case 2. 1 249. 9 10618 94. 3 204. 2 10616 90. 1 1. 22 1. 05 Case 3 166. 1 5081 93. 8 129. 9 5069 89. 1 1. 28 1. 05 Case 4 41. 81 4913 79. 5 33. 37 4913 80. 1 1. 25 0. 99 Case 5 1491 2634 459 1155 2625 315 1. 29 1. 46 In-house Verilog-A compiler VAM Ratio Tran Time (s) Steps Mem (MB) Tran Time (X) Mem (X) Case 1 109. 4 83 643 51. 71 83 338 2. 12 1. 90 Case 1. 1 181. 4 83 713 68. 88 83 346 2. 63 2. 06 Case 2 315. 3 10616 244 150. 8 10618 88. 9 2. 09 2. 74 Case 2. 1 515. 5 10616 297 204. 2 10616 90. 1 2. 52 3. 30 Case 3 271. 4 5062 238 129. 9 5069 89. 1 2. 09 2. 67 Case 4 66. 70 4920 239 33. 37 4913 80. 1 2. 00 2. 98 Case 5 2404 2600 539 1155 2625 315 2. 08 1. 71
Summary of BSIM-CMG 110 • About the model calculation time: • From case 1 and 1. 1, the calculation time of VAM BSIM-CMG 110 is about 1. 23 X faster than the build-in BSIM-CMG 110 of Sim A. From case 2, 2. 1, 3, 4, 5, it is about 1. 26 X faster in whole tran analysis (maybe VAM dumping codes apply for the minimal matrix elements, which reduce the matrix calculation time). • From all these cases, VAM BSIM-CMG 110 is > 2. 0 X faster than the in-house Verilog-A compiler of Sim A. • About the memory: • From case 1 and 1. 1, the memory used by VAM BSIM-CMG 110 is about 1. 7 X less than the build-in BSIM-CMG 110 of Sim A. • From case 1 and 1. 1, the memory used by VAM BSIM-CMG 110 is about 1. 9 X less than the in-house Verilog-A compiler of Sim A.
Testing cases of BSIM 4 4. 5. 0 Instances Equ Case 1 100 k mosfet, 1 res, 2 vsource 5 Case 2 2 k mosfet, 2 vsource 5004 Case 3 ~3. 7 k mosfet, ~20 vsource ~2 k Case 4 ~1 k mosfet, ~4 k cap, ~40 vsource ~600 Case 5 ~55 k mosfet, ~70 vsource ~27 k • NOTE: • (Please refer to the NOTE of “Testing cases of BSIM-CMG 110”) • BSIM 4 4. 5. 0 didn't have self-heating, so here removed case 1. 1 and case 2. 1.
Results of BSIM 4 4. 5. 0 Build-in VAM Ratio Tran Time (s) Steps Mem (MB) Tran Time (X) Mem (X) Case 1 55. 09 135 469 34. 44 135 312 1. 60 1. 50 Case 2 127. 8 2127 182 77. 30 2125 153 1. 65 1. 19 Case 3 182. 1 13991 91. 5 109. 1 13958 86 1. 67 1. 06 Case 4 80. 77 22090 81 55. 40 22246 79. 6 1. 46 1. 02 Case 5 297. 5 1383 395 186. 1 1379 315 1. 60 1. 25 In-house Verilog-A compiler VAM Ratio Tran Time (s) Steps Mem (MB) Tran Time (X) Mem (X) Case 1 65. 43 135 480 34. 44 135 312 1. 89 1. 54 Case 2 156. 1 2125 194 77. 30 2125 153 2. 02 1. 27 Case 3 211. 9 13960 196 109. 1 13958 86 1. 94 2. 28 Case 4 89. 73 22090 191 55. 40 22246 79. 6 1. 62 2. 40 Case 5 339. 1 1383 414 186. 1 1379 315 1. 82 1. 31
Summary of BSIM 4 4. 5. 0 • About the model calculation time: • The calculation time of VAM BSIM 4 4. 5. 0 is about 1. 60 X faster than the build-in BSIM 4 4. 5. 0 of Sim A. • We were amazed by the result (1. 6 X faster). Basing on our experience, they should be close. In fact, we got close results when comparing with Sim B (VAM BSIM 4 4. 5. 0 is about 1. 05 X ~ 1. 10 X faster than the build-in BSIM 4 4. 5. 0 of Sim B). We did check it several time, and didn't find any mistake in testing. • The calculation time of VAM BSIM 4 4. 5. 0 is about 1. 80 X faster than the in-house Verilog-A compiler of Sim A. • About the memory: • VAM BSIM 4 4. 5. 0 used less memories than build-in BSIM 4 4. 5. 0 and in-house Verilog-A compiler of Sim A. • From case 1, the memory used VAM BSIM 4 4. 5. 0 is about 1. 5 X less than build-in BSIM 4 4. 5. 0 of Sim A. This also makes us puzzled, they should also be close.
- Slides: 9