Will You Still Compile Me Tomorrow Static CrossVersion

Finding compiler bugs Testing Source program Compiler Assembly code Verification Compiler Interactive theorem prover

Cross-version validation mov Source program EAX, EDX and EAX, 255 push EAX mov EDX,

Validation across various dimensions ARM Assembly code ARM +optimizations Assembly code x 86 +optimizations

Tools: Sym. Diff, Boogie, Z 3 Source program Compiler version 4. 0 Compiler version

Encoding assembly language • Encode one method at a time • calls are uninterpreted

Month-to-month results (ARM) method bodies 100% 80% 60% 40% 20% 0% month 2 month

Cross-architecture, optimization method bodies 100% 80% 60% 40% 20% 0% Missing Time. Out Different

Fault injection (ARM) method bodies 100% 80% 60% 40% 20% 0% Missing Time. Out

Counterexample traces • Helps user find where program execution diverged • Used by automated

Bucketing • Based on root cause analysis • Users write bucket descriptions

Conclusions • Some statistics: • • • methods analyzed: > 500, 000 new bugs

Slides: 13

Download presentation

Will You Still Compile Me Tomorrow? Static Cross-Version Compiler Validation Chris Hawblitzel, Shuvendu K. Lahiri (Microsoft Research) Kshama Pawar, Hammad Hashmi, Sedar Gokbulut, Lakshan Fernando, Dave Detlefs, Scott Wadsworth (Microsoft CLR Test Team)

Finding compiler bugs Testing Source program Compiler Assembly code Verification Compiler Interactive theorem prover Source program Automated theorem prover Test input Validation Output + high automation - limited coverage + covers all inputs - false alarms + covers all programs - not automated

Cross-version validation mov Source program EAX, EDX and EAX, 255 push EAX mov EDX, 0 x 100000 call Write. Internal. Flag 2 ret Source program Compiler version 4. 0 Compiler version 4. 5 Assembly code push ESI mov ESI, EDX and ESI, 255 push ESI mov EDX, 0 x 100000 call Write. Internal. Flag 2 pop ret ESI Automated theorem prover compare similar code fewer false alarms

Validation across various dimensions ARM Assembly code ARM +optimizations Assembly code x 86 +optimizations v 1 Assembly code v 3 v 4 v 2 Versions

Tools: Sym. Diff, Boogie, Z 3 Source program Compiler version 4. 0 Compiler version 4. 5 Assembly code Boogie program Sym. Diff equivalence verifier Combined Boogie program . . . push ESI. . . Boogie program verifier . . . Mem : = Store 4(. . . esi. . . ); esp : = SUB(esp, imm(4)); . . . Verification condition Z 3 automated theorem prover

Encoding assembly language • Encode one method at a time • calls are uninterpreted • inlining not yet supported • Our encoding is not entirely sound • mathematical integers vs. 32 -bit vectors • Z 3 supports both, but reasoning about integers is faster • non-aliasing assumptions • disjoint regions for stack, heap, static data • Floating point, switch tables, etc. • Complex instructions • rep stosb: i. edx i edx+ecx Mem[i] == al

Month-to-month results (ARM) method bodies 100% 80% 60% 40% 20% 0% month 2 month 3 month 4, 5 month 6 month 7 Missing 1. 1 1. 2 2. 6 1. 9 1. 8 Time. Out 1. 4 3. 8 2. 7 0. 3 Different 1. 8 1. 6 3. 3 7. 0 1. 5 Equivalent 57. 9 24. 4 19. 2 19. 4 1. 9 Identical 37. 8 69. 0 71. 1 69. 1 94. 5 AVG 1. 7 2. 4 3. 0 24. 5 68. 3

Cross-architecture, optimization method bodies 100% 80% 60% 40% 20% 0% Missing Time. Out Different Equivalent Identical x 86 opt vs. unopt 1. 9 2. 0 19. 0 77. 0 0. 0 ARM opt vs. unopt 5. 5 1. 7 18. 8 73. 8 0. 0 x 86 vs. ARM MDIL vs. JIT 3. 6 4. 7 29. 0 62. 8 0. 0 13. 5 1. 3 20. 0 65. 2 0. 0

Fault injection (ARM) method bodies 100% 80% 60% 40% 20% 0% Missing Time. Out Different Equiv-correct Equiv-unsound m 3, 4 3. 1 0. 6 86. 9 5. 6 3. 8 month 5 month 6 month 7 1. 9 3. 1 3. 3 0. 7 0. 0 86. 3 83. 1 81. 3 7. 5 8. 8 10. 0 2. 5 3. 3 AVG 2. 9 0. 5 84. 4 8. 0 3. 0

Counterexample traces • Helps user find where program execution diverged • Used by automated root cause analysis

Root cause analysis

Bucketing • Based on root cause analysis • Users write bucket descriptions

Conclusions • Some statistics: • • • methods analyzed: > 500, 000 new bugs found: 12 false alarm rate, month-to-month versions: 2. 2% false alarm rate, opt vs. unopt, ARM vs x 86: > 20% speed: 13 seconds per method • Sources of false alarms: • aliasing, run-time system calls, embedded addresses, . . . • Counterexample traces, root cause analysis essential