BOINC Application Development Lifecycle Rom Walton rwaltonssl berkeley
BOINC Application Development Lifecycle Rom Walton rwalton@ssl. berkeley. edu September 10, 2008
BOINC Application Lifecycle • Designing Application • Writing Code • Instrumenting Code • Fixing Bugs • Checking Crash Reports • Formulating a Plan of Action • Running Test Workunits • Checking for Memory Leaks • Checking for Handle Leaks • Uploading Application • Uploading Support Files • Running update_versons
Developing Apps Worker applications should call: boinc_init_diagnostics( BOINC_DIAG_DEFAULTS ); Graphics applications should call: boinc_init_graphics_diagnostics( BOINC_DIAG_DEFAULTS ); Benefits of the BOINC Diagnostic Framework Stack traces on crashes (Win, Linux, Mac) Prevents the JIT Debugger dialog (Win) Memory leak detection (Win – VS Debug Builds) Heap corruption detection (Win – VS Debug Builds)
Developing Apps (cont’d) Platform neutral tips Use a separate project to test your applications Windows tips Use the symbol store technology, if possible Linux tips Link application with the BOINC-Build-Compat Virtual Machine. Application binds to older glibc but still contains the optimizations of newer compilers Mac tips Build against the Mac OS X 10. 3. 9 SDK
Developing Apps (cont’d) Windows Symbol Stores What are they? How do they work? How does the BOINC Framework use them? How do I configure my Windows machine? Set environment variable _NT_SYMBOL_PATH to: srv*c: windowssymbols*http: //msdl. microsoft. com/download/symbols Execute: symchk c: windowssystem 32*. dll /ov
Enable Symbols Demo
Developing Highlights Use the BOINC Diagnostic Framework Traps structured exceptions (Win) Traps signals (Linux, Mac) Detects memory leaks (Win) Detects heap corruption (Win)
BOINC Application Lifecycle • Designing Application • Writing Code • Instrumenting Code • Fixing Bugs • Checking Crash Reports • Formulating a Plan of Action • Running Test Workunits • Checking for Memory Leaks • Checking for Handle Leaks • Uploading Application • Uploading Support Files • Running update_versons
Validating Apps Run through your test workunits Check for buffer overruns Win: BOINC_DIAG_HEAPCHECKENABLED Linux: Valgrind Check for memory leaks Win: BOINC_DIAG_MEMORYLEAKCHECKENABLED Linux: Valgrind Check for handle leaks Win: Process Explorer, ntsd (!handles), oh. exe Linux: lsof
Validating Apps (cont’d) gflags. exe options: Global Flags (gflags. exe) Not useful for debugging in the field Tweaks the heap manager (works with any type of application) Only detects buffer overruns for dynamically allocated memory.
Validating Apps (cont’d) Example memory leak output: Dumping objects ->. . libdiagnostics_win. C(642) : {70} normal block at 0 x 017 C 3 A 30, 1068 bytes long. Data: < T > 1 C 1 A 00 00 54 00 00 00 01 00 00 Object dump complete. Example handle leak output (oh. exe): boinc. exe File 0008 Program. DataBOINCstdoutdae. txt boinc. exe Key 0034 REGISTRYMACHINE boinc. exe File 0094 Program. DataBOINCstderrdae. txt
Validating Highlights Test the following once per application release cycle: Memory leaks Handle leaks Heap corruption Test the following once per application release: Exception/Signal handling (Abort a task from BOINC Manager)
BOINC Application Lifecycle • Designing Application • Writing Code • Instrumenting Code • Fixing Bugs • Checking Crash Reports • Formulating a Plan of Action • Running Test Workunits • Checking for Memory Leaks • Checking for Handle Leaks • Uploading Application • Uploading Support Files • Running update_versons
Deploying Apps Add application to ‘$project_root/project. xml’. <boinc> <app> <name>$app_name</name> <user_friendly_name></user_friendly_name> </app> </boinc> $app_name = Whatever you want to call your application Execute xadd to add your application to the project. cd $project_root. /bin/xadd
Deploying Apps (cont’d) Upload your application and its dependencies to: $project_root/apps/$app_name/$app_ver_name is the name of the worker executable Worker executable name is defined as: $name_$version_$platform Windows: uppercase_6. 1_windows_intelx 86. exe Linux: uppercase_6. 1_i 686 -pc-linux-gnu Mac: uppercase_6. 1_i 686 -apple-darwin Execute update_versions t 0 add app versions to cd $project_root project. /bin/update_versions
Deployment Demo
Deployment Highlights Application files are assumed to be immutable update_versions performs the following functions: Copies application files to download directory Reads signature file/generates signature file Adds application to database Touches feeder restart file
BOINC Application Lifecycle • Designing Application • Writing Code • Instrumenting Code • Fixing Bugs • Checking Crash Reports • Formulating a Plan of Action • Running Test Workunits • Checking for Memory Leaks • Checking for Handle Leaks • Uploading Application • Uploading Support Files • Running update_versons
Debugging Apps 4% - 5% failure rates are not uncommon Machines overheat Bad memory Bad processors Crash reports are stored inside the BOINC Database Go to the project_ops web site to look at crash reports Look for reports that have: PDB symbols for the application and libraries Complete stack traces
Debugging Apps (cont’d) Determine environment is setup correctly BOINC Windows Runtime Debugger Version 5. 5. 0 Dump Timestamp : 09/09/08 23: 41: 39 Debugger Engine : 4. 0. 5. 0 Symbol Search Path: srv*C: DOCUME~1romwLOCALS~1Tempsymbols*http: //boinc. berkeley. edu/alpha/symstore; srv*c: windowssymbols*http : //msdl. microsoft. com/download/symbols; srv*C: DOCUME~1 romwLOCALS~1Tempsymbols*http: //boinc. berkeley. edu/sy mstore Check that your symbol store is listed
Debugging Apps (cont’d) Determine that the loaded modules have symbols Mod. Load: 00400000 00060000 C: . . . uppercase_6. 3_windows_intelx 86. exe (PDB Symbols Loaded) Mod. Load: 7 c 800000 000 c 0000 C: . . . ntdll. dll (5. 2. 3790. 1830) (PDB Symbols Loaded) Symbols may not download Machine is off-line during crash Took longer that 30 seconds to download symbol file(s)
Debugging Apps (cont’d) Crashing thread contains an exception record: - Unhandled Exception Record Reason: Breakpoint Encountered (0 x 80000003) at address 0 x 7 C 822583 Access Violation (0 x. C 0000005) NULL references exceptions Using indexes as pointers Processor flipping bits because of heat Breakpoint Encountered (0 x 80000003) User aborted an active task Assertion failure in debug builds
Debugging Apps (cont’d) Check call stack of crashing thread - Callstack Child. EBP Ret. Addr Args to Child 00 aafd 60 00402221 00000000 00000001 ntdll!_Dbg. Break. Point@0+0 x 0 00 aaffb 4 0042684 e 77 e 66063 00000000 uppercase_5. 10_windows_intelx 86!worker+0 x 0 (c: boincsrcmainboincsamplesuppercaseupper_case. c: 181) Look for the interface point between your application and an OS library. Typically NTDLL. DLL or KERNEL 32. DLL
Debugging Apps (cont’d) What information is contained in a stack frame Child. EBP Ret. Addr Args to Child 00 aaffb 4 0042684 e 77 e 66063 00000000 uppercase_5. 10_windows_intelx 86!worker+0 x 0 (c: boincsrcmainboincsamplesuppercaseupper_case. c: 181) Child. EBP – isn’t really useful Ret. Addr – Return Address, should be within the same address space as the previous calling frame Args to Child – First four parameters to function Module!Function Source File: Line Number
Debugging Apps (cont’d) When ‘X’ doesn’t mark the spot Don’t get discouraged Sample several crash reports Debugging applications remotely can be hard One nasty bug in the Rosetta@home application Only happened during application destruction I looked at 120 crash reports over the course of 5 days before seeing the common theme Resource contention between two threads
Debugging Apps (cont’d) Need extra help? Add trace statements by logging to stderr Release a new version of your application Only you know how your applications really work
Debugging Highlights BOINC’s diagnostic framework Processes crashes for you Dumps relevant information to stderr No need to manage dump or mini-dump files No need to understand the specifics of each platforms infrastructure Easy to enable
Conclusion BOINC Application Framework: Mature diagnostic framework Completely automatic feedback loop BOINC Client uses the same framework We eat our own dog food Should be compatible with all run-time environments C/C++ Fortran Anything that supports calling a C Function
Questions and Answers BOINC Website: http: //boinc. berkeley. edu/ BOINC Development Mailing List: boinc_dev@ssl. berkeley. edu BOINC Projects Mailing List: boinc_projects@ssl. berkeley. edu BOINC Application Debugging: http: //boinc. berkeley. edu/trac/wiki/App. Debug Any Questions?
Tools Debugging Tools for Windows (Windows) gflags. exe, symchk. exe, symstore. exe, windbg. exe Web site Windows Resource Kit Tools (Windows) oh. exe, dh. exe, consume. exe, clear. exe Web site Process Explorer (Windows) DLL Dependencies, Handle Leak Detection, Real-Time Thread Monitoring (Backtraces) Web site
Tools (cont’d) Valgrind (Linux) Memory Leak Detection, Thread Error Detection Web site DDD (Linux) Graphical Debugger Web site
- Slides: 31