debugging bug finding and bug avoidance Part 4
debugging, bug finding and bug avoidance Part 4 Alan Dix www. hcibook. com/alan/teaching/debugging/
outline • part 1 – general issues and heuristics • part 2 – the system as it is • understand document • part 3 – locating and fixing bugs • part 4 – bug engineering • design to expose, avoid and recover • including fail-fast programming
debugging – part 4 bug engineering simplicity fail-fast programming factoring limiting effects of error
simplicity simple code is correct code
simplicity avoid • Tower of Babel • long chains of method calls • deep subclass trees • dead code • once was needed / may be needed someday! • unnecessary optimisations • instead profile then optimise where needed
simplicity DO • choose simple data structures • arrays / other pre-supplied data • static vs. dynamic allocated (esp. in C/C++) • choose simple algorithms • linear searches etc. • implement key paths first • design class structure for generality • implement methods necessary for early use/test + incremental delivery, no dead code
fail fast coding • if it is going to fail sooner rather than later • during testing rather than use • ‘cos then you can fix it
fail fast coding general principles • constant checking – data integrity, logical integrity – log, throw exception (or crash!) • avoid special cases – the worst case will eventually happen – reused code is heavily tested code
fail fast coding internal consistency • C/C++ assert char *c = getname(); assert( c!=NULL ); • validation methods and exceptions obj. do. Something. Complicated(); obj. check(); class A { public void check() throws Check. Exception { } }
fail fast coding cross checking • test method list. fast. Sort. Method(); if ( ! list. is. Sorted() ). . . • parallel execution temp = list. clone(); list. fast. Sort. Method(); temp. slow. But. Simple. Sort. Method(); if ( ! list. equals(temp) ). . .
fail fast coding magic values • special arbitrary value - just to check – many file formats – first few bytes special – identify format GIF 89 a. . . . binary. . . data. . . • can add to your own data – file formats, version markers – C/C++ around dynamic structures (for buffer overrun bugs etc. )
fail fast coding real time systems • Byzantine cases branch testing not sufficient • evaluate and discard • best case = worst case • fast enough during testing always fast enough
submarine cable tracker user interface memory 68 xxx control processor sensor array A-to-D Texas DSP shared memory 20 ms interrupt loop
fail fast coding internal reuse • same code used in several places • method, function, iteration or copy • more uses of same code more likely to find bugs • copying may introduce differences
fail fast coding table driven code • form of reuse • replace long-hand code original long-hand code if ( 3 < x && x <= 7 ) return “hello”; if ( 19 < x && x <= 23 ) return “world”; . . . return “default”;
fail fast coding table driven code • long-hand code -> table + loop • table in file or code for ( i=0; i<t. len; i++) { if ( t[i]. min < x && x <= t[i]. max ) return t[i]. val; } return “default”; if ( 3 < x && x <= 7 ) return “hello”; if ( 19 < x && x <= 23 ) return “world”; . . . return “default”; min max val 3 7 hello 19 23 world. . .
fail fast coding generated code min max val 3 7 hello 19 23 world. . . code generator program compile if ( 3 < x && x <= 7 ) return “hello”; if ( 19 < x && x <= 23 ) return “world”; for ( i=0; i<t. len; i++) { print(“if ( “+ t[i]. min +” < x”); print(“ && x <= “+ t[i]. max +” )n”); print(“t return “”+ t[i]. val +””; n”); }
factoring component 1 communication component 2 monitor offline record component 2 replay
factoring in code res = complex. Calculation(); • introduce intermediate data structures intermediate_val = complex. Part 1(); // check and/or print intermediate_val res = complex. Part 2(intermediate_val); exposes hidden state
factoring through files complex program part 1 program 1 object. write() object part 2 object program 2 object. read()
factoring through protocols network client server instrument client and/or server log or use proxy client proxy server
limiting effects of error “file format error - aborting” • when one subsystem goes wrong • protect the rest of the system • best efforts to recover – during testing – allows further tests – when deployed – allows continued use • do log/warn of problem!
limiting effects of error safety net component checks main program • check bounds etc. , sanitise values • beware non-termination, memory hogs
limiting effects of error sandbox other components component • e. g. applet • entirely quarantine component • prevent unauthorized actions other components
limiting effects of error guardian inputs outputs guardian • monitor critical interfaces • on failure - reset or intervene
limiting effects of error in code • pre-post checks – outputs of subcomponents – inputs to methods • wrapper/proxy classes • VM/OS support – memory protection Unix/Win. NT/Mac. Os. X – Java classloader/security manager
limiting effects of error self-healing data • parameters – null default value • data files – – calculate/default missing values ignore unexpected values sentinels for easy parsing old format new format conversions • data operations – check and tidy-up data structures – attribute/property lists for extensibility (beware also)
limiting effects of error workarounds • avoid bad values – document, wrap with protective code – alternative code for bad cases • restart frequently – especially for memory leaks • recover after error – backup/recovery points, • adjust environment – more memory, time, etc.
last word if you think it will be perfect things will fail accept you aren’t and things may work
- Slides: 29