C for Microcontrollers Just Being Efficient Lloyd Moore
‘C’ for Microcontrollers, Just Being Efficient Lloyd Moore, President Lloyd@Cyber. Data-Robotics. com www. Cyber. Data-Robotics. com Seattle Robotics Society 9/15/2012
Agenda l Microcontroller Resources l Knowing Your Environment l Memory Usage l Code Structure l Optimization l Summary
Disclaimer Some microcontroller techniques necessarily need to trade one benefit for another – typically lower resource usage for maintainability l Point of this presentation is to point out various techniques that can be used as needed l Use these suggestions when necessary l Feel free to suggest better solutions as we go along l
Microcontroller Resources EVERYTHING resides on one die inside one package: RAM, Flash, Processor, I/O l Cost is a MAJOR design consideration l l Typical costs are $0. 25 to $25 each (1000’s) RAM: 16 BYTES to 256 K Bytes typical l Flash/ROM: 384 BYTES to 1 M Byte l Clock Speed: 4 MHz to 175 MHz typical l l Much lower for battery saving modes (32 KHz) Bus is 8, 16, or 32 bits wide l Have dedicated peripherals (MAC, Phys, etc) l
Power Consumption l Microcontrollers typically used in battery operated devices l Power requirements can be EXTREMELY tight Energy harvesting applications l Long term battery installations (remote controls, hard to reach devices, etc. ) l l EVERY instruction executed consumes power, even if you have the time and memory!
Know Your Environment l Traditionally we ignore hardware details l Need to tailor code to hardware available l Specialized hardware MUCH more efficient l Compilers typically have extensions Interrupt – specifies code as being ISR l Memory model – may handle banked memory and/or simultaneous access banks l Multiple data pointers / address generators l l Debugger may use some resources
Memory Usage l l Put constant data into program memory (Flash/ROM) Alignment / padding issues l l Avoid dynamic memory allocation, even if available l l l Take extra space and processing time Memory fragmentation a big issue Use and reuse static buffers l l l Typically NOT an issue, non-aligned access ok Reduces variable passing overhead Allows for smaller / faster code due to reduced indirections Does bring back over write bugs if not done carefully More reliable for mission critical systems Use the appropriate variable type l l Don’t use int and double for everything!! Affects processing time as well as storage
C 99 Datatypes – inttypes. h l int 8_t, int 16_t, int 32_t, int 64_t l uint 8_t, uint 16_t, uint 32_t, uint_64_t l Avoids the ambiguity of int and uint when moving code between processors of different native size l Makes code more portable and upgradable over time
Char vs. Int Increment on 8051 char c. X; c. X++; 000 A 000 D 000 E 000 F l l 900000 E 0 04 F 0 MOVX INC MOVX DPTR, #c. X A, @DPTR A @DPTR, A 6 Bytes of Flash 4 Instruction cycles int i. X; i. X++; 0000 0003 0004 0007 900000 E 4 75 F 001 120000 MOV CLR MOV LCALL DPTR, #i. X A B, #01 H ? C? IILDX 10 Bytes of Flash + subroutine overhead l Many more than 4 instruction cycles with a LCALL l
Code Structure l Count down instead of up l l l Pointers vs. array notation l l Saves a subtraction on all processors Decrement-jump-not-zero style instruction on some processors Generally better using pointers Bit Shifting l l l May not always generate what you think May or may not have barrel shifter hardware May or may not have logical vs. arithmetic shifts
Shifting Example on 8051 c. X = c. X << 3; 0006 0007 0008 0009 l l l 33 33 33 54 F 8 c. A = 3; c. X = c. X << c. A; RLC RLC ANL A A, #0 F 8 H Constants turn into seperate statements Variables turn into loops Both of these can be one instruction with a barrel shifter 000 B 000 E 000 F 0010 0011 0013 0014 0016 0017 0018 900000 E 0 FE EF A 806 08 8002 C 3 33 D 8 FC MOV DPTR, #c. A MOVX A, @DPTR MOV R 6, A MOV A, R 7 MOV R 0, AR 6 INC R 0 SJMP ? C 0005 ? C 0004: CLR C RLC A ? C 0005 DJNZ R 0, ? C 0004
Indexed Array vs Pointer on M 8 C uc. Mode = g_Channels[uc_Channel]. uc. Mode; 01 DC 01 DE 01 E 0 01 E 2 01 E 3 01 E 5 01 E 6 01 E 8 01 E 9 01 EB 01 EC 01 EF 01 F 1 01 F 4 01 F 7 01 FA 01 FD 01 FF 52 FC 5300 5000 08 5100 08 5007 08 7 C 0000 38 FC 5 F 0000 060000 0 E 0000 3 E 00 5403 mov A, [X-4] mov [__r 1], A mov A, 0 push A mov A, [__r 1] push A mov A, 0 push A mov A, 7 push A xcall __mul 16 add SP, -4 mov [__r 1], [__r. X] mov [__r 0], [__r. Y] add[__r 1], <_g_Channels adc[__r 0], >_g_Channels mvi A, [__r 1] mov [X+3], A uc. Mode = p. Channel->uc. Mode; 01 ED 01 EF 01 F 1 01 F 3 l l 5201 5300 3 E 00 5405 mov mvi mov A, [X+1] [__r 1], A A, [__r 1] [X+5], A Does the same thing Saves 29 bytes of memory AND a call to a 16 bit multiplication routine! Pointer version will be at least 4 x faster to execute as well, maybe 10 x Most compilers not this bad – but you do find some!
More Code Structure l Actual parameters typically passed in registers if available l l Global variables l l Keep function parameters to less than 3 May also be passed on stack or special parameter area May be more efficient to pass pointer to struct While generally frowned upon for most code can be very helpful here Typically ends up being a direct access Read assembly code for critical areas Know which optimizations are present l l Small compilers do not always have common optimizations Inline, loop unrolling, loop invariant, pointer conversion
Switch Statement Implementation l Switch statements can be implemented in various ways l l Specific implementation can also vary based case clauses l l Sequential compares In line table look up for case block Special function with look up table Clean sequence (1, 2, 3, 4, 5) Gaps in sequence (1, 10, 30, 255) Ordering of sequence (5, 4, 1, 2, 3) Knowing which method gets implemented is critical to optimizing!
Switch Statement Example switch(c. A) { case 0: c. X = 4; break; case 1: c. X = 10; break; case 2: c. X = 30; break; default: c. X = 0; break; } 0006 0009 000 A 000 B 000 C 000 F 0011 0012 0014 0015 0017 0018 001 A 001 C 001 F 0021 0022 900000 E 0 FF EF 120000 00 0000 01 0000 02 0000 900000 7404 F 0 8015 MOVX MOV LCALL DW DB DW DW ? C 0002: MOV MOVX SJMP DPTR, #c. A A, @DPTR R 7, A A, R 7 ? C? CCASE ? C 0003 00 H ? C 0002 01 H ? C 0004 02 H 00 H ? C 0005 DPTR, #c. X A, #04 H @DPTR, A ? C 0006 . . . More blocks follow for each case
Optimization Process l Step 0 – Before coding anything, think about risk points and prototype unknowns!!! l l Step 1 – Get it working!! l l l Use available dedicated hardware Fast but wrong is of no use to anyone Optimization will typically reduce readability Step 2 – Profile to know where to optimize l l Usually one or two routines are critical You need to have specific performance metrics to target
Optimization Process l Step 3 – Let the tools do as much as they can Turn off debugging! l Select the correct memory model l Select the correct optimization level l l Step 4 – Do it manually Read the generated code! Might be able to make a simple code or structure change. l Last – think about assembly coding l
Summary l Microcontrollers are a resource constrained environment l Be familiar with the hardware in your microcontroller l Be familiar with your compiler options and how it translates your code l For time or space critical code look at the assembly listing from time to time
Questions?
- Slides: 19