Buffer Overflow Attackproofing of Code Binaries Ramya Reguramalingam

Buffer Overflow Attackproofing of Code Binaries Ramya Reguramalingam Gopal Gupta Department of Computer Science University of Texas at Dallas

Buffer Overflow: A Serious Problem Percentage of buffer overflows listed in CERT advisories each year Ø Some examples include Windows 2003 server, sendmail, windows HTML conversion library Ø Percentage of Buffer Overflows Per Year as listed by CERT [1] ALPS LAB@UTD

Overview of our approach Ø Ø Ø Buffer Overflow Attacks (B. O. A): A majority of attacks for which advisories are issued are based on B. O. A. Other forms of attacks, such as distributed denial of service attacks, sometimes rely on B. O. A. exploit the memory organization of the traditional activation stack model to overwrite the return address stored on the stack. This memory organization can be slightly changed so as to prevent buffer overflows overwriting return addresses. Our system automatically transforms code binaries in accordance to this modified memory organization, thereby preventing most common forms of B. O. A. s. Our tool can be used on third-party s/w and off-the-shelf products, & does not require access to source code. ALPS LAB@UTD

Traditional Memory Organization Ø Sample Code Stack at the start void function (int a, int b, int c){ char buffer 1[8]; } ff ff ESP Stack void main( ){ function(5, 8, 2); } Heap Data 00 ALPS LAB@UTD Code

Stack Organization: After a Call Ø Sample Code void function (int a, int b, int c){ char buffer 1[8]; } Stack after a function call Stack Param 3 = 2 Param 2 = 8 void main( ){ function(5, 8, 2); } Param 1 = 5 Return address ebp Buffer variables . . . Heap, Data & Code Parameters ALPS LAB@UTD EBP ESP

Buffer Overflow Ø Sample Code Stack showing buffer overflow void function (char *str){ char buffer 1[8]; strcpy (buffer 1, str); } void main( ){ Direction char large_str[256] ; Strcpy writes for (int i=0; i<255; i++) large_str[i] = ‘A’; function(large_str); } Ø New return address =4141 ALPS LAB@UTD Stack Large_str 64 Return address ebp Buffer 1 2

Abusing the Buffer Overflow Step 1: Overwrite the return address with an address that points ‘back’ to the buffer area Ø Step 2: Insert code that you wish to execute in the buffer area Ø ssssssssss ebp ret Step 3: Buffer start of inserted code with NOP instructions Ø Step 4: Eliminate any null values in inserted code Ø ALPS LAB@UTD

Buffer Overflow Attack-proofing B. O. A. done by overwriting return addresses Ø If return addresses are recorded in a separate area, away from the direction of the buffer overflow, then they cannot be overwritten Ø So modify the memory organization to add a new return address stack, allocated in an area opposite to the direction in which buffers are allocated. Ø When a function call exits it uses the return address in this new return address stack Ø ALPS LAB@UTD

B. O. A-proofing (Cont’d) Transform the executable binary code so it is consistent with this new memory organization Ø Conditions that must hold for this transformation to be semantics preserving: Ø l l l Code must be re-entrant Code should not modify the stack pointer or use it for computation in the program. Processor: Intel x 386 Compiler: Dev C++ compiler 4. 9. 9. 1 Platform: Windows ALPS LAB@UTD

Transforming the Binary Ø Step 1: Analyze the binary PE format and develop flow graph for the code Ø Step 2: Modify the binary such that return address is stored on a separate stack (within the stack area); l This separate stack is allocated in an area opposite to the direction of buffer growth Ø Step 3: Convert the modified hexadecimal code back to. exe ALPS LAB@UTD

Step 1 - Sample Source Code Ø Sample Code int func 1 (int a, int b){ int temp = a+ b; return (temp); } void main( ){ int a, b; a = 7; b = func 1(3, 4); a +=b; } Ø The code is compiled Ø The. exe file generated is disassembled MASM 32 ALPS LAB@UTD

Step 1 - Disassembled Code Ø Sample Code Ø int func 1 (int a, int b){ int temp = a+ b; return (temp); } void main( ){ int a, b; a = 7; b = func 1(3, 4); a +=b; } 55 89 E 5 83 EC 04 8 B 45 03 0 C 45 08 89 45 FC 8 B 45 FC C 9 C 3 55 89 E 5 83 EC 18 83 E 4 F 0 B 8 00 00 89 45 F 4 8 B 45 F 4 E 8 A 3 04 00 00 E 8 0 E 01 00 00 C 7 45 FC 07 00 00 00 C 7 44 24 04 04 00 00 C 7 04 24 03 00 00 00 E 8 B 3 FF FF FF 89 45 F 8 8 B 55 F 8 8 D 45 FC 01 10 C 9 C 3 ALPS LAB@UTD

Step 1 - Intel Opcode Decoded 55 Ø 89 E 5 Ø 83 EC 04 Ø 8 B 45 0 C Ø 03 45 08 Ø 89 45 FC Ø 8 B 45 FC Ø C 9 Ø C 3 Ø Push ebp Mov ebp, esp Sub esp, 0 x 04 Mov eax, [ebp+0 x 0 c] Add eax, [ebp+0 x 08] Mov [ebp-0 x 04], eax Mov eax, [ebp-0 v 04] Leave Ret ALPS LAB@UTD

Step 2 – Modification of Binary Before CALL push return address onto another stack 1. l Call = Push (eip) Store new stack pointer in a general purpose register whose value is saved in memory 3. Before RET is executed use new stack pointer to obtain value of return address 2. ALPS LAB@UTD

Step 2 – Example of Modification Ø Ø Mov edx, [0 x 13000000] Mov [edx], eip+0 x 04 Sub edx, 0 x 04 Call function 1 Ø Ø Ø Add edx, 0 x 04 Increment Stack Pointer to get value Mov [esp], [edx] Move stored address to current stack Ret Location of stack pointer Return Address is stored Decrement Stack Pointer ALPS LAB@UTD

Advantages Ø Binary code is analyzed and transformed. Our approach can be used on third-party software when one does not have access to source code. l l Ø Run-time checks require modification to the source code Compiler modifications are costly and performing changes to all available compilers is not possible. Return addresses are stored on the stack itself. Hence overhead incurred while accessing addresses in other areas is reduced. ALPS LAB@UTD

Disadvantages Ø The stack has to store a list of return addresses. Storage overhead = depth of the flow graph is incurred. Ø The code is machine dependent. But, it covers machines from 80 x 86 upwards. A large number of machines fall in this category. ALPS LAB@UTD