Chapter 3 Viruses Virus Definition q Recall definition

  • Slides: 60
Download presentation
Chapter 3 Viruses

Chapter 3 Viruses

Virus Definition q Recall definition from Chapter 2… q Self-replicating: yes q Population growth:

Virus Definition q Recall definition from Chapter 2… q Self-replicating: yes q Population growth: positive q Parasitic: yes q When executed, tries to replicate itself into other executable code o So, it relies in some way on other code q Does not propagate via a network

Virus q 3 parts to a virus q Infection mechanism --- how it spreads

Virus q 3 parts to a virus q Infection mechanism --- how it spreads o Multipartite virus uses multiple means q Trigger --- decides when/how to deliver payload q Payload --- what it does other than spread o Either intentional or accidental

Virus Pseudocode q Without infection mechanism… o It’s not a virus, it’s a logic

Virus Pseudocode q Without infection mechanism… o It’s not a virus, it’s a logic bomb q But trigger and payload are optional q Generic virus pseudocode def virus(): infect() if trigger() is true: payload()

Infection Pseudocode Targets must be “local” q Don’t select already infected targets q o

Infection Pseudocode Targets must be “local” q Don’t select already infected targets q o Can be a double edged sword def infect(): repeat k times: target = select_target() if no target: return infect_code(target)

Virus Classification q Possible to classify in many ways q Here, we classify in

Virus Classification q Possible to classify in many ways q Here, we classify in 2 ways: q Target o What/where does the virus infect? q Concealment strategy o What does it do to remain undetected?

Classification by Target q Briefly consider 3 cases q Boot-sector infectors q Executable file

Classification by Target q Briefly consider 3 cases q Boot-sector infectors q Executable file infectors q Data file infectors o Macro viruses

Boot Sequence q Generic 1. 2. boot sequence Power on ROM-based instructions run o

Boot Sequence q Generic 1. 2. boot sequence Power on ROM-based instructions run o Self-test, device detection, initialization o Boot device IDed, boot block read from it o Control transferred to the loaded code -- this step known as primary boot

Boot Sequence Continued 3. Code loaded in primary boot step loads larger, fancier program

Boot Sequence Continued 3. Code loaded in primary boot step loads larger, fancier program o This is secondary boot 4. Secondary boot loads/runs OS kernel

Boot Sector Infector q Why infect boot sector? q A boot-sector infector (BSI) o

Boot Sector Infector q Why infect boot sector? q A boot-sector infector (BSI) o Infects by copying itself to boot block q May copy boot block elsewhere o Could be tricky, require lots of code o So a fixed “safe” location chosen o Different viruses may use same “safe” location (e. g. , Stoned and Michelangelo)

Boot Sector Infector q BSI once popular, not so much now q Why? o

Boot Sector Infector q BSI once popular, not so much now q Why? o Machines don’t reboot so often o Much harder to infect, due to better defenses

Multiple Infections

Multiple Infections

File Infectors q OS views some files as executable o Like “exe” and similar

File Infectors q OS views some files as executable o Like “exe” and similar q Files that can be run by a command-line "shell" also considered executable o Batch files, shell scripts, … q File infector --- infects executable file o Exe, shell code, consider executable o Binary executable is most common target

File Infectors q Two main issues… 1. Where to put the virus within file?

File Infectors q Two main issues… 1. Where to put the virus within file? 2. How to execute the virus when infected file is run? q Consider these two (interrelated) questions in next few slides

Beginning of File q Older exe formats (e. g. , . COM) treat entire

Beginning of File q Older exe formats (e. g. , . COM) treat entire file as chunk of code and data o Entire file loaded into memory o Execution starts by jumping to the beginning of the loaded file q Can put virus at start of such a file o That is, prepend the virus code

Prepended Virus

Prepended Virus

End of File q Append a virus (even easier? ) q Then how does

End of File q Append a virus (even easier? ) q Then how does virus get executed? q Some possibilities… q Replace first line(s) with a jump to viral code --- save overwritten code q Later, transfer control back to code o How to do this?

End of File q How to transfer control back to code? o Run saved

End of File q How to transfer control back to code? o Run saved instructions in saved location o Restore the infected code back to its original state and run it q Many exe file formats specify start location in file header o If so, virus can change start location to point to its own code and jump to the original start location when done

Appended Virus

Appended Virus

Overwritten into File q Virus places itself atop original code q Can avoid changes

Overwritten into File q Virus places itself atop original code q Can avoid changes in file size q Easy for virus to get control q But… overwriting code will break the original code o Making virus easier to discover q Is it possible to overwrite without breaking the code?

Overwritten into File q Smart ways to overwrite? q Overwrite repeated data o May

Overwritten into File q Smart ways to overwrite? q Overwrite repeated data o May be trickier to execute virus q Save overwritten data (like BSI) q Use over-allocated space in a file q Compress code to make space q For these to work, virus must be small

Merged with File q Could try to merge virus with target q I. e.

Merged with File q Could try to merge virus with target q I. e. , intermixing virus/target code q Difficult o So, it’s “rarely seen” q But, supposedly, Zmist does this o So, apparently it is possible o That’s impressive…

Not in File q Companion virus --- separate from, but naturally executed before target

Not in File q Companion virus --- separate from, but naturally executed before target q No modification to infected code q May take advantage of process used by OS or shell to search for exe files q Like a Trojan horse but it’s a virus… o …since it’s self-replicating

Companion Virus q Virus is earlier in the search path o Same name as

Companion Virus q Virus is earlier in the search path o Same name as the target file, almost… q E. g. , MS-DOS searches for “foo” by 1. Look for foo. com 2. Look for foo. exe 3. Look for foo. bat q If the target file is a foo. exe, companion virus is in file foo. com

Companion Virus q Windows registry associates file types with applications q Can modify registry

Companion Virus q Windows registry associates file types with applications q Can modify registry so that companion virus runs instead of exe o Then companion can transfer control to the corresponding exe q In effect, all exes infected at once!

Companion Virus q ELF file format used on recent Unix’s q Has "interpreter" specified

Companion Virus q ELF file format used on recent Unix’s q Has "interpreter" specified in each exe file header o Points to run-time linker q Companion time linker virus can replace the run- o As above, effect is that all exe files infected at once

Companion Virus q Companion viruses possible in GUI q App’s icon can be overwritten

Companion Virus q Companion viruses possible in GUI q App’s icon can be overwritten with the icon for the companion virus q When a user clicks on “app” icon… o Companion virus runs instead

Macro Virus q Some apps allow data files to have macros embedded in them

Macro Virus q Some apps allow data files to have macros embedded in them q Macros are short snippets of “code” interpreted by the application q Such a languages often provide enough functionality to write a virus

Macro Virus q Macros often run automatically when file is loaded o Easy to

Macro Virus q Macros often run automatically when file is loaded o Easy to write compared to low-level code q First proof of concept in 1989 q Hit “mainstream” in 1995 o o Virus known as Concept Targeted Microsoft Word (of course) Installed in “global macros” Infected all edited documents

Macro Virus: Concept q Targeted Word Docs q Auto. Open macro --- runs automatically

Macro Virus: Concept q Targeted Word Docs q Auto. Open macro --- runs automatically when file opened o How you get the virus from infected file q File. Save. As --- when “file save as” selected from menu o So the virus can infect other docs

Macro Virus: Concept

Macro Virus: Concept

Classification by Concealment Strategy q Most viruses try to hide o Why? q So,

Classification by Concealment Strategy q Most viruses try to hide o Why? q So, how do they hide? o Encryption o Polymorphism o Etc. , etc. q Yet another way to classify viruses. .

No Concealment q Do nothing to hide q This is easiest for virus writer…

No Concealment q Do nothing to hide q This is easiest for virus writer… o …but also easiest to detect, analyze

Encryption q Why encrypt? q Virus body is “hidden” from view o In particular,

Encryption q Why encrypt? q Virus body is “hidden” from view o In particular, the signature is hidden q Distinguish between strong encryption and obfuscation q Viruses usually only obfuscated o Very weak encryption

Encrypted Virus

Encrypted Virus

Encryption q How to encrypt? o Let me count the ways… 1. Simple encryption

Encryption q How to encrypt? o Let me count the ways… 1. Simple encryption o Rotate, increment, negate, etc. 2. Static encryption key o E. g. , XOR fixed byte to all bytes 3. Variable encryption key o Like static, but key changes

Encryption (Continued) 4. Substitution cipher o Permute the bytes o Could be via lookup

Encryption (Continued) 4. Substitution cipher o Permute the bytes o Could be via lookup table o Could even have multiple ciphertexts decrypt to same plaintext 5. Strong encryption o DES, AES, RC 4, etc. o Might use crypto libraries

Stealth q Tries to hide the infection o Not just hide the virus signature

Stealth q Tries to hide the infection o Not just hide the virus signature q Examples of stealth techniques o Change timestamp and/or other file info to pre-infection values o Intercept I/O calls to hide presence (in MS-DOS user-accessible interrupts) o Hijack secondary boot loader

Stealth q Stealth viruses “overlap” rootkits q Rootkit --- installed on compromised machine so

Stealth q Stealth viruses “overlap” rootkits q Rootkit --- installed on compromised machine so attacker can use it o Stealth is critical to rootkit success q Some malware use rootkits o For example, Ryknos Trojan hid itself using a rootkit designed for DRM

Reverse Stealth Virus q What is “reverse stealth”? q Make everything look infected! q

Reverse Stealth Virus q What is “reverse stealth”? q Make everything look infected! q Why is this malicious? o Damage may be done by AV software trying to disinfect

Oligomorphism q Oligomorphic or semi-polymorphic q Code is encrypted q Decryptor code is morphed

Oligomorphism q Oligomorphic or semi-polymorphic q Code is encrypted q Decryptor code is morphed o But not too many different decryptors q For example o Whale had 30 different decryptors o Memorial had 96 decryptors q How to detect?

Polymorphism q Like oligomorphic, but lots more decryptors q Essentially, an infinite number q

Polymorphism q Like oligomorphic, but lots more decryptors q Essentially, an infinite number q For example o Tremor has almost 6 billion decryptors q So, AV software cannot have a signature for each decryptor

Polymorphism q 2 problems for polymorphic writer… q How to generate decryptors? o Use

Polymorphism q 2 problems for polymorphic writer… q How to generate decryptors? o Use a mutation engine o Engine is part of encrypted virus q How to detect previous infections? o Data “hiding”: timestamp, file size, file system features, external storage, … o “Inoculate” system by faking infection?

Mutation Engine 1. Equivalent instruction substitution o One or more instructions 2. 3. 4.

Mutation Engine 1. Equivalent instruction substitution o One or more instructions 2. 3. 4. 5. 6. 7. Instruction reordering Register swap Reorder data Spaghetti code Insert junk code Run-time code modification/generation

Mutation Engine Subroutine permutation 9. DIY virtual machine 10. Concurrency --- threads 11. Inlining/outlining

Mutation Engine Subroutine permutation 9. DIY virtual machine 10. Concurrency --- threads 11. Inlining/outlining 12. “Threaded” code --- not threads 8. Jump directly from one subroutine to another, without returning 13. Subroutine interleaving

Mutation Engine q Many, many other possibilities q Possible overlap with optimizing compilers? o

Mutation Engine q Many, many other possibilities q Possible overlap with optimizing compilers? o Seems more like de-optimizing…

Equivalent Instructions q All of these lines set register r 1 to 0 clear

Equivalent Instructions q All of these lines set register r 1 to 0 clear r 1 xor r 1, r 1 and 0, r 1 move 0, r 1

Concurrency Example r 1 = 12 r 2 = 34 r 3 = rl

Concurrency Example r 1 = 12 r 2 = 34 r 3 = rl + r 2 => start thread T r 1 = 12 wait for signal r 3 = r 1 + r 2. . . T: r 2 = 34 send signal exit thread T

Concurrency q Aside: Concurrency may be very effective anti-reversing technique o Use multiple threads

Concurrency q Aside: Concurrency may be very effective anti-reversing technique o Use multiple threads o Intentional deadlock o “Junk” threads q Described in masters project: q Improved software activation using multithreading

Mutation q Mutation 1. 2. also can be used for good Makes reverse engineering

Mutation q Mutation 1. 2. also can be used for good Makes reverse engineering attacks more difficult Make software more “diverse”

Metamorphism q Apply polymorphism to virus body o Aka, “body polymorphic” q No encryption/decryption

Metamorphism q Apply polymorphism to virus body o Aka, “body polymorphic” q No encryption/decryption needed q Body must change a lot o Goal is to have no common signature q Mutation code must be mutated too! o Otherwise, a signature will exist o Different from polymorphic (why? )

Metamorphism q Two types of metamorphic generators o Both types difficult to produce 1.

Metamorphism q Two types of metamorphic generators o Both types difficult to produce 1. Standalone o Apply generator offline o Easy to make old malware into “new” 2. Malware “carries its own generator” o Necessary if self-propagating o A much more difficult problem

Metamorphism: Apparition q Apparition --- metamorphic virus q Delivered in source code (Pascal) q

Metamorphism: Apparition q Apparition --- metamorphic virus q Delivered in source code (Pascal) q If compiler is present… o Insert junk code and compile q. A very lame approach q Real metamorphism must be done in assembly or (better yet) machine code

Metamorphism: Simile q Simile --- metamorphic virus q Simile’s metamorphic generator o o o

Metamorphism: Simile q Simile --- metamorphic virus q Simile’s metamorphic generator o o o 12, 000 lines of assembly Translate Simile to intermediate form Then remove all old transformations Obtains a base form of virus Apply new set of transformations Generate new (morphed) machine code

Metamorphism: Meta. PHOR q Metamorphic Permutating High. Obfuscating Reassembler o That is, Meta. PHOR

Metamorphism: Meta. PHOR q Metamorphic Permutating High. Obfuscating Reassembler o That is, Meta. PHOR q Described in How I Made Metaphor and What I’ve Learnt by The Mental Driller q Complex expander/shrinker strategy q Almost impossible to analyze

Metamorphism: MWOR q Metamorphic Worm, i. e. , MWOR q Experimental metamorphic malware designed

Metamorphism: MWOR q Metamorphic Worm, i. e. , MWOR q Experimental metamorphic malware designed by former masters student q Modeled on Meta. PHOR, but… o Easier to understand o Better for experiments and testing o A useful research tool q How to detect?

Metamorphism q The bottom line… q Metamorphics difficult to detect o Machine learning works

Metamorphism q The bottom line… q Metamorphics difficult to detect o Machine learning works well on hacker malware, but can be defeated q Metamorphics also difficult to write o Most “metamorphic” generators aren’t q Current state of the art? o “Undetectable” metamorphic viruses

Strong Encryption q What is strong encryption? q Use a real cipher q For

Strong Encryption q What is strong encryption? q Use a real cipher q For this to be useful, must not store key with code o Why not? q But must decrypt the virus q How to get the key to the code?

Strong Encryption: Key q Store key on the web o Then must go fetch

Strong Encryption: Key q Store key on the web o Then must go fetch the key o But then how to get the key? q Binary virus --- 2 parts o Low probability that both parts arrive q “Environmental” key generation o Key based on machine-specific info o Key derived at runtime o Harder to analyze q Other? ? ?

Virus Kits q Many malware construction kits o See VX Heavens q Many kits

Virus Kits q Many malware construction kits o See VX Heavens q Many kits claim to be metamorphic o Or polymorphic, or encrypted, or … o You should be very skeptical of claims o Some have nice GUI interface q Success is failure? o The more successful, the more likely it has been studied and can be detected