Chapter 3 Viruses Virus Definition q Recall definition




























































- Slides: 60

Chapter 3 Viruses

Virus Definition q Recall definition from Chapter 2… q Self-replicating: yes q Population growth: positive q Parasitic: yes q When executed, tries to replicate itself into other executable code o So, it relies in some way on other code q Does not propagate via a network

Virus q 3 parts to a virus q Infection mechanism --- how it spreads o Multipartite virus uses multiple means q Trigger --- decides when/how to deliver payload q Payload --- what it does other than spread o Either intentional or accidental

Virus Pseudocode q Without infection mechanism… o It’s not a virus, it’s a logic bomb q But trigger and payload are optional q Generic virus pseudocode def virus(): infect() if trigger() is true: payload()

Infection Pseudocode Targets must be “local” q Don’t select already infected targets q o Can be a double edged sword def infect(): repeat k times: target = select_target() if no target: return infect_code(target)

Virus Classification q Possible to classify in many ways q Here, we classify in 2 ways: q Target o What/where does the virus infect? q Concealment strategy o What does it do to remain undetected?

Classification by Target q Briefly consider 3 cases q Boot-sector infectors q Executable file infectors q Data file infectors o Macro viruses

Boot Sequence q Generic 1. 2. boot sequence Power on ROM-based instructions run o Self-test, device detection, initialization o Boot device IDed, boot block read from it o Control transferred to the loaded code -- this step known as primary boot

Boot Sequence Continued 3. Code loaded in primary boot step loads larger, fancier program o This is secondary boot 4. Secondary boot loads/runs OS kernel

Boot Sector Infector q Why infect boot sector? q A boot-sector infector (BSI) o Infects by copying itself to boot block q May copy boot block elsewhere o Could be tricky, require lots of code o So a fixed “safe” location chosen o Different viruses may use same “safe” location (e. g. , Stoned and Michelangelo)

Boot Sector Infector q BSI once popular, not so much now q Why? o Machines don’t reboot so often o Much harder to infect, due to better defenses

Multiple Infections

File Infectors q OS views some files as executable o Like “exe” and similar q Files that can be run by a command-line "shell" also considered executable o Batch files, shell scripts, … q File infector --- infects executable file o Exe, shell code, consider executable o Binary executable is most common target

File Infectors q Two main issues… 1. Where to put the virus within file? 2. How to execute the virus when infected file is run? q Consider these two (interrelated) questions in next few slides

Beginning of File q Older exe formats (e. g. , . COM) treat entire file as chunk of code and data o Entire file loaded into memory o Execution starts by jumping to the beginning of the loaded file q Can put virus at start of such a file o That is, prepend the virus code

Prepended Virus

End of File q Append a virus (even easier? ) q Then how does virus get executed? q Some possibilities… q Replace first line(s) with a jump to viral code --- save overwritten code q Later, transfer control back to code o How to do this?

End of File q How to transfer control back to code? o Run saved instructions in saved location o Restore the infected code back to its original state and run it q Many exe file formats specify start location in file header o If so, virus can change start location to point to its own code and jump to the original start location when done

Appended Virus

Overwritten into File q Virus places itself atop original code q Can avoid changes in file size q Easy for virus to get control q But… overwriting code will break the original code o Making virus easier to discover q Is it possible to overwrite without breaking the code?

Overwritten into File q Smart ways to overwrite? q Overwrite repeated data o May be trickier to execute virus q Save overwritten data (like BSI) q Use over-allocated space in a file q Compress code to make space q For these to work, virus must be small

Merged with File q Could try to merge virus with target q I. e. , intermixing virus/target code q Difficult o So, it’s “rarely seen” q But, supposedly, Zmist does this o So, apparently it is possible o That’s impressive…

Not in File q Companion virus --- separate from, but naturally executed before target q No modification to infected code q May take advantage of process used by OS or shell to search for exe files q Like a Trojan horse but it’s a virus… o …since it’s self-replicating

Companion Virus q Virus is earlier in the search path o Same name as the target file, almost… q E. g. , MS-DOS searches for “foo” by 1. Look for foo. com 2. Look for foo. exe 3. Look for foo. bat q If the target file is a foo. exe, companion virus is in file foo. com

Companion Virus q Windows registry associates file types with applications q Can modify registry so that companion virus runs instead of exe o Then companion can transfer control to the corresponding exe q In effect, all exes infected at once!

Companion Virus q ELF file format used on recent Unix’s q Has "interpreter" specified in each exe file header o Points to run-time linker q Companion time linker virus can replace the run- o As above, effect is that all exe files infected at once

Companion Virus q Companion viruses possible in GUI q App’s icon can be overwritten with the icon for the companion virus q When a user clicks on “app” icon… o Companion virus runs instead

Macro Virus q Some apps allow data files to have macros embedded in them q Macros are short snippets of “code” interpreted by the application q Such a languages often provide enough functionality to write a virus

Macro Virus q Macros often run automatically when file is loaded o Easy to write compared to low-level code q First proof of concept in 1989 q Hit “mainstream” in 1995 o o Virus known as Concept Targeted Microsoft Word (of course) Installed in “global macros” Infected all edited documents

Macro Virus: Concept q Targeted Word Docs q Auto. Open macro --- runs automatically when file opened o How you get the virus from infected file q File. Save. As --- when “file save as” selected from menu o So the virus can infect other docs

Macro Virus: Concept

Classification by Concealment Strategy q Most viruses try to hide o Why? q So, how do they hide? o Encryption o Polymorphism o Etc. , etc. q Yet another way to classify viruses. .

No Concealment q Do nothing to hide q This is easiest for virus writer… o …but also easiest to detect, analyze

Encryption q Why encrypt? q Virus body is “hidden” from view o In particular, the signature is hidden q Distinguish between strong encryption and obfuscation q Viruses usually only obfuscated o Very weak encryption

Encrypted Virus

Encryption q How to encrypt? o Let me count the ways… 1. Simple encryption o Rotate, increment, negate, etc. 2. Static encryption key o E. g. , XOR fixed byte to all bytes 3. Variable encryption key o Like static, but key changes

Encryption (Continued) 4. Substitution cipher o Permute the bytes o Could be via lookup table o Could even have multiple ciphertexts decrypt to same plaintext 5. Strong encryption o DES, AES, RC 4, etc. o Might use crypto libraries

Stealth q Tries to hide the infection o Not just hide the virus signature q Examples of stealth techniques o Change timestamp and/or other file info to pre-infection values o Intercept I/O calls to hide presence (in MS-DOS user-accessible interrupts) o Hijack secondary boot loader

Stealth q Stealth viruses “overlap” rootkits q Rootkit --- installed on compromised machine so attacker can use it o Stealth is critical to rootkit success q Some malware use rootkits o For example, Ryknos Trojan hid itself using a rootkit designed for DRM

Reverse Stealth Virus q What is “reverse stealth”? q Make everything look infected! q Why is this malicious? o Damage may be done by AV software trying to disinfect

Oligomorphism q Oligomorphic or semi-polymorphic q Code is encrypted q Decryptor code is morphed o But not too many different decryptors q For example o Whale had 30 different decryptors o Memorial had 96 decryptors q How to detect?

Polymorphism q Like oligomorphic, but lots more decryptors q Essentially, an infinite number q For example o Tremor has almost 6 billion decryptors q So, AV software cannot have a signature for each decryptor

Polymorphism q 2 problems for polymorphic writer… q How to generate decryptors? o Use a mutation engine o Engine is part of encrypted virus q How to detect previous infections? o Data “hiding”: timestamp, file size, file system features, external storage, … o “Inoculate” system by faking infection?

Mutation Engine 1. Equivalent instruction substitution o One or more instructions 2. 3. 4. 5. 6. 7. Instruction reordering Register swap Reorder data Spaghetti code Insert junk code Run-time code modification/generation

Mutation Engine Subroutine permutation 9. DIY virtual machine 10. Concurrency --- threads 11. Inlining/outlining 12. “Threaded” code --- not threads 8. Jump directly from one subroutine to another, without returning 13. Subroutine interleaving

Mutation Engine q Many, many other possibilities q Possible overlap with optimizing compilers? o Seems more like de-optimizing…

Equivalent Instructions q All of these lines set register r 1 to 0 clear r 1 xor r 1, r 1 and 0, r 1 move 0, r 1

Concurrency Example r 1 = 12 r 2 = 34 r 3 = rl + r 2 => start thread T r 1 = 12 wait for signal r 3 = r 1 + r 2. . . T: r 2 = 34 send signal exit thread T

Concurrency q Aside: Concurrency may be very effective anti-reversing technique o Use multiple threads o Intentional deadlock o “Junk” threads q Described in masters project: q Improved software activation using multithreading

Mutation q Mutation 1. 2. also can be used for good Makes reverse engineering attacks more difficult Make software more “diverse”

Metamorphism q Apply polymorphism to virus body o Aka, “body polymorphic” q No encryption/decryption needed q Body must change a lot o Goal is to have no common signature q Mutation code must be mutated too! o Otherwise, a signature will exist o Different from polymorphic (why? )

Metamorphism q Two types of metamorphic generators o Both types difficult to produce 1. Standalone o Apply generator offline o Easy to make old malware into “new” 2. Malware “carries its own generator” o Necessary if self-propagating o A much more difficult problem

Metamorphism: Apparition q Apparition --- metamorphic virus q Delivered in source code (Pascal) q If compiler is present… o Insert junk code and compile q. A very lame approach q Real metamorphism must be done in assembly or (better yet) machine code

Metamorphism: Simile q Simile --- metamorphic virus q Simile’s metamorphic generator o o o 12, 000 lines of assembly Translate Simile to intermediate form Then remove all old transformations Obtains a base form of virus Apply new set of transformations Generate new (morphed) machine code

Metamorphism: Meta. PHOR q Metamorphic Permutating High. Obfuscating Reassembler o That is, Meta. PHOR q Described in How I Made Metaphor and What I’ve Learnt by The Mental Driller q Complex expander/shrinker strategy q Almost impossible to analyze

Metamorphism: MWOR q Metamorphic Worm, i. e. , MWOR q Experimental metamorphic malware designed by former masters student q Modeled on Meta. PHOR, but… o Easier to understand o Better for experiments and testing o A useful research tool q How to detect?

Metamorphism q The bottom line… q Metamorphics difficult to detect o Machine learning works well on hacker malware, but can be defeated q Metamorphics also difficult to write o Most “metamorphic” generators aren’t q Current state of the art? o “Undetectable” metamorphic viruses

Strong Encryption q What is strong encryption? q Use a real cipher q For this to be useful, must not store key with code o Why not? q But must decrypt the virus q How to get the key to the code?

Strong Encryption: Key q Store key on the web o Then must go fetch the key o But then how to get the key? q Binary virus --- 2 parts o Low probability that both parts arrive q “Environmental” key generation o Key based on machine-specific info o Key derived at runtime o Harder to analyze q Other? ? ?

Virus Kits q Many malware construction kits o See VX Heavens q Many kits claim to be metamorphic o Or polymorphic, or encrypted, or … o You should be very skeptical of claims o Some have nice GUI interface q Success is failure? o The more successful, the more likely it has been studied and can be detected