Malware Attribution Tracking Cyber Spies Digital Criminals Greg
Malware Attribution Tracking Cyber Spies & Digital Criminals Greg Hoglund HBGary, Inc Blackhat Vegas 2010
The Bad Guys are Winning • Cybercrime & espionage is the dominant criminal problem globally, surpassing the drug trade – Russians made more money last year in banking fraud than the Columbians made selling cocaine – Chinese are crawling all over commercial & government networks • The largest computing cloud in the world is controlled by Conficker – – – 6. 4 million computer systems* 230 countries 230 top level domains globally 18 million+ CPUs 28 terabits per second of bandwidth *http: //www. readwriteweb. com/cloud/2010/04/the-largest-cloud-in-the-world. php
Humans • Attribution is about the human behind the malware, not the specific malware variants • Focus must be on human-influenced factors Move this way Binary Human We must move our aperture of visibility towards the human behind the malware
$500+ $10, 000+ for 0 -day $1, 000+ Implant Vendor $10, 000+ for 0 -day Exploit Developer Exploit Pack Vendor Rootkit Developer $1000+ Rogueware Developer e. Gold Wizard Country that doesn’t co-op w/ LE Keep 10% A single operator here may recruit 100’s of mules per week Forger $50 Payment system developer atm Small Transfers Secondary Bot Vendor ~4% of bank customers Victims $5, 000 incrm. Keep 50% Drop Man Cashier / Mule Bank Broker Keep 10% Back Office Developer Account Buyer Country where account is physically located $100. 00 per 1000 infections Affiliate Botmaster ID Thief PPI Sells accounts in bulk $5. 00 per Endpoint Exploiters
Installs Marketplace
Intelligence Spectrum Blacklists Net Recon C 2 Developer Fingerprints TTP Nearly Useless MD 5 Checksum of a single malware sample Social Cyberspace DIGINT Physical Surveillance HUMINT Nearly Impossible Sweet Spot IDS signatures with long-term viability Predict the attacker’s next moves SSN & Missile Coordinates of the Attacker
Archaeology layer Net Recon C 2 Developer Fingerprints TTP Actions / Intent (attacker’s behavior, as opposed to code) Installation + Deployment method Command + Control (primary outer loops) CNA (spreader) CNE (search and exfil tools) COMS (code level view, as opposed to network sniff) Defensive / Antiforensics (usually a packer, easily changed) Exploit weaponization / delivery vehicle Shellcode DNS, C 2 Protocol, Encryption Method (high rate of change)
Intel Value Window Lifetime Minutes Hours Blacklists Days Weeks Months Years ATTRIBUTION-Derived Developer Toolmarks Signatures Algorithms NIDS sans address Protocol DNS name IP Address Checksums Hooks Install
Rule #1 • The human is lazy – The use kits and systems to change checksums, hide from A/V, and get around IDS – They DON’T rewrite their code every morning
Rule #2 • Most attackers are focused on rapid reaction to network-level filtering and black-holes – Multiple Dyn. DNS C 2 servers, multiple C 2 protocols, obfuscation of network traffic • They are not-so-focused on host level stealth – Most malware is simple in nature, and works great – Enterprises rely on A/V for host, and A/V doesn’t work, and the attackers know this
Rule #3 • Physical memory is King – Once executing in memory, code has to be revealed, data has to be decrypted
DISK FILE IN MEMORY IMAGE 100% dynamic Copied in full OS Loader Copied in part In memory, traditional checksums don’t work MD 5 Checksum reliable MD 5 Checksum is not consistent Software Traits remain consistent
IN MEMORY IMAGE Packer #1 Packer #2 OS Loader Decrypted Original Starting Malware Packed Malware Software Traits remain consistent Physical memory tends to get around the ‘packing’ problem
DISK FILE IN MEMORY IMAGE OS Loader Same malware compiled in three different ways MD 5 Checksums all different Software Traits remain consistent
Attribution is Not Hard • If you can read a packet sniffer, you can attribute malware – Yes, this means more people in your organization can do this – Focus on strings and human-readable data within a malware program – In most cases, code-level reverse engineering is not required
The Flow of Forensic Toolmarks Machine Developer Core ‘Backbone’ Sourcecode Sample Tweaks & Mods Compiler 3 rd party Sourcecode 3 rd party libraries Time Paths Runtime Libraries MAC address Malware Packing
Developer Fingerprints Communications Functions Developer Installation & Deployment Method Sample Command & Control Functions Malware Compiler Environment Packing Stealth & Antiforensic Techniques
Toolkit Fingerprints Machine PPI Affiliate Packed Malware Toolkit
OS Loader IN MEMORY IMAGE Malware Tookit Toolkits can be detected Different Malware Authors Using Same Toolkit traits are apparent Packed
Paths Machine Developer Core ‘Backbone’ Sourcecode Sample Tweaks & Mods Compiler 3 rd party Sourcecode 3 rd party libraries Time Paths Runtime Libraries MAC address Malware Packing
Example: Gh 0 st. Net
Ghost. Net
Ghost. Net: Dropper UPX! ¶üÿÿU‹ìƒìSVW 3ÿÿ Packer Signature MZx 90 This prog. Ry. y cannot be run in DOS mode Embedded executable NOTE: Packing is not fully effective here
Ghost. Net: Dropper UPX! ¶üÿÿU‹ìƒìSVW 3ÿÿ Resource Culture Code 0 x 0804 MZx 90 This prog. Ry. y cannot be run in DOS mode The embedded executable is tagged with Chinese PRC Culture code
Ghost. Net: Dropper UPX! ¶üÿÿU‹ìƒìSVW 3ÿÿ 0 x 0804 The embedded executable is extracted to disk. The extracted module is not packed. PDB path reveals malware name, E: drive. MZx 90 This prog. Ry. y cannot be run in DOS mode MZx 90 This program cannot be run in DOS mode E: gh 0 stServerRelease install. pdb Embedded PDB Path
For Immediate Defense… Useless MD 5 of the Gh 0 st. Net dropper. EXE Human Query: “Find Attacker’s PDB Path” PDB Path found within extracted EXE Raw. Volume. File. Binary. Data contains “gh 0 st”
Link Analysis “gh 0 st” The web reveals Chinese hacker sites that reference the “gh 0 st” artifact
Ghost. Net: Backdoor UPX! MZx 90 The dropped EXE is loaded as svchost. exe on the victim. It then drops another executable, a device driver. MZx 90 This program cannot be run in DOS mode E: gh 0 stServerRelea seinstall. pdb MZx 90 Another embedded EXE MZx 90 e: gh 0 stserversysi 386RESSDT. pdb Another PDB path
Our defense… Query: “Find Attacker’s PDB Path” Raw. Volume. File. Binary. Data contains “gh 0 st” Even if we had not known about the second executable, our defense would have worked. This is how moving towards the human offers predicative capability.
What do we know… i 386 directory is common to device drivers. Other clues: 1. sys directory 2. ‘SSDT’ in the name SSDT means System Service Descriptor Table – this is a common place for rootkits and HIPS products to place hooks. Also, embedded strings in the binary are known driver calls: 1. Io. XXXX family 2. Ke. Service. Descriptor. Table 3. Probe. For. XXXX Ke. Service. Descriptor. Table is used when SSDT hooks are placed. We know this is a hooker.
What do we know… Iof. Complete. Request, Io. Create. Device, Io. Create. Symbolic. Link, and friends are used when the driver communicates to usermode. This means there is a usermode module (a process EXE or DLL) that is used in conjunction with the device driver. When communication takes place between usermode & kernelmode, there will be a device path.
For Immediate Defense… Device Path of the kernel mode driver and the Symbolic Link name MD 5 of the Gh 0 st. Net dropper. EXE Useless Human Query: “Find Rootkit Device Path or Symlink” Physmem. Windows. Object. Name contains “RESSDT”
Link Analysis “RESSDT” A readme file on Kasperky’s site references a Ressdt rootkit.
TMC Rootkit e: gh 0 stserversysi 386RESSDT. pdb Dropper e: jobgh 0 stReleaseLoader. pdb. ? AVCgh 0 st. Doc@@. ? AVCgh 0 st. App@@ GUI (MFC). ? AVCgh 0 st. View@@ Cgh 0 st. View Doc/View is Cgh 0 st. Doc usually MFC e: jobgh 0 stReleasegh 0 st. pdb C: gh 0 st 3. 6_srcHACKERi 386HACKE. pdb gh 0 st 3. 6_srcServersysi 386CHENQI. pdb Already at version 3. 6 Rootkits
gh 0 st _RAT, source code, team, and forum www. wolfexp. net
Case Study: Chinese APT 2004 2005 2007 2009 2010 Svc. Host. DLL. log & “bind cmd frist!” Svc. Host. DLL. log Just “bind cmd frist!”
Timestamps Machine Developer Core ‘Backbone’ Sourcecode Sample Tweaks & Mods Compiler 3 rd party Sourcecode 3 rd party libraries Time Paths Runtime Libraries MAC address Malware Packing
PE Timestamps PE file Module timestamp* time_t (32 bit) The ‘lmv’ command in Win. DBG will show this value. . e_lfanew Image File Header Optional Header Debug timestamp time_t (32 bit) This is present if an external PDB file is associated with the EXE *This is not the same as NTFS file times, which are 64 bit and stored in the NTFS file structures. Image Data Directories IMAGE DEBUG DIRECTORY
Timestamp Formats • time_t – 32 bit, seconds since Jan. 1 1970 UTC – 0 x 3 DE 03 E 0 A usually start with ‘ 3’ or ‘ 4’ • ‘ 3’ started in 1995 and ‘ 4’ ends in 2012 – Use ‘ctime’ function to convert • FILETIME – 64 bit, 100 -nanosecond intervals since Jan. 1 1600 UTC – 0 x 01 C 195 C 2. 5100 E 190 usually start with ‘ 01’ and a letter • 01 A began in 1972 and 01 F ends in 2057 – Use File. Time. To. System. Time(), Get. Date. Format(), and Get. Time. Format() to convert
Case Study: Chinese APT 2004 2005 2007 2009 2010 XX/XX/2005 – XX: XX PM 12/XX/2007 – X: XX AM 12/XX/2007 – X: XX PM 11/XX/2009 – 9: XX AM 12/XX/2009 – 11: XX PM Compile times extracted from ‘soysauce’ backdoor program. 2/XX/2010 – XX: XX AM 3/XX/2010 – XX: XX PM
For Immediate Defense… Compile time Useless Human Query: “Find Modules Created Within Attack Window” Raw. Volume. File. Compile. Time > 3/1/2010 < 3/31/2010
MAC Address Machine Developer Core ‘Backbone’ Sourcecode Sample Tweaks & Mods Compiler 3 rd party Sourcecode 3 rd party libraries Time Paths Runtime Libraries MAC address Malware Packing
GUID V 1 • The OSF specified algorithm for GUID V 1 uses the MAC address of the network card for the last 48 bits of the 128 bit GUID – This was deprecated on Windows 2000 and greater, so this has limited value {21 EC 2020 -3 AEA-1069 -A 2 DD-08002 B 30309 D} V 1 GUIDS have a 1 in this position This is the MAC of the machine This technique was used to track the author of the Melissa virus
Compiler Version Machine Developer Core ‘Backbone’ Sourcecode Sample Tweaks & Mods Compiler 3 rd party Sourcecode 3 rd party libraries Time Paths Runtime Libraries MAC address Malware Packing
Visual Studio • • Static or dynamic linked runtime library? Single-threaded or multi-threaded? Use of STL? Use of older iostream libraries? * See: * support. microsoft. com/kb/154753
Visual Studio – Static Linking Version Libraries linked with Type VC++. NET 2003 and earlier LIBC. LIB, LIBCP. LIB Single Threaded Static VC++. NET 2003 and earlier LIBCD. LIB, LIBCPD. LIB Single Threaded Static All LIBCMT. LIB, LIBCPMT. LIB Multi-threaded Static All LIBCMTD. LIB, LIBCPMTD. LIB Multi-threaded Static Compiler flag /MLd /MTd Visual Studio – Dynamic Linking Version DLL Linked with VC++ 4. 2 MSVCRT. DLL/MSVCRTD. DLL VC++ 5. 0 MSVCR 50. DLL VC++ 6. 0 MSVCR 60. DLL VC++. NET 2002 MSVCR 70. DLL VC++. NET 2003 MSVCR 71. DLL VC++. NET 2005 MSVCR 80. DLL VC++. NET 2008 MSVCR 90. DLL
MFC "^MFC(? <type>(|O|D|N|S))(? <version>[0 -9]+)(? <debug>(|U|D|UD))\. DLL" O: "Microsoft Foundation Classes (MFC) for Active Technologies, version: " + version; D: "Microsoft Foundation Classes (MFC) for database, version: " + version; N: "Microsoft Foundation Classes (MFC) for network (sockets), version: " + version; S: "Microsoft Foundation Classes (MFC) statically linked code, version: " + version; default: "Microsoft Foundation Classes (MFC) standard, version: " + version; D: "ANSI Debug"; UD: "Unicode Debug"; U: "Unicode Release"; default: "ANSI Release";
Static Linking • C runtime library strings will be embedded in the EXE itself, as opposed to being in an external DLL – DOMAIN error – TLOSS error – SING error – R 6027 • Other libraries can also be detected in same manner (MFC, Open. SSL, etc)
Debug Symbols • Debug timestamp (time_t – seconds since 01. 1970) • Version of the PDB file • • NB 09 - Codeview 4. 10 NB 11 - Codeview 5. 0 NB 10 - PDB 2. 0 RSDS - PDB 7. 0 • Age – number of times the malware has been compiled
Debug Information Format • Types: – Standard Program Database – Program Database for Edit and Continue (/ZI) – C 7 Compatible
Name Mangling
Undecorate Visual C++ demangle: DWORD WINAPI Un. Decorate. Symbol. Name( __in PCTSTR Decorated. Name, __out PTSTR Un. Decorated. Name, __in DWORD Undecorated. Length, __in DWORD Flags ); Also, see source to winedbg GNU C++ demangle see libiberty/cplus-dem. c and include/demangle. h
Delphi • Give-away strings: SOFTWAREBorlandDelphiRTL This program must be run under Win 32 - Borland’s tlink 32 linker
Delphi • Uses specific function names – easy to identify • Language is derived from Pascal 78 hits for pascal, only 2 for c++
DOS stubs • MZx 50 • MXx 90 • “This program cannot be run in DOS mode” – VC, gcc, MASM • “This program requires Microsoft Windows” • “This program must be run under win 32”
Embedded Manifest • Contains name, description, platform • Contains list of dependent modules + versions – May contain key tokens that identify specific dependent modules (aka strongly named) • May contain public key that is tied to the developer if assembly itself is strongly named – not likely! – Public/private key pair (sn. exe)
Manifest assembly. Identity version=. +\<description\>(? <description>. +)\</description\> assembly. Identity version=\"(? <version>[0 -9\. ]+)\" processor. Architecture=\"(? <proc>. +)\" name=\"(? <asm_name>. +)\" type \<dependent. Assembly\>\<assembly. Identity. + name=\"(? <asm_name>. +)\“ version=\"(? <version>[0 -9\. ]+)\" processor. Architecture=\"(? <proc>. +)\“ public. Key. Token=\"(? <token>[0 -9 a-f. A-F]+)\" I hope you like Regex
Choice of string handling functions – UNICODE, ASCII, Multi. Byte – “wprintf” – wide – “f_sprintf” – safe – “(n|w)printf” – length check – “_v” - var-arg – “_f” - file output
Compiler Options • • Optimize for Size / Speed Inline Function Expansion Intrinsic Functions Fast code over small code
Frame Pointer Omission • Look for a certain & of [esp] variable initializations Example: C 7 44 24 08 00 00 mov dword ptr [esp+0 x 8], 0 x 0 Don’t need a disassembler, this can be byte pattern based
Exception Handling Structured (SEH) “__except_handler 3” or “__local_unwind 3” – VS < 8. 0 “__except_handler 4” or “__local_unwind 4” or “_Xcpt. Filter” – VS 8. 0+ 64 ff 35 00 00 - push dword fs: [0] (SEH save) 64 89 25 00 00 - mov fs: [0], esp (SEH init) Vectored “Add. Vectored. Exception. Handler” or “Remove. Vectored. Exception. Handler”
Buffer Security Checks 0041140 F 8 B 4 D FC 00411412 33 CD 00411414 E 8 05 FC FF FF mov ecx, dword ptr [ebp-0 x 4] xor ecx, ebp call 0 x 0041101 E▲ // sub_0041101 E Add. Pattern(the. List, "Buffer Security Checks", "8 B 4 D FC 33 CD E 8", 1, 0, null);
Runtime Type Information (RTTI) "Run-Time Check Failure #%d"
Calling Convention • __cdelc • __stdcall • __fastcall
C versus C++ • Pattern is apparent when C++ objects are used • Call thru vtable
UAC • • as. Invoker highest. Available require. Administrator “Bypass” UI Protection
Tracking Source Code Machine Developer Core ‘Backbone’ Sourcecode Sample Tweaks & Mods Compiler 3 rd party Sourcecode 3 rd party libraries Time Paths Runtime Libraries MAC address Malware Packing
Main Functions • Main – Same argument parsing – Init of global variables – WSAStartup • Dll. Main • Service. Main
Service Routines • • • Install / Uninstall Service Run. Dll 32 Service Start/Stop Service. Main Control. Service
Skeleton of a service Dll. Main() { // store the HANDLE to the module in a global variable } Size of local Service. Main() buffer { // Register. Service. Ctrl. Handler & store handle to service in global variable // call Set. Service. Status, set PENDING, then RUNNING // call to main malware function(s) } Service. Ctrl. Handler_Callback { // handle various commands, start/stop/pause/etc } Sleep loop at end dw. Wait. Hint Hard coded sleep( ) times
Skeleton of a service Main_Malware_Function { // do stuff } Size of local Install. Service() buffer { // Open. SCManager // Create. Service } Uninstall. Service() { // Open. SCManager // Delete. Service } Service Name Exception Handling Registry Keys
Filename Creation • • Log files, EXE’s, DLL’s Subdirectories Environment Variables Random numbers
Case Study: Chinese APT 2004 2005 2009 2010 2005 posting of similar source code, includes poster’s handle.
Case Study: Chinese APT Continued searching will reveal many, many references to the base source code of this malware. All malware samples for this attacker are derived from this basic framework, but many additions & modifications have been made.
3 rd Party Source. Code Machine Developer Core ‘Backbone’ Sourcecode Sample Tweaks & Mods Compiler 3 rd party Sourcecode 3 rd party libraries Time Paths Runtime Libraries MAC address Malware Packing
Format Strings • These are written by humans, so they provide good uniqueness http: //%s: %d/%d%04 d
Logging Strings Searching for: -“Unable to determine” & -“Unknown type!” Reveals that the attacker is using the source-code of BO 2 k for cut-and-paste material.
Mutex Names Mutex names remain consistent at least for one infection-push, as they are designed to prevent multiple-infections for the same malware.
Link Analysis
3 rd Party Libraries Machine Developer Core ‘Backbone’ Sourcecode Sample Tweaks & Mods Compiler 3 rd party Sourcecode 3 rd party libraries Time Paths Runtime Libraries MAC address Malware Packing
Copyright & Version Strings Open. SSL/0. 9. 6 RAND part of Open. SSL 0. 9. 8 e 23 Feb 2007 MD 5 part of Open. SSL 0. 9. 8 k 25 Mar 2009 libdes part of Open. SSL 0. 9. 7 b 10 Apr 2003 inflate 1. 2. 1 Copyright 1995 -2003 Mark Adler inflate 1. 1. 4 Copyright 1995 -2002 Mark Adler inflate 1. 2. 3 Copyright 1995 -2005 Mark Adler inflate 1. 0. 4 Copyright 1995 -1996 Mark Adler inflate 1. 1. 3 Copyright 1995 -1998 Mark Adler inflate 1. 1. 2 Copyright 1995 -1998 Mark Adler inflate 1. 2. 2 Copyright 1995 -2004 Mark Adler
zlib Fingerprinting • Every new version of zlib has a unique pattern of bits in the data tables – these are modified for each version specifically • This pattern is a data constant and can be used even if the copyright notices have been removed http: //www. enyo. de/fw/security/zlib-fingerprint/zlib. db
inflate library patterns • Not as specific as zlib patterns but can be used to detect the inflate decompressor http: //www. enyo. de/fw/security/zlib-fingerprint/inflate. db
Installation & Deployment Communications Functions Developer Installation & Deployment Method Sample Command & Control Functions Malware Compiler Environment Packing Stealth & Antiforensic Techniques
Case Study: Chinese APT 2004 2005 2009 2010 Alters the DLL value of an existing service named “Remote. Registry”: Original Service. Dll value: regsvc. dll Trojan Service. Dll value: regsvr. dll Registers a service named “IPRIP” which operates as a DLL loaded under svchost. exe
Method used to find base of kernel 32 Find. Kernel 32: pushad and mov esi, 0 FFFF 0000 h ecx, 100 h Mask off ESI to a page boundary FK 32_Loop: call Try. Address jnc FK 32_Success sub esi, 010000 h loop FK 32_Loop FK 32_Hardcodes: mov esi, KERNEL 32_WIN 9 X call Try. Address jnc FK 32_Success Load ECX w/ a length to scan backwards from mov esi, KERNEL 32_WINNT call Try. Address jnc FK 32_Success mov esi, KERNEL 32_WIN 2 K call Try. Address jnc FK 32_Success Subtract, try again mov esi, KERNEL 32_WINME call Try. Address jnc FK 32_Success FK 32_Fail: popad stc ret FK 32_Success: mov [ebp + Kernel 32], esi popad clc ret Try a bunch of hard coded offsets if the scan fails
Command & Control Communications Functions Developer Installation & Deployment Method Sample Command & Control Functions Malware Compiler Environment Packing Stealth & Antiforensic Techniques
Command Control Once installed, the malware phones home… TIMESTAMP SOURCE COMPUTER USERNAME VICTIM IP ADMIN? OS VERSION HD SERIAL NUMBER
C&C Hello Message 1) this queries the uptime of the machine. . 2) checks whether it's a laptop or desktop machine. . . 3) enumerates all the drives attached to the system, including USB and network. . . 4) gets the windows username and computername. . . 5) gets the CPU info. . . and finally, 6) the version and build number of windows.
Command Control Server • The C&C system may vary – Custom protocol (Aurora-like) – Plain Old URL’s – IRC (not so common anymore) – Stealth / embedded in legitimate traffic • Machine identification – Stored infections in a back end SQL database
Aurora C&C parser A) Command is stored as a number, not text. It is checked here. B) Each individual command handler is clearly visible below the numerical check C) After the command handler processes the command, the result is sent back to the C&C server
Command & Control Communications Functions Developer Installation & Deployment Method Sample Command & Control Functions Malware Compiler Environment Packing Stealth & Antiforensic Technique vds
Antidebugging Place SEH handler Detect. Debuggers: pushad PUT_SEH_HANDLER FD_Continue xor eax, eax div eax RESTORE_SEH_HANDLER jmp FD_Debugger_Found FD_Continue: RESTORE_SEH_HANDLER mov test jnz eax, fs: [20 h] eax, eax FD_Debugger_Found popad clc ret FD_Debugger_Found: popad stc ret ; ; ; ; Use SEH to kill debuggers Generate a exception (divide by 0) Divide by zero error Here some abnormal occured So lets quit Execution should resume at this pnt Remove handler ; Detect application-level debugger ; Is present? ; Quit! ; No debuggers found, so restore ; registers, clear carry flag and ; return!
Debugger Detection • Call Is. Debugger. Present – Or, read offset 2 from the PEB structure mov eax, fs: [30 h] mov eax, byte ptr [eax+2] test eax, eax jnz __found_debugger • Check the Heap Manipulation Flags in Nt. Global. Flags mov eax, fs: [30 h] • • • FLG_HEAP_ENABLE_TAIL_CHECK, FLG_HEAP_ENABLE_FREE_CHECK, FLG_HEAP_VALIDATE_PARAMETERS mov eax, [eax+68 h] and eax, 0 x 70 test eax, eax jnz __found_debugger
Debugger Detection • Heap Flags, not the same as Nt. Global. Flags but affected by the use of FLG_HEAP_* mov eax, fs: [30 h] mov eax, [eax+18 h] process heap // EAX now points to the first heap header… mov eax, [eax+10 h] heap flags member in the header // EAX can now be tested for any heap flags that may be enabled test eax, eax jnz __found_debugger
Debugger Detection • Nt. Query. Information. Process – Called with a Process. Information. Class of 7 (Process. Debug. Port), will set Process. Information pointer to 0 x. FFFF if process is being debugged
Debugger Detection • Check. Remote. Debugger. Present – This just wraps Nt. Query. Information. Process, but in this case the OUT DWORD is set to 1 (TRUE) if a debugger is present
Debugger Detection • TRAP_FLAG – Checking to see if it’s set – Or, setting it with an exception handler • The debugger would process the single step and the exception handler would not be called if a debugger were present
Debugger Detection • Zw. Close – If a program is being debugged, calling Zw. Close with an invalid handle will generate an exception STATUS_INVALID_HANDLE (0 x. C 0000008)
Debugger Detection • Set. Unhandled. Exception. Filter – Will not be called if a debugger is attached – If a debugger is attached, the program will terminate due to the unhandled exception
Debugging and Timers • Calling Query. Performance. Counter • Calling Get. Tick. Count • RDTSC instruction
Hiding a Thread from a Debugger • Call Nt. Set. Information. Thread with a Thread. Information. Class of 0 x 11 (Thread. Hide. From. Debugger) – the thread will be detached from any debuggers
Advanced Fingerprinting
Ghost. Net: Screen Capture Algorithm Loops, scanning every 50 th line (c. Y) of the display. Reads screenshot data, creates a special DIFF buffer LOOP: Compare new screenshot to previous, 4 bytes at a time If they differ, enter secondary loop here, writing a ‘data run’ for as long as there is no match. Offset in screenshot Len in bytes Data….
Ghost. Net: Searching for sourcecode Large grouping of constants Search source code of the ‘Net
Ghost. Net: Refining Search Has something to do with audio… Further refine the search by including ‘WAVE_FORMAT_GSM 610’ in the search requirements…
Ghost. Net: Source Discovery We discover a nearly perfect ‘c’ representation of the disassembled function. Clearly cut-and-paste. We can assume most of the audio functions are this implementation of ‘CAudio’ class – no need for any further low-level RE work.
On link analysis…
Example: Link Analysis with Palantir™ 1. Implant 2. Forensic Toolmark specific to Implant 3. Searching the ‘Net reveals source code that leads to Actor 4. Actor is supplying a backdoor 5. Group of people asking for technical support on their copies of the backdoor
Keylogger (link analysis)
Working back the timeline • Who sells it, when did that capability first emerge? – Requires ongoing monitoring of all open-source intelligence, presence within underground marketplaces – Requires budget for acquisition of emerging malware products
Penetrating Cyberspaces • • Maintaining and building digital cover Non-attrib pop on ‘net Multiple identities Contribution for bonafides
carders. cc Holy. Darkness: f 5 a 602 d 0 d 9300 e 18197 a 1 fdd 1 ad 49507: : hodark@Safe-mail. net z. Zz. Zz. Z: d 5 c 84 c 7 f 046 f 103 d 98 b 3 a 769 d 433 fd 72: : wickedboy 2007@gmail. com house 727: 203488391 fa 5 af 323 a 408 beba 858 a 5 cc: : closer 727@gmail. com god-son: a 84142494 a 9340 afd 735 f 2487401918 b: : zanucamig@yahoo. com Kurokaze: 17 bef 81 eb 5 a 39113 a 2743 abb 4 eeebe 0 e: : baron. de. cash@googlemail. com slic 3 menic 3: 1 ba 2 cf 5 cc 41 ef 9701 cfbff 21 c 7 f 6145 c: : 13 hero 37@web. de N. A. S. A. : eb 2 f 0229 da 724 ee 600012 a 047 f 7 ab 725 cc 81 b 51 b: fuckface: : x 1 x 8 x 2@yahoo. de Flex: 6 a 1 e 9 faf 60 f 1 a 7 dfd 0230 f 1715 e 44 a 93: : maxim_16@hotmail. de *HIV*: 6563883 a 558 daa 7 a 76 f 51 e 84 ffc 5 a 706: : hivhiv@hushmail. com Freak. Out: 9 df 6 b 1 e 3 a 642 b 8 b 95 d 9641 bcf 2 add 90 a: : t. koritkowski@web. de 4 Freedom: 321 d 0134947848 a 1 afc 6 f 3 f 79 b 4936 dc: : lucky. 024@gmail. com Final x-2: e 46 a 6472 c 9 d 208893242715 ae 8062 ce 6082 db 953: : Final. X 2@web. de secre. TSline: 2 ad 9 ce 7 b 3 d 92280553616578 bd 3 d 8 df 4: : secretsline@mail. ru My 0 wn: 34 efb 4818 c 564 b 5 b 933 b 1 b 414441450 f: : dennis_rieger@web. de Cee. K: c 990575 a 993 cee 991498 aad 711 a 0 ef 5 a: : gyros@spambog. com Spitfir 3: 14 bb 037 e 1205338 e 4487 f 7 c 5 f 9 e 473 dd 24 a 46570: 0123456: uweuckel@yahoo. de next: d 7 f 798 cf 492 aab 7 b 0598260049 d 3928 f 087 c 4118: : luxbanking@secure-mail. biz
Defining Threat Groups • Smallest atomic unit: the individual • Largest cloud unit: the scam – Fraud, IP-theft, access reseller • A. B. C narrowing cloudspace to individual • Developers – Less than number of malware (with malware defined before MD 5 created aka pre-packing) • Users – Larger than number of developers
Fingerprint. exe
Fingerprint Utility Developer Fingerprint Utility, Copyright 2010 HBGary, INC File: 1228 ad 2 e 39 befa 4319733 e 98 d 8 ed 2890. livebin Original project name: RESSDT Developer's project directory: e: gh 0 stserversysi 386 Compiler: Microsoft Visual C++ 6. 0 release User interface: Media: Compression: Networking: Windows GDI/Common Controls Windows multimedia API Microsoft Vf. W (Video for Windows) Inflate Library version: 1. 1. 4 Windows sockets (TCP/IP) Windows Internet API Source directory: e: gh 0 stserversysi 386
“Smars” malware % Match Sample compiled at 5: 50: 13 AM compared against DB 5: 48: 28 AM 5: 50: 13 AM % Match 5: 48: 28 AM 5: 50: 13 AM 81. 1 86. 1 91. 1 96. 1 101. 1 All samples have different MD 5 checksums, may have been packed in various ways. All but one score in 90%+ range.
The set of Mark Russinovich’s free system tools. You can see which ones are just variants of the same source base, or were compiled on the same platform in or around the same time.
Clustering a malware collection • Large number of samples • Need to group self-similar items into “clusters” – Like a “strange attractor” • From the cluster, perform link analysis into social cyberspaces to find “participants” – Some participants may “resolve” into a developer, user, or other archetype
system 32 directory – Windows 7 64 bit Professional Old-school DOS command EXE’s These were very small binaries with almost no fingerprint data More old school, but these have extra cmd-parsing features Hypigon Virut Rebel Base Autorun infecting sysinternals Language support binaries (NLS) tskill, tsdiscon, logoff, changelogon, etc Vobfus 1/41 on virtualtotal Azero Yah. Lover Rungbu
HBGary, Inc. www. hbgary. com
HBGary, Inc. www. hbgary. com
HBGary, Inc. www. hbgary. com
Conclusion
Takeaways • Actionable intelligence can be obtained from malware infections for immediate defense: – File, Registry, and IP/URL information • Existing security doesn’t stop ‘bad guys’ – Go ‘beyond the checkbox’ • Adversaries have intent and funding • Need to focus on the criminal, not malware – Attribution is possible thru forensic toolmarking combined with open and closed source intelligence
Continued Work • Will be performing large-scale fingerprint analysis over 400 gigs+ of malware captured by the U. S. Intelligence Community • HBGary is interested in processing as many malware collections as possible, both targeted/APT and non-targeted, both classified and unclassified, commercial or govt/govt contractor
Fingerprint Download • Get fingerprint from www. hbgary. com -- or - • Stop by the HBGary booth to get a CD
Thank You • HBGary, Inc. (www. hbgary. com) • HBGary Federal (www. hbgaryfederal. com)
- Slides: 129