Spyware Detection Jeff Rosenberg Advisor Professor Hemmendinger Computer
Spyware Detection Jeff Rosenberg Advisor: Professor Hemmendinger Computer Science Senior Project Winter 2006
Why bother? n Most spyware removal tools use a list of known spyware programs ¨ List must be constantly updated ¨ Won’t catch anything that’s not listed ¨ At the mercy of those who create it n Rather than use a reference list, detect spyware based on patterns it shows ¨ Avoids the need to update
The Program n C++ program that uses MFC ¨ Easy to make a dialog-based interface Searches a computer, testing all files and directories it encounters n Displays a list of detected files and directories, along with their probabilities of being spyware n ¨ Based on which tests they passed
Bad Files List Searching Dir 2 Dir 4 Dir 3 root Dir 1 Dir 2 File 1 File 3 File 2 Dir 3 File 4 File 10 Dir 4 File 7 File 8 File 9
Testing 123 Patterns – size, name, type Tests – combinations of patterns that spyware often exhibits Size. Pattern 400, 800 00000001 Spyware. Exe Name. Pattern “spy” 00000010 - looks for executable files with “spy” in the name Type. Pattern “exe” 00000100 Small. Exe 00000110 00000101 - looks for executable files between 400 and 800 bytes File This. Is. Spy. Ware. exe 2 KB 00000110 & 00000110 = 00000110 & 00000101 = 00000100
Time is on my side n Spyware often appears in groups, with all files created at the exact same time n Can also use these bad dates to find spyware in other locations Algorithm to find date clusters in a given directory n n n Sort a list of files by date Starting with the first file in the list, look through all of the files that follow as long as their dates are within a certain range of each other Continue until a date is found outside of this range. The probability of being spyware for files in this cluster depends on how many files are in it.
Program Interface
Conclusions/Future Work n n Can be hard to distinguish between good and bad files Still did a good job of finding all the spyware on the test machine ¨ n n Tests were developed from infections in October, but were still able to find spyware from new infections in February Learning – adjusting tests on the fly and create new ones Optimization ¨ ¨ Many of the algorithms used can be sped up significantly Still does okay, took 3 minutes and 35 seconds to scan 131, 301 files (37 GB)
Questions?
- Slides: 9