Spam An Analysis of Spam Filters Joe Chiarella
- Slides: 11
Spam: An Analysis of Spam Filters Joe Chiarella Jason O’Brien Advisors: Professor Wills and Professor Claypool
Project Goals To analyze the effectiveness of different kinds of spam filters. n Focused on Spam. Assassin and Bogofilter n
Spam. Assassin Rule-based filter – over 400 rules. n Each Rule has an associated weight. n Score of an email is sum of weights across all matching rules. n User adjustable threshold. n
Bogofilter Bayesian filter. n Calculates probability that an email is spam using past email. n Looks at frequency of words (not order of words). n Accuracy should improve over time. n
Data Collection Email collected from students, professors, small business employees, and free email accounts. n 4626 ham emails, 5010 spam emails, separated into ham and spam mailboxes for each user. n
Methodology Compared accuracy of Spam. Assassin and Bogofilter for each user’s email. n Tested same number of ham emails and spam emails from each user. n Ignored results from first 50 emails to allow Bogofilter to learn. n
Comparison of Bogofilter and Spam. Assassin on Ham CP = Company Person PR = Professor ST = Student FE = Free Email
Comparison of Bogofilter and Spam. Assassin on Spam CP = Company Person PR = Professor ST = Student FE = Free Email
Spam. Assassin Score Analysis
Conclusion Bogofilter and Spam. Assassin effectiveness depend greatly on the user. n Neither filter outperformed the other in all cases. n Filtering Spam is hard. n
Questions?
- Bogofilter vs spamassassin
- Webcam chiarella
- Wet etch clean and filter
- Weka hadoop
- Lymph nodes: “filters of the blood”
- Lymph nodes: “filters of the blood”
- Applications of active filters
- Ironport outbreak filters
- Discriminative training of kalman filters
- Columbus industries filters
- Types of analog filters
- Vertical blinds