CECR Seeker Centralized Event Correlation and Response Ramon

  • Slides: 36
Download presentation
CECR (“Seeker”) Centralized Event Correlation and Response Ramon Kagan, Chris Russel York University, Toronto

CECR (“Seeker”) Centralized Event Correlation and Response Ramon Kagan, Chris Russel York University, Toronto

Agenda • How and why Automated Incident Response enhances an Information Security program •

Agenda • How and why Automated Incident Response enhances an Information Security program • Initial Phase: Detection and Compliance systems • Implementation of Centralized Event Correlation and Response (CECR) system – Detection and Compliance – Correlation and Classification – Automated Response

Context: York University • Located in Toronto • Canada’s third largest University – 60,

Context: York University • Located in Toronto • Canada’s third largest University – 60, 000+ students – 10, 000+ staff and faculty – 25, 000+ network drops • In 2003, 2 FT information security staff positions (now 3)

“Traditional” Incident Response • Preparation • Identification • Containment • Eradication • Recovery •

“Traditional” Incident Response • Preparation • Identification • Containment • Eradication • Recovery • Lessons Learned (From SANS Step-by-Step Incident Response)

August 2003

August 2003

Short-Circuited Incident Response • Information Security becomes “Worm Management Services” • No time for

Short-Circuited Incident Response • Information Security becomes “Worm Management Services” • No time for normal response procedures • We adapted and made it through, but… – Is this really security? – What are we missing in the noise and mayhem of constant worm attacks?

Prevention • Don’t let these things happen in the first place • Lots of

Prevention • Don’t let these things happen in the first place • Lots of products to buy – Firewalls, IPS, Anti-Virus, Silver Bullets, etc. • These are all good things but not without their challenges

Prevention Challenges in the Academic Environment • Porous and increasingly fuzzy perimeter – Dialup,

Prevention Challenges in the Academic Environment • Porous and increasingly fuzzy perimeter – Dialup, Wireless, VPN, Mobile devices, etc. Where does the firewall go now? • Political hurdles to implement controls – I want my dancing pigs! • Increase in operational management overhead • $$$++

Detection and Response are Essential Too • Why? – Prevention measures require increasing amounts

Detection and Response are Essential Too • Why? – Prevention measures require increasing amounts of money and strong policy, diminishing returns – They cannot prevent everything – What if they fail? • How useful is a bank vault without an alarm and police response? – Ultimately it can only buy time

Automated Detection and Response • Improving detection and response speed – Makes best use

Automated Detection and Response • Improving detection and response speed – Makes best use of and complements existing prevention measures – Better ROI than additional prevention? – Allows a 24/7/265 response absent staff – Frees up incident handlers to focus on less obvious/potentially more serious matters

Where Automated Detection and Response Matter • Bot. Nets – compromised host waits for

Where Automated Detection and Response Matter • Bot. Nets – compromised host waits for commands – Detect it first and take it out before it spreads behind your perimeter • Spyware (Marketscore, etc) • Leveraged/Low and Slow Hacking – Automated correlation can help detect things otherwise below the radar • Large virus/worm infestation – Can scale to greatly assist with a future large-scale event

Detection • Gather as much information as possible from anywhere you can • Syslog

Detection • Gather as much information as possible from anywhere you can • Syslog • Flow logs • IDS/IPS/Firewall logs • Honeypots

Syslog • Login attempts • Port scans • Local exploits • Anti-virus alerts

Syslog • Login attempts • Port scans • Local exploits • Anti-virus alerts

Flow logs • Network traffic patterns • Scanning detection • Anomaly detection • Historical

Flow logs • Network traffic patterns • Scanning detection • Anomaly detection • Historical context and forensic information

IDS/IPS/Firewall Logs • Scanning • Invalid access • Hacking attempts

IDS/IPS/Firewall Logs • Scanning • Invalid access • Hacking attempts

Honeypots • Great for internal detection – No need for expensive hardware – much

Honeypots • Great for internal detection – No need for expensive hardware – much cheaper than gigabit (multi-gig? ) IDS sensors at every router • By definition, very few false positives • Darknets or Honeynets

Compliance • Agent-based compliance detection • Network-side vulnerability scanning – Nessus or other commercial

Compliance • Agent-based compliance detection • Network-side vulnerability scanning – Nessus or other commercial tools – NOXscan: FAST scanner for Microsoft vulnerabilities used by many worms (MS 04007, MS 04 -011, MS 05 -039) http: //infosec. yorku. ca/tools/

Correlation and Reaction • Map events to an IP or MAC • Map IP

Correlation and Reaction • Map events to an IP or MAC • Map IP or MAC to user, support group, network drop, etc. • Initiate a response as appropriate to the incident type, severity and context • Do this very fast! • Enter CECR… large drop in incidents within 3 months after implementation

Implementation

Implementation

Lots of info, so what • All this great information being gathered • How

Lots of info, so what • All this great information being gathered • How to sift through it • How to react to it • How to record our actions

Manual Handling • Manual correlation • Manually enter each incident (ELOG) • Basic reporting

Manual Handling • Manual correlation • Manually enter each incident (ELOG) • Basic reporting available • Very time consuming to enter all the tickets

Manual Handling • Needed to increase correlation speed • Needed better reporting • Needed

Manual Handling • Needed to increase correlation speed • Needed better reporting • Needed a way to distinguish incident types more easily • Needed a tool that portrayed a process • Needed a way to enter incidents automatically

Impetus for Change • In a single word - LAZINESS • September 2004 -

Impetus for Change • In a single word - LAZINESS • September 2004 - Outbreak of virus activity on our dialup network • Two problems – Mapping users to IP/Mapping IP to network segment - time consuming – Entering all those tickets - even more time consuming and oh the pain

CECR v 1. 0 • Shell script designed to accomplish two menial tasks –

CECR v 1. 0 • Shell script designed to accomplish two menial tasks – Correlate incidents to users – Submit tickets to RTIR automagically • Great first step for dealing with mass breakouts – Allowed for initial automation of specific triggers

CECR v 1. 0 • Limitations – Not abstracted and difficult to manipulate for

CECR v 1. 0 • Limitations – Not abstracted and difficult to manipulate for extension – Haphazard script to ease the pain – Wasn’t really designed for more central usage – Unable to effectively take actions based on incident severity

CECR v 2. 0 • Rewritten in Perl • Designed for extension and real-time

CECR v 2. 0 • Rewritten in Perl • Designed for extension and real-time updating • Able to conduct many more tasks – Different actions depending on severity – Plugins can be added at any time – Exclusions now possible – Repeat notification removed - limited to once daily – Automated contact to end-users/support groups

Framework of CECR v 2. 0 Sensors Reporting Process Central Processor Correlation Plugins Logging

Framework of CECR v 2. 0 Sensors Reporting Process Central Processor Correlation Plugins Logging and Ticket Creation Action Plugins Automated Notification

Components of CECR v 2. 0 • Reporting Process – Wrapping scripts around some

Components of CECR v 2. 0 • Reporting Process – Wrapping scripts around some IDS sources • Argus not “tail-able” • Vulnerability scanner results – Logsurfer+ for real-time processing of others • Pix log trends - context cognition • snort

Components of CECR v 2. 0 • Central Process – Perl script - the

Components of CECR v 2. 0 • Central Process – Perl script - the coordinator • Param: incident type, IP, time, port (optional) • Two configuration files – Actions - what action to take per incident Incident type: Action: RTIR Category: Reason Tag: Email file: Exclusion List – Contacts - whom to contact for non-user access Regex domain: email: RTIR support group

Components of CECR v 2. 0 • Correlation plugins – 6 plugins – Correlate

Components of CECR v 2. 0 • Correlation plugins – 6 plugins – Correlate IP (depending on connection): • Username • MAC • Port • TTY (dialup)

Components of CECR v 2. 0 • Action Plugins – 5 plugins – Conduct

Components of CECR v 2. 0 • Action Plugins – 5 plugins – Conduct various tasks including • Account lockout • Deregistration • Disconnection from network • Quarantine

Components of CECR v 2. 0 • Automated notification – Template based emails by

Components of CECR v 2. 0 • Automated notification – Template based emails by incident type – Contact either username (LDAP verified) or contact listed in contacts file – Notification sent to infosec group of incident • In event of no contact information, infosec email states as such

Components of CECR v 2. 0 • Logging & Ticket Creation – All actions

Components of CECR v 2. 0 • Logging & Ticket Creation – All actions and decisions are logged via syslog for audit purposes – E-mail notification to RTIR to automagically create tickets in appropriate queues – Time based record of event maintained to limit repeat notification

RTIR • CERT sponsored add-on to RT from Best Practical - opensource with support

RTIR • CERT sponsored add-on to RT from Best Practical - opensource with support availability • Queues helped define process • Manual insertion still required, but contributions existed for e-mail ticket creation - the light!!

CECR v 2. 0 • Net Results – Extendable framework for ever changing landscape

CECR v 2. 0 • Net Results – Extendable framework for ever changing landscape – Force multiplier allowing handlers to worry about more significant events – 24 x 7 x 365 monitoring of known issues – Automated tracking of events - allows for statistics

Questions?

Questions?