International Secure Systems Lab
Vienna University of Technology
SIDAR Graduierten-Workshop über Reaktive Sicherheit 2009
Detecting Bots with 
Automatically Generated 
Network Signatures
Peter Wurzinger, Leyla Bilge, Thorsten Holz,
Jan Goebel, Christopher Kruegel, Engin Kirda
International Secure Systems Lab,
Vienna University of Technology, {pw,tho}@seclab.tuwien.ac.at
Institute Eurecom, France, {bilge,kirda}@eurecom.fr
University of Mannheim, goebel@informatik.uni-mannheim.de
University of California, Santa Barbara, chris@cs.ucsb.edu
International Secure Systems Lab
Vienna University of Technology
SIDAR Graduierten-Workshop über Reaktive Sicherheit 2009
Outline
 Introduction
 Detection models - overview
 Generating detection models
 Analysis and evaluation
International Secure Systems Lab
Vienna University of Technology
SIDAR Graduierten-Workshop über Reaktive Sicherheit 2009
The Botnet Threat
 Tool of choice for Internet criminals
 Useful for many purposes:
▫ Spam
▫ DDoS
▫ Fast Flux
 Extremely powerful
 Simple to deploy and maintain
International Secure Systems Lab
Vienna University of Technology
SIDAR Graduierten-Workshop über Reaktive Sicherheit 2009
The Botnet Threat
 Network of compromised computers
 Remotely operated by botmaster
 Command and control channel (C&C)
▫ IRC: classic, Agobot
▫ HTTP: more stealthy, Bobax
▫ P2P: robust, Storm worm
International Secure Systems Lab
Vienna University of Technology
SIDAR Graduierten-Workshop über Reaktive Sicherheit 2009
Botnet Counter Measures
 Host-based
▫ Anti-virus software
▫ Relies on binary signature database (polymorphism)
▫ Host installation required
 Network-based
▫ Intrusion detection
▫ No requirements from end-user
▫ Relies on (hand-crafted) network signatures
International Secure Systems Lab
Vienna University of Technology
SIDAR Graduierten-Workshop über Reaktive Sicherheit 2009
Goal of our Work
 Network-based botnet detection
▫ Deployed on gateway
▫ Transparent to the user
 Automatically generated signatures
▫ No costly work has to be performed by human experts
▫ Signatures for new botnets can be added easily
 C&C protocol agnostic
▫ Signatures can be generated regardless of C&C protocol
▫ No expert knowledge about a specific botnet is required
International Secure Systems Lab
Vienna University of Technology
SIDAR Graduierten-Workshop über Reaktive Sicherheit 2009
Detection Models
International Secure Systems Lab
Vienna University of Technology
SIDAR Graduierten-Workshop über Reaktive Sicherheit 2009
Detection Models
 Characterisation of bot traffic using two phases
▫ Phase 1: Bot receives command
▫ Phase 2: Bot executes command
 Both phases are visible in network traffic
 Example:
▫ Phase 1 (command): string „advscan“ is transmitted to host X
▫ Phase 2 (response): X transmits many SYN packets to 
different recipients
International Secure Systems Lab
Vienna University of Technology
SIDAR Graduierten-Workshop über Reaktive Sicherheit 2009
Detection Models
International Secure Systems Lab
Vienna University of Technology
SIDAR Graduierten-Workshop über Reaktive Sicherheit 2009
Generating Detection Models
International Secure Systems Lab
Vienna University of Technology
SIDAR Graduierten-Workshop über Reaktive Sicherheit 2009
Generating Detection Models - Overview
 Input: Network traces of similar bot programs
 Find sudden changes in the bot‘s network behavior
 These changes are most likely due to a previously 
received command!
 Characterize traffic content before the change -> 
command model (phase 1)
 Characterize network behavior after the change -> 
response model (phase 2)
International Secure Systems Lab
Vienna University of Technology
SIDAR Graduierten-Workshop über Reaktive Sicherheit 2009
Obtaining Bot Network Traces
 Assemble a „bot family“
▫ Set of similar sample bot programs
▫ Similar C&C mechanism
▫ Not necessarily from same botnet
 Execute samples in a controlled environment
▫ Internet access open, so C&C communication works
▫ Run-time: several days
▫ Goal: collect command/reponse pairs
International Secure Systems Lab
Vienna University of Technology
SIDAR Graduierten-Workshop über Reaktive Sicherheit 2009
Locating Bot Behavior Changes
 Identify points in time where a sudden change of the 
bot‘s network behavior has occurred
 Assumption
▫ Change is due to a previously received command
▫ New network behavior is a manifestation of a bot response
▫ Command (data that is directly related to the bot‘s action) was 
received within a restricted time interval before the change
International Secure Systems Lab
Vienna University of Technology
SIDAR Graduierten-Workshop über Reaktive Sicherheit 2009
Locating Bot Behavior Changes
 Time-series
 Partition into discretization intervals of equal length
 Set of low-level network features each interval is 
inspected for:
▫ Number of packets
▫ Cumulative size of packets
▫ Number of different IPs contacted
▫ Number of different ports contacted
▫ Number of non-ASCII bytes in payload
▫ Number of UDP packets
▫ Number of HTTP packets (Port 80)
▫ Number of SMTP packets (Port 25)
International Secure Systems Lab
Vienna University of Technology
SIDAR Graduierten-Workshop über Reaktive Sicherheit 2009
Locating Bot Behavior Changes
 Change point detection
 Modified variant of CUSUM algorithm
 We know the interesting points in time now!
  command in traffic before
  response in traffic after
International Secure Systems Lab
Vienna University of Technology
SIDAR Graduierten-Workshop über Reaktive Sicherheit 2009
Response Model (Phase 2)
 Generalisation steps:
1. Description of network behavior in one discretization interval
2. Description of network behavior of the discretization intervals 
that form one bot response
3. Description of a class of bot responses
 We already have 1.  network features
International Secure Systems Lab
Vienna University of Technology
SIDAR Graduierten-Workshop über Reaktive Sicherheit 2009
Response Model (Phase 2)
 Generalization to describe sequence of discretization 
intervals that form one bot response
 Each period between two detected change points 
exhibits consistent bot network behavior
 This consistent behavior represents one bot response
 Behavior profile: average values of the network 
features per discretization interval
International Secure Systems Lab
Vienna University of Technology
SIDAR Graduierten-Workshop über Reaktive Sicherheit 2009
Response Model (Phase 2)
 Generalization to describe a class of bot responses
 Clustering of similar bot responses based on behavior 
profiles
 Each cluster represents one type of bot behavior
 The response model (phase 2) is the average of all 
behavior profiles of a cluster
International Secure Systems Lab
Vienna University of Technology
SIDAR Graduierten-Workshop über Reaktive Sicherheit 2009
Command Model (Phase 1)
 We have response models, now what are the 
corresponding command models?
 Reuse clusters of similar bot responses
 Inspect traffic that precedes responses in same cluster
 Extract similarities
International Secure Systems Lab
Vienna University of Technology
SIDAR Graduierten-Workshop über Reaktive Sicherheit 2009
Command Model (Phase 1)
 Find token sequences in the network traffic that are 
characteristic for triggering the observed response
 Tokens can consist of:
▫ the command itself
▫ frequently used parameters
▫ artefacts from the surrounding C&C protocol
International Secure Systems Lab
Vienna University of Technology
SIDAR Graduierten-Workshop über Reaktive Sicherheit 2009
Detection Model Summary
 Phase 1 – command
▫ Token  sequence
▫ Network content that is characteristic to show up before a 
certain bot response begins
 Phase 2 – response
▫ Description of the response using network features
▫ Network-level characterization of a type of bot response
International Secure Systems Lab
Vienna University of Technology
SIDAR Graduierten-Workshop über Reaktive Sicherheit 2009
Evaluation
International Secure Systems Lab
Vienna University of Technology
SIDAR Graduierten-Workshop über Reaktive Sicherheit 2009
Evaluation
 Generated detection models for
▫ various IRC bots
▫ Bobax
▫ Storm worm
 Translated them into Bro NIDS policy script
International Secure Systems Lab
Vienna University of Technology
SIDAR Graduierten-Workshop über Reaktive Sicherheit 2009
Example
signature irc {
dst-ip == local_nets
payload /.* PRIVMSG #.* :\.asc .*5 0 .*/
}
#DIFFERENT IPS > 20
(within 50 seconds)
International Secure Systems Lab
Vienna University of Technology
SIDAR Graduierten-Workshop über Reaktive Sicherheit 2009
Evaluation – Detection Performance
 Evaluation of our generated signatures using cross-
validation on bot network traces
 Detection rate: 88%
International Secure Systems Lab
Vienna University of Technology
SIDAR Graduierten-Workshop über Reaktive Sicherheit 2009
Evaluation - Preciseness
 Real-world deployment on well maintained networks
 No bot infections expected
 Students residential homes network
▫ /21 range, densely populated
▫ observation period: 55 days
▫ no false positives
 University network (/20, 3 months)
▫ /20 range, medium populated
▫ observation period: 102 days
▫ only 11 IPs falsely raised an alert
International Secure Systems Lab
Vienna University of Technology
SIDAR Graduierten-Workshop über Reaktive Sicherheit 2009
Conclusion
 2 Phases: Command/Response
 Our system produces botnet detection models
▫ for network-based detection
▫ without expert knowledge about specific botnets
▫ automatically
 Deployment on gateway, end-user not involved
 Effective detection with few false positives
International Secure Systems Lab
Vienna University of Technology
SIDAR Graduierten-Workshop über Reaktive Sicherheit 2009
Publication
This work is presented also at ESORICS 2009.
„Automatically Generating Models for Botnet Detection“
Check out the paper at http://www.iseclab.org
International Secure Systems Lab
Vienna University of Technology
SIDAR Graduierten-Workshop über Reaktive Sicherheit 2009
Questions?
Thank you for your attention! 
I'd be happy to answer all of your questions!