A Taxonomy of Botnet Structures David Dagon Guofei

A Taxonomy of Botnet Structures David Dagon, Guofei Gu, Chris Lee, and Wenke Lee Georgia Institute of Technology In Proceedings of the 23 Annual Computer Security Applications Conference (ACSAC'07), Miami Beach, FL, December 2007. 2022/2/11 1

Outline • • • Purpose and Goal Key Metrics for Botnet Structure Botnet Network Models Taxonomy-Driven Botnet Response Strategies Empirical Analysis - Bandwidth Conclusion 2

Purpose and Goal • Assist the defender in indentifying possible types of botnets • Describe key properties of botnet classes, so researchers may focus on their efforts on beneficial response technologies. • The taxonomy in this paper is driven by possible responses, and not detection. 3

Key Metrics for Botnet Structure 4

Botnet Effectiveness - Giant • S: “Giant” component of the botnet, or largest connected (or online) portion of the graph • Giant component lets us directly measure the damage potentially caused by certain botnet functions • Some infected victims may not always reachable by the botmaster, e. g. , because of diurnal variations 5

6

Botnet Effectiveness – Bandwidth (1/2) • Average bandwidth, B, means the cumulative available bandwidth in a bot that a botmaster could generate from the various bots under ideal circumstances. • B varies with – the distribution of bandwidth available to each member of the botnet, – the probability that any victim is “on-line” at any given time, – and the amount of bandwidth already being consumed by the victims themselves (e. g. , for normal use). 7

Botnet Effectiveness – Bandwidth (2/2) • Three types of bots according to their transit categories: – Type 1: Modems – Type 2: DSL/cable – Type 3: “high speed” network • • Pi : probability of a bot belong to type i ={1, 2, 3} Mi : maximum network bandwidth within each type Ai : average normal usage bandwidth within each type W : online time probability vector of above 3 types of bots – Online time of type 1/type 2/type 3 = 2/6/24 hrs – W = [2/32, 6/32, 24/32] 8

9

Botnet Efficiency – Diameter (1/2) • We can express communication efficiency of a network by diameter l, the average geodesic length of a network • Here, inverse geodesic length l-1 is used, instead of l • l-1 ranges from 0 (no edges) to 1 (fully connected) • l-1 refers to the overlay network of bot-to-bot connections created by the malware, instead of the physical topology of the Internet. 10

Botnet Efficiency – Diameter (2/2) • l-1 also relevant to robustness because with each message passed through a botnet, there is a probability of detection or failure • Assume that bots u and v are connected through n possible path, P 1, …Pn • Each node in the path can be recovered (cleaned) with probability α • ϵi is the chance that path Pi is corrupted , quarantined, or blocked • While bots u and v are connected through some path with probability 1 -(1 - α)n, the chance of failure increases with α 11

Botnet Robutness – Local Transitivity • Local transitivity measures the likelihood that nodes appear in “triad” groups. • • Γv: neighborhood of vertices around node v Ev: the number of edges in Γv kv : the number of vertices in Γv γ : cluster coefficient ranges from [0, 1], with 1 representing a complete mesh 12

Botnet Network Models • • Erdös-Rényi Random Graph Models Watts-Strogatz Small World Models Barabási-Albert Scale Free Models P 2 P Models – Structured (CAN, CHORD) → Random – Unstructured (Gnutella, Kazaa) → Scale Free Random Graph Small World Scale Free 13

Erdös-Rényi Random Graph Models • In a random graph, each node is connected with equal probability to the other N-1 nodes. • Such networks have a logarithmically increasing l-1 • The chance a bot has a degree of k • Each bot must carry a list of all members in the network to form the random graph • Botmasters may select 〈k〉≦ 10, so that bots appear to have flow behavior similar to many peer-to-peer applications 14

Watts-Strogatz Small World Models • A regional network of local connections is created in a ring, within a range r. • Each bot is further connected with probability P to nodes on the opposite side of the ring through a “shortcut”. • Typically, P is quite low, and the resulting network has a length l ≈ log N • To frustrate remediation and recovery, the r is typically small (r ≈ 5). 15

Barabási-Albert Scale Free Models • Distinguished by degree distribution, and the distribution of k decays as a power law. • Targeted responses can select the high degree nodes, leading to dramatic decay in the operation of the network. hub 16

TAXONOMY-DRIVEN BOTNET RESPONSE STRATEGIES 17

Random Graph and P 2 P Models (1/3) • Botnet size = 5 k 18

Random Graph and P 2 P Models (2/3) • Random loss will not diminish the number of triads in the botnet 19

Random Graph and P 2 P Models (3/3) • Structured P 2 P networks in fact have a constant k (often set equal to the log N size of the network), so they are slightly more stable than purely random networks. • Thus, changes in γ and s, and l-1 are constant with the loss of random nodes. • Botnets with random topologies are therefore extremely resilient. • We speculate that the most effective response strategies will include technologies to remove large numbers of nodes at once. 20

Small World Models • The average degree in a small world is 〈k〉≈ r, or the number of local links in a graph. • Thus, random and targeted responses to a small world botnet produce the same result: the loss of r links with each removed node. • The key metrics for botnets, s, γ, l-1 all decay at a constant rate in a small world. • We presumed that shortcut links in a small world botnet are not used, but even if present, they would not affect γ with r≧ 4. 21

Scale Free and P 2 P Models • The “core” of a scale free botnet is the number of high-degree central nodes–the routers and hubs used to coordinate the soldier bots. • C : core size, for 5 K scale free botnet 22

Changes in link count for leaves in a scale free network 23

Unstructured P 2 P bot – Nugache (1/2) • Nugache uses a link encrypted, peer-to-peer file-sharing protocol, WASTE, and uses several hard-coded IP addresses to request a list of peers to from • Using WINE to emulate many Windows servers under a Unix server. 24

Unstructured P 2 P bot – Nugache (2/2) • Counts by SYN+ACK connection observed by batch WINE nodes, but we can’t tell the new victims from the olds. 25

Empirical Analysis - Bandwidth • Botnet 1: 7, 326/50, 000 , in February 2005 • Botnet 2: 3, 391/48, 000 , in January 2006 • Using tmetic tool to perform the bandwidth estimation. • tmetric essentially uses successively larger probes to estimate the bandwidth to a host. 26

Botnet 1 27

Botnet 2 28

• Final average bandwidths – Botnet 1 (50 k): 22. 7164 Kbps – Botnet 2 (48 k): 14. 6378 Kbps • we speculate that responders might significantly reduce a botnet’s DDo. S potential by targeting the “high-speed” members of a botnet. 29

Conclusion • Demonstrated the utility of this taxonomy and proposed metrics by using both simulation and real-world botnet experiments. • Random network models give botnets considerable resilience. Such formations resist both random and targeted responses. • Targeted removals on scale free botnets offer the best response. 30