Planting trees in random graphs and finding them

  • Slides: 24
Download presentation
Planting trees in random graphs (and finding them back) Laurent Massoulié Joint work with

Planting trees in random graphs (and finding them back) Laurent Massoulié Joint work with Ludovic Stephan & Don Towsley

Inference problems with planted structure Classical examples community detection, planted clique detection, planted dense

Inference problems with planted structure Classical examples community detection, planted clique detection, planted dense subgraph detection Typical Objectives -Detection (planted structure present or not) -Recovery (estimation of planted structure) Recurring phenomena 3 regimes, or phases: -Information-Theoretic impossibility -Computational hardness though IT-feasible -Computationally easy Boundary phases the same for both detection and reconstruction

‘Network Security’ scenario Attackers may be “plotting” or not Plotting communications among them Goals:

‘Network Security’ scenario Attackers may be “plotting” or not Plotting communications among them Goals: -Infer from observed communication graph whether attack under way (detection ) -If attack detected, identify attackers (reconstruction )

Model Erdös-Rényi graph Under attack: augmented by tree on chosen uniformly at random Focus

Model Erdös-Rényi graph Under attack: augmented by tree on chosen uniformly at random Focus on simplest tree i. e. path nodes

Outline • Phase diagram for paths • Other trees

Outline • Phase diagram for paths • Other trees

Sparse phase

Sparse phase

First moment method: no K-path in G under poly-time detection and reconstruction

First moment method: no K-path in G under poly-time detection and reconstruction

The argument for undetectability Likelihood ratio between distributions without and with attack,

The argument for undetectability Likelihood ratio between distributions without and with attack,

The argument for undetectability Likelihood ratio between distributions without and with attack,

The argument for undetectability Likelihood ratio between distributions without and with attack,

Bounding Markov chain

Bounding Markov chain

The argument for detectability, 1) Too many edges under 2) Counts of small connected

The argument for detectability, 1) Too many edges under 2) Counts of small connected components enable consistent estimation of Arguments hold for any connected K-graph, not just lines and trees

The argument for reconstruction impossibility Optimal overlap between estimated and planted path: Maximum a

The argument for reconstruction impossibility Optimal overlap between estimated and planted path: Maximum a Posteriori, i. e. nodes on largest number of K-paths

The argument for reconstruction impossibility “Lures”: can construct confounding segment s. t.

The argument for reconstruction impossibility “Lures”: can construct confounding segment s. t.

More generally: construct T symmetric paths

More generally: construct T symmetric paths

Other trees: stars • Stars hard to hide! Threshold at Detection and reconstruction easy:

Other trees: stars • Stars hard to hide! Threshold at Detection and reconstruction easy: inspection of degrees

D-trees

D-trees

Conclusions Phase diagram for line detection & reconstruction -No computationally hard phase -Reconstruction impossible,

Conclusions Phase diagram for line detection & reconstruction -No computationally hard phase -Reconstruction impossible, by presence of too many copies of planted structure Planted subgraphs beyond cliques & dense subgraphs: -Phase diagram for D-regular trees? More general planted graphs? -What triggers hard phases?

Thanks!

Thanks!