Erdős–Rényi Subgraph Pair Model

Updated 15 January 2026

The Erdős–Rényi Subgraph Pair Model is a formal framework that couples random graphs via subgraph extraction and vertex correspondence for network matching.
It establishes sharp information-theoretic phase transitions for both exact and partial recovery, guiding practical and adversarial network de-anonymization.
The model underpins studies in graph alignment and community detection using methods like brute-force MAP estimators and tail degree signature algorithms.

The Erdős–Rényi Subgraph Pair Model is a formal framework for studying random graphs coupled through subgraph extraction, vertex correspondence, and, crucially, for quantifying the information-theoretic limits of subgraph alignment and network matching. It forms the mathematical substrate for a wide array of statistical and computational analyses on network alignment, planted subgraph recovery, and correlated random graph pairs. This framework now plays a central role in the rigorous treatment of exact and partial graph alignment, particularly for assessing the feasibility and optimality of recovery in both practical and adversarial regimes (Shiu et al., 8 Jan 2026, Du, 17 Feb 2025, Bozorg et al., 2019).

1. Formal Model Definitions

1.1 Subgraph-Pair Model (Alignment/Recovery Setting)

Let $n\in\mathbb{N}$ , $m=m(n)<n$ , and $p=p(n)\in[0,1]$ . A base random graph $G\sim ER(n,p)$ is sampled on vertex set $[n]=\{1,\ldots,n\}$ . An $m$ -subset $S\subset [n]$ is chosen uniformly; $H=G[S]$ is the induced subgraph, which is then anonymized by a uniformly random permutation $\pi:S\to[m]$ to produce $H_\pi$ . The observer sees $m=m(n)<n$ 0 but neither $m=m(n)<n$ 1 nor $m=m(n)<n$ 2, and aims to recover $m=m(n)<n$ 3 (set recovery) and/or $m=m(n)<n$ 4 (permutation recovery) (Shiu et al., 8 Jan 2026).

1.2 Correlated Erdős–Rényi Subgraph Pair (Graph Matching Setting)

A "parent" graph $m=m(n)<n$ 5 is generated. Two edge-subsampled graphs $m=m(n)<n$ 6 are created by independently including each parent edge with probability $m=m(n)<n$ 7. One of the graphs is vertex-permuted by an unknown bijection $m=m(n)<n$ 8. The analyst receives $m=m(n)<n$ 9 and aims to recover $p=p(n)\in[0,1]$ 0 (Du, 17 Feb 2025, Bozorg et al., 2019).

1.3 Agglomerated Subgraph-Pair (Super-vertex Construction)

Given a partition of $p=p(n)\in[0,1]$ 1 into $p=p(n)\in[0,1]$ 2 disjoint nonempty subsets ("super-vertices"), a subgraph–pair model is defined on the super-vertex set: two super-vertices are connected iff at least one edge exists between their constituent nodes in the original $p=p(n)\in[0,1]$ 3 graph. This construction creates an effective inhomogeneous random graph on the super-vertex level, with edge probabilities depending on the subset sizes (Kang et al., 2013).

2. Information-Theoretic Recovery Thresholds

Sharp information-theoretic phase transitions delimit when exact or partial recovery is possible:

2.1 Exact Subgraph Set and Permutation Recovery

Set Recovery: Achievable iff $p=p(n)\in[0,1]$ 4, impossible (converse) if $p=p(n)\in[0,1]$ 5, where $p=p(n)\in[0,1]$ 6. Under mild conditions, the sharp threshold is $p=p(n)\in[0,1]$ 7 (Shiu et al., 8 Jan 2026).
Permutation Recovery: Requires, in addition, $p=p(n)\in[0,1]$ 8 (unique labeling). Fails if either the set recovery converse applies or $p=p(n)\in[0,1]$ 9.

2.2 Partial Recovery in Correlated Graphs

For correlated pairs with $G\sim ER(n,p)$ 0, $G\sim ER(n,p)$ 1, and $G\sim ER(n,p)$ 2, one cannot recover all vertices, but the fraction of recoverable correspondences is bounded tightly in terms of a limiting "balanced-load" distribution $G\sim ER(n,p)$ 3:

The maximal fraction of accurately aligned vertices approaches $G\sim ER(n,p)$ 4, with $G\sim ER(n,p)$ 5 (Du, 17 Feb 2025).

These thresholds delineate computational and information-theoretic feasibility in subgraph alignment and network de-anonymization.

3. Structural and Statistical Properties

3.1 Degree and Clustering Structure

For two independent $G\sim ER(n,p)$ 6 on a common vertex set, their union is $G\sim ER(n,p)$ 7 with $G\sim ER(n,p)$ 8. Degree distributions are binomial, clustering coefficient is $G\sim ER(n,p)$ 9 (Wen et al., 2012).
Agglomerated super-vertex models produce inhomogeneous graphs, where connection probability between super-vertices of sizes $[n]=\{1,\ldots,n\}$ 0 is $[n]=\{1,\ldots,n\}$ 1, enabling explicit degree and connectivity computations at the super-vertex level (Kang et al., 2013).

3.2 Emergence of Community and Heavy-Tailed Structures

When community sizes are heavy-tailed (e.g., $[n]=\{1,\ldots,n\}$ 2), the induced super-vertex network has a scale-free (power-law) degree distribution, depending on the partition (Kang et al., 2013).

4. Methodologies and Algorithms

4.1 Brute-force (MAP) Estimator

For subgraph alignment, the optimal MAP estimator tests all $[n]=\{1,\ldots,n\}$ 3-subsets $[n]=\{1,\ldots,n\}$4 and bijections $[n]=\{1,\ldots,n\}$ 5, returning those for which relabeling $[n]=\{1,\ldots,n\}$ 6 by $[n]=\{1,\ldots,n\}$ 7 reproduces $[n]=\{1,\ldots,n\}$ 8. This is computationally intractable but achieves the information-theoretic threshold (Shiu et al., 8 Jan 2026).

4.2 Tail Degree Signature (TDS)

TDS is a polynomial-time, seedless matching algorithm exploiting the robustness of tail-degree statistics in correlated ER graphs. Feature vectors consist of sorted extremes of neighbor degree distributions across multiple neighborhood shells. Theoretical analysis shows it achieves the information-theoretic threshold $[n]=\{1,\ldots,n\}$ 9 in regime $m$ 0 (Bozorg et al., 2019).

Complexity

Algorithm	Time Complexity	Achieves IT Threshold
Brute-force MAP	Exponential ( $m$ 1)	Yes (exact recovery), not practical for large $m$ 2
TDS–h (Hungarian)	$m$ 3	Yes (matching threshold for $m$ 4, sparse regime)
TDS–g (Greedy)	$m$ 5	Yes, with high probability under threshold conditions

5. Phase Transitions and Limit Theorems

5.1 Phase Diagrams in Alignment

Define $m$ 6. Set recovery is feasible for $m$ 7, infeasible for $m$ 8, with a grey zone in between. Sharp phase transitions demarcate algorithmic possibility from impossibility (Shiu et al., 8 Jan 2026).

5.2 Community Graph Phase Transitions

For agglomerated super-vertex graphs, thresholds for connectivity and giant component emergence follow from inhomogeneous random graph (IRG) theory (Kang et al., 2013). The key parameter is $m$ 9, the average squared community size times edge probability:

Largest component vanishes if $S\subset [n]$ 0, occupies $S\subset [n]$ 1 super-vertices if $S\subset [n]$ 2.

6. Connections to Broader Random Graph Models

The ER subgraph-pair model is a special case of subgraph generated models (SUGMs), where the only generated subgraphs are links ( $S\subset [n]$ 3 type), with SUGM reducing exactly to ER( $S\subset [n]$ 4). More general SUGMs encode dependency on motifs such as triangles, stars, and cliques, bridging ER structure and higher-order motif-based randomness (Chandrasekhar et al., 2016).

By tuning the types and rates of subgraph "atoms," the model generalizes ER, permitting tractable closed-form expressions for expectations, variances, and parameter inference.

7. Applications and Implications

The ER subgraph-pair model underpins rigorous analysis of biological network alignment, privacy and de-anonymization of social networks, and statistical models of network community structure. Its phase diagrams and thresholds provide foundational guarantees for algorithmic graph matching and motif-based inference. Recent advances demonstrate that truly seedless and polynomial-time algorithms can saturate the fundamental information-theoretic limits via robust local statistics, revealing new pathways for tractable recovery in high-noise regimes (Shiu et al., 8 Jan 2026, Du, 17 Feb 2025, Bozorg et al., 2019).

References:

(Shiu et al., 8 Jan 2026) Information-Theoretic Limits on Exact Subgraph Alignment Problem (Du, 17 Feb 2025) Optimal recovery of correlated Erdős-Rényi graphs (Bozorg et al., 2019) Seedless Graph Matching via Tail of Degree Distribution for Correlated Erdos-Renyi Graphs (Wen et al., 2012) Edge Union of Networks on the Same Vertex Set (Chandrasekhar et al., 2016) A Network Formation Model Based on Subgraphs (Kang et al., 2013) Evolution of a modified binomial random graph by agglomeration