Papers
Topics
Authors
Recent
Search
2000 character limit reached

Duplication-Divergence Growing Graphs

Updated 18 January 2026
  • Duplication-divergence growing graphs are stochastic network models where nodes duplicate and selectively retain or drop edges to mirror real-world systems.
  • They employ probabilistic edge copying with parameters p and r, leading to heterogeneous degree distributions, phase transitions, and rich connectivity regimes.
  • Analytical techniques such as master equations, martingale concentration, and Markov chain embeddings rigorously establish scaling laws and structural properties of these graphs.

Duplication-divergence growing graphs are a class of stochastic network models in which network evolution is driven by the mechanisms of node (vertex) duplication and subsequent divergence, typically via partial retention or deletion of edges. This framework was developed to capture structural properties observed in real-world systems such as biological, protein–protein interaction, and social networks, where new vertices arise by copying the interaction patterns of existing ones but undergoing random displacement of links. These models are mathematically tractable and serve as canonical descriptions for the emergence of heterogeneous degree distributions, high clustering, and modular connectivity in large-scale sparse networks.

1. Formal Model Definition

A canonical duplication-divergence model, such as the DD(t,p,r)DD(t,p,r) model, operates as follows (Frieze et al., 2023):

  • Start with an initial simple graph Gt0G_{t_0} on t0t_0 vertices.
  • For each time step i=t0,...,t1i = t_0, ..., t-1:
    • If (u,w)E(Gi)(u,w)\in E(G_i), attach (v,w)(v,w) independently with probability Gt0G_{t_0}0.
    • If Gt0G_{t_0}1, attach Gt0G_{t_0}2 independently with probability Gt0G_{t_0}3 (for background rewiring).
  • Bernoulli trials are independent.

This encompasses pure and partial duplication, allowance for background mutation (Gt0G_{t_0}4), as well as tunable divergence probability (Gt0G_{t_0}5).

Several generalizations exist:

For directed graphs, edges may be duplicated only in the outgoing or incoming direction, possibly also adding deterministic citations to the parent (Steinbock et al., 2018).

2. Degree Distribution and Concentration Phenomena

The asymptotic degree distribution in duplication-divergence models exhibits rich phenomenology:

  • Concentration of Maximum and Average Degree:
    • For the Gt0G_{t_0}7 model and any Gt0G_{t_0}8:
    • The maximum degree Gt0G_{t_0}9 is concentrated around t0t_00, up to polylogarithmic factors, with failure probability at most t0t_01 for any t0t_02: for any t0t_03,

    t0t_04 - The average degree t0t_05 scales as t0t_06, with similar high-probability bounds (Frieze et al., 2023).

  • Threshold and Phase Regimes:

    • For t0t_07, t0t_08.
    • For t0t_09, i=t0,...,t1i = t_0, ..., t-10 sharply concentrates around i=t0,...,t1i = t_0, ..., t-11 (Frieze et al., 2023).
    • There is no phase transition in the scaling of the maximum degree at i=t0,...,t1i = t_0, ..., t-12, in contrast to the average degree (Frieze et al., 2023).
  • Tail Behavior:
    • In basic models with background mutation (i=t0,...,t1i = t_0, ..., t-13), the limiting degree distribution is not always pure power-law:
    • With specific duplication/deletion rules, the degree distribution decays as a stretched-exponential:

    i=t0,...,t1i = t_0, ..., t-14

    (Backhausz et al., 2013). - In mean-field models, a power-law tail i=t0,...,t1i = t_0, ..., t-15 with exponent i=t0,...,t1i = t_0, ..., t-16 solving

    i=t0,...,t1i = t_0, ..., t-17

    can emerge in specific partial-duplication regimes (Borrelli, 18 Jun 2025).

  • Central Limit Theorem for Log-Degree:

    • For robust supercritical duplication-divergence models, a central limit theorem holds for the log-degree:

    i=t0,...,t1i = t_0, ..., t-18

    where i=t0,...,t1i = t_0, ..., t-19 and v=i+1v = i+10 are effective birth/catastrophe rates (Barbour et al., 2021).

3. Parameter Regimes, Structural Transitions, and Component Behavior

Parameter tuning in duplication-divergence models controls key network properties (Borrelli, 18 Jun 2025, Borrelli, 2024, Borrelli, 11 Jan 2026):

  • Divergence Rate (v=i+1v = i+11) and Densification:

    • Edge retention probability v=i+1v = i+12.
    • For v=i+1v = i+13, the expected number of edges grows superlinearly with time; the network densifies.
    • For v=i+1v = i+14, graphs remain sparse; average degree and edge count grow sublinearly.
  • Asymmetry Parameter (v=i+1v = i+15) in Divergence (Borrelli, 2024):
    • v=i+1v = i+16 or v=i+1v = i+17 yields complete asymmetric divergence (edges lost only from copy or parent, one giant component).
    • v=i+1v = i+18 yields symmetric divergence (edges lost with equal probability, generation of fragmented components with power-law distributed sizes).
  • Component Size and Percolation:
    • Models with symmetric divergence (e.g., v=i+1v = i+19) exhibit a nontrivial phase transition in the emergence of a giant component as divergence increases.
    • For the symmetric coupled divergence model, the critical divergence rate uu0 for the appearance of a giant component is numerically estimated as uu1 (Borrelli, 11 Jan 2026).
    • The component-size distribution follows uu2, with uu3 for uu4, evidencing heavy-tailed modularity (Borrelli, 2024).
  • Euler Characteristic:
    • The locus where the Euler characteristic uu5 vanishes marks a singularity in network structure and coincides with the percolation transition (Borrelli, 11 Jan 2026).

4. Analytical Techniques and Proof Schemes

Multiple rigorous and mean-field analytical techniques have been deployed (Frieze et al., 2023):

  • Martingale and Chernoff-type Concentration:
    • Proofs for maximum and average degree concentration use telescoping meshes, careful deterministic envelopes, and repeated application of Chernoff bounds on degree increments.
    • Martingale methods establish convergence (and concentration) of the total degree and its increments.
    • Central recurrences for degree growth mimic polynomial trajectories (uu6, uu7).
  • Master Equation Analysis:
    • Degree distributions are derived from master equations for vertex counts of given degree (often using binomial thinning per copied edge).
    • Linear recurrences, eigen-decomposition of the transition matrix, and asymptotics (via generating functions) yield stationary or non-stationary degree laws (Sudbrack et al., 2017, Borrelli, 18 Jun 2025).
  • Markov Chain and Birth-Catastrophe Process Embeddings:
    • Tagged-vertex Markov chains with duplication (birth) and divergence (catastrophe) transitions map the evolution of degree for specific vertices.
    • Quasi-stationary distributions and critical behavior are characterized by spectral equations for the Markov transition generator.
  • Union Bounds and High-Probability Analysis:
    • Application of Chernoff-type union bounds over vertices and time steps ensures superpolynomial concentration of extremal quantities (Frieze et al., 2023).

5. Open Problems and Structural Invariants

Despite extensive progress, several open questions persist (Frieze et al., 2023, Borrelli, 18 Jun 2025):

  • The exact limiting law and support of the normalized maximum degree uu8 remain undetermined.
  • Proving the existence of a true power-law degree tail in the generic uu9 model is unresolved, with known special cases showing only stretched-exponential decay (Backhausz et al., 2013).
  • The full degree distribution, especially in models with background mutation or asymmetric divergence, lacks a comprehensive description.
  • Further analysis is needed on component sizes, motif frequencies, graph automorphism groups, and efficient encoding schemes for duplication-divergence-generated networks.

6. Biological and Network Science Relevance

Duplication-divergence models capture central aspects of biological network evolution, such as gene or protein duplication followed by interaction loss (Borrelli, 18 Jun 2025). Empirical tests on biological datasets, such as protein-protein interaction or genetic regulatory networks, reveal signatures (e.g., negative deviation from expected distinguishability number) consistent with pure duplication–deletion histories (Crawford-Kahrl et al., 2021).

The general class of models unifies fundamental mechanisms in network science:

  • Emergence of scale-free (or nearly scale-free) degree distributions.
  • High clustering and modularity, exceeding those of preferential-attachment or Erdős–Rényi graphs.
  • Phase transitions in connectivity structure driven by model parameters.

The duplication-divergence framework thus underpins a statistical-mechanics approach to natural network formation and has become a touchstone for analytic exploration of non-equilibrium network growth (Borrelli, 18 Jun 2025, Borrelli, 11 Jan 2026, Frieze et al., 2023).


Key References:

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Duplication-Divergence Growing Graphs.