Papers
Topics
Authors
Recent
Search
2000 character limit reached

Rank–Nexus: Invariants in Phylogenetic Networks

Updated 4 February 2026
  • Rank–Nexus is a framework extending rank invariants from tree structures to phylogenetic networks by using tensor flattenings to identify network clades.
  • It employs tensor flattenings of joint site distributions under the general Markov model to derive rank constraints, ensuring all 5×5 minors vanish for genuine network clades.
  • The framework adapts to equivariant models by decomposing the flattening matrix into block-diagonal forms, which facilitates efficient detection of reticulate events and network topologies.

Rank–Nexus describes a framework for extending rank invariants from phylogenetic trees to phylogenetic networks, particularly in the context of molecular sequence evolution. This approach generalizes the classical rank-based invariants that characterize tree-like evolutionary relationships, providing analogous constraints (termed "Rank–Nexus invariants") for partially reticulated, acyclic, rooted graphs under both the general Markov model (GMM) and group-equivariant submodels. The principal technical device is the rank of certain tensor flattenings of the joint site distribution, which discriminates between different network topologies and reveals network-clades embedded in reticulate histories (Casanellas et al., 2020).

1. Structural Foundations and Notation

Phylogenetic networks in Rank–Nexus are specifically tree-child binary networks: rooted, acyclic, directed graphs representing nn labeled taxa. The structure satisfies the following:

  • The root has indegree 0 and outdegree 2.
  • Each leaf has indegree 1 and outdegree 0, uniquely labeled from $1$ to nn.
  • Interior nodes are either tree-vertices (indegree 1, outdegree 2) or reticulation-vertices (indegree 2, outdegree 1), with every reticulation’s child required to be a tree-vertex.

Evolutionary processes on the network are modeled via assignment of 4×44 \times 4 Markov matrices MeM^e to each edge and an initial distribution π\pi on nucleotides Σ={A,C,G,T}\Sigma = \{A,C,G,T\}. The parameterization θ={π,MeeEdges(N)}\theta = \{ \pi, M^e \mid e \in \mathrm{Edges}(N) \} governs the distribution PN,θP_{N, \theta} over character patterns at the leaves.

Reticulations preclude the network from specifying a unique tree. Each of mm reticulation-vertices wiw_i corresponds to binary choices σ{0,1}m\sigma \in \{0,1\}^m selecting one of the two incoming edges, reducing NN to a tree TσT_\sigma. The induced leaf pattern distribution PN,θP_{N,\theta} is then a mixture over the 2m2^m possible tree resolutions:

PN,θ(x1,,xn)=σ{0,1}m[i=1mδi1σi(1δi)σi]PTσ,θσ(x1,,xn)P_{N,\theta}(x_1, \ldots, x_n) = \sum_{\sigma \in \{0,1\}^m} \left[ \prod_{i=1}^m \delta_i^{1-\sigma_i} (1-\delta_i)^{\sigma_i} \right] P_{T_\sigma, \theta_\sigma}(x_1, \ldots, x_n)

where δi[0,1]\delta_i \in [0,1] denotes site-inheritance probabilities.

2. Tensor Flattenings and Rank-Invariants

For a bipartition AB={1,,n}A \cup B = \{1, \ldots, n\}, the observed joint distribution PN,θP_{N,\theta} can be viewed as a vector pp in C4n\mathbb{C}^{4^n}, and the flattening operation flattAB(p)\mathrm{flatt}_{A|B}(p) reshapes pp into a 4A×4B4^{|A|} \times 4^{|B|} matrix with entries

$\left[ \mathrm{flatt}_{A|B}(p) \right]_{i, j} = p(\text{char-vector: states $ion on A,, jon on B$} )$

In the traditional GMM on a phylogenetic tree, if ABA|B corresponds to an edge split, the flattening factors as MADπMBM_A^\top D_\pi M_B, with MAM_A, MBM_B as conditional probability matrices and DπD_\pi the base distribution. This structure implies:

rankflattAB(p)4\mathrm{rank} \, \mathrm{flatt}_{A|B}(p) \leq 4

Casanellas and Fernández-Sánchez extend this invariant to networks: if AA forms a network-clade (a subtree present in every tree resolution), the factorization remains valid after network-to-tree mixture. Consequently, the rank bound applies and all 5×55 \times 5 minors must vanish—the polynomial relations constituting the Rank–Nexus invariants.

3. Equivariant Model Extensions

For many biological models, transition matrices MeM^e exhibit symmetry under a permutation group GS4G \leq S_4. These G-equivariant models—including Jukes–Cantor (G=S4G = S_4), Kimura 2-parameter, Kimura 3-parameter, and strand-symmetric models—constrain the distribution pp to the GG-invariant subspace (C4n)G(\mathbb{C}^{4^n})^G. Maschke’s theorem yields a simultaneous decomposition of the state spaces and, correspondingly, the flattening flattAB(p)\overline{\mathrm{flatt}}_{A|B}(p) becomes block-diagonal:

flattAB(p)=diag(B1,,Bk)\overline{\mathrm{flatt}}_{A|B}(p) = \mathrm{diag}(B_1, \ldots, B_k)

with block sizes mi×mim_i \times m_i determined by the isotypic decomposition of C4\mathbb{C}^4. The Rank–Nexus bound now reads:

rankBimi\mathrm{rank} \, B_i \leq m_i

for each block ii, with vanishing (mi+1)×(mi+1)(m_i+1) \times (m_i+1) minors constituting the system of invariants specific to the equivariant model.

4. Theorems and Proof Structure

Theorem 2.1 (GMM Rank–Nexus)

Let NN be a tree-child binary network and AA a subset of its leaves forming a true subtree (no internal reticulations). Denoting B=LeavesAB = \mathrm{Leaves} \setminus A, then for any GMM parameters θ\theta, flattAB(PN,θ)\mathrm{flatt}_{A|B}(P_{N, \theta}) has rank at most $4$; thus, all 5×55 \times 5 minors vanish.

Sketch: Reroot every resolution TσT_\sigma at the root of the AA subtree. The flattening factors for each tree, with mixture weights yielding a convex sum of rank at most $4$ matrices.

Theorem 2.2 (Equivariant Rank–Nexus)

Under a GG-equivariant model for NN and the same ABA|B clade partition, the block-diagonal flattening has block ranks bounded by mim_i, with all corresponding minors vanishing.

Sketch: Equivariance ensures the mixture factors through the appropriate HomG\mathrm{Hom}_G spaces on each isotypic component, enforcing the blockwise rank constraints (Casanellas et al., 2020).

5. Illustrative Example

Consider a four-leaf network NN with a single reticulation ww, leaves $1,2$ assigned to AA, $3,4$ to BB:

  • NN admits two tree resolutions, indexed by σ{0,1}\sigma \in \{0,1\}.
  • Given a root distribution π\pi and edge Markov matrices MeM^e, the flattening flattAB\mathrm{flatt}_{A|B} results in a 16×1616 \times 16 matrix.
  • For each TσT_\sigma, the flattening is MADπσMBσM_A^\top D_{\pi^\sigma} M_B^\sigma, summing to a matrix of rank at most $4$ after mixing.
  • Numerically, singular value decomposition confirms at most four nonzero singular values; all 5×55 \times 5 minors vanish.

6. Applications and Computational Practices

Detecting Tree-Clades and Reticulations

Given site-frequency data p^\hat{p} from nn taxa, if p^\hat{p} arises from a GMM on a network NN containing a tree-clade AA, then flattAB(p^)\mathrm{flatt}_{A|B}(\hat{p}) should empirically testify to the rank bound. Conversely, partitions not supported by a true subtree generally exhibit ranks significantly exceeding $4$. Scanning bipartitions and evaluating the decay in singular values (beyond the theoretical threshold) reveals tree-like subsets and, in turn, network topology.

Distinguishing Network Topologies

As different networks may share certain clades but not others, their collections of Rank–Nexus invariants differ. Non-vanishing of specified minors can rule out network hypotheses, thus extending tree-invariant methodology to reticulate models.

Algorithmic and Practical Considerations

Enumerating all minors is intractable; typical practice involves:

  • Selecting a relevant bipartition ABA|B
  • Forming the 4A×4B4^{|A|} \times 4^{|B|} empirical flattening
  • Deploying SVD (worst-case cost O(4n)O(4^n)) and inspecting singular-value spectra
  • For equivariant models, first block-diagonalizing via symmetric transforms, reducing the problem to analyzing smaller blockwise SVDs.

For sample sizes n20n \leq 20, these procedures are computationally feasible (Casanellas et al., 2020).

7. Perspective and Significance

Rank–Nexus invariants provide a linear-algebraic means to generalize robust identifiability conditions and reconstruction techniques from trees to networks. Their key principle is the preservation of flattening-rank constraints in any reticulation-free subtree across all tree resolutions of a network. These polynomial invariants afford practical, statistically grounded tests for the identification of tree-like structure in complex reticulate histories—detecting clades, distinguishing network hypotheses, and ultimately advancing phylogenomic inference under rich evolutionary scenarios.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Rank-Nexus.