Rank–Nexus: Invariants in Phylogenetic Networks

Updated 4 February 2026

Rank–Nexus is a framework extending rank invariants from tree structures to phylogenetic networks by using tensor flattenings to identify network clades.
It employs tensor flattenings of joint site distributions under the general Markov model to derive rank constraints, ensuring all 5×5 minors vanish for genuine network clades.
The framework adapts to equivariant models by decomposing the flattening matrix into block-diagonal forms, which facilitates efficient detection of reticulate events and network topologies.

Rank–Nexus describes a framework for extending rank invariants from phylogenetic trees to phylogenetic networks, particularly in the context of molecular sequence evolution. This approach generalizes the classical rank-based invariants that characterize tree-like evolutionary relationships, providing analogous constraints (termed "Rank–Nexus invariants") for partially reticulated, acyclic, rooted graphs under both the general Markov model (GMM) and group-equivariant submodels. The principal technical device is the rank of certain tensor flattenings of the joint site distribution, which discriminates between different network topologies and reveals network-clades embedded in reticulate histories (Casanellas et al., 2020).

1. Structural Foundations and Notation

Phylogenetic networks in Rank–Nexus are specifically tree-child binary networks: rooted, acyclic, directed graphs representing $n$ labeled taxa. The structure satisfies the following:

The root has indegree 0 and outdegree 2.
Each leaf has indegree 1 and outdegree 0, uniquely labeled from $1$ to $n$ .
Interior nodes are either tree-vertices (indegree 1, outdegree 2) or reticulation-vertices (indegree 2, outdegree 1), with every reticulation’s child required to be a tree-vertex.

Evolutionary processes on the network are modeled via assignment of $4 \times 4$ Markov matrices $M^e$ to each edge and an initial distribution $\pi$ on nucleotides $\Sigma = \{A,C,G,T\}$ . The parameterization $\theta = \{ \pi, M^e \mid e \in \mathrm{Edges}(N) \}$ governs the distribution $P_{N, \theta}$ over character patterns at the leaves.

Reticulations preclude the network from specifying a unique tree. Each of $m$ reticulation-vertices $w_i$ corresponds to binary choices $\sigma \in \{0,1\}^m$ selecting one of the two incoming edges, reducing $N$ to a tree $T_\sigma$ . The induced leaf pattern distribution $P_{N,\theta}$ is then a mixture over the $2^m$ possible tree resolutions:

$P_{N,\theta}(x_1, \ldots, x_n) = \sum_{\sigma \in \{0,1\}^m} \left[ \prod_{i=1}^m \delta_i^{1-\sigma_i} (1-\delta_i)^{\sigma_i} \right] P_{T_\sigma, \theta_\sigma}(x_1, \ldots, x_n)$

where $\delta_i \in [0,1]$ denotes site-inheritance probabilities.

2. Tensor Flattenings and Rank-Invariants

For a bipartition $A \cup B = \{1, \ldots, n\}$ , the observed joint distribution $P_{N,\theta}$ can be viewed as a vector $p$ in $\mathbb{C}^{4^n}$ , and the flattening operation $\mathrm{flatt}_{A|B}(p)$ reshapes $p$ into a $4^{|A|} \times 4^{|B|}$ matrix with entries

$\left[ \mathrm{flatt}_{A|B}(p) \right]_{i, j} = p(\text{char-vector: states $i $on$ A $,$ j $on$ B$} )$

In the traditional GMM on a phylogenetic tree, if $A|B$ corresponds to an edge split, the flattening factors as $M_A^\top D_\pi M_B$ , with $M_A$ , $M_B$ as conditional probability matrices and $D_\pi$ the base distribution. This structure implies:

$\mathrm{rank} \, \mathrm{flatt}_{A|B}(p) \leq 4$

Casanellas and Fernández-Sánchez extend this invariant to networks: if $A$ forms a network-clade (a subtree present in every tree resolution), the factorization remains valid after network-to-tree mixture. Consequently, the rank bound applies and all $5 \times 5$ minors must vanish—the polynomial relations constituting the Rank–Nexus invariants.

3. Equivariant Model Extensions

For many biological models, transition matrices $M^e$ exhibit symmetry under a permutation group $G \leq S_4$ . These G-equivariant models—including Jukes–Cantor ( $G = S_4$ ), Kimura 2-parameter, Kimura 3-parameter, and strand-symmetric models—constrain the distribution $p$ to the $G$ -invariant subspace $(\mathbb{C}^{4^n})^G$ . Maschke’s theorem yields a simultaneous decomposition of the state spaces and, correspondingly, the flattening $\overline{\mathrm{flatt}}_{A|B}(p)$ becomes block-diagonal:

$\overline{\mathrm{flatt}}_{A|B}(p) = \mathrm{diag}(B_1, \ldots, B_k)$

with block sizes $m_i \times m_i$ determined by the isotypic decomposition of $\mathbb{C}^4$ . The Rank–Nexus bound now reads:

$\mathrm{rank} \, B_i \leq m_i$

for each block $i$ , with vanishing $(m_i+1) \times (m_i+1)$ minors constituting the system of invariants specific to the equivariant model.

4. Theorems and Proof Structure

Theorem 2.1 (GMM Rank–Nexus)

Let $N$ be a tree-child binary network and $A$ a subset of its leaves forming a true subtree (no internal reticulations). Denoting $B = \mathrm{Leaves} \setminus A$ , then for any GMM parameters $\theta$ , $\mathrm{flatt}_{A|B}(P_{N, \theta})$ has rank at most $4$; thus, all $5 \times 5$ minors vanish.

Sketch: Reroot every resolution $T_\sigma$ at the root of the $A$ subtree. The flattening factors for each tree, with mixture weights yielding a convex sum of rank at most $4$ matrices.

Theorem 2.2 (Equivariant Rank–Nexus)

Under a $G$ -equivariant model for $N$ and the same $A|B$ clade partition, the block-diagonal flattening has block ranks bounded by $m_i$ , with all corresponding minors vanishing.

Sketch: Equivariance ensures the mixture factors through the appropriate $\mathrm{Hom}_G$ spaces on each isotypic component, enforcing the blockwise rank constraints (Casanellas et al., 2020).

5. Illustrative Example

Consider a four-leaf network $N$ with a single reticulation $w$ , leaves $1,2$ assigned to $A$ , $3,4$ to $B$ :

$N$ admits two tree resolutions, indexed by $\sigma \in \{0,1\}$ .
Given a root distribution $\pi$ and edge Markov matrices $M^e$ , the flattening $\mathrm{flatt}_{A|B}$ results in a $16 \times 16$ matrix.
For each $T_\sigma$ , the flattening is $M_A^\top D_{\pi^\sigma} M_B^\sigma$ , summing to a matrix of rank at most $4$ after mixing.
Numerically, singular value decomposition confirms at most four nonzero singular values; all $5 \times 5$ minors vanish.

6. Applications and Computational Practices

Detecting Tree-Clades and Reticulations

Given site-frequency data $\hat{p}$ from $n$ taxa, if $\hat{p}$ arises from a GMM on a network $N$ containing a tree-clade $A$ , then $\mathrm{flatt}_{A|B}(\hat{p})$ should empirically testify to the rank bound. Conversely, partitions not supported by a true subtree generally exhibit ranks significantly exceeding $4$. Scanning bipartitions and evaluating the decay in singular values (beyond the theoretical threshold) reveals tree-like subsets and, in turn, network topology.

Distinguishing Network Topologies

As different networks may share certain clades but not others, their collections of Rank–Nexus invariants differ. Non-vanishing of specified minors can rule out network hypotheses, thus extending tree-invariant methodology to reticulate models.

Algorithmic and Practical Considerations

Enumerating all minors is intractable; typical practice involves:

Selecting a relevant bipartition $A|B$
Forming the $4^{|A|} \times 4^{|B|}$ empirical flattening
Deploying SVD (worst-case cost $O(4^n)$ ) and inspecting singular-value spectra
For equivariant models, first block-diagonalizing via symmetric transforms, reducing the problem to analyzing smaller blockwise SVDs.

For sample sizes $n \leq 20$ , these procedures are computationally feasible (Casanellas et al., 2020).

7. Perspective and Significance

Rank–Nexus invariants provide a linear-algebraic means to generalize robust identifiability conditions and reconstruction techniques from trees to networks. Their key principle is the preservation of flattening-rank constraints in any reticulation-free subtree across all tree resolutions of a network. These polynomial invariants afford practical, statistically grounded tests for the identification of tree-like structure in complex reticulate histories—detecting clades, distinguishing network hypotheses, and ultimately advancing phylogenomic inference under rich evolutionary scenarios.

Markdown Report Issue Upgrade to Chat

References (1)

Rank conditions on phylogenetic networks (2020)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Rank-Nexus.