Papers
Topics
Authors
Recent
Search
2000 character limit reached

Normalized Excess Co-occurrence Matrix

Updated 18 February 2026
  • Normalized excess co-occurrence matrix is a method that measures how often tuples of nodes co-occur beyond expectations based on independent occurrence.
  • It employs incidence matrices, marginal probability normalization, and the face-splitting product to extend pairwise counts to higher-order tensors.
  • Applications span word embedding, market basket analysis, and hypergraph community detection by revealing statistically significant interactions.

A normalized excess co-occurrence matrix (and its higher-order tensor analogs) provides a measure of how much more frequently tuples of nodes (typically words, items, or graph vertices) co-occur within a collection of groups (hyperedges, contexts, baskets) than would be expected if these nodes occurred independently. This construct is foundational in hypergraph theory, NLP, and data mining, generalizing the standard co-occurrence matrix and mutual information to arbitrary tuple size via the face-splitting (rowwise Khatri–Rao) product, yielding excess co-occurrence tensors and multivariate pointwise mutual information (Bischof, 2020).

1. Incidence Matrix and Pairwise Co-occurrence

The starting point is a bipartite relationship between a finite set of nodes X={x1,,xn}X = \{x_1, \dots, x_n\} and a family of hyperedges E={e1,,em}E = \{e_1, \dots, e_m\}. The binary incidence matrix A{0,1}m×nA \in \{0,1\}^{m \times n} is defined by Ae,i=1A_{e,i} = 1 if node xix_i is present in hyperedge ee, and $0$ otherwise.

The unnormalized pairwise co-occurrence matrix is calculated as C=ATAC = A^T A, an n×nn \times n matrix where entry CijC_{ij} counts the number of hyperedges containing both xix_i and xjx_j. The degree di=Ciid_i = C_{ii} records the number of hyperedges containing node xix_i, and i=1ndi\sum_{i=1}^n d_i gives the total number of node-edge incidences (Bischof, 2020).

2. Normalization and the Excess Co-occurrence Matrix

To interpret co-occurrence significance, a null model assumes independent node participation in edges. The empirical marginal probability for node xix_i is pi=di/mp_i = d_i/m. The matrix of empirical pairwise probabilities is Pij=Cij/mP_{ij} = C_{ij} / m, and under independence, the reference probability is the rank-one matrix ppTpp^T with entries pipjp_i p_j.

The excess (or normalized) co-occurrence quantifies deviation from this independent baseline:

  • Probability form: Pexcess=PppTP^{\rm excess} = P - pp^T
  • Raw-count form: Cexcess=C1mddTC^{\rm excess} = C - \frac{1}{m} dd^T where dRnd \in \mathbb{R}^n is the vector of node degrees. These highlight pairs co-occurring more or less frequently than independence predicts (Bischof, 2020).

3. Higher-Order Co-occurrence via Face-Splitting Product

Pairwise co-occurrence generalizes to k-way co-occurrence for arbitrary order kk through the face-splitting (transpose Khatri–Rao) product. For A{0,1}m×nA \in \{0,1\}^{m \times n}, define $F^{(k-1)} = A \fsplit \cdots \fsplit A$ (with k1k-1 factors), yielding a matrix with

Fe,(i1,,ik1)(k1)=t=1k1Ae,itF^{(k-1)}_{e, (i_1,\dots,i_{k-1})} = \prod_{t=1}^{k-1} A_{e,i_t}

The order-kk tensor of raw counts is then constructed as: T:=(F(k1))TARnk1×nT := (F^{(k-1)})^T A \in \mathbb{R}^{n^{k-1} \times n} which can be indexed and reshaped as

Ti1,i2,,ik=e=1mt=1kAe,itT_{i_1, i_2, \dots, i_k} = \sum_{e=1}^m \prod_{t=1}^k A_{e,i_t}

representing the number of hyperedges containing all kk nodes (xi1,,xik)(x_{i_1}, \dots, x_{i_k}) (Bischof, 2020).

4. Normalized Excess for k-way Co-occurrence Tensors

The normalization framework for k=2k=2 extends directly to higher orders. Given Ti1,...,ikT_{i_1, ..., i_k} (raw counts), empirical probabilities are computed via normalization: Pi1,,ik=Ti1,,ikmP_{i_1, \dots, i_k} = \frac{T_{i_1, \dots, i_k}}{m} with pi=di/mp_i = d_i/m as before. The "excess" form for the k-way case subtracts the completely independent model t=1kpit\prod_{t=1}^k p_{i_t} from the observed probability: Pi1,...,ikexcess=Pi1,...,ikt=1kpitP^{\rm excess}_{i_1, ..., i_k} = P_{i_1, ..., i_k} - \prod_{t=1}^k p_{i_t} or for raw counts: Ti1,...,ikexcess=Ti1,...,ikm1kt=1kditT^{\rm excess}_{i_1, ..., i_k} = T_{i_1, ..., i_k} - m^{1 - k} \prod_{t=1}^k d_{i_t} This formulation generalizes pointwise mutual information (PMI) to kk-tuples: PMIk(i1,...,ik)=logPi1,...,ikt=1kpit=log(1+Pi1,...,ikexcesstpit)\mathrm{PMI}_k(i_1, ..., i_k) = \log \frac{P_{i_1, ..., i_k}}{\prod_{t=1}^k p_{i_t}} = \log \left(1 + \frac{P^{\rm excess}_{i_1, ..., i_k}}{\prod_t p_{i_t}}\right) The multivariate PMI thus obtained connects to generalized mutual information measures (Bischof, 2020).

5. Applications in Word Representations, Recommendation, and Hypergraph Analysis

Normalized excess co-occurrence matrices and their high-order analogs are central to several domains:

  • Word Embedding Models: The skip-gram with negative sampling (word2vec) and GloVe algorithms can be viewed as implicitly factorizing a transformed excess co-occurrence matrix CexcessC^{\rm excess} or its PMI variant. Extending to k=3k=3 enables modeling of not just word-context pairs but also triple co-occurrences (e.g., short phrases, syntactic triples) via so-called word tensors, permitting richer compositional embeddings (Bischof, 2020).
  • Market-Basket and Recommendation Systems: Transactions form a natural hypergraph, with nodes as items and edges as baskets. The third-order tensor Ti,j,kT_{i,j,k} enumerates tri-item co-occurrences, and TexcessT^{\rm excess} reveals surprising triple relationships, supporting higher-order clustering and embedding (Bischof, 2020).
  • Hypergraph Community Detection and Similarity: Beyond standard Laplacians based on C=ATAC = A^T A, higher-order excess co-occurrence tensors serve as refined similarity kernels, facilitating detection of community structure and similarity patterns not observable at the pairwise level (Bischof, 2020).

6. Algorithmic Properties and Computational Considerations

The face-splitting product provides an algorithmically tractable and highly parallelizable method for constructing k-way co-occurrence tensors. By leveraging rowwise Kronecker products, it systematically encodes all k-tuple co-occurrence frequencies without explicit enumeration of all hyperedge membership configurations, enabling scalable learning and statistical testing of higher-order interactions. The excess normalization eliminates high background rates, yielding signals of statistical dependence among nodes across a diverse range of graph and hypergraph-based data structures (Bischof, 2020).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Normalized Excess Co-occurrence Matrix.