Greedy CIM Algorithm for Causal Discovery

Updated 18 February 2026

Greedy CIM algorithm is a geometric approach to causal discovery that transforms the search for Markov equivalence classes into a linear optimization problem over the characteristic imset polytope.
It employs a two-phase simplex-type walk, alternating between edge moves and edge reversals, to incrementally improve score metrics like BIC.
Its skeletal variant uses conditional independence tests to constrain the search space, enhancing computational efficiency and outperforming methods such as PC, MMHC, and GIES.

The Greedy CIM algorithm is a geometric approach to score-based and hybrid causal discovery, formulated as a greedy walk along the edges of the characteristic imset polytope, denoted $\operatorname{CIM}_p$ . Each vertex of $\operatorname{CIM}_p$ represents a Markov equivalence class (MEC) of directed acyclic graphs (DAGs), enabling the causal discovery search to be transformed into a linear optimization problem over a polytope. The Greedy CIM algorithm and its skeletal hybrid variant generalize the edge-move frameworks of GES, GIES, and MMHC, providing a simplex-type search structure that can leverage both score and conditional independence information (Linusson et al., 2021).

1. Mathematical Foundation of Greedy CIM

Let $p$ denote the number of nodes. Given a DAG $G$ on $[p]=\{1,\ldots,p\}$ , its characteristic imset is a mapping

$c_G : \{S \subseteq [p]\,:\, |S|\geq2\} \to \{0,1\}$

with $c_G(S) = 1$ iff $\exists i\in S$ such that $S\setminus\{i\} \subseteq \operatorname{pa}_G(i)$ . This vector in $\mathbb{R}^{2^p-p-1}$ uniquely identifies the Markov equivalence class represented by $G$ —two DAGs are Markov equivalent if and only if their characteristic imsets coincide.

The characteristic-imset polytope $\operatorname{CIM}_p$ is

$\operatorname{CIM}_p = \operatorname{conv}\{c_G : G \text{ is a DAG on } [p]\}$

where $\operatorname{conv}$ denotes convex hull. Each vertex corresponds to a unique MEC. Maximizing a decomposable, score-equivalent function (e.g., BIC) over DAGs thus reduces to maximizing a linear function $\langle w, c_G\rangle$ over $\operatorname{CIM}_p$ , with $w$ derived from the score function.

Face restrictions: For any skeleton $G \subset K$ (undirected graphs), the restricted polytope

$\operatorname{CIM}_G = \operatorname{conv}\{c_H : H \text{ a DAG with skeleton in } [G,K]\}$

is a face of $\operatorname{CIM}_p$ , allowing skeleton constraints—such as from conditional independence (CI) tests—to reduce the search space to a lower-dimensional face.

Edge moves, or adjacency steps within the polytope, correspond to two fundamental move classes:

Edge-pairs (single edge addition/deletion in the skeleton): Each such operation corresponds to an adjacency (edge) in $\operatorname{CIM}_p$ .
Turn-pairs (single valid edge reversal): Reversing an edge $i \rightarrow j$ (without violating acyclicity and altering the MEC) also corresponds to an adjacency of the polytope.

These move classes subsume and strictly generalize the three phase moves (“forward,” “backward,” “turn”) of GES/GIES and the hill climb steps of MMHC, as shown via the polytope geometry (Linusson et al., 2021).

2. Algorithmic Description

The Greedy CIM algorithm operates as a two-phase simplex-type walk on $\operatorname{CIM}_p$ , alternating between edge and turn phases. The state is the current characteristic imset $c$ for DAG $G$ ; $w$ encodes the score weights.

Edge phase: Enumerate and evaluate all edge-pair moves (additions/deletions of edges) to adjacent vertices (MECs) in the polytope. Select the best move that strictly improves the score and update $c \leftarrow c'$ .
Turn phase: Enumerate and evaluate all turn-pair moves (edge reversals) to adjacent vertices via the identified edge classes, again selecting the best strictly improving move.

The algorithm alternates these phases, performing as many improving moves as possible within each phase, and terminates when neither phase yields any improvement. The formal selection criterion per iteration is:

$c' = \arg\max_{c' \in \operatorname{Nbr}(c)} \langle w, c' \rangle \quad \text{with } \langle w, c'\rangle > \langle w, c\rangle,$

where $\operatorname{Nbr}(c)$ is the set of vertices adjacent to $c$ along known “edge-pair” or “turn-pair” classes.

Pseudocode:

initialize G ← empty DAG; c ← c_G
repeat
    old_c ← c
    // Edge phase
    repeat
        bestInc ← 0
        for each unordered pair {i, j} with i ≠ j
            for each subset S^* ⊆ ne_G(i)
                if adding j→i (or deleting i→j) produces valid H and {G, H} is an edge-pair
                    Δ ← ⟨w, c_H⟩ – ⟨w, c_G⟩
                    if Δ > bestInc, store best move
        if improvement, update G, c; else break
    until no improvement
    // Turn phase
    repeat
        bestInc ← 0
        for each edge i→j in G
            for partitions S_i, S_j of neighborhoods
                if reversing i→j produces valid H and {G, H} is a turn-pair
                    Δ ← ⟨w, c_H⟩ – ⟨w, c_G⟩
                    if Δ > bestInc, store best move
        if improvement, update G, c; else break
    until no improvement
until c == old_c
return c

Complexity per iteration is

O(p^2 2^d p)

in the edge phase and

O(p^2 2^{d_i+d_j}p)

in the turn phase, with

d

denoting neighborhood size; the number of steps is at most

O(p^2)

since each strictly increases the finite-precision score.

3. Skeletal Greedy CIM and Hybridization

The Skeletal Greedy CIM algorithm restricts the search to a face of $\operatorname{CIM}_p$ determined by a skeleton $Ĝ$ estimated via CI tests (e.g., using the PC algorithm). All greedy edge and turn moves are constrained to respect $Ĝ$ ; i.e., no new edges are added outside $Ĝ$ . Typically, only turn phases are performed, and the initial orientation can be any DAG matching $Ĝ$ .

Skeletal Greedy CIM thus directly generalizes the hybrid methodology of MMHC (learn skeleton then hill-climb), but as a geometric walk on the skeleton-restricted face $\operatorname{CIM}_{Ĝ}$ . This reduces computational complexity and search space size.

Hybrid strategy:

Estimate skeleton $Ĝ$ via CI tests.
Start from any orientation $c_{G_0}$ with skeleton $Ĝ$ .
Greedily perform turn-phase moves (edge reversals) allowed by $Ĝ$ .
Return when no improvement is possible.

4. Theoretical Properties and Relations to Prior Methods

Every step of Greedy CIM or Skeletal Greedy CIM increases the score (e.g., BIC), and since the score is discrete, both algorithms terminate in $O(p^2)$ steps. If all edges of $\operatorname{CIM}_p$ were known and searchable, a greedy simplex walk would find the global optimum; in practice, only edge- and turn-pair classes are used, guaranteeing local optimality within those adjacency graphs.

GES (Greedy Equivalence Search) is realized as a greedy edge-walk using edge-pair and turn-pair steps, alternating forward and backward edge addition/deletion, with a final turn phase. GIES extends this in the interventional setting. MMHC corresponds precisely to Skeletal Greedy CIM, as it places skeleton restrictions and then performs a greedy climb. Empirically and theoretically, these methods are subclasses of greedy walks along faces or the full structure of $\operatorname{CIM}_p$ .

The main theorem posits that GES, GIES, MMHC, and Greedy SP all can be realized as greedy edge-walks on $\operatorname{CIM}_p$ or its faces, establishing the geometric nature of greedy causal discovery (Linusson et al., 2021).

5. Empirical Evaluation

For $p=8$ nodes, synthetic data from linear-Gaussian SEMs with random Erdős–Rényi DAGs (expected degree $d \in [0.5, 7]$ ) and 10,000 i.i.d. samples were used. The recovery rate (fraction of true MECs recovered) and SHD (structural Hamming distance) were benchmarking metrics.

Results demonstrate:

Skeletal Greedy CIM outperformed PC, MMHC, and Greedy SP for all sparsity levels, and, when the skeleton was correctly identified, almost always recovered the correct MEC.
Breadth-first Greedy CIM matched GIES in recovery and SHD; depth-first was competitive with GES but slightly inferior to GIES.
Runtime: Skeletal Greedy CIM executes in $\approx 1$ s/model; Greedy CIM in $\approx 10$ s; both comparable to GES/GIES for this regime.

6. Impact and Significance

Greedy CIM provides a unifying geometric framework for score-based and hybrid causal network discovery, clarifying the relationships among GES, GIES, and MMHC algorithms through explicit polyhedral structure. By allowing for general edge-walk classes (edge-pair and turn-pair), Greedy CIM strictly generalizes move sets available to legacy algorithms, with empirical results confirming performance gains, particularly for hybrids. Its geometric perspective suggests new directions for designing causal structure learning algorithms as polytope walks and motivates further investigation into the facets and adjacency structures of characteristic-imset polytopes for even higher-order generalizations (Linusson et al., 2021).

Markdown Report Issue Upgrade to Chat

References (1)

Greedy Causal Discovery is Geometric (2021)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Greedy CIM Algorithm.