Papers
Topics
Authors
Recent
Search
2000 character limit reached

Fast-Block-Select Algorithm

Updated 19 January 2026
  • The paper introduces a heuristic that rapidly identifies near-optimal shared block selections to maximize joint log-likelihood in multi-graph stochastic block models.
  • It employs a greedy, injective block selection procedure to efficiently navigate exponential candidate spaces while circumventing NP-hard ILP formulations.
  • Empirical results demonstrate dramatic runtime improvements—from over 8 hours with ILP to under one second—with ARI scores approaching near-optimal values in practical settings.

The Fast-Block-Select algorithm refers to a class of greedy block selection procedures that rapidly identify optimal or near-optimal block assignments in large-scale partitioning and inference problems. Most prominently, in the context of shared stochastic block modeling (SSBM) over multiple graphs, Fast-Block-Select provides a scalable heuristic for selecting ss shared blocks, circumventing the computational hardness of integer linear programming (ILP) formulations. This paradigm has emerged as a practical response to the NP-hardness and inapproximability of “shared block detection” in multi-graph SBMs (Kumpulainen et al., 2024). While the term may be occasionally overloaded in the literature—for instance, in network clustering or compressed bitmap implementations (Grabowski et al., 2016)—the defining hallmark is the rapid, injective, greedy construction of block assignments to optimize a statistical objective.

1. Algorithmic Objective and Formal Setup

In the SSBM setting, the goal is to select ss block vectors S={r(1),...,r(s)}S = \{r^{(1)}, ..., r^{(s)}\} across nn input graphs, each partitioned into BkB_k blocks, in order to maximize the joint likelihood: $\mathit{LLH} = \sum_{k=1}^n \log P(G_k\mid \Theta_k, \bfb_k)$ subject to injective mapping constraints between graphs and shared parameterization for the selected blocks. Each candidate r=(r1,...,rn)r = (r_1, ..., r_n) spans the product space T=[B1]××[Bn]\mathcal{T} = [B_1] \times \cdots \times [B_n]. Block-pair parameters are re-estimated in closed form for both private and shared assignments, yielding respective log-likelihood scores UijkU^k_{ij} for private blocks and QrtQ_{rt} for shared blocks, based on empirical edge and non-edge counts Cijk,FijkC^k_{ij}, F^k_{ij}. The objective is equivalently an NP-hard combinatorial maximization over valid ss-sized injective subsets STS \subseteq \mathcal{T} (Kumpulainen et al., 2024).

2. Heuristic Procedure and Pseudocode

Fast-Block-Select employs a greedy, iterative construction of SS. At each step, the procedure scans the candidate set TTT \subseteq \mathcal{T} to find the vector rr^* whose addition induces maximal increase (or minimal drop) in log-likelihood, computed via incremental updates to QrtQ_{r t} and removal or merging of private UijkU^k_{ij} contributions. Candidate vectors sharing any block index with vectors already in SS are eliminated to maintain injectivity. The process repeats until S=s|S| = s. The annotated pseudocode is as follows (Kumpulainen et al., 2024):

1
2
3
4
5
6
7
8
9
10
11
Procedure Fast-Block-Select(s, {G_k, b_k}_{k=1}^n):
    Input: s, graphs G_k, block partitions b_k
    Precompute all per-block and per-pair scores
    Initialize S = , T = all block-vectors in product space
    while |S| < s:
        for r in T:
            Δ(r) = net increase in objective if r is appended to S
        select r* with maximal Δ(r)
        S  S  {r*}
        remove from T any candidate sharing block indices with r*
    return S
Complexity per iteration is dominated by candidate scans (O(nS)O(n|S|) per candidate) and injective pruning (O(nk(Bk1))O(n \sum_k (B_k - 1))) (Kumpulainen et al., 2024). No guarantee on approximation ratio is provided; the procedure is entirely heuristic.

3. Mathematical Structure and Underlying Scores

The algorithm’s efficiency and correctness depend on precise calculation of the relevant scores:

  • For each graph kk, private block-pair contribution:

Uijk=Cijklogθijk+Fijklog(1θijk),θijk=CijkCijk+FijkU^k_{ij} = C^k_{ij} \log \theta^k_{ij} + F^k_{ij} \log(1-\theta^k_{ij}), \quad \theta^k_{ij} = \frac{C^k_{ij}}{C^k_{ij} + F^k_{ij}}

  • For shared blocks r,tr, t:

Qrt=k=1n[Crktkklogθrt+Frktkklog(1θrt)],θrt=kCrktkkk(Crktkk+Frktkk)Q_{rt} = \sum_{k=1}^n \left[ C^k_{r_k t_k} \log \theta_{rt} + F^k_{r_k t_k} \log(1-\theta_{rt}) \right], \quad \theta_{rt} = \frac{\sum_k C^k_{r_k t_k}}{\sum_k (C^k_{r_k t_k} + F^k_{r_k t_k})}

The candidate pool T\mathcal{T} grows exponentially with nn and BkB_k, but in practice nn is small and BkB_k moderate, allowing storage and precomputation of Cijk,Fijk,Uijk,QrtC^k_{ij}, F^k_{ij}, U^k_{ij}, Q_{rt} in compact tables. This enables real-time greedy selection in large graphs.

4. Complexity, Theoretical Properties, and Limitations

The total running time is O(ns2kBk)O(n s^2 \prod_k B_k) for ss selections, with successive candidate pool reduction. The underlying optimization is NP-hard; Theorem 1 in (Kumpulainen et al., 2024) proves inapproximability to any constant factor. Fast-Block-Select therefore has no worst-case optimality guarantees—suboptimal block selection is theoretically possible, yet its empirical efficacy is validated for practical settings.

5. Empirical Results and Practical Utility

Experiments on synthetic benchmarks and real-world graphs (e.g., large Wikipedia link networks, Bk=20B_k=20, s=2s=2) demonstrate that Fast-Block-Select’s runtime is orders of magnitude faster than exact ILP methods (under one second versus over 8 hours), while typically approaching near-optimal assignment quality as measured by Adjusted Rand Index (ARI: 0.75–0.90 versus 0.95–1.0 for ILP). For synthetic “planted” SBMs, greedy selection consistently outperforms random assignment (ARI 0.2–0.5) (Kumpulainen et al., 2024).

6. Illustrative Example

Consider two graphs (n=2n=2), each with two blocks. With s=1s=1, the algorithm computes the gain for each candidate shared block:

  • Private scores: UAA1=2.501U^1_{AA} = -2.501, UXX2=2.616U^2_{XX} = -2.616
  • Shared score: Q(A,X),(A,X)=3.767Q_{(A,X),(A,X)} = -3.767
  • Log-likelihood gain from merging AA and XX, 3.767(5.117)=+1.350-3.767 - (-5.117) = +1.350 The highest gain is chosen, conflicting candidates are removed, and selection continues for larger ss (Kumpulainen et al., 2024).

While the Fast-Block-Select term is closely associated with SSBM selection, analogous “fast block select” procedures appear in hierarchical block model inference (Park et al., 2017), compressed bitmap search (Grabowski et al., 2016), and active set methods for 1\ell_1 regularized regression (Santis et al., 2014). In each, rapid identification and update of relevant blocks form a core computational strategy, though mathematical details and objectives differ. In SSBM, the concept is uniquely tied to maximizing multi-graph joint likelihood under block-sharing constraints.

Summary Table: Key Attributes of Fast-Block-Select

Attribute SSBM (shared SBM) HSBM (hierarchical SBM) Bitmap/block selection
Objective Maximize joint log-likelihood with ss injective shared blocks Maximize marginal likelihood, hierarchical assignment Accelerate select/rank queries
Block selection mechanism Greedy, injective vector selection Dynamic programming + coordinate ascent Precomputed block offsets
Complexity per step O(ns2kBk)O(n s^2 \prod_k B_k) O(mlogK+nK)O(m \log K + nK) O(/64)O(\ell/64) popcount ops
Guarantee None (heuristic); NP-hard Local convergence, Bayes pruning Pareto-optimal space-time tradeoff

The Fast-Block-Select paradigm is a technically robust and practically validated solution for high-dimensional block assignment tasks in multi-graph and hierarchical models, featuring tractable computation on dense, large-scale real-world data and maintaining empirical proximity to theoretical optimums when exact solvers are computationally infeasible (Kumpulainen et al., 2024, Park et al., 2017, Grabowski et al., 2016, Santis et al., 2014).

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Fast-Block-Select Algorithm.