Weighted Group Testing

Updated 1 February 2026

Weighted group testing is a framework that extends traditional group tests by integrating weighted designs, non-binary outcomes, and cost models to enhance detection efficiency.
It employs constant column weight designs, non-binary measurement models, and high-dimensional inference techniques to achieve significant improvements over classic binary methods.
Its practical applications include clinical trial design, genomic studies, and industrial testing, providing improved detection rates, power, and cost efficiency.

Weighted group tests generalize classical group testing by incorporating either weighted designs in the construction of test matrices, non-binary measurement outcomes reflecting item “loads,” or explicitly parametrized weights in hypothesis testing frameworks. These approaches are prominent in nonadaptive group testing theory, clinical trial design, high-dimensional inference, and applications requiring sensitivity to heterogeneous item properties or correlated test structures. Advances in weighted group testing have led to provable improvements in detection rates, power, and efficiency compared to traditional unweighted, binary methods. This entry surveys the main mathematical frameworks, methodologies, theoretical limits, and practical implications established in recent research.

1. Weighted Designs in Nonadaptive Group Testing

Classical nonadaptive group testing seeks to identify $K$ defectives in $n$ items using $T$ pooled binary tests such that each test is positive iff at least one defective is included. Weighted, or more precisely “constant column weight,” designs fix an integer $L$ per item so that each column of the test matrix $X \in \{0,1\}^{T\times n}$ contains exactly $L$ ones, their locations selected uniformly at random with replacement over the $T$ tests. The key parametrization is $L = \nu \frac{T}{K}$ with $\nu=\ln2$ maximizing the detection rate.

In contrast, the Bernoulli design draws $X_{t,i}\sim\text{Bernoulli}(p)$ independently for all entries, introducing significant variance in column weights. Constant-weight matrices guarantee more uniform test coverage, minimizing the probability that a nondefective item is entirely “uncovered” (i.e., not appearing in any negative test), which is critical for practical algorithms such as COMP (Combinatorial Orthogonal Matching Pursuit).

The achievable rate for COMP under constant-weight designs is

$R_{\mathrm{COMP}} = (\ln2)(1-\theta) \approx 0.693(1-\theta),$

where $K=\Theta(n^\theta),\;\theta\in(0,1)$ . This rate is a $\sim31\%$ improvement over the Bernoulli design, which is limited to

$R_{\mathrm{COMP,Bern}} \approx \frac{1}{e\ln2}(1-\theta) \approx 0.531(1-\theta).$

Algorithm-independent upper bounds show similar improvements, especially in the dense regime ( $\theta$ large). For all practical decoders (COMP, DD, SCOMP), constant-weight designs uniformly outperform Bernoulli constructions in both theory and simulation (Aldridge et al., 2016).

2. Weighted (Non-Binary) Measurement Models

Recent models extend binary group testing to non-binary test outcomes, reflecting “loads” or weights, e.g., viral load in RT-qPCR SARS-CoV-2 testing. In this framework, each item $i$ has $x_i = \xi_i Z_i$ , where $\xi_i \in \{0,1\}$ is the defective indicator and $Z_i \in (0,1]$ its “intrinsic load.” Group tests return

$y_S = \max_{i \in S} x_i,$

over pool $S$ , capturing the maximal load in each pool. Non-adaptive pooling can be constructed on a toroidal $n \times n$ grid with $L$ pools per item. Decoding exploits the multiplicity of identical loads across pools: a local rule flags item $(i,j)$ as defective if its minimal measurement across pools is repeated at least twice and strictly positive.

Efficiency is dramatically improved over binary schemes. For prevalence $p$ , the optimal asymptotic test-efficiency is $E^*(p) = \Theta(p)$ rather than $O(p|\ln p|)$ . In practical terms, weighted measurement frameworks reduce the number of required tests for exact recovery, particularly for low $p$ , and facilitate extensions to continuous loads and multi-level measurements (Joly et al., 2020).

3. Weighted Group Testing in High-Dimensional Inference

Weighted group testing in high-dimensional statistics focuses on inference for functional quantities of the form $Q_w = \beta_G^\top W \beta_G$ , where $G \subset \{1,\ldots,p\}$ indexes groups in a regression parameter vector and $W$ is a positive-definite weight matrix (e.g., identity, covariance, or user-specified). The framework supports hierarchical testing, interaction analysis, and assessment of variance explained by groups (“local heritability”).

Debiased estimators $\hat Q_w$ are constructed via convex quadratic programming with precise control of bias and asymptotic variance. The resulting test statistics are asymptotically Gaussian with variance $V_w^0$ , supporting valid one- and two-sided confidence intervals and highly efficient power properties for detection under mild regularity and sparsity conditions. This methodology is robust for large groups, avoiding degradation in inference quality even in dense or collinear settings (Guo et al., 2019).

4. Weighted Parametric Group Sequential Designs (WPGSD) in Multiple Hypothesis Testing

Weighted group tests also arise in clinical trial designs with correlated endpoints or populations. The WPGSD (Weighted Parametric Group Sequential Design) methodology assigns weights to multiple hypotheses and leverages known correlations (via a full covariance structure on sequential test statistics) to optimally control familywise error rates:

$\mathrm{FWER} = \Pr\left\{\bigcup_{J\subseteq I} \{H_J\ \text{rejected}\}\right\} \le \alpha.$

Weights are reallocated upon test rejections according to specified transition matrices. Sequential monitoring and error-spending functions permit power gains and sample size reductions compared to Bonferroni-based designs. MVN (multivariate normal) integration is used to compute exact critical boundaries at each interim analysis. Simulation studies across diverse scenarios demonstrate uniform improvements in statistical power (2–3% absolute) and efficient sample usage (Anderson et al., 2021).

5. Weighted Cost Models and Adaptive Algorithms

Group testing in settings where test costs are outcome-dependent is handled by explicitly modeling positive and negative test costs ( $C_+, C_-$ ; cost ratio $r = C_+/C_-$ ). In hypergraph sampling, minimizing total or positive test cost dictates the choice between deterministic splitting (e.g., generalized binary search, SIGHT) and stochastic splitting (e.g., Random Chemistry, RC). Theoretically, the optimal pool size scales as $n^* \approx (1/p)\ln r$ for test prevalence $p$ .

Deterministic algorithms minimize total tests and are favored for small $r$ , while stochastic algorithms minimize positive tests and are superior for large $r$ . Real-world applications include cascading-failure analysis in power systems and feature selection, where balancing test costs is crucial for scalable inference (Clarfeld et al., 2020).

6. Weighted Group $\chi^2$ -Testing for Histograms

Weighted group tests are used for assessing homogeneity between weighted and/or unweighted histograms (e.g., simulation outputs vs. experiment). Gagunashvili’s CHICOM framework supplies three test statistics depending on the type of weights (normalized vs. unnormalized) in the two samples. Statistics are constructed by omitting one bin at a time and minimizing the test statistic over unknown bin probabilities subject to normalizing constraints, with the final test statistic taken as the median over all omitted-bin schemes. Null distributions are $\chi^2_{m-1}$ or $\chi^2_{m-2}$ , depending on modes.

Implementation via the CHICOM Fortran subroutine enables rigorous handling of weighted entries, accommodating both normalized and unnormalized weighting schemes and preserving correct degrees of freedom in the limit of large sample sizes (Gagunashvili, 2011).

7. Practical Implications and Applications

Weighted group testing frameworks yield rigorous improvements in detection, efficiency, and inference:

In nonadaptive group testing, constant column weight designs are optimal for practical decoders, uniformly outperforming Bernoulli random designs.
Non-binary measurement models facilitate exact recovery in multiplex biological assays under typical laboratory constraints and noise, notably in viral load testing.
High-dimensional weighted group inference enables powerful and valid testing in structured regression and genomics settings, with theoretical guarantees extending to hierarchical testing regimes.
In clinical trials and multi-arm hypothesis testing, WPGSD methods exploit known correlations for superb FWER control and sample-size efficiency.
Adaptive algorithms optimized for weighted test costs directly address emerging computational constraints in industrial, biomedical, and engineering test systems.
Weighted $\chi^2$ testing rigorously compares experimental and simulation histograms with arbitrary weights, enhancing statistical validation in physical sciences and engineering.

Weighted group testing thus comprises a diverse set of models, algorithms, and theoretical tools central to contemporary statistical, computational, and applied research, reflecting a mature and rapidly advancing domain.