On a Near-Optimal \& Efficient Algorithm for the Sparse Pooled Data Problem
Abstract: The pooled data problem asks to identify the unknown labels of a set of items from condensed measurements. More precisely, given $n$ items, assume that each item has a label in $\cbc{0,1,\ldots, d}$, encoded via the ground-truth $\SIGMA$. We call the pooled data problem sparse if the number of non-zero entries of $\SIGMA$ scales as $k \sim n{\theta}$ for $\theta \in (0,1)$. The information that is revealed about $\SIGMA$ comes from pooled measurements, each indicating how many items of each label are contained in the pool. The most basic question is to design a pooling scheme that uses as few pools as possible, while reconstructing $\SIGMA$ with high probability. Variants of the problem and its combinatorial ramifications have been studied for at least 35 years. However, the study of the modern question of \emph{efficient} inference of the labels has suggested a statistical-to-computational gap of order $\log n$ in the minimum number of pools needed for theoretically possible versus efficient inference. In this article, we resolve the question whether this $\log n$-gap is artificial or of a fundamental nature by the design of an efficient algorithm, called \algoname, based upon a novel pooling scheme on a number of pools very close to the information-theoretic threshold.
- M. Aldridge, O. Johnson and J. Scarlett “Group Testing: An Information Theory Perspective” In Foundations and Trends in Communications and Information Theory 15.3–4, 2019, pp. 196–392
- “The Franz-Parisi Criterion and Computational Trade-offs in High Dimensional Statistics” In Advances in Neural Information Processing Systems 35, 2022, pp. 33831–33844
- E.J. Candes, J. Romberg and T. Tao “Robust Uncertainty Principles: Exact Signal Reconstruction from Highly Incomplete Frequency Information” In IEEE Transactions on Information Theory 52.2, 2006, pp. 489–509
- “Optimal Design of Process Flexibility for General Production Systems” In Operations Research 67.2 INFORMS, 2019, pp. 516–531
- “Optimal group testing” In Combinatorics, Probability and Computing 30.6 Cambridge University Press, 2021, pp. 811–848
- “Statistical and Computational Phase Transitions in Group Testing” In Proceedings of 35th Conference on Learning Theory (COLT) 178 PMLR, 2022, pp. 4764–4781
- A. Djackov “On a Search Model of False Coins” In Topics in Information Theory (Colloquia Mathematica Societatis János Bolyai 16). Budapest, Hungary: Hungarian Academy of Sciences 16, 1975, pp. 163–170
- “Thresholds for the Recovery of Sparse Solutions via L1 Minimization” In 2006 40th Annual Conference on Information Sciences and Systems, 2006, pp. 202–206 IEEE
- “Decoding from Pooled Data: Phase Transitions of Message Passing” In IEEE Transactions on Information Theory 65.1 IEEE, 2018, pp. 572–585
- “Quantitative Group Testing and the rank of random matrices” CoRR, abs/2006.09074, 2020 arXiv:2006.09074 [cs.IT]
- “Time-Varying Periodic Convolutional Codes with Low-Density Parity-Check Matrix” In IEEE Transactions on Information Theory 45.6, 1999, pp. 2181–2191
- “Information-Theoretic and Algorithmic Aspects of Parallel and Distributed Reconstruction from Pooled Data” In Journal of Parallel and Distributed Computing 180 Elsevier, 2023, pp. 104718
- “Optimal Reconstruction of Graphs under the Additive Model” In Algorithmica 28.1, 2000, pp. 104–124
- “Near Optimal Efficient Decoding from Pooled Data” In Proceedings of 35th Conference on Learning Theory (COLT) 178, 2022, pp. 3395–3409 PMLR
- S. Janson “On Concentration Of Probability” In Contemporary Combinatorics, Bolyai Society Mathematical Studies. Budapest, Hungary: János Bolyai Mathematical Society 10, 2002, pp. 289–301
- Svante Janson, Tomasz Luczak and Andrzej Rucinski “Random Graphs” John Wiley & Sons, 2011
- “Sparse Graph Codes for Non-adaptive Quantitative Group Testing” In 2019 IEEE Information Theory Workshop (ITW), 2019, pp. 1–5
- “Non-adaptive Quantitative Group Testing Using Irregular Sparse Graph Codes” In 2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton), 2019, pp. 608–614 IEEE
- S. Kudekar, T. Richardson and R.L. Urbanke “Spatially Coupled Ensembles Universally Achieve Capacity under Belief Propagation” In IEEE Transactions on Information Theory 59.12, 2013, pp. 7761–7813
- Shrinivas Kudekar and Henry D Pfister “The Effect of Spatial Coupling on Compressive Sensing” In 2010 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton), 2010, pp. 347–353 IEEE
- “Neural Group Testing to Accelerate Deep Learning” In 2021 IEEE International Symposium on Information Theory (ISIT), 2021, pp. 958–963
- J.P. Martins, R. Santos and R. Sousa “Testing the Maximum by the Mean in Quantitative Group Tests” In New Advances in Statistical Modeling and Applications Springer, 2014, pp. 55–63
- “Support Recovery in Universal One-Bit Compressed Sensing” In 13th Innovations in Theoretical Computer Science Conference, 2022, pp. 106:1–106:20 Schloss Dagstuhl-Leibniz-Zentrum für Informatik
- “Phase Transitions in the Pooled Data Problem” In Advances in Neural Information Processing Systems 30, 2017, pp. 376–384
- “DNA Pooling: A Tool for Large-Scale Association Studies” In Nature Reviews Genetics 3.11, 2002, pp. 862–871
- H.S. Shapiro “Problem E 1399” In Amer. Math. Monthly 67, 1960, pp. 82
- Mahdi Soleymani, Hessam Mahdavifar and Tara Javidi “Non-Adaptive Quantitative Group Testing via Plotkin-Type Constructions” In 2023 IEEE International Symposium on Information Theory (ISIT), 2023, pp. 1854–1859
- C. Wang, Q. Zhao and C.N. Chuah “Group Testing under Sum Observations for Heavy Hitter Detection” In 2015 Information Theory and Applications Workshop (ITA), 2015, pp. 149–153 IEEE
- Alexander S Wein “Optimal Low-Degree Hardness of Maximum Independent Set” In Mathematical Statistics and Learning 4.3, 2022, pp. 221–251
- “Parallel Feature Selection Inspired by Group Testing” In Advances in Neural Information Processing Systems 27, 2014, pp. 3554–3562
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.