Papers
Topics
Authors
Recent
Search
2000 character limit reached

BatchGFN: Generative Flow Networks for Batch Active Learning

Published 26 Jun 2023 in cs.LG and stat.ML | (2306.15058v1)

Abstract: We introduce BatchGFN -- a novel approach for pool-based active learning that uses generative flow networks to sample sets of data points proportional to a batch reward. With an appropriate reward function to quantify the utility of acquiring a batch, such as the joint mutual information between the batch and the model parameters, BatchGFN is able to construct highly informative batches for active learning in a principled way. We show our approach enables sampling near-optimal utility batches at inference time with a single forward pass per point in the batch in toy regression problems. This alleviates the computational complexity of batch-aware algorithms and removes the need for greedy approximations to find maximizers for the batch reward. We also present early results for amortizing training across acquisition steps, which will enable scaling to real-world tasks.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (29)
  1. Gone fishing: Neural active learning with fisher embeddings. Advances in Neural Information Processing Systems, 34:8927–8939, 2021.
  2. Deep batch active learning by diverse, uncertain gradient lower bounds. arXiv preprint arXiv:1906.03671, 2019.
  3. Flow network based generative models for non-iterative diverse candidate generation. In Beygelzimer, A., Dauphin, Y., Liang, P., and Vaughan, J. W. (eds.), Advances in Neural Information Processing Systems, 2021a. URL https://openreview.net/forum?id=Arn2E4IRjEB.
  4. Gflownet foundations, 2021b.
  5. Deep bayesian active learning with image data. In International Conference on Machine Learning, pp. 1183–1192. PMLR, 2017.
  6. Gpytorch: Blackbox matrix-matrix gaussian process inference with gpu acceleration. Advances in neural information processing systems, 31, 2018.
  7. Entropy search for information-efficient global optimization. Journal of Machine Learning Research, 13(6), 2012.
  8. A framework and benchmark for deep batch active learning for regression. arXiv preprint arXiv:2203.09410, 2022.
  9. Bayesian active learning for classification and preference learning. arXiv preprint arXiv: Arxiv-1112.5745, 2011.
  10. Perceiver: General perception with iterative attention. In International conference on machine learning, pp. 4651–4664. PMLR, 2021.
  11. Biological sequence design with gflownets. In International Conference on Machine Learning, pp. 9786–9801. PMLR, 2022a.
  12. Multi-objective gflownets. arXiv preprint arXiv:2210.12765, 2022b.
  13. Efficient nonmyopic bayesian optimization via one-shot multi-step trees. Advances in Neural Information Processing Systems, 33:18039–18049, 2020.
  14. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
  15. Unifying approaches in active learning and active sampling via fisher information and information-theoretic quantities. Transactions on Machine Learning Research, 2022.
  16. Batchbald: Efficient and diverse batch acquisition for deep bayesian active learning. Advances in neural information processing systems, 32, 2019.
  17. Stochastic batch acquisition for deep active learning. arXiv preprint arXiv:2106.12059, 2021.
  18. Self-attention between datapoints: Going beyond individual input-output pairs in deep learning. Advances in Neural Information Processing Systems, 34:28742–28756, 2021.
  19. torchgfn: A pytorch gflownet library. arXiv preprint arXiv: 2305.14594, 2023.
  20. Lewis, D. D. A sequential algorithm for training text classifiers: Corrigendum and additional data. In Acm Sigir Forum, volume 29, pp.  13–19. ACM New York, NY, USA, 1995.
  21. Learning gflownets from partial episodes for improved convergence and stability. International Conference on Machine Learning (ICML), 2023, to appear.
  22. Conditioning sparse variational gaussian processes for online decision-making. Advances in Neural Information Processing Systems, 34:6365–6379, 2021.
  23. Trajectory balance: Improved credit assignment in gflownets. Neural Information Processing Systems (NeurIPS), 2022.
  24. Better training of gflownets with local credit and incomplete trajectories. International Conference on Machine Learning (ICML), 2023, to appear.
  25. Active learning for convolutional neural networks: A core-set approach. In International Conference on Learning Representations, 2018.
  26. Settles, B. Active learning literature survey. 2009.
  27. Gaussian process optimization in the bandit setting: no regret and experimental design. In Proceedings of the 27th International Conference on International Conference on Machine Learning, pp.  1015–1022, 2010.
  28. Beyond marginal uncertainty: How accurately can bayesian regression models estimate posterior predictive correlations? In International Conference on Artificial Intelligence and Statistics, pp.  2476–2484. PMLR, 2021.
  29. Robust scheduling with gflownets. arXiv preprint arXiv:2302.05446, 2023.
Citations (2)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.