Infinite dSprites for Disentangled Continual Learning: Separating Memory Edits from Generalization
Abstract: The ability of machine learning systems to learn continually is hindered by catastrophic forgetting, the tendency of neural networks to overwrite previously acquired knowledge when learning a new task. Existing methods mitigate this problem through regularization, parameter isolation, or rehearsal, but they are typically evaluated on benchmarks comprising only a handful of tasks. In contrast, humans are able to learn over long time horizons in dynamic, open-world environments, effortlessly memorizing unfamiliar objects and reliably recognizing them under various transformations. To make progress towards closing this gap, we introduce Infinite dSprites, a parsimonious tool for creating continual classification and disentanglement benchmarks of arbitrary length and with full control over generative factors. We show that over a sufficiently long time horizon, the performance of all major types of continual learning methods deteriorates on this simple benchmark. This result highlights an important and previously overlooked aspect of continual learning: given a finite modelling capacity and an arbitrarily long learning horizon, efficient learning requires memorizing class-specific information and accumulating knowledge about general mechanisms. In a simple setting with direct supervision on the generative factors, we show how learning class-agnostic transformations offers a way to circumvent catastrophic forgetting and improve classification accuracy over time. Our approach sets the stage for continual learning over hundreds of tasks with explicit control over memorization and forgetting, emphasizing open-set classification and one-shot generalization.
- Online continual learning with maximal interfered retrieval. In Advances in Neural Information Processing Systems 32, pages 11849–11860. Curran Associates, Inc., 2019a.
- Gradient based sample selection for online continual learning. Advances in neural information processing systems, 32, 2019b.
- Pseudo-recursal: Solving the catastrophic forgetting problem in deep neural networks. arXiv preprint arXiv:1802.03875, 2018.
- On tiny episodic memories in continual learning. arXiv preprint arXiv:1902.10486, 2019.
- A simple framework for contrastive learning of visual representations. In International conference on machine learning, pages 1597–1607. PMLR, 2020.
- Pathnet: Evolution channels gradient descent in super neural networks. arXiv preprint arXiv:1701.08734, 2017.
- A practical method for constructing equivariant multilayer perceptrons for arbitrary matrix groups. In International conference on machine learning, pages 3318–3328. PMLR, 2021.
- On the transfer of inductive bias from simulation to the real world: a new disentanglement dataset. In Advances in Neural Information Processing Systems 32 (NeurIPS 2019), pages 15714–15725. Curran Associates, Inc., 2019.
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
- Two complementary perspectives to continual learning: Ask not only what to optimize, but also how. arXiv preprint arXiv:2311.04898, 2023.
- Symmetry-based representations for artificial and biological general intelligence. Frontiers in Computational Neuroscience, 16:836498, 2022.
- Selective experience replay for lifelong learning. In Proceedings of the AAAI Conference on Artificial Intelligence, 2018.
- Natural continual learning: success is a journey, not (just) a destination. Advances in neural information processing systems, 34:28067–28079, 2021.
- Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
- Overcoming catastrophic forgetting in neural networks. Proceedings of the National Academy of Sciences of the United States of America, 114(13):3521–3526, 2017. MAG ID: 2560647685.
- Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 25, 2012.
- Learning without forgetting. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(12):2935–2947, 2017a.
- Learning without forgetting. IEEE transactions on pattern analysis and machine intelligence, 40(12):2935–2947, 2017b.
- Challenging common assumptions in the unsupervised learning of disentangled representations. In international conference on machine learning, pages 4114–4124, 2019.
- Core50: a new dataset and benchmark for continuous object recognition. In Proceedings of the 1st Annual Conference on Robot Learning, pages 17–26. PMLR, 2017.
- Avalanche: an end-to-end library for continual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pages 3600–3610, 2021.
- Gradient Episodic Memory for Continual Learning. In Advances in Neural Information Processing Systems. Curran Associates, Inc., 2017.
- dsprites: Disentanglement testing sprites dataset. https://github.com/deepmind/dsprites-dataset/, 2017.
- Catastrophic interference in connectionist networks: The sequential learning problem. In Psychology of Learning and Motivation Vol. 24, pages 109–165. Academic Press, 1989.
- Variational continual learning. In International Conference on Learning Representations, 2018.
- Continual Deep Learning by Functional Regularisation of Memorable Past. In Advances in Neural Information Processing Systems, pages 4453–4464. Curran Associates, Inc., 2020.
- Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, 32, 2019.
- Gdumb: A simple approach that questions our progress in continual learning. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part II 16, pages 524–540. Springer, 2020.
- Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR, 2021.
- icarl: Incremental classifier and representation learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
- Stream-51: Streaming classification and novelty detection from videos. In The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2020.
- Experience replay for continual learning. In Advances in Neural Information Processing Systems. Curran Associates, Inc., 2019.
- Progressive neural networks. arXiv preprint arXiv:1606.04671, 2016.
- Continual learning with deep generative replay. Advances in neural information processing systems, 30, 2017.
- Deep learning and the information bottleneck principle. In 2015 ieee information theory workshop (itw), pages 1–5. IEEE, 2015.
- Functional regularisation for continual learning with gaussian processes. In International Conference on Learning Representations, 2020.
- Three types of incremental learning. Nature Machine Intelligence, 4(12):1185–1197, 2022.
- Clad: A realistic continual learning benchmark for autonomous driving. Neural Networks, 161:659–669, 2023.
- Continual learning through synaptic intelligence. In International conference on machine learning, pages 3987–3995. PMLR, 2017.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.