URLOST: Unsupervised Representation Learning without Stationarity or Topology
Abstract: Unsupervised representation learning has seen tremendous progress. However, it is constrained by its reliance on domain specific stationarity and topology, a limitation not found in biological intelligence systems. For instance, unlike computer vision, human vision can process visual signals sampled from highly irregular and non-stationary sensors. We introduce a novel framework that learns from high-dimensional data without prior knowledge of stationarity and topology. Our model, abbreviated as URLOST, combines a learnable self-organizing layer, spectral clustering, and a masked autoencoder (MAE). We evaluate its effectiveness on three diverse data modalities including simulated biological vision data, neural recordings from the primary visual cortex, and gene expressions. Compared to state-of-the-art unsupervised learning methods like SimCLR and MAE, our model excels at learning meaningful representations across diverse modalities without knowing their stationarity or topology. It also outperforms other methods that are not dependent on these factors, setting a new benchmark in the field. We position this work as a step toward unsupervised learning methods capable of generalizing across diverse high-dimensional data modalities.
- A self-organizing model of “color blob” formation. Neural Computation, 8(7):1427–1448, 1996.
- Scaling learning algorithms towards ai. Large-scale kernel machines, 34(5):1–41, 2007.
- Geometric deep learning: going beyond euclidean data. IEEE Signal Processing Magazine, 34(4):18–42, 2017.
- Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.
- Generative pretraining from pixels. In International conference on machine learning, pp. 1691–1703. PMLR, 2020a.
- A simple framework for contrastive learning of visual representations. In International conference on machine learning, pp. 1597–1607. PMLR, 2020b.
- The sparse manifold transform. Advances in neural information processing systems, 31, 2018.
- Emergence of foveal image sampling from learning to attend in visual scenes. In The Fifth International Conference on Learning Representations, 2016.
- Spherical transformer. arXiv preprint arXiv:2202.04942, 2022.
- Surface analysis with vision transformers. arXiv preprint arXiv:2205.15836, 2022.
- Convolutional neural networks on graphs with fast localized spectral filtering. Advances in neural information processing systems, 29, 2016.
- Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pp. 248–255. Ieee, 2009.
- J.S. Denker and Y. leCun. Natural versus ”universal” probability, complexity, and entropy. In Workshop on Physics and Computation, pp. 122–127, 1992. doi: 10.1109/PHYCMP.1992.615508.
- Bert: Pre-training of deep bidirectional transformers for language understanding. In North American Chapter of the Association for Computational Linguistics, 2019.
- An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations, 2021.
- A dimension reduction framework for understanding cortical maps. Nature, 343(6259):644–647, 1990a. doi: 10.1038/343644a0.
- A dimension reduction framework for understanding cortical maps. Nature, 343(6259):644–647, 1990b.
- Retinotopic organization of visual cortex in human infants. Neuron, 109(16):2616–2626, 2021.
- Retinotopic organization in human visual cortex and the spatial precision of functional mri. Cerebral cortex (New York, NY: 1991), 7(2):181–192, 1997.
- Distributed hierarchical processing in the primate cerebral cortex. Cerebral cortex (New York, NY: 1991), 1(1):1–47, 1991.
- Splinecnn: Fast geometric deep learning with continuous b-spline kernels. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 869–877, 2018.
- Charles D Gilbert and Wu Li. Adult visual cortical plasticity. Neuron, 75(2):250–264, 2012.
- High-dimensional gene expression and morphology profiles of cells across 28,000 genetic and chemical perturbations. Nature Methods, 19(12):1550–1557, 2022. doi: 10.1038/s41592-022-01667-0.
- Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 9729–9738, 2020.
- Masked autoencoders are scalable vision learners. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16000–16009, 2022.
- beta-vae: Learning basic visual concepts with a constrained variational framework. In International conference on learning representations, 2016.
- Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. The Journal of physiology, 160(1):106, 1962.
- Natural image statistics: A probabilistic approach to early computational vision., volume 39. Springer Science & Business Media, 2009.
- Rapid topographic reorganization in adult human primary visual cortex (v1) during noninvasive and reversible deprivation. Proceedings of the National Academy of Sciences, 117(20):11059–11067, 2020.
- Foveater: Foveated transformer for image classification. arXiv preprint arXiv:2105.14173, 2021.
- Teuvo Kohonen. Self-organized formation of topologically correct feature maps. Biological cybernetics, 43(1):59–69, 1982.
- The cifar-10 dataset. online: http://www. cs. toronto. edu/kriz/cifar. html, 55(5), 2014.
- Image as set of points. In The Eleventh International Conference on Learning Representations, 2023.
- Multi-omics prediction from high-content cellular imaging with deep learning. arXiv preprint arXiv:2306.09391, 2023.
- Ronald L Meyer. Tetrodotoxin inhibits the formation of refined retinotopography in goldfish. Developmental Brain Research, 6(3):293–298, 1983.
- Ishan Misra and Laurens van der Maaten. Self-supervised learning of pretext-invariant representations. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 6707–6717, 2020.
- Geometric deep learning on graphs and manifolds using mixture model cnns. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2016.
- On spectral clustering: Analysis and an algorithm. Advances in neural information processing systems, 14, 2001.
- A principle for the formation of the spatial structure of cortical feature maps. Proceedings of the National Academy of Sciences, 87(21):8345–8349, 1990.
- The geometry of visual perception: Retinotopic and nonretinotopic representations in the human visual system. Proceedings of the IEEE, 98(3):479–492, 2010.
- Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature, 381(6583):607–609, 1996.
- How close are we to understanding v1? Neural computation, 17(8):1665–1699, 2005.
- What is the other 85 percent of v1 doing. L. van Hemmen, & T. Sejnowski (Eds.), 23:182–211, 2006.
- Suite2p: beyond 10,000 neurons with standard two-photon microscopy. BioRxiv, pp. 061507, 2016.
- Image invariance with changes in size: The role of peripheral contrast thresholds. JOSA A, 8(11):1762–1774, 1991.
- Gaze-contingent real-time simulation of arbitrary visual fields. In Human vision and electronic imaging VII, volume 4662, pp. 57–69. SPIE, 2002.
- JS Pointer and RF Hess. The contrast sensitivity gradient across the human visual field: With emphasis on the low spatial frequency range. Vision research, 29(9):1133–1151, 1989.
- Laminar analysis of 7 t bold using an imposed spatial activation pattern in human v1. Neuroimage, 52(4):1334–1346, 2010.
- Visualization of very large high-dimensional data sets as minimum spanning trees. Journal of Cheminformatics, 12(1):12, 2020. doi: 10.1186/s13321-020-0416-x.
- Improving language understanding by generative pre-training. 2018.
- Learning transferable visual models from natural language supervision. In International conference on machine learning, pp. 8748–8763. PMLR, 2021.
- The arrangement of the three cone classes in the living human eye. Nature, 397(6719):520–522, 1999.
- Learning the 2-d topology of images. In J. Platt, D. Koller, Y. Singer, and S. Roweis (eds.), Advances in Neural Information Processing Systems, volume 20. Curran Associates, Inc., 2007.
- Adult plasticity and cortical reorganization after peripheral lesions. Current Opinion in Neurobiology, 35:136–141, 2015.
- Natural image statistics and neural representation. Annual review of neuroscience, 24(1):1193–1216, 2001.
- Multiclass spectral clustering. In Computer Vision, IEEE International Conference on, volume 2, pp. 313–313. IEEE Computer Society, 2003.
- Recordings of 10,000 neurons in visual cortex in response to 2,800 natural images. Figshare Repos, 2018.
- High-dimensional geometry of population responses in visual cortex. Nature, 571(7765):361–365, July 2019.
- High-precision coding in visual cortex. Cell, 184(10):2767–2778.e15, 2021. ISSN 0092-8674. doi: https://doi.org/10.1016/j.cell.2021.03.042.
- Application of kohonen’s self-organizing feature map algorithm to cortical maps of orientation and direction preference. Proceedings: Biological Sciences, 265(1398):827–838, 1998. ISSN 09628452.
- Larry N Thibos. Acuity perimetry and the sampling theory of visual resolution. Optometry and vision science: official publication of the American Academy of Optometry, 75(6):399–406, 1998.
- Review the cancer genome atlas (tcga): an immeasurable source of knowledge. Contemporary Oncology/Współczesna Onkologia, 2015(1):68–77, 2015.
- Information processing in the primate visual system: an integrated systems perspective. Science, 255(5043):419–423, 1992.
- Attention is all you need. Advances in neural information processing systems, 30, 2017.
- Visual field maps in human cortex. Neuron, 56(2):366–383, 2007.
- The cancer genome atlas pan-cancer analysis project. Nature genetics, 45(10):1113–1120, 2013.
- Rachel OL Wong. Retinal waves and visual system development. Annual review of neuroscience, 22(1):29–47, 1999.
- Unsupervised feature learning via non-parametric instance discrimination. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018.
- Mining tcga database for gene expression in ovarian serous cystadenocarcinoma microenvironment. PeerJ, 9:e11375, 2021.
- Barlow twins: Self-supervised learning via redundancy reduction. In International Conference on Machine Learning, pp. 12310–12320. PMLR, 2021.
- Integrated multi-omics analysis using variational autoencoders: Application to pan-cancer classification. In 2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 765–769, 2019. doi: 10.1109/BIBM47256.2019.8983228.
- Omiembed: A unified multi-task deep learning framework for multi-omics data. Cancers, 13(12), 2021. ISSN 2072-6694. doi: 10.3390/cancers13123047.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.