Brain decoding: toward real-time reconstruction of visual perception
Abstract: In the past five years, the use of generative and foundational AI systems has greatly improved the decoding of brain activity. Visual perception, in particular, can now be decoded from functional Magnetic Resonance Imaging (fMRI) with remarkable fidelity. This neuroimaging technique, however, suffers from a limited temporal resolution ($\approx$0.5 Hz) and thus fundamentally constrains its real-time usage. Here, we propose an alternative approach based on magnetoencephalography (MEG), a neuroimaging device capable of measuring brain activity with high temporal resolution ($\approx$5,000 Hz). For this, we develop an MEG decoding model trained with both contrastive and regression objectives and consisting of three modules: i) pretrained embeddings obtained from the image, ii) an MEG module trained end-to-end and iii) a pretrained image generator. Our results are threefold: Firstly, our MEG decoder shows a 7X improvement of image-retrieval over classic linear decoders. Second, late brain responses to images are best decoded with DINOv2, a recent foundational image model. Third, image retrievals and generations both suggest that high-level visual features can be decoded from MEG signals, although the same approach applied to 7T fMRI also recovers better low-level features. Overall, these results, while preliminary, provide an important step towards the decoding -- in real-time -- of the visual processes continuously unfolding within the human brain.
- A massive 7T fMRI dataset to bridge cognitive neuroscience and artificial intelligence. Nature neuroscience, 25(1):116–126, 2022.
- EEG-ConvTransformer for single-trial EEG-based visual stimulus classification. Pattern Recognition, 129:108757, 2022.
- Vector-based navigation using grid-like representations in artificial agents. Nature, 557(7705):429–433, 2018.
- The temporal evolution of conceptual object representations revealed through models of behavior, semantics and deep neural networks. NeuroImage, 178:172–182, 2018. ISSN 1053-8119. doi: https://doi.org/10.1016/j.neuroimage.2018.05.037. URL https://www.sciencedirect.com/science/article/pii/S1053811918304440.
- G. Bradski. The OpenCV Library. Dr. Dobb’s Journal of Software Tools, 2000.
- Evidence of a predictive coding hierarchy in the human brain listening to speech. Nature human behaviour, 7(3):430–441, 2023.
- Dynamics of scene representations in the human brain revealed by magnetoencephalography and deep neural networks. NeuroImage, 153:346–358, 2017.
- Decoding speech from non-invasive brain recordings. arXiv preprint arXiv:2208.12266, 2022.
- Semantic brain decoding: from fMRI to conceptually similar image reconstruction of visual stimuli. arXiv preprint arXiv:2212.06726, 2022.
- A large and rich EEG dataset for modeling human visual object recognition. NeuroImage, 264:119754, 2022.
- The representational dynamics of visual objects in rapid serial visual processing streams. NeuroImage, 188:668–679, 2019.
- Measuring and modeling the motor system with machine learning. Current opinion in neurobiology, 70:11–23, 2021.
- THINGS: A database of 1,854 object concepts and more than 26,000 naturalistic object images. PloS one, 14(10):e0223792, 2019.
- THINGS-data, a multimodal collection of large-scale datasets for investigating object representations in human brain and behavior. eLife, 12:e82580, feb 2023. ISSN 2050-084X. doi: 10.7554/eLife.82580. URL https://doi.org/10.7554/eLife.82580.
- Generic decoding of seen and imagined objects using hierarchical visual features. Nature communications, 8(1):15037, 2017.
- Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. The Journal of physiology, 160(1):106, 1962.
- MOABB: trustworthy algorithm benchmarking for bcis. Journal of neural engineering, 15(6):066011, 2018.
- Deep convolutional neural networks for mental load classification based on EEG data. Pattern Recognition, 76:582–595, 2018.
- Decoding the visual and subjective contents of the human brain. Nature neuroscience, 8(5):679–685, 2005.
- The fusiform face area: a module in human extrastriate cortex specialized for face perception. Journal of neuroscience, 17(11):4302–4311, 1997.
- The human brain encodes a chronicle of visual events at each instant of time through the multiplexing of traveling waves. Journal of Neuroscience, 41(34):7224–7233, 2021.
- Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
- The perils and pitfalls of block design for EEG classification experiments. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(1):316–333, 2020.
- Decoding and synthesizing tonal language speech from brain activity. Science Advances, 9(23):eadh0478, 2023.
- Unibrain: Unify image reconstruction and captioning all in one diffusion model from human brain activity. arXiv preprint arXiv:2308.07428, 2023.
- A zero-shot deep metric learning approach to brain–computer interfaces for image retrieval. Knowledge-Based Systems, 246:108556, 2022.
- An ecologically motivated image dataset for deep learning yields better models of human vision. Proceedings of the National Academy of Sciences, 118(8):e2011417118, 2021.
- A high-performance neuroprosthesis for speech decoding and avatar control. Nature, pp. 1–10, 2023.
- Neuroprosthesis for decoding speech in a paralyzed person with anarthria. New England Journal of Medicine, 385(3):217–227, 2021.
- Reconstructing visual experiences from brain activity evoked by natural movies. Current biology, 21(19):1641–1646, 2011.
- The hippocampus as a cognitive map. Behavioral and Brain Sciences, 2(4):487–494, 1979.
- Brain-diffuser: Natural scene reconstruction from fMRI signals using generative latent diffusion. arXiv preprint arXiv:2303.05334, 2023.
- Decoding brain representations by multimodal learning of neural activity and visual features. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(11):3833–3849, 2020.
- Learning transferable visual models from natural language supervision, 2021.
- Deep learning-based electroencephalography analysis: a systematic review. Journal of neural engineering, 16(5):051001, 2019.
- Artificial neural networks accurately predict language processing in the brain. BioRxiv, pp. 2020–06, 2020.
- Reconstructing the mind’s eye: fMRI-to-image with contrastive learning and diffusion priors. arXiv preprint arXiv:2305.18274, 2023.
- Generative adversarial networks for reconstructing natural images from brain activity. NeuroImage, 181:775–785, 2018.
- Yu Takagi and Shinji Nishimoto. High-resolution image reconstruction with latent diffusion models from human brain activity. bioRxiv, 2023. doi: 10.1101/2022.11.18.517004. URL https://www.biorxiv.org/content/early/2023/03/11/2022.11.18.517004.
- Semantic reconstruction of continuous language from non-invasive brain recordings. Nature Neuroscience, pp. 1–9, 2023.
- Self-supervised learning of brain dynamics from broad neuroimaging data. Advances in Neural Information Processing Systems, 35:21255–21269, 2022.
- scikit-image: image processing in python. PeerJ, 2:e453, 2014.
- Reconstructing faces from fMRI patterns using deep generative neural networks. Communications biology, 2(1):193, 2019.
- A high-performance speech neuroprosthesis. Nature, pp. 1–6, 2023.
- Performance-optimized hierarchical models predict neural responses in higher visual cortex. Proceedings of the national academy of sciences, 111(23):8619–8624, 2014.
- Controllable mind visual diffusion model. arXiv preprint arXiv:2305.10135, 2023.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.