Psychometry: An Omnifit Model for Image Reconstruction from Human Brain Activity
Abstract: Reconstructing the viewed images from human brain activity bridges human and computer vision through the Brain-Computer Interface. The inherent variability in brain function between individuals leads existing literature to focus on acquiring separate models for each individual using their respective brain signal data, ignoring commonalities between these data. In this article, we devise Psychometry, an omnifit model for reconstructing images from functional Magnetic Resonance Imaging (fMRI) obtained from different subjects. Psychometry incorporates an omni mixture-of-experts (Omni MoE) module where all the experts work together to capture the inter-subject commonalities, while each expert associated with subject-specific parameters copes with the individual differences. Moreover, Psychometry is equipped with a retrieval-enhanced inference strategy, termed Ecphory, which aims to enhance the learned fMRI representation via retrieving from prestored subject-specific memories. These designs collectively render Psychometry omnifit and efficient, enabling it to capture both inter-subject commonality and individual specificity across subjects. As a result, the enhanced fMRI representations serve as conditional signals to guide a generation model to reconstruct high-quality and realistic images, establishing Psychometry as state-of-the-art in terms of both high-level and low-level metrics.
- Network of experts for large-scale image categorization. In ECCV, 2016.
- A massive 7t fmri dataset to bridge cognitive neuroscience and artificial intelligence. Nature Neuroscience, 25(1):116–126, 2022.
- Architectonic mapping of the human brain beyond brodmann. Neuron, 88(6):1086–1107, 2015.
- From voxels to pixels and back: Self-supervision in natural-image reconstruction from fmri. In NeurIPS, 2019.
- Complex brain networks: graph theoretical analysis of structural and functional systems. Nature Reviews Neuroscience, 10(3):186–198, 2009.
- Unsupervised learning of visual features by contrasting cluster assignments. In NeurIPS, 2020.
- Bold5000, a public fmri dataset while viewing 5000 visual images. Scientific data, 6(1):49, 2019.
- Improved learning algorithms for mixture of experts in multiclass classification. Neural Networks, 12(9):1229–1252, 1999.
- Seeing beyond the brain: Conditional diffusion model with sparse masked modeling for vision decoding. In CVPR, 2023a.
- Mod-squad: Designing mixtures of experts as modular multi-task learners. In CVPR, 2023b.
- Functional magnetic resonance imaging (fmri)“brain reading”: detecting and classifying distributed patterns of fmri activity in human visual cortex. Neuroimage, 19(2):261–270, 2003.
- An image is worth 16x16 words: Transformers for image recognition at scale. In ICLR, 2021.
- A multilevel mixture-of-experts framework for pedestrian classification. IEEE Transactions on Image Processing, 20(10):2967–2979, 2011.
- M33{}^{3}start_FLOATSUPERSCRIPT 3 end_FLOATSUPERSCRIPTvit: Mixture-of-experts vision transformer for efficient multi-task learning with model-accelerator co-design. In NeurIPS, pages 28441–28457, 2022.
- New method for fmri investigations of language: defining rois functionally in individual subjects. Journal of Neurophysiology, 104(2):1177–1194, 2010.
- Switch transformers: Scaling to trillion parameter models with simple and efficient sparsity. The Journal of Machine Learning Research, 23(1):5232–5270, 2022.
- The neurobiological foundation of memory retrieval. Nature Neuroscience, 22(10):1576–1585, 2019.
- Measuring structural–functional correspondence: spatial variability of specialised brain regions after macro-anatomical alignment. Neuroimage, 59(2):1369–1381, 2012.
- Modular encoding and decoding models derived from bayesian canonical correlation analysis. Neural Computation, 25(4):979–1005, 2013.
- Individual-specific features of brain systems identified with resting state functional correlations. Neuroimage, 146:918–939, 2017.
- Decoding natural image stimuli from fmri data with a surface-based convolutional network. In MIDL, 2023.
- Mixture of experts for classification of gender, ethnic origin, and pose of human faces. IEEE Transactions on Neural Networks, 11(4):948–960, 2000.
- Dselect-k: Differentiable selection in the mixture of experts with applications to multi-task learning. In NeurIPS, 2021.
- Denoising diffusion probabilistic models. In NeurIPS, pages 6840–6851, 2020.
- Generic decoding of seen and imagined objects using hierarchical visual features. Nature Communications, 8(1):15037, 2017.
- Adaptive mixtures of local experts. Neural Computation, 3(1):79–87, 1991.
- Decoding the visual and subjective contents of the human brain. Nature Neuroscience, 8(5):679–685, 2005.
- Identifying natural images from human brain activity. Nature, 452(7185):352–355, 2008.
- Dynamic magnetic resonance imaging of human brain activity during primary sensory stimulation. Proceedings of the National Academy of Sciences, 89(12):5675–5679, 1992.
- Deep learning. Nature, 521(7553):436–444, 2015.
- Efficient multimodal fusion via interactive prompting. In CVPR, 2023.
- Mind reader: Reconstructing complex images from brain activities. In NeurIPS, 2022.
- Microsoft coco: Common objects in context. In ECCV, 2014.
- Bird’s-eye-view scene graph for vision-language navigation. In ICCV, 2023a.
- Brainclip: Bridging brain and visual-linguistic representation via clip for generic natural visual stimulus decoding from fmri. arXiv preprint arXiv:2302.12971, 2023b.
- Minddiffuser: Controlled image reconstruction from human brain activity with semantic and structural diffusion. In ACM MM, 2023.
- Zero-shot video grounding with pseudo query lookup and verification. IEEE Transactions on Image Processing, 33:1643–1654, 2024.
- Temporal perceiving video-language pre-training. In AAAI, 2023.
- Modeling task relationships in multi-task learning with multi-gate mixture-of-experts. In ACM SIGKDD, pages 1930–1939, 2018.
- Reconstructing natural scenes from fmri patterns using bigbigan. In IJCNN, pages 1–8, 2020.
- Bayesian reconstruction of natural images from human brain activity. Neuron, 63(6):902–915, 2009.
- Subject-specific functional localizers increase sensitivity and functional resolution of multi-subject analyses. Neuroimage, 63(3):1646–1669, 2012.
- Brain-diffuser: Natural scene reconstruction from fmri signals using generative latent diffusion. arXiv preprint arXiv:2303.05334, 2023.
- Reconstruction of perceived images from fmri patterns and semantic brain exploration using instance-conditioned gans. In IJCNN, pages 1–8, 2022.
- Neural networks for efficient bayesian decoding of natural images from retinal neurons. In NeurIPS, 2017.
- Joint fmri decoding and encoding with latent embedding alignment. arXiv preprint arXiv:2303.14730, 2023.
- Learning transferable visual models from natural language supervision. In ICML, 2021.
- Natural image reconstruction from fmri using deep learning: A survey. Frontiers in Neuroscience, 15:795488, 2021.
- Scaling vision with sparse mixture of experts. In NeurIPS, 2021.
- High-resolution image synthesis with latent diffusion models. In CVPR, pages 10684–10695, 2022.
- Correspondence between functional connectivity and task-related activity patterns within the individual. Current Opinion in Behavioral Sciences, 40:178–188, 2021.
- Reconstructing the mind’s eye: fmri-to-image with contrastive learning and diffusion priors. In NeurIPS, 2023.
- Outrageously large neural networks: The sparsely-gated mixture-of-experts layer. In ICLR, 2017.
- End-to-end deep image reconstruction from human brain activity. Frontiers in Computational Neuroscience, 13:21, 2019a.
- Deep image reconstruction from human brain activity. PLoS Computational Biology, 15(1):e1006633, 2019b.
- Controllable 3d face generation with conditional style code diffusion. In AAAI, 2024.
- Deep unsupervised learning using nonequilibrium thermodynamics. In ICML, pages 2256–2265, 2015.
- Generative modeling by estimating gradients of the data distribution. NeurIPS, 32, 2019.
- Score-based generative modeling through stochastic differential equations. ICLR, 2020.
- Ecphory of autobiographical memories: an fmri study of recent and remote memory retrieval. Neuroimage, 30(1):285–298, 2006.
- Knowledge-enhanced dual-stream zero-shot composed image retrieval. In CVPR, 2024.
- High-resolution image reconstruction with latent diffusion models from human brain activity. In CVPR, 2023.
- Efficientnet: Rethinking model scaling for convolutional neural networks. In ICML, 2019.
- Improving deep learning with generic data augmentation. In 2018 IEEE symposium series on computational intelligence (SSCI), pages 1542–1547, 2018.
- Endel Tulving. Ecphoric processes in episodic memory. Philosophical Transactions of the Royal Society of London. B, Biological Sciences, 302(1110):361–371, 1983.
- The wu-minn human connectome project: an overview. Neuroimage, 80:62–79, 2013.
- Versatile diffusion: Text, images and variations all in one diffusion model. In CVPR, 2023.
- Doraemongpt: Toward understanding dynamic scenes with large language models. arXiv preprint arXiv:2401.08392, 2024.
- Taskexpert: Dynamically assembling multi-task representations with memorial mixture-of-experts. In ICCV, 2023.
- How transferable are features in deep neural networks? In NeurIPS, 2014.
- Migc: Multi-instance generation controller for text-to-image synthesis. In CVPR, 2024.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.