Omni-fMRI: A Universal fMRI Paradigm

Updated 6 February 2026

Omni-fMRI is a universal fMRI approach that integrates adaptive voxel-level foundation models, dynamic patch tokenization, and modality fusion to overcome traditional limitations.
The methodology leverages deep learning (ViT, scale-aware masked autoencoders) and advanced Bayesian models to achieve state-of-the-art performance and enhanced signal interpretability.
It facilitates cross-subject semantic alignment and multi-modal integration, enabling robust brain decoding and comprehensive clinical and neuroscientific applications.

Omni-fMRI is a term that denotes a class of comprehensive approaches in functional magnetic resonance imaging (fMRI) characterized by their ability to unify disparate aspects of fMRI analysis: subject variability, spatial and temporal scales, signal complexity (including magnitude and phase), and even modality integration. Distinct from standard fMRI methods constrained by subject-specific processing, spatial atlases, or real-only BOLD signal modeling, Omni-fMRI methodologies are designed to be universal, adaptive, and information-maximal. They include foundation models for voxel-level brain representation learning, advanced Bayesian analyses of complex-valued data, cross-subject and cross-modal semantic alignment, and even next-generation scanner architectures enabling simultaneous multi-modal acquisition. The following sections survey the key technological pillars of Omni-fMRI systems, their methodological innovations, benchmarked performance, and practical and clinical implications, as codified in the recent literature.

1. Atlas-Free Voxel-Level Foundation Models

Omni-fMRI as an fMRI foundation model paradigm is exemplified by the "Omni-fMRI" system of Qiao et al. (Wang et al., 30 Jan 2026), which dispenses with predefined anatomical parcellations and directly processes the 4D (space × time) fMRI volumes at the voxel level. The core technological advances include:

Dynamic Patch Tokenization: Instead of uniformly extracting $4^3$ patches ( $\sim$ 13,000 tokens per 96 $^3$ volume), Omni-fMRI partitions each 4D volume into content-adaptive patches based on local spatiotemporal variance $\sigma^2_P$ . High-variance, information-rich regions are recursively subdivided to the $4^3$ voxel base resolution, while low-complexity regions remain as coarse tokens, reducing the token count by two-thirds without loss of informative content.
Vision Transformer Backbone: Patch tokens are embedded via a dual-path multi-scale embedding mechanism and encoded with a 12-layer, 12-head, 768-dimension ViT encoder.
Scale-Aware Masked Autoencoder: The model leverages a MAE objective with scale-conditioned decoders and per-token, per-scale normalization, reconstructing the masked fraction (75%) of tokens per volume. This formulation maintains scale invariance and prevents large patches from dominating the loss.

The end-to-end pipeline yields a universal, atlas-free embedding of fMRI sessions that is readily transferable across demographics, pathologies, and tasks (Wang et al., 30 Jan 2026). Benchmarking across 16 downstream tasks and 11+ datasets, including demography, disease prediction, and image retrieval, shows Omni-fMRI achieving state-of-the-art performance in both supervised (full fine-tuning, linear probing, few-shot) settings.

2. Cross-Subject Semantic Alignment and Brain Decoding

A principal challenge in multi-subject fMRI is the drastic variation in measured signals owing to neuroanatomical and physiological heterogeneity. Omni-fMRI strategies address this by learning subject-normalized representations supporting cross-subject decoding and robust brain–machine–interface applications:

MindFormer (Han et al., 2024): Utilizes subject-specific linear projections ( $W_s$ ) and learnable subject tokens ( $t_s$ ) to map fMRI voxel vectors $x^{(s)}$ from diverse subjects into a unified semantic embedding space. The shared Transformer encoder effectively "explains away" idiosyncratic features while preserving sufficient individual information. Alignment is enforced at the level of patch embeddings, matched to IP-Adapter image tokens from corresponding stimuli, using joint $\mathcal{L}_1$ and contrastive losses. This unified model conditions generative diffusion models (e.g., Stable Diffusion) to reconstruct images from raw fMRI regardless of subject.
Shallow Adapter + Unified Decoder Paradigm (Liu et al., 2024): Subject-specific shallow adapters map native fMRI spaces to a common latent space $H$ ; a shared, deeper decoder then performs high-level (CLIP) and low-level (VAE latent) brain decoding, supervised via multi-modal contrastive and pixel-wise losses. Transfer to novel subjects requires only retraining the shallow adapters, demonstrating efficient generalization and "Omni-fMRI" capability.

Both approaches significantly outperform subject-specific decoders, especially under low-data regimes, and achieve robust semantic and pixel-level alignment across individuals, meeting the criterion of an Omni-fMRI encoder that both normalizes for inter-subject variance and preserves individual nuances.

3. Complex-Valued Bayesian Modeling: Magnitude and Phase Integration

Standard fMRI analyses discard the imaginary (phase) component of the complex-valued BOLD signal, potentially missing critical neurophysiological information. Omni-fMRI statistical models explicitly incorporate both signal magnitude and phase, enabling comprehensive mapping of brain activation:

Complex-valued Bayesian SGLMM (Wang et al., 2023): For voxel $v$ , models the entire complex time series $y^{(v)} \in \mathbb{C}^T$ via regression on task design and AR(1) noise, embedding spike-and-slab priors (variable selection) and spatial sparsity via structured GMRFs on voxel neighborhoods. Posterior inference is performed efficiently via block-parallel Gibbs sampling on partitioned image parcels, vastly reducing computation without edge artifacts.
CV-M&P Model (Wang et al., 2024): Extends the above by directly parameterizing the complex signal in polar coordinates, modeling magnitude ( $\rho_{v,t}$ ) and phase ( $\theta_{v,t}$ ) as independent responses to task regressors. Joint spike-and-slab priors and spatial GMRFs are employed for both components, and inference is parallelized at the parcel level. This approach accurately distinguishes voxels with magnitude-only, phase-only, and mixed activation, which magnitude-only models fail to identify.

These models have shown, through simulation and real-data application, marked gains in sensitivity, specificity, and interpretability of activation maps over standard pipelines (Wang et al., 2023, Wang et al., 2024).

Omni-fMRI can also signify the integration of fMRI with other imaging modalities within a single acquisition session, leveraging the collective strengths of complementary contrast mechanisms (“grand fusion”):

Omni-tomography Scanner Design (Wang et al., 2011): Realizes simultaneous acquisition of MRI/fMRI, PET, SPECT, CT, ultrasound, and optical/X-ray fluorescence, by stacking rotating and stationary rings around a shared patient bore and applying “interior tomography” design principles (localized ROI imaging, compressed sensing, modality-specific priors).
Multi-Modal Joint Reconstruction: Enables co-registration at sub-millimeter resolution, fuses anatomical with molecular/functional data in a single variational framework, and allows cross-modality constraints (e.g., CT-guided anatomical priors improving BOLD map stability).

Such architectures allow dynamic neurovascular, metabolic, perfusion, and molecular imaging concurrently, supporting neuroscience, cardiology, oncology, and translational medicine studies (Wang et al., 2011).

5. Benchmarks, Evaluation, and Empirical Insights

Extensive quantitative evaluations substantiate the efficacy and universality of Omni-fMRI approaches:

Task / Dataset	Baseline	Omni-fMRI Metric	Source
HCP Task 23-way (Full)	BrainMASS: 49.85%	50.17%	(Wang et al., 30 Jan 2026)
NSD Image Retrieval (Top-1)	NeuroSTORM: 3.00%	6.93%	(Wang et al., 30 Jan 2026)
ADNI-AD (Diagnosis, Lin. Probing)	71.64%	84.26%	(Wang et al., 30 Jan 2026)
Multi-Subject Decoding (PixCorr, Multi)	MindBridge: .151	.243	(Han et al., 2024)
NSD Reconstr. (EffNet-B, Multi-subj)	.712	.648	(Han et al., 2024)
GOD Zero-shot Class. (Top-1, 5-subj avg)	BCLIP-VAE: 18.4%	23.2%	(Liu et al., 2024)

Ablation studies demonstrate that dynamic patching (Omni-fMRI), cross-subject alignment (MindFormer, STTM), learnable subject/adapter tokens, and joint magnitude-phase modeling all contribute substantive accuracy gains and information recovery (Han et al., 2024, Liu et al., 2024, Wang et al., 2023, Wang et al., 2024, Wang et al., 30 Jan 2026).

6. Limitations and Future Prospects

While Omni-fMRI systems have demonstrated significant advances, current limitations include:

Complexity Heuristic: Reliance on variance-based patch partitioning (Omni-fMRI) may be suboptimal; differentiable, learned patch routing could enhance adaptivity (Wang et al., 30 Jan 2026).
Scaling Subject-Specific Components: Linear projection matrices or adapter tokens will grow in parameter count with more subjects; scalable or compositional alternatives are proposed (Han et al., 2024).
Pure Reconstruction Objective: Absence of cross-modal or semantic supervision in foundation pre-training may limit downstream transfer to tasks beyond reconstruction (Wang et al., 30 Jan 2026).
Hardware Integration: Although grand-fusion is feasible in principle, implementation complexities (magnetic field homogeneity, cross-modality interference, shielding) remain active engineering challenges (Wang et al., 2011).
Phase-Only and Mixed Activation Detection: Magnitude-only pipelines miss phase-only activations; only recent Bayesian models recover these (Wang et al., 2023, Wang et al., 2024).

Anticipated directions involve end-to-end adaptive patching, joint objectives incorporating semantic/clinical/behavioral targets, scaling to multimodal and longitudinal data, cross-site harmonization, ultra-fast inference for real-time feedback, and clinical deployment in personalized therapeutics.

7. Theoretical and Practical Significance

Omni-fMRI, as a unifying paradigm, illustrates the convergence of methodological innovations spanning data representation (atlas-free, voxel-level, multimodal), statistical inference (complex-valued, hierarchical Bayesian, parallel MCMC), deep learning (transformers, subject adaptation, contrastive alignment), and systems engineering (grand-fusion scanners). Its adoption is redefining the standards for universality, transferability, and interpretability in brain imaging, with empirically validated improvements across a spectrum of neuroscientific and translational benchmarks. The paradigm's extensibility to new modalities, populations, and clinical domains marks it as a foundational development in functional neuroimaging (Wang et al., 30 Jan 2026, Han et al., 2024, Liu et al., 2024, Wang et al., 2023, Wang et al., 2024, Wang et al., 2011).