FoMo-0D: Zero-Dimensional Models and Applications

Updated 10 February 2026

FoMo-0D is a framework of zero-dimensional (0D) abstractions that simplifies complex phenomena by using lumped parameters and statistical models.
In legal eDiscovery, FoMo-0D rigorously estimates the likelihood of factoid omission using confidence bounds derived from document review probabilities.
The method accelerates cardiovascular simulations and enables zero-shot outlier detection by leveraging reduced-order modeling and Bayesian Transformer techniques.

FoMo-0D refers to a class of zero-dimensional (0D) or lumped-parameter models, algorithms, or theoretical constructs designed for diverse tasks in computational science and machine learning, characterized by the use of zero-dimensional abstractions. Three distinct uses of FoMo-0D are prominent in the literature: (1) in legal eDiscovery as a probabilistic model for factoid omission; (2) as an automated framework for cardiovascular blood flow simulation from patient-specific geometries; and (3) as a foundation model for zero-shot outlier detection on tabular data. Each application embodies the central motif of “0D”—eschewing spatial extent for statistical, network, or Bayesian abstraction.

1. FoMo-0D in Legal eDiscovery: Probabilistic Model of Factoid Omission

The FoMo-0D model in legal eDiscovery formalizes the probability that a distinct fact or topic (termed "factoid") is omitted from a set of responsive documents identified during the discovery process. Let $M$ denote the unknown total number of factoids, $p_i$ the prevalence of factoid $i$ in the document population, and $N$ the number of reviewed documents. The omission probability for a specific factoid $i$ after $N$ draws is

$P_{\mathrm{miss},i}(N) = (1 - p_i)^{N}$

Aggregating over all factoids, the expected number missed is $\sum_{i=1}^M (1-p_i)^N$ . For practical scenarios where $p_i$ and $M$ are unknown, FoMo-0D focuses on confidence-bounded estimation of the rarest unseen factoid. If after reviewing $n_1$ documents no instance of a given factoid is found, the upper confidence-bound on its prevalence at confidence $c$ is given by

$(1-\hat p)^{n_1} = 1-c$

with solution $\hat p = 1 - (1 - c)^{1/n_1}$ . The likelihood $k$ of encountering the factoid in $n_2$ remaining documents is $k = 1 - (1 - \hat p)^{n_2}$ , so the probability the factoid is missed in $n_1$ but present in $n_2$ is approximately $(1-c) k$ .

This model’s simplifying assumptions—that documents carry only one factoid, factoid-document draws are i.i.d., and factoids are independent—enable tractable analysis and are empirically justified. Heap’s and Zipf’s laws support the expectation that a handful of common factoids are captured quickly, while the rarest determine the likelihood and effort required for complete coverage. Empirical studies on microaggression texts and web page tags validate that, even at incomplete recall, almost all factoids are captured early, and residual omission risk can be bounded rigorously (Roitblat, 2021).

2. FoMo-0D for Automated Generation of Cardiovascular 0D Models

FoMo-0D is also the name of a fully automated pipeline for generating zero-dimensional reduced-order models (ROMs) of blood flow from 3D patient-specific vascular geometries. This system, implemented in SimVascular, proceeds through:

Geometry import: Surface mesh of the vessel from imaging (CT/MRI) serves as the only required input.
Centerline extraction: Using VMTK, the centerlines of all inlet–outlet paths are generated and merged, with centerline points annotated by maximum-inscribed-sphere radius.
Area sampling and anatomical segmentation: At each centerline point, the cross-sectional area $A(z)$ is measured. Junctions are detected by color segmentation of the surface mesh; stenoses by identifying extrema of $A(z)$ .
Network construction: Vascular network is abstracted as a graph, with nodes for junctions and edges for vessel segments or sub-segments (including stenoses).
Lumped parameter assignment: Each edge is mapped to serial elements: inertance ( $L$ ), resistance ( $R$ ), and compliance ( $C$ ), with closed-form expressions parameterized by local geometric and physical properties:

$R = \frac{8\mu L}{\pi r^4}, \qquad L = \frac{\rho L}{\pi r^2}, \qquad C = \frac{3\pi r^3}{2 E h}$

For stenoses, $R_{\mathrm{stenosis}} = K_t \frac{\rho}{2} \frac{(A_0/A_s - 1)^2}{A_0^2} |Q|$ .

Boundary and junction conditions enforce static pressure continuity and mass conservation. The assembled ODE system is discretized and solved using implicit time integration (generalized- $\alpha$ scheme with Newton–Raphson iterations).

Validation on 72 anatomical models demonstrates outlet pressure and flow errors of 1–10% relative to reference 3D CFD, with runtimes around $0.8$ minutes per simulation cycle (versus over two days for 3D, and five minutes for 1D). This framework enables real-time exploration of patient-specific hemodynamics and is robust to anatomical diversity. Limitations include coarse internal pressure resolution for minimal segmentation, lack of dynamic junction losses, and a reliance on assumed vessel wall properties (Pfaller et al., 2021).

3. FoMo-0D in Zero-Shot Outlier Detection

In the context of unsupervised anomaly detection, FoMo-0D refers to a pre-trained Foundation Model (PFN) that delivers zero-shot outlier detection (OD) on tabular data. This method eliminates the model selection and hyperparameter search bottleneck by approximating the Bayesian posterior predictive distribution (PPD) for the outlier/inlier decision, given only an inlier training set and test samples.

Theoretical basis: Define a generative prior over hypothesis $\varphi$ , from which synthetic datasets are sampled. The PFN is trained to emulate $p(y_{test}\mid x_{test}, D_{train}) = \int p(y_{test} \mid x_{test}, \varphi) p(D_{train}|\varphi)p(\varphi)d\varphi$ by minimizing expected cross-entropy on synthetic OD problems.
Network architecture: The model receives as input a set of context (inlier) samples $D_{train}$ and a query $x_{test}$ . All are embedded as tokens and processed via $L$ Transformer layers implementing routerized self-attention (for scalability, $O(nR)$ complexity with $R \ll n$ ). The test point attends to the context via standard cross-attention, producing a context-conditioned representation, which is mapped to the outlier probability by an MLP and softmax.
Synthetic data prior: Pretraining uses a mixture-of-Gaussians model with variable dimension, number of clusters, and inflated covariance subspaces for outliers, with Mahalanobis distance used to define inlier/outlier regions. Random invertible linear transforms expand the diversity of synthetic tasks.

Zero-shot inference is performed by passing the (unknown) inlier training set and each unlabeled test point into the frozen model; no adaptation or parameter tuning is required.

Empirical evaluation on 57 datasets (tabular and embedding-based) against 26 baseline OD methods shows FoMo-0D to be highly competitive, with mean AUROC rank statistically indistinguishable from top methods (e.g., $p=0.106$ vs. kNN on all datasets). Inference is efficient—$7.7$ ms/sample (NVIDIA RTX A6000)—and requires no retraining per new dataset. Limitations include reliance on Gaussian-like data marginals and a pretraining prior that does not capture categorical or complex dependencies, though generalization holds to $d\le 500$ with minor accuracy loss (Shen et al., 2024).

4. Mathematical Foundations and Simplifying Assumptions

Across its applications, FoMo-0D employs simplifications that underlie both its tractability and limitations. In legal eDiscovery, the model presumes documents are i.i.d. carriers of single (mutually independent) factoids, for which the coverage of rare factoids dominates the discovery process. In cardiovascular modeling, 0D abstraction is justified as an asymptotic reduction: spatial details are collapsed into lumped circuits parameterized by geometry, yielding fast and interpretable surrogates at the expense of intra-segment granularity. In zero-shot OD, the Bayesian formulation is dependent on the fidelity of the synthetic data prior, and attention-based Transformer modules are made tractable by routerization.

These frameworks benefit when object frequency distributions are heavy-tailed (e.g., Zipf, Heap’s law), as a small sample quickly captures high-prevalence entities, while rare entities determine asymptotic coverage or detection limits.

5. Practical Implications and Empirical Performance

In eDiscovery, FoMo-0D provides a quantitative tool to argue that additional document review is unlikely to yield novel factoids, thus guiding efficient discovery processes with rigorously bounded omission risk. For cardiovascular engineering, the FoMo-0D pipeline enables rapid development, uncertainty quantification, and exploration of patient models on commodity hardware, matching 3D reference simulations to within $\leq$ 10% for clinically relevant outputs in anatomically variable scenarios. As a zero-shot outlier detector, FoMo-0D streamlines OD deployment in practice, dispensing with model selection hurdles and enabling millisecond-scale inference in streaming or real-time settings.

The main limitations across domains include sensitivity to the statistical assumptions inherent in each abstraction—e.g., factoid independence in eDiscovery, linear wall mechanics in hemodynamics, and distributional mismatch in zero-shot OD on real-world data.

6. Future Directions and Extensions

Potential advances for FoMo-0D frameworks include:

In eDiscovery, using occupancy models parameterized by empirically estimated frequency distributions or integrating semantic relations between factoids.
For lumped-parameter cardiovascular models, incorporating dynamic loss models at junctions, segmenting vessels more finely, and representing nonlinear or active vessel wall properties via variable elastance or viscoelasticity. Augmentation with machine learning at key network elements (e.g., junctions or stenoses) could improve fidelity.
In zero-shot OD, enriching the synthetic pretraining prior to model categorical features, contextual/correlated anomalies, and causal dependencies, as well as incorporating real datasets or pre-trained generative models. Extension to contaminated, streaming, or multimodal (image, text, graph) OD tasks is a plausible trajectory.

Each branch of FoMo-0D demonstrates that 0D abstractions, when judiciously constructed and validated, provide interpretable, computationally efficient, and scalable solutions to complex real-world problems, with substantial room for further innovation across both methodological and applied axes.