Medical Event Data Standard (MEDS) Overview

Updated 14 January 2026

MEDS is a minimal, event-centric schema that represents each clinical observation as a tuple, promoting standardized and portable ML workflows.
MEDS-Tab automates feature extraction and model tuning through sliding-window and fixed-horizon aggregations, achieving robust performance on tasks like MIMIC-IV mortality prediction.
MEDS-OWL extends the framework for semantic interoperability, ensuring FAIR data practices via formal ontologies and rigorous SHACL validation.

The Medical Event Data Standard (MEDS) is a minimal, event-centric schema for structured electronic health record (EHR) data that is designed to enable reproducible, scalable, and semantically precise ML workflows across diverse clinical datasets. The MEDS framework facilitates the tabularization and semantic integration of longitudinal clinical data, supporting both baseline ML modeling through standardized aggregation pipelines and interoperability with the Semantic Web ecosystem via formal ontologies.

1. Foundational Data Model and Representation

At its core, MEDS represents each clinical observation as a tuple $e = (s, t, c, v)$ , where $s$ is the subject identifier, $t \in \mathbb{R}^+$ is a timestamp, $c$ is a standardized code (such as ICD, LOINC, or ATC), and $v \in \mathbb{R} \cup \{\text{``}\varnothing\text{''}\}$ is an optional numeric measurement, with the null value used for categorical codes (Oufattole et al., 2024, Marfoglia et al., 7 Jan 2026). MEDS distinguishes four event types:

Static categorical codes (e.g., sex, race)
Static numeric values (e.g., birthweight)
Time-series categorical codes (e.g., diagnoses, medications)
Time-series numeric measurements (e.g., lab results, vitals)

Storage is inherently subject-sharded, utilizing Parquet, JSON, or CSV files with companion metadata files that record the code taxonomy, static/time-series assignments, and code frequencies. Formally, a MEDS dataset is defined as a tuple $(D_{\text{shards}}, \text{code\_metadata})$ , where each shard contains all events for a subset of subjects.

Key entities:

Subject: Primary unit of analysis.
Event: Atomic record linking subject, code, time, and value.
Code: Standardized classification identifier for observations.
DatasetMetadata: Provenance and versioning.
SubjectSplit: Experimental partitioning (e.g., train/validation/test).
SubjectLabel: Supervised task target (with cutoff time and label value).

This normalized, schema-agnostic representation supports generic, portable data extraction and machine-learning pipelines, circumventing idiosyncrasies and heterogeneities of existing EHR database schemas (Oufattole et al., 2024, Marfoglia et al., 7 Jan 2026).

2. Tabularization and Baseline ML Pipeline (MEDS-Tab)

MEDS-Tab is a fully automated, end-to-end system for generating tabular features and high-caliber baseline models from MEDS-format datasets (Oufattole et al., 2024). The process consists of five main stages:

Describe: Compute per-code frequencies and static/time-series status, yielding metadata for downstream featurization.
Tabularize-Static: Transform static events to a dense (internally sparse) subject-feature matrix.
Tabularize-Time-Series: For each subject, window $\Delta$ , and aggregation function $f$ , compute features at every event time. This utilizes efficient pre-sorting and rolling-index computations (via Polars) and sparse storage formats (CSR/CSC).
Cache-Task: Extract per-label slices aligned to prediction events for memory efficiency; only rows relevant to label triplets $(s, t_{\text{pred}}, y)$ are loaded during modeling.
Model: Use only task-relevant shards to launch an Optuna-powered AutoML routine over XGBoost or scikit-learn classifiers, tuning both model and featurization parameters.

Key implementation techniques include parallelization by subject shard, exploitation of event-code sparsity, and task-specific caching to minimize memory overhead.

3. Feature Engineering and Aggregation Methods

MEDS-Tab supports two main featurization paradigms:

Sliding-Window: Generate a feature vector at each event timestamp for dynamic prediction tasks.
Fixed-Horizon: Aggregate events within varying window widths up to a fixed cutoff $t_{\text{pred}}$ .

For any time-series code $s$ 0 and subject $s$ 1, define

$s$ 2

and the window

$s$ 3

Common aggregation functions $s$ 4 applied over each window yield:

$s$ 5: Event counts.
$s$ 6: Mean of numeric values if present.
$s$ 7: Standard deviation of numeric values.
$s$ 8: Last observed numeric value prior to $s$ 9.

Categorical codes are featurized via count aggregations only. Each $t \in \mathbb{R}^+$ 0 triplet forms a unique sparse feature column. Users specify the set of windows $t \in \mathbb{R}^+$ 1 and aggregation functions as pipeline parameters.

4. Baseline Modeling and Empirical Benchmarking

The default modeling strategy in MEDS-Tab is XGBoost classification, with automatic tuning across a constrained hyperparameter space:

$t \in \mathbb{R}^+$ 2
$t \in \mathbb{R}^+$ 3
$t \in \mathbb{R}^+$ 4
$t \in \mathbb{R}^+$ 5
$t \in \mathbb{R}^+$ 6

Datasets are automatically split by subject (default 70/15/15 for train/validation/test). Core evaluation metrics include AUC-ROC (primary), accuracy, and Brier score for calibration. Owing to sparse storage and streaming, MEDS-Tab supports efficient training on cohorts with hundreds of millions of events (Oufattole et al., 2024).

Empirical results on canonical tasks show:

30-day post-discharge mortality (MIMIC-IV): AUC = 0.935
1-year post-discharge mortality (MIMIC-IV): AUC = 0.898
30-day readmission (MIMIC-IV): AUC = 0.708
In-hospital mortality (MIMIC-IV, eICU): AUC = 0.812, 0.855
ICU and hospital LOS > 3 days: AUC = 0.946, 0.943

As for scalability, benchmark comparisons with TSFresh and CaTabRa on MIMIC-IV and eICU datasets demonstrate that MEDS-Tab achieves drastically lower memory usage and wall time by leveraging event sparsity and parallelization, e.g., 1.4 GB average RAM for 500 MIMIC-IV patients versus 217–660 GB for dense-matrix competitors.

Dataset/Task	MEDS-Tab Time	TSFresh Time	CaTabRa Time
MIMIC-IV (10)	0:02 (0.42GB avg)	1:41 (84GB avg)	0:15 (2.5GB avg)
MIMIC-IV (500)	0:16 (1.4GB avg)	5:09 (217GB avg)	3:17 (14GB avg)

5. Semantic Interoperability: MEDS-OWL and FAIR Alignment

Recognizing the need for semantic interoperability and FAIR (Findable, Accessible, Interoperable, Reusable) data principles, MEDS is further formalized as the MEDS-OWL ontology—an OWL 2 DL-based model that enables conformance to the Semantic Web (Marfoglia et al., 7 Jan 2026).

Ontology Components

13 Classes: Including local MEDS classes (Subject, Event, Code, DatasetMetadata, SubjectSplit, SubjectLabel) and imported classes (e.g., dcat:Dataset, prov:Entity).
10 Object Properties: e.g., meds:hasSubject, meds:hasCode, meds:parentCode, prov:wasDerivedFrom.
20 Data Properties: Including meds:subjectId, meds:time, meds:codeString, meds:numericValue.
24 Key OWL Axioms: Enforce subclass relationships, cardinality, and domain/range restrictions (e.g., each Event must have exactly one subject and code).

Conversion and Validation

The meds2rdf Python library deterministically maps MEDS event rows to corresponding RDF triples using rdflib, supporting seamless conversion of MEDS datasets to knowledge-graph form. Structural and semantic validation is enforced using SHACL NodeShapes, which guarantee, for instance, that each Event node has exactly one linked subject and code, correct data types, and mutually exclusive label modalities.

Sample SHACL constraint for Event validation:

$t \in \mathbb{R}^+$ 7

6. Best Practices and Limitations

For enhanced performance and reproducibility in ML applications:

Include at least three window widths (Δ), including a long/lifetime window.
Apply both f_count and f_last aggregations for time-series codes.
Filter codes with extremely low prevalence via automatic options.
Leverage tree-based model’s native handling of missingness (no additional imputation/normalization).
Use task-specific caching to limit in-memory data for large tasks (setting iterator flags as needed).

To adapt to new datasets or prediction tasks, it suffices to prepare the MEDS-formatted event file and associated label table, then run the standard five-stage CLI workflow—no additional custom coding required (Oufattole et al., 2024).

Limitations primarily reflect the event-centric model’s design: while well-suited to tabular/longitudinal feature extraction and time-aware ML workflows, higher-fidelity graph-based analytics or semantic queries require the additional transformation and ontological mapping provided by MEDS-OWL and meds2rdf (Marfoglia et al., 7 Jan 2026).

7. Scientific Impact and Outlook

MEDS and its derivative systems (MEDS-Tab, MEDS-OWL) establish a standardized, practical, and semantically robust foundation for EHR-based ML research. The minimal event-centric schema, automated tabularization, and semantic graph export collectively address the major challenges of interoperability, reproducibility, and scalability in clinical informatics. The integration with the Semantic Web via MEDS-OWL and SHACL guarantees FAIR compliance and enables provenance-awareness, thus facilitating transparent publishing and distributed analytics over heterogeneous, event-based clinical data.

A plausible implication is that widespread adoption of MEDS and MEDS-OWL could enable the community to develop, evaluate, and disseminate generalizable, portable machine-learning models across diverse clinical data sources and analytic workflows, while maintaining robust semantic interoperability and provenance tracking (Oufattole et al., 2024, Marfoglia et al., 7 Jan 2026).

Markdown Report Issue Upgrade to Chat

References (2)

MEDS-Tab: Automated tabularization and baseline methods for MEDS datasets (2024)

Clinical Data Goes MEDS? Let's OWL make sense of it (2026)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Medical Event Data Standard (MEDS).