RS-FMD: Remote Sensing Model Database
- RS-FMD is a structured, queryable database that unifies fragmented remote sensing model documentation into a standardized schema.
- It streamlines model discoverability and reproducibility by consolidating metadata on architectures, modalities, datasets, and benchmark performance.
- The database employs automated filtering and ranking using dense embeddings and LLM re-ranking to efficiently support diverse Earth observation applications.
The Remote Sensing Foundation Model Database (RS-FMD) is a structured, machine-readable knowledge base that consolidates technical and performance metadata for over 150 publicly released remote sensing foundation models (RSFMs). Designed to address the challenges of model discoverability, reproducibility, and automated retrieval, RS-FMD serves as the core reference infrastructure for large-scale model selection, benchmarking, and deployment across diverse remote sensing tasks, modalities, and application domains.
1. Scope, Purpose, and Core Principles
RS-FMD aims to unify the fragmented documentation surrounding remote sensing foundation models—encompassing vision, multimodal, and vision-language architectures—into a schema-guided, fully queryable catalog. It exposes granular metadata for each RSFM, including architecture, input modalities (e.g., multispectral, SAR, hyperspectral), spatial and temporal resolution, pretraining datasets, downstream benchmark results, and deployment constraints. The database supports both human and agent-based retrieval, enabling transparent, constraint-aware model recommendation and formal comparison for critical Earth observation applications such as semantic segmentation, land-use classification, change detection, object detection, and visual question answering (Chen et al., 21 Nov 2025, Xiao et al., 2024).
The RS-FMD’s explicit objectives are:
- Standardizing model documentation and linking every record to primary sources (papers, code, weights) for computational reproducibility.
- Providing maximal coverage of RSFM diversity, ranging from unimodal optical models to large multimodal transformers, with spatial resolutions spanning sub-meter to multi-kilometer.
- Enabling automated filtering and ranking of models under precise user- or system-defined constraints, such as target modality, minimum performance threshold, or deployment cost.
2. Schema Design and Metadata Structure
Each RS-FMD entry is materialized as a JSON object validated by a pydantic/JSON Schema with strict typing and mandatory provenance fields. Top-level fields encode unique model identifiers, canonical names, version tags, release/update timestamps, free-text summaries, paper/code/weight URLs, and hardware requirements. Technical metadata includes backbone architecture (e.g., SwinV2, ViT-Large), model size (parameter count), layer depth, modalities (e.g., "multispectral", "SAR", "optical"), supported sensors (e.g., Sentinel-2, MODIS), spatial and temporal resolution, and pretext training paradigm (e.g., "Masked Autoencoder", "contrastive", "multi-task").
Additional nested structures describe two core phases:
- PretrainingPhase: dataset name, geographic and temporal coverage, number of images, patch/token size, masking ratio, batch size, epochs, learning rate, sampling, augmentations.
- Benchmark: each downstream application with dataset name, task type, sensor, region, evaluation metrics (with values), and performance details such as number of classes and image resolution.
Example (abridged) main fields:
| Field | Example Value | Description |
|---|---|---|
| model_name | "SatVision-TOA" | Official model name |
| backbone | "SwinV2-Giant" | Network backbone |
| num_parameters | 3,000,000,000 | Model size (parameters) |
| spatial_resolution | "1 km" | Ground sampling distance |
| modalities | ["multispectral"] | Input data types |
| pretext_training_type | "Masked-Image-Modeling" | Self-supervised paradigm |
| pretraining_dataset | "MODIS L1B All-Sky" | Main training corpus |
| supported_tasks | ["cloud retrieval"] | Downstream capabilities |
| license | "CC0" | Usage license |
The complete schema supports logical AND/OR queries and can be extended to encode new task types, sensor classes, and application domains as the field evolves (Chen et al., 21 Nov 2025, Xiao et al., 2024, Huang et al., 28 Mar 2025).
3. Coverage, Sources, and Curation Pipeline
RS-FMD encompasses ≈150 RSFMs (as of the most recent update), covering:
- Modalities: SAR, multispectral, hyperspectral, LiDAR, optical, vision–language pairs.
- Model types: Vision Foundation Models (VFMs), Visual–LLMs (VLMs), LLMs for RS, generative models, multi-modal MAEs, and task-specific specialized FMs.
- Spatial resolution range: sub-meter (ultra-high-res aerial) to coarse (e.g., MODIS TOA radiances at 1 km).
- Temporal coverage: from fixed single-date snapshots to dense annual/seasonal revisits.
The database is populated through a pipeline involving:
- Source aggregation: systematic survey of arXiv, primary literature, model cards, GitHub repositories.
- Automated field extraction by schema-aware LLMs (multiple GPT calls per record, with consistency voting).
- Confidence and provenance tracking for extracted metadata; fields below a 0.75 confidence threshold are manually verified.
- Storage as versioned JSONL, change-tracked via DVC, with regular periodic and author-submitted update cycles.
All entries are explicitly linked to publications, code bases, and (where available) downloadable pretrained weights for direct reproducibility and operational deployment.
4. Model Indexing, Retrieval, and Ranking
To support automated and agentic model recommendation (notably in the REMSA LLM-based selection framework), RS-FMD deploys a multi-stage selection and ranking mechanism:
- Dense embedding retrieval: Each field is prepended with a typed token and Sentence-BERT encoded, resulting in high-fidelity vector representations. User queries are encoded similarly, and FAISS index-based searches retrieve top-K by cosine similarity.
- Rule-based filtering: Any candidate not meeting hard constraints (e.g., required modality, spatial resolution) is dropped preemptively.
- In-context LLM ranking: Remaining candidates are re-ranked by GPT-4.1 (few-shot prompt), prioritizing exact constraint matches and high-confidence ranking, as computed via a weighted combination of log-probability and self-consistency scores.
where , , and threshold θ=0.75 governs clarification requirements.
APIs for retrieval include RESTful endpoints (e.g., /models, /models/{id}, /models/query), supporting string, range, and structured schema queries directly aligned with the RS-FMD record layout (Chen et al., 21 Nov 2025).
5. Benchmarking, Contribution, and Integration Ecosystem
RS-FMD is integrated into major benchmarks, cataloging each model’s performance on primary datasets and tasks. Referenced benchmarks include BigEarthNet, EuroSAT, Onera Satellite Change Detection, SpaceNet, fMoW, DIOR-H/R, and DFC23, among others. For each RSFM, metrics (e.g., accuracy, mIoU, F1, mAP, RMSE) are recorded against canonical test protocols.
Open contribution is supported for both models and datasets; entries require complete key–value metadata per the schema, validated via automated linting and continuous integration checks. Accepted pull requests update both the Markdown/JSON index and auxiliary scripts for community queries (including shell, REST, and GraphQL examples). Versioning and provenance tracking ensure the reproducibility and reliability of all cataloged resources (Xiao et al., 2024, Huang et al., 28 Mar 2025).
Practical deployment and integration are standardized: all records encode inference requirements, permissible downstream tasks, spatial and temporal scope, and licensing. This enables rapid operationalization and integration into workflows by both human users and automated agents.
6. Impact, Applications, and Principal Limitations
RS-FMD, together with agentic query systems such as REMSA, constitutes the first schema-driven, end-to-end pipeline for model selection in remote sensing. Its direct impacts are:
- Replacing manual, ad-hoc searches with reproducible, explainable model recommendation and ranking.
- Enabling cross-model comparison along axes of modality, task type, scale, and benchmark-tested performance, facilitating methodological advances and meta-analyses.
- Supporting downstream constraint propagation, such as hardware capacity, modality alignment, or required accuracy for regulatory or mission-critical use cases.
Principal limitations include the lag between publication and schema ingestion for cutting-edge models, and the inherent need for ongoing human verification as new modalities (e.g., off-nadir, night, Doppler SAR) and complex benchmarks are introduced. Continued community contributions and automated scans partially mitigate these limitations (Chen et al., 21 Nov 2025, Xiao et al., 2024, Huang et al., 28 Mar 2025).
7. Relationship to Broader RSFM and Model Registry Ecosystems
RS-FMD is tightly aligned with other cataloging efforts, such as “awesome-RSFMs” (Xiao et al., 2024), and taxonomy-driven surveys (Huang et al., 28 Mar 2025). It adopts and extends standard metadata conventions, supporting both human curation and LLM-driven extraction, enrichment, and reasoning. As a result, RS-FMD constitutes the definitive foundation model registry for remote sensing AI, undergirding both research and operational deployments at scale.