OJALÁ: Optimizing J-PAS Astronomy for Large-scale Analysis. A foundation model for the SED of galaxies, QSOs and stars

Published 1 Apr 2026 in astro-ph.GA and astro-ph.IM | (2604.00661v1)

Abstract: The advent of large-scale surveys requires efficient ML techniques to exploit the information of massive datasets. We present OJALA, a transformer-based autoregressive foundation model designed to simultaneously classify astronomical objects and infer their physical parameters using 54 narrow bands from J-PAS, combined with broad bands from the DESI Legacy Imaging Surveys and WISE. The model is trained on $\sim20$ million synthetic SEDs generated from DESI DR1 spectra. We validate OJALA using a cross-matched sample of $\sim121,000$ objects between J-PAS and DESI. The model achieves a weighted F1-score of approximately 0.9 for spectral classification (stars, galaxies, and QSOs) at $i < 21$. For galaxies, we recover photo-z with a precision of $σ{\rm NMAD} < 0.01$, while for QSOs, the precision improves significantly at $z > 1.5$, reaching $σ{\rm NMAD} \approx 0.006$ at $z \approx 3.5$. We demonstrate robust estimation of physical properties for galaxies, recovering stellar masses and SFR with a scatter of approximately 0.11 dex and 0.22 dex, respectively. Furthermore, the model accurately predicts EWs for major optical emission lines, allowing for the derivation of extinction-corrected H$α$ luminosities with a scatter of 0.29 dex. OJALA successfully reproduces the BPT and WHAN diagnostic diagrams, classifying SF, AGN, and passive galaxies with F1-scores typically ranging from 70% to 90% depending on the diagnostic class. For stars, the model reliably infers effective temperature and metallicity, though surface gravity remains challenging. Finally, we show the modularity of the architecture by fine-tuning the pre-trained embeddings to predict BH masses, a property not included in the primary training, recovering spectroscopic virial estimates with a precision of approximately 0.5 dex. We release the code, model weights, and a comprehensive VAC for the J-PAS EDR.

Abstract PDF Upgrade to Chat

Authors (40)

First 10 authors:

Summary

The paper introduces a transformer-based model that robustly performs SED analysis and multi-task inference from J-PAS data.
It employs a masked, autoregressive architecture enabling simultaneous classification and regression for redshifts, emission lines, and physical parameters.
OJALÁ achieves high performance with F1-scores up to 0.9 and redshift precision (σ_NMAD < 0.01), underscoring its scalability for large survey datasets.

Transformer-based Large-scale SED Analysis of J-PAS Photometry: The OJALÁ Foundation Model

Motivation and Context

The exponential growth in astronomical survey data, enabled by facilities such as J-PAS, DESI, and LSST, necessitates the development of scalable, robust algorithms for the automated analysis of both photometric and spectroscopic observations. The J-PAS survey's unique 54 narrow-band system delivers quasi-spectroscopic SEDs, providing robust diagnostic power for redshift estimation, emission-line analyses, and spectral classification, bridging the traditional photometry/spectroscopy divide. However, conventional SED-fitting and supervised ML models struggle at these scales due to high dimensionality, heterogeneity, and incompleteness in survey data.

The OJALÁ model addresses this need by establishing a transformer-based, autoregressive foundation model capable of ingesting variable context J-PAS and complementary broad-band photometry—including partial or missing inputs—and simultaneously performing classification (galaxy/star/QSO) and the regression of a wide suite of physical parameters (e.g., photometric redshift, emission line EWs, stellar mass, SFR, and stellar parameters) (2604.00661).

Data, Synthetic Mock Pipeline, and Validation

OJALÁ is trained with ∼20 million SEDs derived from DESI DR1 spectra, convolved with the J-PAS filter transmission curves, and supplemented by DESI Legacy Survey and WISE broad-band data. For reliable benchmarking and sim-to-real evaluation, a J-PAS–DESI cross-matched validation sample of 121,065 objects is constructed, comprising 86,140 galaxies, 25,958 stars, and 8,967 QSOs.

A critical engineering detail is the treatment of photometric errors for training mocks: errors are assigned through an embedding-similarity approach matching synthetic SEDs to real J-PAS objects across magnitude bins. This realistic error sampling ensures consistency between training and inference domains and captures the noise covariance structure characteristic of J-PAS observations.

Figure 1: Examples of synthetic J-PAS fluxes from DESI (white dots) and similar real J-PAS objects (colored dots) used in the error-mapping process.

The datasets Sample distributions in magnitude and color space confirm adequate representation and coverage, with the caveat that DESI’s target selection induces biases at parameter space boundaries.

Figure 2: Distributions of $r$ -band magnitude and $g-r$ color for training, validation, and full J-PAS datasets.

Model Architecture and Training Protocol

OJALÁ departs from traditional fixed-input ML models, employing an encoder–decoder transformer with self-attention. All observed fluxes, magnitudes, and known physical parameters are cast as input tokens in a flexible sequence; the model autoregressively predicts any subset of hidden or missing tokens conditioned on the others, mimicking a masked language modeling regime.

This design enables:

Robustness to missing data and heterogeneity (partial photometry, incomplete parameter coverage),
Unified training on non-strict cross-matched catalogs,
Direct applicability to inference settings typical in next-generation surveys.

For the training objective, a multi-task loss is minimized, combining heteroscedastic negative log-likelihood for regression tokens (with explicit modeling of both predictive and observational uncertainties), categorical cross-entropy for class and morphological labels, and a domain alignment MMD penalty for unsupervised domain adaptation (UDA). The UDA term enforces distributional similarity between latent spaces for synthetic and real J-PAS data, favoring generalization.

Classification and Redshift Regression Performance

OJALÁ yields high classification accuracy across stars, galaxies, and QSOs up to $i$ ∼ 21, achieving a global weighted F1-score of ∼0.9 (Test-Real). Misclassification predominantly occurs at bright magnitudes due to intrinsic ambiguity at the galaxy–low- $z$ QSO boundary (see detailed confusion analysis). The F1-score remains stable under various restricted input subsets; performance is robust even when only the blue trays and broad-band photometry are provided, retaining F1 ∼ 0.8.

Figure 3: Balanced metrics (Purity, Completeness, F1-score) and global weighted F1-score versus $i$ -band magnitude for galaxies (red), stars (blue), QSOs (green) in real and synthetic test sets.

In photometric redshift estimation, OJALÁ reaches $\sigma_{\mathrm{NMAD}} < 0.01$ for galaxies and shows minimal bias up to $i \sim 22$ , with an outlier fraction $\eta < 5\%$ . For QSOs, performance improves significantly at $z > 1.5$ ( $\sigma_{\mathrm{NMAD}} \approx 0.006$ at $g-r$ 0), where prominent emission lines enter the optical bands. Redshift regression is essentially immune to the exclusion of broad-band or morphology tokens as input.

Figure 4: Photometric redshift accuracy (bias, $g-r$ 1, outlier fraction) for galaxies and QSOs as a function of magnitude and spectroscopic $g-r$ 2.

Physical Parameter Estimation

Stellar Parameters

OJALÁ’s inference of $g-r$ 3 for stars is precise (negligible bias, low scatter), reflecting the direct link between photospheric temperature and SED shape, even at the moderate resolution of J-PAS. Surface gravity $g-r$ 4 is biased for giants ( $g-r$ 5) due to the lack of direct information on pressure-broadened, gravity-sensitive features in NB photometry. Metallicity [Fe/H] and [ $g-r$ 6/Fe] are recovered robustly within $g-r$ 71 < [Fe/H] < 0, but the model saturates/overestimates metallicities for the most metal-poor stars, with increased scatter for [ $g-r$ 8/Fe] in the high-enhancement regime.

Figure 5: OJALÁ predictions versus DESI DR1 values for $g-r$ 9, $i$ 0, [Fe/H], [ $i$ 1/Fe] for stars with $i$ 2.

Galaxy Emission Lines and Nebular Diagnostics

OJALÁ predicts emission line EWs (Hα, Hβ, [NII], [OIII]) with low bias and scatter for the bulk of the sample, enabling robust galaxy emission line classification. For instance, the Hα EW has bias 0.035 dex and $i$ 3 dex. The performance degrades for rare, extreme-EW objects due to long-tailed EW distributions and class imbalance. Line ratio diagnostics [NII]/Hα and [OIII]/Hβ can be reconstructed photometrically, enabling the accurate reproduction of BPT and WHAN diagrams.

Figure 6: Density plots comparing OJALÁ-predicted versus DESI-measured EWs for key lines and diagnostic ratios in galaxies.

Figure 7: BPT and WHAN diagrams from DESI, OJALÁ on Test-Synth, and OJALÁ on real J-PAS.

OJALÁ achieves F1-scores up to 80% for pure star-forming galaxies, ~68% for AGN, and ~91% for passive galaxies when classes are evaluated in the appropriate diagnostic regime.

Derivation of dust-corrected Hα luminosity, integrating predicted EWs, SED continuum, and photometric redshift, yields scatter $i$ 4 dex relative to spectroscopic estimates, verifying model coherence across multiple output modalities.

Figure 8: Comparison of extinction-corrected Hα luminosity from DESI spectroscopy and OJALÁ predictions.

Stellar Mass and Star Formation Rate

OJALÁ retrieves stellar masses for galaxies with $i$ 5 dex and minimal bias. SFR is inferred with $i$ 6 dex for star-forming systems, limited by intrinsic degeneracies in optical SED SFR estimation and uncertainties in the reference CIGALE values. For QSOs, both mass and SFR inferences are unreliable due to AGN contamination of the host SED.

Figure 9: OJALÁ predictions vs. CIGALE-derived stellar mass and SFR for galaxies (top) and QSOs (bottom) with $i$ 7.

Black Hole Mass Regression

The autoregressive design enables seamless extension to new output variables: fine-tuning OJALÁ’s pre-trained embeddings for QSO black hole mass ( $i$ 8) recovery achieves precision $i$ 9 dex—competitive with direct single-epoch virial mass estimation from spectra—for single-epoch MgII-based masses in $z$ 0 [2025ApJ...987...48P].

Figure 10: Comparison of spectroscopic and OJALÁ-predicted BH masses for DESI QSOs via fine-tuned embeddings.

Model Limitations and Theoretical Implications

OJALÁ is built as a compact ( $z$ 14.6M parameter) foundation model, contrasting with the trend toward multi-hundred million parameter astronomical models. By leveraging a masked transformation of SED+physical property sequences with flexible attention, OJALÁ provides unified, efficient analysis at scale—processing 1M objects in under an hour on consumer hardware.

Model limitations are primarily:

Unimodal Gaussian regression per continuous parameter, precluding full representation of non-Gaussian or multimodal posteriors,
Performance drop-off for objects at domain boundaries or with rare/highly non-Gaussian properties,
Intrinsic dependency on the physical reliability and selection biases of labeling catalogs (e.g., CIGALE SFRs, DESI spectroscopic types),
Remaining sim-to-real gap for the rarest object types despite robust UDA.

Theoretical implications include the demonstration that a masked-token-based transformer foundation model can naturally absorb the heterogeneity of survey data, perform multi-task learning efficiently, and offer a foundation for higher-level downstream tasks, such as spatially resolved galaxy mapping, physics-aware segmentation, and multi-modal survey fusion. The modular architecture is inherently suited for expansion to richer data types (e.g., images, IFU cubes) and deeper physical parameter spaces.

Future Prospects

OJALÁ’s framework anticipates the needs of next-generation astronomical surveys and paves the way for:

Scaling to IFU-enabled, spatially resolved SED analysis using embedding-based segmentation,
Unified, multi-modal analysis incorporating photometry, low-res spectra, imaging, morphology, and time domain,
Generative and probabilistic decoders for better calibration and uncertainty quantification,
Deployment as a scientific foundation model that integrates incremental knowledge from new data releases and more sophisticated physical parameterizations.

Conclusion

OJALÁ establishes a new technical paradigm for large-scale survey SED analysis, delivering robust, multi-property inference from J-PAS data with demonstrated generalization to real observations and flexibility to extend to new tasks with minimal retraining. The model sets performance benchmarks for classification ( $z$ 2), redshift estimation ( $z$ 3 for galaxies), physical regression (e.g., stellar mass $z$ 4 dex, SFR $z$ 5 dex), and emission line analyses, and provides an effective foundation for scalable, physically informative survey science in the big-data regime (2604.00661).