Quantitative spectroscopy of single and multiple OB-type stars. Non-LTE spectrum analysis with machine learning

Published 3 Apr 2026 in astro-ph.SR | (2604.03082v1)

Abstract: The plethora of spectra of OB-type stars in observatory archives and the much larger numbers to come from the WEAVE and 4MOST spectroscopic facilities require efficient, but also accurate and precise methods for (semi)automatic quantitative analyses. Neural networks were used to emulate the spectra of single- and multi-star systems, trained on hybrid non-local thermodynamic equilibrium (non-LTE) models that cover a wide range of atmospheric parameters and chemical compositions. To derive the full set of stellar atmospheric parameters and uncertainties, a Markov chain Monte Carlo algorithm was implemented to fit high-resolution spectra within 3000A-10500A. The neural networks and fitting algorithm were bundled into a programme called Spectral Analysis Tool Using Restricted Neural networks (SATURN). In its current implementation, SATURN facilitates the emulation of synthetic spectra for spectral types O7 to B9, which differ only negligibly from computed models. SATURN was tested on a number of benchmark stars that have been studied before, including single OB stars and a detached eclipsing binary (DEB) system. Excellent agreement of atmospheric parameters and elemental abundances for up to ten metal species is found with respect to the data in the literature, often with reduced uncertainties. For DEB components, the uncertainties are larger, in particular for the fainter secondaries when only a single-epoch spectrum is considered. Uncertainties of elemental abundances are typically <0.10dex. Some first applications of SATURN for analyses of new targets are shown to demonstrate its capabilities, such as fast rotators, including HD149757 (Zeta Ophiuchi). Consistent results are also found at reduced spectral resolutions relevant for observations with WEAVE and 4MOST.

Abstract PDF Upgrade to Chat

Authors (2)

Summary

The paper introduces SATURN, a machine learning toolkit that automates non-LTE spectral fitting for both single and composite OB-type stars.
It leverages extensive hybrid NLTE model grids and region-specific neural networks to achieve precise parameter estimates with uncertainties of ~1% in Tₑff and 0.05 dex in log(g).
The study demonstrates significant computational efficiency and scalability for large spectroscopic surveys, reducing analysis time while maintaining high fidelity.

Machine Learning-Accelerated Non-LTE Quantitative Spectroscopy of OB-type Stars

Introduction and Motivation

Quantitative spectroscopy is fundamental for constraining fundamental parameters of OB stars—luminosity, $T_\mathrm{eff}$ , $\log(g)$ , and chemical composition—through high-resolution, multiwavelength spectroscopic analysis. The complexity of OB-star atmospheres, most clearly the necessity of non-LTE (NLTE) treatment, precludes LTE-based approaches, particularly for hot, low-density photospheres susceptible to significant deviations from local thermodynamic equilibrium. The expansion of large-scale spectroscopic surveys (e.g., WEAVE, 4MOST) amplifies the need for scalable, precision-driven methodologies capable of extracting multi-element abundance vectors and atmospheric parameters in both single and multiple (binary, DEB) stellar systems.

This paper (2604.03082) introduces and rigorously tests "SATURN"—a neural network-powered spectral emulation and analysis toolkit trained on extensive hybrid NLTE model grids. SATURN delivers fully automated fits of high-resolution optical and near-infrared spectra in the range $3000$–$10500$ Å, handling both single-star and composite (multi-star) spectra, and substantially reducing computational cost relative to traditional direct NLTE modeling.

Synthetic Spectrum Emulation with Neural Networks

Model Grid Construction and Parameter Space

The NLTE models are constructed using the hybrid ATLAS + Detail/Surface (ADS) approach, subdividing the ( $T_\mathrm{eff}$ , $\log(g)$ ) parameter space into five regions corresponding to evolutionary and atmospheric regimes: O, B $_1$ , B $_2$ , BSG $_1$ , BSG $_2$ . Each zone ensures coverage of the relevant ionization stages and mimics evolutionary tracks at various masses and rotational histories, as detailed in (Figure 1).

Figure 1: Kiel diagram showing the coverage of NLTE model regions and distribution of benchmark sample stars overlayed on evolutionary tracks.

For each region, approximately 5000 synthetic models are computed with randomly sampled $\log(g)$ 0 (grid step 50 K), $\log(g)$ 1 (0.01 dex), microturbulence $\log(g)$ 2 (1 km/s), He abundance $\log(g)$ 3, and abundances for ten metals, spanning physically and observationally motivated intervals. These models adopt turbulence, updated line-broadening, and state-of-the-art atomic data.

Neural Network Emulator Architecture

Instead of classical grid interpolation or PCA-based dimensionality reduction, SATURN leverages region-specific multilayer perceptrons (MLPs, 14 inputs for all parameters, 3 hidden layers, $\log(g)$ 47500-wavelength chunk output) for direct regression of the normalized flux as a function of atmospheric labels. Model training partitions the grid into 90% training/10% validation, optimizing mean-squared error using Adam, with carefully tuned learning-rate scheduling and no evidence of overfitting as demonstrated in loss convergence.

Empirical assessments show that emulation errors in normalized flux are typically $\log(g)$ 5 $\log(g)$ 6 in 99% of wavelength points and remain $\log(g)$ 7 even at spectral line cores (Figure 2). This accuracy supports robust spectrum synthesis for downstream parameter estimation.

Figure 2: Direct comparison of ADS-model spectra and corresponding ML predictions, reinforcing sub-millipercent accuracy.

Composite models for binaries employ neural networks for continuum flux emulation and cubic splines across sparsely sampled continuum points.

Automated Spectral Fitting and Parameter Inference

SATURN integrates emulated spectra with detailed broadening (rotational, macroturbulent, instrumental effects), enabling robust, full-spectrum fitting via a Metropolis Hastings MCMC engine with a physically motivated likelihood function that uniformly weights broad hydrogen/helium and weak metal lines. For multi-star systems, composite spectra are constructed with appropriately weighted Doppler-shifted components.

Parameter estimation proceeds in stages:

$\log(g)$ 8 and $\log(g)$ 9: simultaneous fitting using Balmer line wings and multiple ionization equilibria (e.g., He I/II, O II/III).
$3000$0 and $3000$1: inferred from metal line shapes through joint MCMC.
Microturbulence $3000$2: derived by minimizing line-to-line abundance scatter (mathematically justified as equivalent to null correlation with line strength).
Elemental abundances: obtained as means over fits to individual spectral lines (for up to 10 metals).
For binaries: individual radial velocities and flux ratios are also fitted.

Uncertainties are returned from the MCMC posteriors, yielding typically $3000$31% in $3000$4, $3000$50.05 dex in $3000$6, and $3000$70.1 dex in abundances for high-S/N data. Representative posterior distributions confirm the parameter constraints (Figure 3).

Figure 3: MCMC-derived corner plots for $3000$8–$3000$9 and $10500$0–$10500$1 for selected benchmarks, highlighting tight posteriors.

Benchmark Validation and Application Spectrum

Single OB-Star and Binary Benchmarks

SATURN is validated against established NLTE studies of five single OB stars and one DEB, reproducing literature parameters and abundances within reported uncertainties. Notably, uncertainties are frequently reduced compared to prior work. For DEB components, parameter errors for fainter secondaries are naturally larger unless external constraints (from photometric, orbital analysis) are enforced.

Model fits to benchmark DEB spectra (e.g., HD 259135/V578 Mon) demonstrate precise recovery of both blended and separated features, properly attributing features to primary/secondary and handling line strengths and profiles across a dynamic range (Figure 4).

Figure 4: Observed versus global best-fitting model for the DEB HD 259135, decomposing primary and secondary star contributions.

Analysis of Rapid Rotators and Additional Systems

SATURN is demonstrated on high $10500$2 systems such as $10500$3 Oph (HD 149757, $10500$4 km/s), successfully fitting heavily blended profiles (Figure 5) and recovering $10500$5, $10500$6, helium, and CNO abundances with physically consistent results.

Figure 5: Spectrum/model comparison for $10500$7 Oph confirming robust performance in the extreme blending regime of very rapid rotators.

Additional tests include blue supergiants, subgiants, and chemically normal late B-types, with consistently excellent agreement between SATURN-derived parameters and expectations from stellar evolution and independent Gaia distances.

Fast-rotating bright giant/supergiant HD 93827 (Figure 6) and the DEB HD 77464 (CV Vel; Figure 7) are also robustly recovered.

Figure 6: Model fit quality for fast-rotating bright giant/supergiant HD 93827, highlighting capability in severe blending and low S/N regimes.

Figure 7: Composite spectrum fit for the DEB HD 77464 (CV Vel), clearly decomposing both stellar components and their relative contributions.

Performance at Lower Spectral Resolution

By rerunning analyses at resolutions $10500$8–$10500$9 (simulating WEAVE/4MOST pipelines), the authors demonstrate that SATURN maintains parameter accuracy for quality spectra, though the number of measurable elements decreases at high rotational broadening and with declining S/N. Abundance uncertainties increase due to greater line blending but remain competitive with prior work.

Numerical Results and Contradictory Claims

Key numerical findings: SATURN achieves atmospheric parameter uncertainties of $T_\mathrm{eff}$ 01% ( $T_\mathrm{eff}$ 1) and $T_\mathrm{eff}$ 20.05 dex ( $T_\mathrm{eff}$ 3) at $T_\mathrm{eff}$ 4 for high S/N spectra, and abundance errors $T_\mathrm{eff}$ 50.1 dex in nearly all elements analyzed. For DEBs, use of external constraints (e.g., photometric $T_\mathrm{eff}$ 6, $T_\mathrm{eff}$ 7 ratios) substantially tightens errors for both components, and flux ratio fits are consistent with geometric and physical expectations. For rapid rotators with substantial gravity darkening/g-deformation, parameters reflect globally "averaged" values with explicit caveats regarding systematic uncertainties.

Limitations and Theoretical Implications

Limitations arise for chemically peculiar stars (He-strong, HgMn, Bp, Am, Ap), where atmospheric structures or surface compositions are not covered by training grids, and in cases with severe atomic diffusion. Models for contact/interacting binaries (where spectra cannot be decomposed as sums of single-star models) are excluded. For classical Be stars, circumstellar emission and non-spherical geometries are not considered, and for stars with $T_\mathrm{eff}$ 8, gravity darkening effects are not modeled.

Theoretically, SATURN's integration of high-fidelity NLTE spectral emulation with MCMC-driven full-spectrum fitting represents a significant step towards scalable, physically interpretable model analysis for spectroscopic survey pipelines. The method is substantially more flexible and comprehensive than label-driven, data-driven regression or shallow grid-interpolation techniques, inherently supporting rapid retuning for future expansions in chemical parameter space, physics (e.g., expanded NLTE atoms), or adapted to other types/hrydrogen/helium-rich stars.

Broader Implications and Future Prospects

Practically, with rapid spectrum emulation (10 ms per model), SATURN provides a viable route for automated, precision-driven OB-star analysis at survey scale (tens of thousands of spectra), directly supporting the science objectives of current and upcoming facilities. The emulation approach is extensible to broader wavelength regimes, additional spectral types (e.g., A stars, hot subdwarfs), and supports prompt public release once model grid development stabilizes.

The approach also sets the stage for the integration of more sophisticated neural architectures (e.g., transformer-based emulators), efficient uncertainty quantification, and systematic propagation of atomic data uncertainties into stellar parameter and abundance inferences. As future surveys deliver millions of spectra, physically consistent, high-dimensional emulators will be essential for extracting unbiased constraints on stellar populations, galactic evolution, and chemical enrichment.

Conclusion

This work demonstrates a robust methodology—SATURN—for quantitative NLTE spectroscopy of OB-type stars using machine learning emulators trained on large, physically motivated NLTE grids. Benchmarked against the literature and proven on diverse systems including rapid rotators, binaries, and lower-resolution data, the framework attains both accuracy and efficiency unattainable with traditional computational approaches. The integration of neural emulation and automated MCMC analysis positions SATURN as a strong candidate for deployment in large-scale spectroscopic survey pipelines and motivates further theoretical and computational innovations in the analysis of hot star spectra (2604.03082).

Markdown Report Issue