SWEET-Cat & MAISTEP: Stellar Host Analysis
- SWEET-Cat and MAISTEP are complementary tools that deliver high-precision stellar parameters crucial for exoplanet demographic studies.
- SWEET-Cat provides uniform spectroscopic measurements while MAISTEP uses grid-based machine learning to infer stellar masses, radii, and ages with robust uncertainty estimates.
- Their combined application refines exoplanet population analysis, such as the radius valley, supporting insights into atmospheric evolution and host star demographics.
SWEET-Cat is a homogeneous, continually maintained spectroscopic catalog of stellar parameters for planet-hosting stars, while MAISTEP is a grid-based machine learning tool for inferring fundamental stellar properties—including radius, mass, and age—from atmospheric observables. Used in combination, these resources enable precise and consistent host-star characterization, which is essential for exoplanet demographic studies, such as resolving the occurrence and parameter dependence of the exoplanet radius valley. Below follows a comprehensive overview of their construction, methodology, and scientific impact.
1. SWEET-Cat: Homogeneous Spectroscopic Stellar Parameters
SWEET-Cat (Stars With ExoplanETs Catalogue) provides a high-precision, uniform database of key spectroscopic parameters (effective temperature , logarithmic surface gravity , and metallicity [Fe/H]) and masses for stars known to host exoplanets (Santos et al., 2013, Kamulali et al., 18 Jan 2026). The catalog is designed to enable robust statistical studies of planet populations by minimizing heterogeneity in host-star properties, which can otherwise hinder cross-sample comparisons or demographic inferences.
Principles and Methodology
- Target Selection: Stars identified in exoplanet.eu as exoplanet hosts by radial velocity, transit, or astrometry methods; direct-imaging, microlensing, and degenerate-timing hosts are typically excluded.
- Spectral Types: Primary focus on FGK dwarfs and subgiants, but includes cool giants, evolved stars, and M dwarfs (the latter via empirical color or index calibrations).
- Spectroscopic Pipeline: Atmospheric parameters and masses are derived under LTE using:
- Automatic Equivalent Width (EW) measurement from high-, high-S/N spectra (e.g., ARES code).
- Forced excitation/ionization balance via MOOG with ATLAS9 model atmospheres.
- Key equilibrium conditions (in LaTeX):
- Excitation equilibrium:
- Microturbulence equilibrium:
- Ionization equilibrium:
- Mass Determination: Initial mass estimates via the Torres relation,
corrected to Padova isochrone scale:
Uncertainties are propagated through Monte Carlo sampling.
- Validation: Cross-matching with fundamental scale benchmarks (asteroseismology, interferometry), and analysis of systematics across literature catalogs.
Catalog Content and Uncertainties
The database (currently covering 4200 planet hosts) provides the following fields per star: identifiers, coordinates, magnitude, parallax, , , microturbulence, [Fe/H], stellar mass, error estimates, and provenance. Typical errors:
- : K
- : dex
- [Fe/H]: dex
- : $5$–$10$\% (Santos et al., 2013, Kamulali et al., 18 Jan 2026)
2. MAISTEP: Grid-Based Machine Learning Stellar Parameter Inference
MAISTEP (Machine learning Algorithm for Inferring STEllar Parameters) is a tool that infers stellar radius, mass, and age using only atmospheric inputs—, [Fe/H], and luminosity —meeting the need for robust parameter estimation when asteroseismic constraints are unavailable (Kamulali et al., 4 Feb 2025, Kamulali et al., 18 Jan 2026).
Model Construction
- Training Data: Synthetic data from MESA evolutionary tracks spanning , , to $2.4$, from ZAMS to core-H exhaustion. Sampling along each track is optimized for uniform coverage.
- Features: Raw , [Fe/H], and (derived via Gaia DR3 parallax, magnitude, bolometric corrections, and extinction), directly ingested by tree-based algorithms without additional scaling.
Machine Learning Architecture
Four base regressors—Random Forest (RF), Extra Trees (XT), XGBoost, and CatBoost—are independently optimized (Optuna, 50 trials, 10-fold cross-validation) and stacked via non-negative least squares combination: with the prediction of regressor on sample .
Uncertainty Estimation
Given observational errors in , [Fe/H], and , each star is realized 10,000 times in a Monte Carlo ensemble; the median and quartile-based bounds of the resulting parameter distributions define the output value and uncertainty.
Performance Metrics
Validation against asteroseismic samples (APOKASC, LEGACY) demonstrates negligible bias and competitive scatter:
- Radius: (bias), (scatter) vs. APOKASC; , vs. LEGACY
- Mass: , (APOKASC); , (LEGACY)
- Age: , (APOKASC); , (LEGACY) (Kamulali et al., 4 Feb 2025)
3. SWEET-Cat and MAISTEP: Combined Application for Exoplanet Science
The power of SWEET-Cat and MAISTEP emerges most strikingly in joint applications, where homogeneous spectroscopy feeds into MAISTEP's inference framework for host-star parameters, which are then propagated into planetary property determination and exoplanet population studies (Kamulali et al., 18 Jan 2026).
Workflow Integration
- Data Matching: SWEET-Cat’s and [Fe/H], combined with Gaia luminosities, are supplied to MAISTEP. For any existing SWEET-Cat table, ages and radii can be appended via a database join on star identifier or coordinates.
- Propagation: The uncertainties on input parameters from both pipelines can be combined to produce robust final confidence intervals.
Case Study: The Exoplanet Radius Valley
In (Kamulali et al., 18 Jan 2026), SWEET-Cat and MAISTEP are used to recompute radii, masses, and ages for over 1,200 main-sequence planet-hosting stars, producing a sample for refined analysis of the "radius valley"—a deficit of planets near thought to demarcate the transition between super-Earths and sub-Neptunes. With stellar radii determined to precision and propagating to in , the valley is revealed as deeper and partially filled, with trends quantified as follows:
- Period dependence: in versus
- Flux dependence:
- Stellar mass dependence: Slope (stronger for sub-Neptunes)
- Age dependence: (super-Earth to sub-Neptune ratio) rises from for Gyr to for Gyr; the valley becomes broader and shifts to larger with increasing age, consistent with gradual atmospheric loss.
A multidimensional fit yields: with , showing the age trend is weaker but significant (Kamulali et al., 18 Jan 2026).
4. Scientific Implications of Precision Host-Star Characterization
The joint use of SWEET-Cat and MAISTEP has direct impact on key exoplanetary problems:
- Atmospheric loss diagnostics: Trends with period, flux, and mass support both photoevaporation and core-powered mass loss as valley-sculpting processes. The presence of gigayear-scale valley evolution and rising with age strongly suggests an important role for long-lived envelope loss mechanisms, favoring core-powered mass loss (Kamulali et al., 18 Jan 2026).
- Exoplanet demographics: Enhanced precision in host properties enables detection of subtle demographic features, resolving previous ambiguities due to parameter systematic errors.
- Giant-planet host ages: Application of MAISTEP to Jupiter-mass planet hosts reinforces prior findings: hosts of hot Jupiters are statistically younger (median 1.98 Gyr) than those of warm or cold Jupiters (medians 2.98 and 3.51 Gyr, respectively), consistent with tidal decay and evolutionary scenarios (Kamulali et al., 4 Feb 2025).
5. Future Directions and Expansion
Continued growth and refinement of SWEET-Cat—especially as new high-resolution spectroscopic surveys expand the catalog—and enhancements in MAISTEP’s evolutionary grid and ML architecture are expected. Further improvement in precision and age accuracy is anticipated from upcoming missions such as PLATO, which will yield direct age and radius benchmarks via asteroseismology across broader stellar populations (Kamulali et al., 18 Jan 2026).
A plausible implication is that future studies leveraging SWEET-Cat and MAISTEP will provide increasingly robust constraints on exoplanet formation and evolution scenarios, as systematic uncertainties in host parameters are suppressed to the few-percent level at scale.
References:
- "MAISTEP -- a new grid-based machine learning tool for inferring stellar parameters I. Ages of giant-planet host stars" (Kamulali et al., 4 Feb 2025)
- "Revisiting the exoplanet radius valley with host stars from SWEET-Cat" (Kamulali et al., 18 Jan 2026)
- "SWEET-Cat: A catalogue of parameters for Stars With ExoplanETs I. New atmospheric parameters and masses for 48 stars with planets" (Santos et al., 2013)