Lomb-Scargle Periodogram Explained
- Lomb-Scargle periodogram is a spectral analysis tool that detects periodic signals in unevenly sampled data using a weighted least-squares sinusoidal fit.
- It introduces phase offsets to decorrelate sine and cosine components, yielding statistically rigorous significance measures under white Gaussian noise assumptions.
- Advanced variants like GLS and BGLS extend the method with floating means and robust noise handling, enabling efficient analysis in large-scale astronomical surveys.
The Lomb-Scargle periodogram (LSP) is a foundational spectral analysis tool for detecting and characterizing periodic signals in unevenly sampled time series, especially prevalent in astronomical applications where observational cadences are irregular. It generalizes classical Fourier-based periodograms to accommodate gaps, variable errors, and nonuniform time coverage, providing statistically meaningful period estimates and significance metrics under well-defined noise assumptions.
1. Mathematical Foundations and Derivation
The standard LSP is derived as a least-squares solution to the fit of a single-frequency sinusoid to an unevenly sampled dataset. Given measurements , where are observation times, are observed values, and are (Gaussian) uncertainties, the measurement model is: Weights are defined as . To decorrelate the sinusoidal basis, a phase offset is introduced such that the weighted cross-terms vanish: The essential LSP power statistic at frequency is then: 0 with 1 as the weighted mean. Under the null hypothesis of white Gaussian noise, 2 is exponentially distributed at each frequency: 3 (Vio et al., 2013, Vio et al., 2018).
2. Statistical Properties and Assumptions
The periodogram formalism relies on several core assumptions:
- The noise 4 is white, zero-mean, and Gaussian.
- The signal, if present, is well-modeled as a sinusoid at each frequency.
The phase offset 5 ensures weak-sense decorrelation between the fitted sine and cosine components, even under uneven sampling (VanderPlas, 2017, Vio et al., 2013). The false-alarm probability (FAP) is controlled via the exponential tail of 6, e.g., for 7 independent frequencies: 8 where 9 is the effective number of independent frequencies, often 0 for modest irregularity (Vio et al., 2018).
If the signal mean is nonzero or the data contain a secular trend, spurious power can appear throughout the spectrum. Remedies include subtracting the mean or adopting a generalized ("floating-mean") formulation, in which a constant offset is refitted at each frequency (Mortier et al., 2014, Tejas et al., 2018, Pasumarti et al., 2024).
3. Generalizations: Weighted, Floating-Mean, and Bayesian Formulations
Generalized Lomb-Scargle
The generalized Lomb-Scargle periodogram (GLS) fits the model 1 at each 2, where 3 is a frequency-dependent floating mean. The reduction in 4 from the constant model is computed as: 5 where 6, 7, 8, and 9 are frequency-shifted, weighted sums of the data, and 0 is the total weighted variance about the mean (Pasumarti et al., 2024, Dhaygude et al., 2019).
Bayesian Generalized Lomb-Scargle (BGLS)
The BGLS marginalizes the Gaussian likelihood over 1, 2, and 3 (and optionally trend parameters), yielding a posterior for each frequency: 4 where 5, 6, and 7 are explicit combinations of the weighted sums (see Sect. 2.2 of (Mortier et al., 2014)). This formulation ensures positive-definite probability densities, robustly quantifies relative likelihoods between frequency hypotheses, and reduces to the classical GLS in the limit of large datasets and flat priors (Mortier et al., 2014, Olspert et al., 2017).
Bayesian generalizations can be further extended to include linear trends (BGLST), Gaussian priors on nuisance parameters, and heteroscedastic noise, yielding closed-form marginal likelihoods for model selection and period estimation even in the presence of red noise or secular trends (Olspert et al., 2017).
4. Computational Algorithms and Efficiency
The computational cost for a full periodogram scales as 8, where 9 is the number of data points and 0 the number of trial frequencies. Direct evaluation is typically tractable (1 seconds for 2, 3) (Mortier et al., 2014, Gowanlock et al., 2021). Fast algorithms leverage non-uniform FFT (NUFFT) routines to accelerate large-scale searches:
- Brute-force LSP: 4, suitable for moderate sample sizes.
- Press & Rybicki extirpolation+FFT: 5 (Garrison et al., 2024).
- State-of-the-art NUFFT (e.g., finufft, nifty-ls): 6, with relative errors down to 7–8 at double precision and orders-of-magnitude speedups on CPU and GPU (Garrison et al., 2024, Gowanlock et al., 2021, Townsend, 2010).
Optimal frequency grids oversample the Rayleigh resolution (step 9 for total time baseline 0) by a factor 4–5 to avoid missing narrow peaks (Mortier et al., 2014, Pasumarti et al., 2024). Data gaps, irregular cadence, and varying errors are natively handled by direct summation; FFT-based acceleration requires special treatment.
5. Limitations, Robustness, and Extensions
The classical LSP assumes the true signal is sinusoidal and noise is stationary, white, and Gaussian. In non-Gaussian or heavy-tailed situations, the standard 1 least-squares approach is sensitive to outliers and tail events; robust periodograms using the 2 (sum-of-absolute residuals) norm are more resilient, though at increased computational cost (nonlinear minimization or linear programming at each frequency) (Makarov et al., 2024).
The LSP is sub-optimal for non-sinusoidal periodicities, as the signal power is distributed over higher harmonics, reducing detection efficiency compared to multi-harmonic, template-matched, or Bayesian model comparison approaches (Lin et al., 20 May 2025). For colored ("red") noise, the exponential null distribution is invalid, and significance estimation requires explicit noise modeling (e.g., via Whittle likelihoods and frequency-dependent FAPs) (Ejaz et al., 20 Dec 2025).
Multiple-frequency and multiband generalizations extend the model space to 3-dimensional "omnigrams": fits to arbitrary bases, joint frequency searches, and applications in high-dimensional time-series pipelines (VanderPlas et al., 2015, Scargle et al., 8 Jan 2026).
6. False-Alarm Probability, Significance, and Practical Recommendations
FAP can be estimated analytically (for white noise) or via bootstrapping (resampling residuals), especially in the regime of correlated or red noise where analytic formulae are invalid:
- For 4 independent frequencies, 5 for an LSP peak of height 6 (Vio et al., 2013).
- The number of independent frequencies should be estimated (via spectral window/correlation analysis) to adjust FAP for oversampling and window effects (Vio et al., 2018, Lu et al., 2022).
- In practical cases, bootstrap resampling or Monte Carlo simulation of the periodogram under the null is recommended to calibrate significance (Mortier et al., 2014, Dhaygude et al., 2019, Pasumarti et al., 2024).
For trend- or offset-contaminated data, Bayesian or GLS periodograms with simultaneous offset/trend fitting are preferred to pre-detrending, which can distort genuine long-period signals (Mortier et al., 2014, Olspert et al., 2017). For large surveys or real-time contexts, optimized GPU/NUFFT implementations are necessary to maintain tractability (Gowanlock et al., 2021, Garrison et al., 2024, Townsend, 2010).
7. Applications and Software Implementations
The LSP and its generalizations have become central to exoplanet radial-velocity searches, stellar rotation analysis, survey time-domain pipelines, and a host of astrophysical variability studies. Widely used software implementations include:
- astropy.stats.LombScargle and scipy.signal.lombscargle (Python): support classical, generalized, and Bayesian evaluations, GPU acceleration via
nifty-ls, and robust FAP via built-in methods (Garrison et al., 2024, Mortier et al., 2014). - nifty-ls: direct integration with Astropy, leveraging the
finufft/cufinufftbackends for high-throughput, high-precision calculations (Garrison et al., 2024). - Custom, vectorized or low-level C/Fortran codes for maximum performance in large-scale survey processing (Gowanlock et al., 2021, Townsend, 2010).
- Bayesian and robust 7 extensions for heavy-tailed or systematically contaminated datasets (Makarov et al., 2024).
A detailed end-to-end workflow involves: selection of frequency grid; computation of weighted, time-shifted sums; model normalization; FAP calibration (via analytic, bootstrap, or red-noise approaches); and post hoc model selection among candidate frequencies or composite hypotheses (multiharmonic, template, Bayesian model comparison) (Mortier et al., 2014, Ejaz et al., 20 Dec 2025, Scargle et al., 8 Jan 2026).
References:
- Mortier et al., "BGLS: A Bayesian formalism for the generalised Lomb-Scargle periodogram" (Mortier et al., 2014)
- Olspert et al., "Estimating activity cycles with probabilistic methods I. Bayesian Generalised Lomb-Scargle Periodogram with Trend" (Olspert et al., 2017)
- Dhaygude & Desai, "Generalized Lomb-Scargle analysis of 8Cl decay rate measurements at PTB and BNL" (Dhaygude et al., 2019)
- Pasumarti & Desai, "Generalized Lomb-Scargle Analysis of 22 years of Super-Kamiokande solar 9B neutrino data" (Pasumarti et al., 2024)
- Lin et al., "Lomb-Scargle periodograms struggle with non-sinusoidal supermassive BH binary signatures in quasar lightcurves" (Lin et al., 20 May 2025)
- Ejaz et al., "Red noise-based false alarm thresholds for astrophysical periodograms via Whittle's approximation to the likelihood" (Ejaz et al., 20 Dec 2025)
- Makarov et al., "Robust 1-norm periodograms for analysis of noisy non-Gaussian time series with irregular cadences" (Makarov et al., 2024)
- Gowanlock et al., "Fast Period Searches Using the Lomb-Scargle Algorithm on Graphics Processing Units for Large Datasets and Real-Time Applications" (Gowanlock et al., 2021)
- Townsend, "Fast Calculation of the Lomb-Scargle Periodogram Using Graphics Processing Units" (Townsend, 2010)
- VanderPlas & Ivezić, "Periodograms for Multiband Astronomical Time Series" (VanderPlas et al., 2015)
- Scargle, J. D. (1982).
- Zechmeister & Kürster, "The generalised Lomb-Scargle periodogram" (2009)
This summary synthesizes technical details of the LSP, variants, and computational strategies, focusing on rigor and research-driven best practices.