Machine Learning Supports Existence of Previously Unrecognized Transient Astronomical Phenomena in Historical Observatory Images
Abstract: Transient, star-like point sources that appear and vanish over short timescales are described in astronomical images prior to launch of Sputnik. We have reported that transient numbers diminish significantly in Earth's shadow (shadow deficit) and are more likely within (plus/minus) one day of nuclear testing (nuclear window). These findings remain debated with some arguing that transients identified via existing automated pipelines are simply plate defects. Therefore, we use ML to enhance transient identification accuracy and validate the phenomenon. The model was trained against 250 transient image pairs taken 30 minutes apart that were classified as real versus plate defect by expert visual review; the model demonstrated good discrimination (out-of-fold AUC$=$0.81; sensitivity$=$0.71, specificity$=$0.71). After deployment in a dataset of 107,875 previously-identified transients, the model assigned each a probability of being real. After controlling for ML-identified artifacts, transient counts were significantly elevated for dates within a nuclear window (p$=$.024); transients with the highest probability of being real were more likely to occur within a nuclear window (p$<$.0001). The shadow deficit was significant (p$<$.0001) and largest in the highest probability transients relative to lower probability transients (p$=$.003). Results strongly support existence of an unrecognized population of transient objects in historical astronomical plates warranting further study.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Explain it Like I'm 14
What this paper is about (big picture)
The paper looks at strange, very short-lived “points of light” that show up on old space photos from the 1950s, then disappear. These photos were taken long before the first satellite (Sputnik) was launched. Some scientists think many of these “transients” are just flaws on the old photographic plates (like dust or scratches), not real sky objects. This study uses machine learning (a kind of computer pattern-recognition) to sort likely real transients from plate defects and then asks: do the most “real-looking” ones show patterns that make sense in the real world?
What questions the researchers asked
They focused on two simple questions:
- Are these brief, star-like “transients” real objects, not just defects on old photos?
- If they are real, do they behave in ways we’d expect from real objects? Specifically:
- Do we see fewer of them when they would be in Earth’s shadow (where they couldn’t shine by reflecting sunlight)? The team calls this the “shadow deficit.”
- Do more of them appear around the dates of U.S. above-ground nuclear tests (within one day before or after), a timing pattern reported in earlier work?
If the strongest “real-looking” transients show both patterns more clearly, that would support the idea that at least some transients are genuine.
How they studied it (explained simply)
- Historical images: The team started with 107,875 “transient” candidates found by an earlier automated search in old Palomar Observatory (POSS-I) photographic plates taken between 1949 and 1957.
- What’s a photographic plate? Think of an old-fashioned camera using glass plates instead of digital sensors. Over decades, plates can get dust, scratches, or chemical spots—these are “plate defects” that can look like tiny stars.
- Training a machine to tell “real” from “fake”:
- They assembled a small training set of 250 image pairs taken 30 minutes apart, each containing at least one red-plate transient.
- An expert astronomer labeled each as “likely real transient” or “plate defect” by eye.
- The machine learning model looked at 23 simple, image-based features (things like how round or sharp the spot looks, how bright it is, how noisy the plate is, how close it is to the plate’s edge, etc.).
- The model combined several tree-based algorithms and learned to estimate, for each candidate, a probability that it is real.
- How good was the model?
- Area under the curve (AUC) ≈ 0.81. (Quick translation: 0.5 = guessing; 1.0 = perfect. So 0.81 is “good.”)
- Sensitivity ≈ 0.71 (it finds many of the real ones), specificity ≈ 0.71 (it rejects many of the fakes).
- Applying the model:
- They ran the model on all 107,875 candidates and gave each a “realness” probability.
- They sorted candidates into 10 equal groups (“deciles”) from lowest to highest probability of being real.
- Checking real-world patterns:
- Earth’s shadow (“shadow deficit”): If some transients are sun glints from reflective objects high above Earth, we should see fewer when those objects are inside Earth’s shadow (where sunlight can’t reach them). The team used a 3D geometry model to figure out which sky positions would be shadowed as seen from Palomar.
- Nuclear test “window”: They checked if more transients appeared on nights within ±1 day of U.S. above-ground nuclear tests (mostly at Nevada Test Site, relatively close to Palomar).
- To judge if patterns were meaningful and not random, they used statistical tests that shuffle labels (called permutation tests) and binomial tests. Think of it like repeatedly mixing up the calendar labels to see how often a pattern as strong as the real one would appear just by chance.
What they found (main results)
- Many candidates are probably not real:
- Only about the top 20% of candidates had more than a 66% chance of being real, and only the top 10% approached 80% or higher. This means the original automated catalog likely contains lots of plate defects. That’s not surprising for such old data—and it shows why cleaning with machine learning helps.
- Strongest “real-looking” transients show the clearest real-world patterns:
- Shadow deficit is largest in the highest-probability group:
- The top decile (the most “real-looking” 10%) had the fewest transients in Earth’s shadow, a much bigger shortage than expected by chance. This matches the idea that at least some are reflective objects that need sunlight to be seen.
- More transients around nuclear test dates:
- Even after down-weighting likely defects (using the model’s probabilities), dates within the ±1-day “nuclear window” showed significantly more transients than expected by chance.
- The effect was strongest on the day of the test and the night before (remember the plates were taken at night and tests usually in the morning, so “the night before” is actually closest in time to the test event).
- The highest-probability group again showed the strongest increase.
- Clustering:
- Among the highest-confidence transients, they sometimes appeared in pairs or small groups on the same night and close together in the sky, which is interesting and may hint at shared causes.
Why these results matter
- The machine learning model’s success (AUC 0.81) argues strongly that not all transients are plate defects. If everything were just random dust and scratches, a model trained on expert labels wouldn’t do this well.
- Seeing the biggest “shadow deficit” and the strongest nuclear-test timing signal specifically in the most “real-looking” transients supports the idea that at least a subset are genuine physical phenomena, not just imaging artifacts.
- The shadow deficit is consistent with shiny objects at high altitude reflecting sunlight—objects you wouldn’t see when they’re in Earth’s shadow.
- The nuclear-test timing pattern is unusual and tightly timed, which makes a simple scheduling coincidence less likely. It doesn’t by itself explain what the objects are, but it suggests some physical link worth investigating.
What this could mean (and what it doesn’t)
- Implications:
- There may be a previously unrecognized group of short-lived, point-like events in historical sky images—potentially reflective objects high above Earth observed before the official start of the satellite era.
- Machine learning can “clean” historical astronomical data, making old plate archives more useful for studying fast events in the sky.
- Caution and limits:
- The model was trained on expert opinions, not an absolute gold standard (because none exists for this old data). Still, the performance was solid, and the real-world patterns back it up.
- The paper doesn’t claim a definite identity for these objects. It discusses possibilities (for example, unknown reflective objects in orbit) but leaves interpretation open.
- More independent checks using other archives from the same era are needed to confirm these patterns.
In short: By teaching a computer to tell likely real flashes from old-photo blemishes, the authors show that the clearest, most “real” transients line up with two physical expectations—fewer when in Earth’s shadow and more around U.S. nuclear test dates. That combination makes it more likely that at least some of these mysterious, short-lived lights were real phenomena, not just flaws, and it opens the door to new research using historical sky images.
Knowledge Gaps
Knowledge gaps, limitations, and open questions
The following list enumerates specific gaps, limitations, and open questions left unresolved by the study that future research could address.
- Lack of an objective gold standard for transient vs. defect labeling; expert labels on only 250 pairs were used without quantifying inter-rater reliability, label noise, or systematic bias.
- No post-deployment human validation of top-decile candidates; the true precision (PPV) and false discovery rate for the highest-probability transients remain unknown.
- Training set size and diversity are limited; it is unclear whether the 250 curated exemplars adequately represent the full variety of plate defects and real phenomena in the 107,875-candidate catalog.
- Probability outputs are uncalibrated (isotonic calibration collapsed); the reliability of probability scores across deciles is uncertain and needs robust calibration (e.g., Platt scaling, Dirichlet calibration) on an independent validation set.
- The model uses only red-plate features and excludes cross-band (red–blue) comparisons or spectral/color information that could materially improve classification and reduce false positives.
- Potential covariate leakage: SHAP results indicate plate-level features (e.g., plate quality, SNR statistics) strongly drive predictions; the model may learn plate conditions rather than source-specific morphology. Ablation tests removing plate-level predictors are needed.
- Domain shift risk: model trained on a small curated subset but deployed on a large, heterogeneous catalog; generalization under varying plate scanners, emulsions, epochs, and sky regions is unverified.
- Operating thresholds are not principled; deciles are used for convenience, but ROC/PR-based threshold selection tied to specific precision/recall targets is missing.
- Shadow-deficit inference assumes geostationary orbit (GEO) altitude and specular reflections producing point sources over 50-minute exposures; the altitude assumption is untested and may not fit sub-second phenomena.
- No sensitivity analysis for shadow modeling across altitudes (LEO/MEO/GEO) or mixed altitude distributions; the observed deficit could change substantially under different orbital assumptions.
- Penumbra vs. umbra modeling choices are not stress-tested; the deficit’s sensitivity to Earth-shadow model parameters, observatory location uncertainties, and time-stamping accuracy is not quantified.
- Control expectations for the shadow fraction assume uniform sky positions; potential confounding from nonuniform airmass, sky brightness, Milky Way crowding, or plate-specific coverage is not fully accounted for.
- Spatial/temporal uncertainty is ignored in the shadow classification; coordinate errors and plate timing uncertainties are not propagated into deficit estimates.
- The nuclear-window definition (±1 day) is assumed rather than optimized; robustness to alternative windows (e.g., ±12 h, ±6 h) and precise test-time alignment with plate exposure times is not evaluated.
- Multiple lag tests (−3 to +3 days) are reported without correction for multiple comparisons; it is unclear if results remain significant under stringent FDR/Bonferroni control across lags.
- Statistical choices (one-tailed tests) raise concerns about confirmatory bias; two-tailed tests and preregistered analysis plans would strengthen inference.
- Small number of nuclear-window dates (26/370) limits power; a formal power analysis and bootstrap uncertainty for effect size estimates are missing.
- Seasonal and scheduling confounders are under-modeled; analyses do not control for lunar phase, weather, airmass, plate quality, sky brightness, observatory scheduling, or solar activity that could covary with test dates.
- Only U.S. nuclear tests were analyzed; extension to global tests (Soviet, UK, French) with distance-to-observatory weighting and time-of-day alignment is needed to probe mechanistic plausibility.
- Mechanism linking nuclear tests to transients is unspecified; hypotheses (RF emissions, ionospheric disturbances, geomagnetic effects, luminous phenomena) require direct testing against contemporaneous physical indices (Kp/Ap, Dst, ionosonde data, RF logs, lightning networks).
- Causality remains unaddressed; the study establishes correlations but provides no causal model or falsifiable predictions that differentiate alternative mechanisms.
- The absence of streaks during long exposures implies sub-second flashes, but timing is not measurable from plates; there is no effort to constrain durations via plate photometry (e.g., latent-image grain statistics) or modern time-resolved follow-ups.
- Photometric calibration is not performed; aperture fluxes are not converted to magnitudes, leaving brightness distributions, completeness limits, and potential selection biases uncharacterized.
- Color information is unused; differences between red and blue plates (taken ~30 minutes apart) could reveal spectral behavior or atmospheric contaminants (e.g., airglow, aurora), but are not exploited.
- Alternative explanations are not systematically excluded; coordinated checks against meteor databases, aircraft flight logs (1949–1957), lightning/sprite reports, cosmic ray incidence, or lab/scanner artifacts are absent.
- Spatial clustering of high-probability transients (doublets/triplets) is described but not statistically tested against Poisson expectations; alignment with anti-solar direction or orbital planes is not assessed.
- Catalog cross-match radius (5 arcsec) may miss faint or high-proper-motion counterparts; deeper catalogs, proper motion compensation, and variability catalogs should be tested to refine “no counterpart” claims.
- Scanning and digitization artifacts may vary by plate batch; replication using independent rescans or direct inspection of original glass plates is needed to rule out scanner-specific defects.
- Shadow-deficit magnitude is not compared to physical models; forward modeling of specular glint rates from hypothetical orbital populations (area, reflectance, orientation distributions) is needed to quantify expected deficits.
- ML model class balance (134 real, 116 defect) and sampling strategy are not detailed; impact of class imbalance and label noise on AUC, sensitivity, specificity, and probability scores is unquantified.
- Ensemble choice is not benchmarked; comparisons with CNNs on cutouts, self-supervised or anomaly-detection methods, and image-level generative modeling could improve robustness and interpretability.
- Probability-weighted counts drive the nuclear association, yet probability calibration and model uncertainty are not propagated; Bayesian or bootstrap approaches should quantify uncertainty in weighted counts.
- The pipeline is not tested on other archival surveys (e.g., POSS-II, UKST, ESO, Sonneberg) from the same era; cross-observatory replication is essential to validate generality.
- Geographic proximity argument (Nevada vs. Palomar) is qualitative; quantitative modeling of distance-dependent signals (if any) and atmospheric transport timescales is missing.
- The study speculates about early unpublicized satellites or non-human technosignatures but offers no discriminating tests; articulating falsifiable predictions and targeted observational strategies is necessary.
- Data and code availability are partial; detailed preprocessing, feature extraction, and plate selection protocols need full transparency for independent reproduction.
- Uncertainty quantification for shadow and nuclear effects is limited; confidence intervals for observed fractions, effect sizes, and ratios (not just p-values) should be reported.
- Robustness to catalog revisions is untested; re-running the pipeline against updated Gaia/Pan-STARRS releases and enhanced artifact filters could change the “no counterpart” classification rates.
- Plate-aware control uses 100 random positions per plate; sensitivity to control sample size and generation scheme (e.g., stratified by local sky properties) is not examined.
- Exact time stamps for plates (start/end of exposure) are not explicitly used; precise temporal alignment relative to nuclear test times is needed to refine “day-before” and “day-of” interpretations.
Practical Applications
Immediate Applications
The following applications can be deployed now using the paper’s ML methods, data hygiene workflow, and geometric modeling, with notes on sectors and practical dependencies.
- ML triage and quality control for historical sky surveys (academia, observatories, archives)
- Use the released ensemble model and feature set (catalog-, plate-, and FITS-level) to assign probabilities to candidates and prioritize expert review to the top deciles, cutting manual workload by ~80–90%.
- Tools/products/workflows: a CLI/API that ingests FITS and catalog metadata and outputs per-candidate probabilities and SHAP diagnostics; integration into Virtual Observatory pipelines.
- Assumptions/dependencies: access to plate FITS data and plate metadata; model retraining for new archives; current training labels are expert-judgment based (250 exemplars), so performance may degrade under domain shift.
- Digitization QA for photographic plates and film (cultural heritage, libraries, scanning vendors)
- Repurpose the classifier to flag dust, hair, scratches, and emulsion defects on scanned media, improving automated restoration and reducing false content detections.
- Tools/products/workflows: a plug-in for scanning software that auto-tags likely artifacts and produces per-image QA scores based on plate-level features.
- Assumptions/dependencies: requires small, domain-specific labeled sets to retune the model and features for non-astronomical media.
- Cleaned time-domain science from archival plates (academia, time-domain astronomy)
- Apply the probability-weighted counts to re-mine historical surveys for novae, M-dwarf flares, pulsar optical spikes, asteroid/KBO detections, and rare fast transients with reduced false positives.
- Tools/products/workflows: probability-weighted candidate catalogs; dashboards highlighting clustered events (doublets/triplets) in high-probability deciles for targeted follow-ups.
- Assumptions/dependencies: availability of plate sequences with timing metadata; model recalibration per archive.
- Observation planning to minimize satellite glints (observatories, amateur astrophotography)
- Use the paper’s 3D topocentric Earth-shadow/penumbra model to schedule exposures in low-illumination geometries that suppress glints, improving data cleanliness.
- Tools/products/workflows: a planner that ingests site coordinates/time and outputs per-field glint likelihood based on penumbral geometry.
- Assumptions/dependencies: glint rates correlate with solar illumination geometry; modern LEO constellations differ from GEO assumption—model should allow altitude parameterization.
- Methodological template for event-association studies (academia, geoscience, epidemiology, economics)
- Adopt date-level permutation tests and probability-weighted counts to assess temporal associations while avoiding clustering biases (e.g., linking environmental events to sensor anomalies).
- Tools/products/workflows: reusable notebooks implementing permutation testing and decile-stratified analyses.
- Assumptions/dependencies: accurate timestamping; sufficient event days to achieve statistical power.
- Citizen science vetting with ML guidance (education, public engagement)
- Present volunteers with higher-probability candidates first and surface SHAP explanations to teach artifact-vs-signal cues, increasing throughput and training value.
- Tools/products/workflows: Zooniverse-style interfaces with decile filters and annotation exports.
- Assumptions/dependencies: UI integration and curation; periodic expert spot checks to guard against model drift.
- Cross-team data hygiene standards for archival imaging (software, research IT)
- Institutionalize the paper’s “multi-scale features + explainability (SHAP) + decile binning” pattern as a QA standard for digitized imagery projects.
- Tools/products/workflows: SOPs and code templates for feature extraction at item- and batch-level, model audit reports for governance.
- Assumptions/dependencies: staff familiarity with scikit-learn/XGBoost/LightGBM; versioned data pipelines.
- Initial policy-relevant analytics for historical event correlations (history of science, defense studies)
- Apply the paper’s event study design to test whether historical observatory anomalies concentrate around documented exogenous events (e.g., weapons tests), informing archival re-examination and contextualization.
- Tools/products/workflows: reproducible pipelines pairing observatory logs with public event registries.
- Assumptions/dependencies: strong caution in interpretation; correlation ≠ causation; sensitivity to confounders remains.
Long-Term Applications
These opportunities require further research, replication, scaling, or productization before broad deployment.
- Global reprocessing of archival plates with ML (academia, space agencies, archives)
- Build a Global Historical Transient Catalog by scanning and ML-vetting plates from multiple observatories (1940s–1990s), standardizing metadata, and cross-matching with modern catalogs.
- Tools/products/workflows: cloud-scale ETL, cross-archive feature harmonization, active-learning loops to improve labels.
- Assumptions/dependencies: international data-sharing; sustained funding; robust annotation programs beyond the initial 250 examples.
- Purpose-built fast-imaging validation campaigns (instrumentation, observatories)
- Deploy high-frame-rate optical sensors and coordinated multi-site observations to confirm sub-second transient populations suggested by the plates.
- Tools/products/workflows: low-latency transient pipelines, automated follow-up triggers, glint-discriminating photometry.
- Assumptions/dependencies: mitigating modern satellite contamination; synchronized timing; dedicated telescope time.
- Space situational awareness from historical glints (aerospace, defense)
- Use ML-cleaned archival detections and shadow modeling to reconstruct historical orbital object populations/glint statistics, informing SSA models and debris evolution studies.
- Tools/products/workflows: data fusion with declassified tracking logs; Bayesian inference over altitude/orbit given illumination geometry.
- Assumptions/dependencies: validation against ground-truth is limited for pre-Sputnik era; sensitive policy context.
- Technosignature/UAP research framework (academia, SETI)
- Establish standardized criteria, pipelines, and multi-modal corroboration protocols for transient glints as potential technosignatures, with rigorous artifact suppression and temporal/geometric controls.
- Tools/products/workflows: registries of candidate events with reproducibility metadata; joint optical–RF campaigns.
- Assumptions/dependencies: high evidentiary standards; replication across independent archives and sensors.
- Observatory schedulers integrating glint/penumbra models (software for observatories)
- Create automated schedulers that incorporate dynamic Earth-shadow, Sun–object–observer geometry, and constellation traffic to minimize contamination in time-domain programs.
- Tools/products/workflows: plugins for observatory operations (e.g., TOM systems), APIs for sky visibility/glint risk.
- Assumptions/dependencies: accurate satellite ephemerides and altitude-aware glint models; real-time weather integration.
- Brightness mitigation policy and standards for megaconstellations (policy, aerospace)
- Inform standards on reflectivity and orientation control using refined illumination/shadow modeling to predict and limit glints visible to surveys.
- Tools/products/workflows: simulation suites for constellation operators and regulators; compliance metrics for sky brightness.
- Assumptions/dependencies: industry buy-in; alignment with IAU/IAAS/UNOOSA guidelines.
- Cross-domain artifact-detection SaaS (healthcare imaging, geospatial, industrial NDT)
- Generalize the “candidate + batch features + explainability + probability triage” stack to radiology digitization, satellite imagery QC, and non-destructive testing to reduce false alarms.
- Tools/products/workflows: domain-tuned models delivered as APIs; auditor-facing SHAP dashboards.
- Assumptions/dependencies: domain-specific labeled datasets and regulatory validation (e.g., in healthcare).
- Unsupervised/self-supervised models for artifact suppression in archives (ML R&D)
- Reduce dependence on scarce expert labels via contrastive learning and anomaly detection tuned to plate statistics, improving transfer across archives.
- Tools/products/workflows: pretraining on large unlabeled plate corpora; few-shot fine-tuning protocols.
- Assumptions/dependencies: adequate compute; careful evaluation to avoid amplifying biases.
- Probabilistic event-study toolkits for policy and finance (methods transfer)
- Package the paper’s permutation testing and probability-weighting approach for robust event studies (e.g., assessing regulatory announcements, natural disasters) without parametric assumptions.
- Tools/products/workflows: open-source libraries with templates for lag analyses and clustering-robust inference.
- Assumptions/dependencies: appropriate mapping of probability weights to event likelihoods in each domain.
- Ethical and communication frameworks for controversial correlations (science policy, education)
- Develop best practices for communicating uncertainty, separating signal from artifacts, and avoiding overinterpretation when findings intersect sensitive topics (e.g., weapons testing).
- Tools/products/workflows: training modules, disclosure checklists, and replication mandates in archival-data studies.
- Assumptions/dependencies: institutional adoption; alignment with journal and agency policies.
Notes on cross-cutting assumptions and dependencies:
- Label quality and volume: The current model is trained on 250 expert-labeled examples; broader deployment benefits from larger, multi-expert, consensus-labeled sets and inter-rater checks.
- Generalization: Features and thresholds tuned on POSS-I red plates may not transfer directly to other surveys or media; expect recalibration.
- Physical modeling: Shadow/penumbra analyses assume GEO altitude for interpretability; real objects may span altitudes, so altitude-aware modeling is recommended for operational use.
- Interpretability: SHAP provides useful diagnostics for model trust, but users should monitor for feature drift and update models as archives and workflows change.
- Reproducibility: The released code and methods should be run under version-controlled environments with documented data provenance to ensure consistent outputs.
Glossary
- Anti-solar direction: The direction opposite the Sun from a given observing location, used to determine shadow geometry relative to the observer. "using the topocentric anti-solar direction as seen from the coordinates of Palomar Observatory,"
- Aperture flux: The total measured light within a defined aperture around a source. "PSF Full Width at Half Maximum (PSF FWHM), ellipticity, sharpness, connected pixel count, aperture flux, distance to plate edge, symmetry score, gradient magnitude, proximity to bright star, and FITS-measured SNR."
- AUC: Area Under the ROC Curve; a measure of binary classifier performance, with 1.0 perfect and 0.5 random. "area under the curve (AUC) value of 0.81 +/- 0.04 across 5-fold cross- validation."
- Binomial test: A statistical test for proportions comparing observed successes to a theoretical expectation. "using a one- sided binomial test."
- Bonferroni correction: A multiple-comparisons adjustment that lowers the significance threshold to control family-wise error. "a Bonferroni-corrected significance threshold of p < 0.005 was applied."
- Complete linkage: A hierarchical clustering linkage criterion using the maximum distance between clusters. "(complete linkage, 15-degree threshold)"
- EarthShadow model: A computational model to calculate Earth’s shadow geometry for orbiting objects. "we employed the Nir et al. 2D EarthShadow model (https://github.com/guynir42/earthshadow)"
- Ellipticity: A shape measure describing how elongated a source appears (0 circular, 1 elongated). "red_ellipticity = Red FITS: ellipticity of the source (0 = circular, 1 = elongated);"
- Ensemble classifier: A model that combines multiple base learners to improve predictive performance. "The ensemble ML classifier detailed in this study combined four tree-based models (XGBoost, Random Forest, Gradient Boosting, LightGBM), each trained with 300 trees and identical hyperparameters, with final classification predictions based on the unweighted mean of the four models' predicted probabilities."
- FITS: Flexible Image Transport System; a standard file format for storing astronomical images and metadata. "Finally, the model included 10 morphometric features identified by the ML model in the red FITS images themselves:"
- FWHM (Full Width at Half Maximum): A width measure of a peak (e.g., PSF) at half its maximum amplitude. "PSF Full Width at Half Maximum (PSF FWHM)"
- Gaia DR3: The third data release of the Gaia astrometric catalog of stars. "A final criterion for classifying an object as a transient was that there were no optical counterparts either in PanStarrs DR1 or Gaia DR3 at less than 5 arcsec"
- Geostationary orbit (GEO): A circular equatorial orbit where a satellite remains fixed over one longitude on Earth. "geostationary orbit altitude, 35,786 km (GEO)."
- Geosynchronous orbit: An Earth orbit with a period equal to one sidereal day (may be inclined or elliptical). "Earth's geometric shadow at geosynchronous orbit altitude"
- Gradient Boosting: A boosting ensemble method that builds models sequentially to correct predecessor errors. "The ensemble ML classifier detailed in this study combined four tree-based models (XGBoost, Random Forest, Gradient Boosting, LightGBM), each trained with 300 trees and identical hyperparameters, with final classification predictions based on the unweighted mean of the four models' predicted probabilities."
- IAU 1976 formulae: International Astronomical Union standard formulas for precession and related astrometric transformations. "with candidate coordinates precessed from j2000.0 to the observation epoch using the IAU 1976 formulae."
- Isotonic probability calibration: A non-parametric method to calibrate predicted probabilities to observed frequencies. "Isotonic probability calibration was attempted but collapsed the probability distribution;"
- J2000.0: The standard astronomical epoch starting at noon on January 1, 2000, used as a reference for coordinates. "with candidate coordinates precessed from j2000.0 to the observation epoch"
- LightGBM: A gradient boosting framework using tree-based learning, optimized for speed and efficiency. "The ensemble ML classifier detailed in this study combined four tree-based models (XGBoost, Random Forest, Gradient Boosting, LightGBM), each trained with 300 trees and identical hyperparameters, with final classification predictions based on the unweighted mean of the four models' predicted probabilities."
- Mann-Whitney U test: A non-parametric test comparing ranks between two independent samples. "A confirmatory non-parametric Mann-Whitney U test was also conducted"
- Monte Carlo: A computational technique using random sampling to estimate expected values or distributions. "plate-aware Monte Carlo control expectation of 0.692%"
- Morphometric features: Quantitative descriptors of an object’s shape, size, and intensity profile. "Seven catalog-level morphometric features were included: signal-to-noise ratio (SNR), point spread function (PSF) ratio, elongation, compactness, sharpness, number of comparison stars, and candidate score (described in Solano et al.9)."
- Nuclear window: A predefined time window around nuclear tests used to assess temporal associations with transients. "As in our prior work\", the nuclear testing variable was again a nuclear testing window reflecting whether each date fell within one day of any nuclear weapons test (test date +/- 1 day)."
- Palomar Observatory Sky Survey (POSS-I): A mid-20th-century photographic survey of the sky conducted from Palomar Observatory. "Transient, star-like phenomena exhibiting point spread functions have been identified by comparing sequential images over short timescales in the Palomar Observatory Sky Survey (POSS-I) and other historical sky surveys2,8 2,8,9,11-14."
- PanStarrs DR1: The first data release of the Pan-STARRS optical sky survey catalog. "A final criterion for classifying an object as a transient was that there were no optical counterparts either in PanStarrs DR1 or Gaia DR3 at less than 5 arcsec"
- Penumbra: The partial shadow region where the Sun is partially obscured, relevant for illumination of orbiting objects. "and it accounted for the impact of Earth's penumbra on shadow deficit results."
- Permutation test: A non-parametric significance test using label shuffling to build a null distribution for a test statistic. "using a non-parametric permutation approach (i.e., no distributional assumptions) with observation dates as the independent unit of analysis"
- Plate defects: Non-astronomical artifacts on photographic plates (e.g., dust, scratches, emulsion flaws) that can mimic sources. "plate defects (e.g., emulsion errors, dust, scratches)"
- Point spread function (PSF): The response of an imaging system to a point source, describing how a point of light is distributed on the detector. "Transient, star-like phenomena exhibiting point spread functions have been identified by comparing sequential images over short timescales"
- Precession: The slow change in the orientation of Earth’s rotational axis affecting celestial coordinates over time. "with candidate coordinates precessed from j2000.0 to the observation epoch"
- Random Forest: An ensemble learning method using many decision trees with random feature and data sampling. "The ensemble ML classifier detailed in this study combined four tree-based models (XGBoost, Random Forest, Gradient Boosting, LightGBM), each trained with 300 trees and identical hyperparameters, with final classification predictions based on the unweighted mean of the four models' predicted probabilities."
- Sensitivity: The true positive rate of a classifier (probability of detecting actual positives). "sensitivity (True Positive / True Positive + False Negative)"
- SHAP values: Shapley Additive exPlanations; feature attribution values indicating each predictor’s contribution to model output. "Shapley (SHAP) values quantifying relative contributions of each predictor to the final ML model are displayed in Figure 1."
- Signal-to-noise ratio (SNR): A measure comparing the level of a desired signal to the background noise level. "Seven catalog-level morphometric features were included: signal-to-noise ratio (SNR), point spread function (PSF) ratio, elongation, compactness, sharpness, number of comparison stars, and candidate score"
- Specificity: The true negative rate of a classifier (probability of correctly identifying negatives). "specificity (True Negative / True Negative + False Positive)"
- Specular reflections: Mirror-like reflections from smooth surfaces, causing brief glints from orbiting objects. "to the extent that transients represent orbital objects exhibiting specular reflections."
- Supervised learning: An ML approach where models are trained on labeled data to learn input–output relationships. "specifically a supervised learning approach,"
- Topocentric: Relative to the observer’s location on Earth, used for geometries like shadow and direction calculations. "The primary model was the 3D topocentric penumbra,"
- Two-proportion z-test: A statistical test comparing two proportions (e.g., rates in two groups). "using a two-proportion z-test."
- VASCO v4 catalog: A catalog from the VASCO project used for candidate features in transient detection. "The ML model included 23 predictors extracted from the red FITS images and the VASCO v4 catalog."
- XGBoost: An efficient gradient-boosted tree algorithm widely used for tabular ML tasks. "The ensemble ML classifier detailed in this study combined four tree-based models (XGBoost, Random Forest, Gradient Boosting, LightGBM), each trained with 300 trees and identical hyperparameters, with final classification predictions based on the unweighted mean of the four models' predicted probabilities."
Collections
Sign up for free to add this paper to one or more collections.