| Issue |
A&A
Volume 703, November 2025
|
|
|---|---|---|
| Article Number | A242 | |
| Number of page(s) | 15 | |
| Section | Numerical methods and codes | |
| DOI | https://doi.org/10.1051/0004-6361/202556339 | |
| Published online | 20 November 2025 | |
Interpreting the detection of anomalies in SDSS spectra
1
Departamento de Física, Universidad de Antofagasta
Avenida Angamos 601,
Antofagasta,
Chile
2
Université Côte d’Azur, Observatoire de la Côte d’Azur, CNRS, Laboratoire Lagrange,
06000
Nice,
France
★ Corresponding authors: edgar.ortiz@ua.cl; mederic.boquien@oca.eu
Received:
10
July
2025
Accepted:
29
September
2025
Context. The increasing use of machine-learning methods in astronomy introduces important questions about interpretability. The complexity and nonlinear nature of machine-learning methods means that it can be challenging to understand their decision-making process, especially when applied to the detection of anomalies. While these models can effectively identify unusual spectra, it remains a great challenge to interpret the physical nature of the flagged outliers.
Aims. We aim to bridge the gap between an anomaly detection and the physical understanding by combining deep learning with interpretable machine-learning (iML) techniques to identify and explain anomalous galaxy spectra from SDSS data.
Methods. We present a flexible framework that uses a variational autoencoder to compute multiple anomaly scores, including physically motivated variants of the mean-squared error. We adapted the iML LIME algorithm to spectroscopic data, systematically explored segmentation and perturbation strategies, and computed explanation weights that identified the features that are most likely to cause a detection. To uncover population-level trends, we normalized the LIME weights and applied clustering to 1% of the most strongly anomalous spectra.
Results. Our approach successfully separated instrumental artifacts from physically meaningful outliers and grouped anomalous spectra into astrophysically coherent categories. These include dusty metal-rich starbursts, chemically enriched H II regions with moderate excitation, and extreme emission-line galaxies with a low metallicity and hard ionizing spectra. The explanation weights agree with established emission-line diagnostics and enable a physically grounded taxonomy of spectroscopic anomalies.
Conclusions. Our work shows that an interpretable anomaly detection provides a scalable, transparent, and physically meaningful approach to exploring large spectroscopic datasets. Our framework opens the door for incorporating interpretability tools into quality control, follow-up targeting, and discovery pipelines in current and future surveys.
Key words: methods: data analysis / methods: miscellaneous / methods: statistical
© The Authors 2025
Open Access article, published by EDP Sciences, under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
This article is published in open access under the Subscribe to Open model. Subscribe to A&A to support open access publication.
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.