| Issue |
A&A
Volume 708, April 2026
|
|
|---|---|---|
| Article Number | A224 | |
| Number of page(s) | 15 | |
| Section | Catalogs and data | |
| DOI | https://doi.org/10.1051/0004-6361/202553826 | |
| Published online | 08 April 2026 | |
Stellar flare detection in XMM-Newton with gradient-boosted trees
1
INAF IASF-Milano,
Via Alfonso Corti 12,
20133
Milano,
Italy
2
Ciela, Computation and Astrophysical Data Analysis Institute,
Montreal,
Quebec,
Canada
3
Département d’Informatique, École Normale Supérieure, Université PSL (Paris Sciences & Lettres),
Paris,
France
4
Trinity College, University of Cambridge,
Cambridge,
UK
5
IUSS Pavia,
Piazza della Vittoria 15,
27100
Pavia,
Italy
★ Corresponding author: This email address is being protected from spambots. You need JavaScript enabled to view it.
Received:
20
January
2025
Accepted:
31
May
2025
Abstract
Context. The EXTraS project, based on data collected with the XMM-Newton observatory, provides us with a vast amount of light curves for X-ray sources. For each light curve, EXTraS also provides us with a set of features. From the EXTraS database, we extracted a tabular dataset of 31, 832 variable sources based on 108 features. Of these, 13, 851 sources were manually labeled as stellar flares or non-flares based on direct visual inspection.
Aims. We employed a supervised learning approach to produce a catalog of stellar flares based on our dataset, subsequently releasing it to the community. We leveraged explainable AI tools and interpretable features to better understand our classifier.
Methods. We trained a gradient-boosting classifier on 80% of the data, which had labels available. We computed the permutation feature importance scores, visualized the feature space using UMAP, and analyzed some false positive and false negative data points with the help of Shapley additive explanations. Specifically, we used it to measure the importance of each feature in determining the classifier’s prediction for each instance.
Results. On the test set made up of the remainder 20% of our labeled data, we obtained an accuracy of 97.1%, with a precision of 82.4% and a recall of 73.3%. Our classifier outperforms a simple criterion based on fitting the light curve with a flare template and significantly surpasses a gradient-boosted classifier trained only on model-independent features. False positives appear to be related to flaring light curves that are not associated with a stellar counterpart, while false negatives often correspond to multiple flares or otherwise peculiar or noisy curves.
Conclusions. We applied our trained classifier to currently unlabeled sources, leading to the compilation and release of the largest catalog of X-ray stellar flares to date. We estimated that integrating our classifier into the astronomers’ workflow will reduce the time spent on visually inspecting light curves by approximately half, compared to an approach based on flare template fitting. This holds implications for the classification of sources whose variability is less well established within EXTraS as well as for other catalogs and, possibly, forthcoming missions.
Key words: stars: activity / stars: flare / X-rays: binaries / X-rays: bursts / X-rays: general / X-rays: stars
© The Authors 2026
Open Access article, published by EDP Sciences, under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
This article is published in open access under the Subscribe to Open model. This email address is being protected from spambots. You need JavaScript enabled to view it. to support open access publication.
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.