| Issue |
A&A
Volume 704, December 2025
|
|
|---|---|---|
| Article Number | A70 | |
| Number of page(s) | 17 | |
| Section | Stellar structure and evolution | |
| DOI | https://doi.org/10.1051/0004-6361/202555875 | |
| Published online | 01 December 2025 | |
Unsupervised learning for variability detection with Gaia Data Release 3 photometry
The main sequence–white dwarf valley
1
Department of Astrophysics/IMAPP, Radboud University, P.O. Box 9010 6500 GL Nijmegen, The Netherlands
2
Instituut voor Sterrenkunde, KU Leuven, Celestijnenlaan 200D, 3001 Leuven, Belgium
3
Astrophysics group, Department of Physics, University of Surrey, Guildford GU2 7XH, United Kingdom
4
Departament de Física Quàntica i Astrofísica, Institut de Ciêncies del Cosmos, Universitat de Barcelona, Martí i Franquès 1, E-08028 Barcelona, Spain
5
Department of Astronomy, University of Cape Town, Private Bag X3, Rondebosch 7701, South Africa
6
South African Astronomical Observatory, P.O. Box 9 Observatory 7935, South Africa
7
The Inter-University Institute for Data Intensive Astronomy, University of Cape Town, Private Bag X3, Rondebosch 7701, South Africa
8
Hamburger Sternwarte, University of Hamburg, Gojenbergsweg 112, 21029 Hamburg, Germany
9
Texas Tech University, Department of Physics & Astronomy, Box 41051 79409 Lubbock, TX, USA
10
Max Planck Institute for Astronomy, Königstuhl 17, 69117 Heidelberg, Germany
⋆ Corresponding author: princy.ranaivomanana@ru.nl
Received:
9
June
2025
Accepted:
26
October
2025
Context. The unprecedented volume and quality of data from space- and ground-based telescopes present an opportunity for machine learning to identify new classes of variable stars and peculiar systems that may have been overlooked by traditional methods. The region between the main sequence and white-dwarf sequence in the colour-magnitude diagram (CMD) hosts a variety of astrophysically valuable and poorly characterised objects, including hot subdwarfs, pre-white dwarfs, and interacting binaries.
Aims. Extending prior methodological work, this study investigates the potential of the unsupervised learning approach to scale effectively to larger stellar populations, including objects in crowded fields, and without the need for pre-selected catalogues. Specifically, it focuses on 13 405 sources selected from Gaia DR3 and lying in the selected region of the CMD.
Methods. Our methodology incorporates unsupervised clustering techniques based primarily on statistical features extracted from Gaia DR3 epoch photometry. We used the t-distributed stochastic neighbour embedding algorithm to identify variability classes, their subtypes, and spurious variability induced by instrumental effects. Feature importance was evaluated using SHapley Additive exPlanations values to identify the most influential parameters associated with each cluster.
Results. The clustering results revealed distinct groups, including hot subdwarfs, cataclysmic variables (CVs), eclipsing binaries, and objects in crowded fields, such as those in the Andromeda (M31) field. Several potential stellar subtypes also emerged within these clusters, such as pulsating hot subdwarfs exhibiting pure or hybrid (pressure and/or gravity) modes within the HSD cluster. Magnetic CVs and dwarf novae appeared in the CV cluster. Feature evaluation further enabled the identification of a cluster dominated purely by photometric variability, as well as clusters associated with instrumental effects and crowded fields. Notably, objects previously labelled as RR Lyrae were found in an unexpected region of the CMD, potentially due to either unreliable astrometric measurements (e.g. due to binarity) or alternative evolutionary pathways.
Conclusions. This study emphasises the robustness of the proposed method in finding variable objects in a large region of the Gaia CMD, including variable hot subdwarfs and CVs, while demonstrating its efficiency in detecting variability in extended stellar populations. The proposed unsupervised learning framework demonstrates scalability to large datasets and yields promising results in identifying stellar subclasses.
Key words: methods: data analysis / methods: statistical / techniques: photometric / surveys / subdwarfs / stars: variables: general
© The Authors 2025
Open Access article, published by EDP Sciences, under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
This article is published in open access under the Subscribe to Open model. Subscribe to A&A to support open access publication.
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.