Open Access
Issue
A&A
Volume 703, November 2025
Article Number A32
Number of page(s) 11
Section Catalogs and data
DOI https://doi.org/10.1051/0004-6361/202555838
Published online 30 October 2025

© The Authors 2025

Licence Creative CommonsOpen Access article, published by EDP Sciences, under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

This article is published in open access under the Subscribe to Open model. Subscribe to A&A to support open access publication.

1 Introduction

The CoRoT mission (Auvergne et al. 2009; Baglin et al. 2007) enabled groundbreaking studies across various astrophysical domains (e.g., Lanza et al. 2009; Léger et al. 2009; Alencar et al. 2010; Ferreira Lopes et al. 2015b; Chiappini et al. 2015). Among its most notable achievements were the discovery of the first Earth-like rocky planet (Léger et al. 2009; Queloz et al. 2009) and the first close-in brown dwarf with a measured radius (Bouchy et al. 2011; Csizmadia et al. 2015). The high-quality, homogeneous photometric data from CoRoT have enabled significant advancements in the study of variable stars, including RR Lyrae stars (Chadid et al. 2010), eclipsing binaries (Dolez et al. 2009; Maciel et al. 2011), and rotational and pulsating variable stars (De Medeiros et al. 2013). Key findings include the identification of 4106 CoRoT sources with semi-sinusoidal signatures (De Medeiros et al. 2013), a catalogue of 1978 stars with periodicities (Affer et al. 2012), and the discovery of 418 γ Doradus, 274 hybrid γ Doradus/δ Scuti candidates (Hareter 2012), as well as 1428 pulsating M giants (Ferreira Lopes et al. 2015c). Additionally, 74 δ Scuti and hybrid stars were analysed to advance the understanding of A–F-type stars (Moya et al. 2017). CoRoT data have also contributed significantly to the study of stellar magnetic activity cycles. For example, García et al. (2010) identified a solar-like activity cycle in the main-sequence star HD 49933 using asteroseismic analysis of CoRoT light curves. Similarly, Ferreira Lopes et al. (2015b) reported potential new activity cycles in a sample of 16 FGK main-sequence stars observed by CoRoT.

However, a comprehensive catalogue of variable stars with well-defined CoRoT light curves is still lacking. Indeed, catalogues containing a large number of false classifications or poorly selected sources are typically excluded from major variable star repositories such as SIMBAD or the VSX database, which prioritise reliable and well-vetted data. For instance, Ferreira Lopes et al. (2020) conducted a comprehensive selection of variable star candidates using multiple variability indices, producing a catalogue of 45 million candidates. However, many of these candidates likely suffer from poor-quality signals, particularly for very faint or very bright stars that often lack trustworthy photometric variability. Similarly, Debosscher et al. (2007, 2009) classified CoRoT data using Fourier models of the light curves, but shifts in baseline levels often led to unreliable parameters and false classifications. Although the authors provided probability flags to improve target selection, a complete list of high-quality variable stars was not made publicly available. Likewise, Gavras et al. (2023) compiled a comprehensive catalogue of variable stars cross-matched with Gaia data; however, only about 22% of the variable stars identified in this work are included in that compilation.

The CoRoT database has also served as a testbed for developing new methods to extract and classify information from stellar light curves (LCs) (Ferreira Lopes et al. 2018). For instance, Damiani et al. (2016) provided the first spectral and luminosity classification of CoRoT targets using broadband multi-colour photometry. Spectroscopic data have further enhanced the characterisation of CoRoT targets (e.g., Guenther et al. 2012; Sarro et al. 2013; Anders et al. 2017, 2020). However, the reliability of such classifications can be compromised if the light curves are affected by instrumental biases or discontinuities. To address this, various algorithms have been developed to correct CoRoT LCs (e.g., Mislis et al. 2010; De Medeiros et al. 2013; Leão et al. 2015), significantly improving data quality by minimising jumps, trends, and outliers. These improvements have enhanced the statistical robustness of the data, enabling more accurate confidence intervals and reducing over- or underestimation errors, yet a complete catalogue of CoRoT sources with reliable signals has not been produced.

These improvements have opened new avenues for exploring CoRoT data in different science cases, including the global characterisation of faint Be stars in the CoRoT exofield (Zorec et al. 2023), eclipsing binaries exhibiting the light-travel-time effect (Hajdu et al. 2022), the detection and characterisation of planets around intermediate-mass stars (Sebastian et al. 2022), and period fluctuations in CoRoT RR Lyrae stars (Benko 2016). However, the selection and characterisation of variable star candidates observed by CoRoT remain incomplete. Challenges arise when authors rely on such catalogues to select targets but fail to publish them, hindering efforts to compile and validate variable star discoveries. This lack of accessibility presents significant obstacles for cross-matching and validating classifications.

This work is the first in a series within the New Insight Into Time Series Analysis (NITSA) project, which aims to enhance time-series analysis techniques – particularly in the presence of level shifts – and to construct a comprehensive catalogue of variable stars with reliable signals from CoRoT data (Ferreira Lopes & Cross 2016, 2017; Ferreira Lopes et al. 2018, 2020, 2021). To this end, we used the latest version of the CoRoT database (Chaintreuil et al. 2016; Ollivier et al. 2016), which incorporates significant improvements over earlier releases, notably in mitigating instrumental jumps caused by temperature fluctuations or proton impacts on the CCDs. Despite these advances, a full selection and characterisation of the CoRoT variable star population has not yet been completed.

The paper is organised as follows: Section 2 introduces the moving average method (MAM) as a tool to address data quality issues. Section 3 describes the selection process for variable stars. Sections 4.1 and 4.2 detail the cross-matching with external catalogues and the classification of new targets. Section 5 presents the main findings, and Section 6 provides a summary of our conclusions.

2 CoRoT dataset

The CoRoT telescope observed 163 665 point sources distributed across 26 stellar fields in the faint star channel. Among these, 12 896 sources were observed in two separate campaigns, while 892 sources were observed in three. The first CoRoT field (IRa01) was monitored for 54.3 days, while the remaining fields were divided into short (S) and long (L) runs, with durations ranging from 5 to 52.3 days and 76.7 to 148.3 days, respectively. Observations were conducted with a cadence of either 512 seconds or 32 seconds, the latter being used for select oversampled targets (e.g., Auvergne et al. 2009). The survey also provided luminosity classes and spectral types for the observed stars, determined based on their positions in the colourmagnitude diagram (Damiani et al. 2016; Deleuil et al. 2009; Deleuil & Fridlund 2018). Notably, the SRc03 CoRoT run had the shortest time coverage (approximately 3.5 days), but around 85% of its stars were re-observed in subsequent runs.

A shift-level variation in a light curve refers to sudden changes in its mean value, typically caused by abrupt drops or increases in pixel sensitivity. These variations are often associated with instrumental issues or the impact of high-energy particles on the CCD (e.g., Auvergne et al. 2009). In CoRoT data, most perturbations arise from high-energy particle interactions with the CCD (see, e.g. the raw light curves in Fig. 1). To address this issue, we applied the moving average method (MAM), which computes averages over subsets of data within a defined time segment size (TSS). For evenly spaced data, the TSS parameter corresponds to the number of data points used in the calculation of the moving average. The MAM acts as a pre-whitening filter, attenuating signals with periods longer than the TSS (see Fig. 1) and effectively mitigating multiple jumps, outliers, and long-term trends (e.g. CoRoT 632896191 and 206076651 in Fig. 1). However, it does not specifically correct for abrupt shifts, which can introduce biases in segments of data with lengths comparable to the TSS.

Examples of CoRoT signals before (upper panels) and after (lower panels) applying MAM corrections are shown in Fig. 1. The effectiveness of the correction is particularly evident in long-period signals, such as for source 206076651, where the period is significantly longer than those of other targets. Additionally, the signal removed by the MAM may correspond to long-term trends, contamination from background stars (e.g., Ferreira Lopes et al. 2015c), or variability with periods longer than the TSS (see sources 102807225 and 205935266 in Fig. 1). This study includes the full CoRoT dataset, except the SRc03 subset, which was excluded due to its limited time coverage.

The MAM is a well-established technique that is straightforward to understand and implement. It effectively reduces biases in CoRoT data, aiding in the recovery of short-period signals (see Sect. 2.1). To assess the potential biases introduced by the MAM in CoRoT light curves, we conducted a series of simulations using representative signal types found in CoRoT data. These simulations enabled us to quantify the extent to which MAM modifies signal parameters and to evaluate its impact on the extracted light-curve characteristics.

2.1 Design of the simulations

To assess the impact of the smoothing process on time-series data, we generated synthetic light curves without noise or artificial jumps. A comprehensive simulation typically accounts for all possible variables, including jump amplitude, number of occurrences, signal phase, and the data behaviour before, during, and after each jump, as well as overlay complexity. Therefore, we opted for a more straightforward simulation approach. These simulations were designed to isolate the effect of the moving average and evaluate how the smoothing window (with a length of TSS) interacts with the intrinsic period of the signal (P). In this framework, the period becomes a dependent variable concerning TSS, and the key quantity is the TSS/P ratio. Therefore, the specific choice of period is not crucial, as long as the comparison is made between signals with equivalent TSS/P ratios. Since the moving-average filter only modifies the data locally over segments of length TSS, it primarily affects features with spatial scales comparable to or larger than TSS. The results obtained using the ratio between TSS length and period can be generalised to any period value. To quantify how TSS modifies the signal, we used the following parameters: χ(c)=i=1i=N|yi(o)yi(f)|i=1i=N|yi(o)|=i=1i=N|yi(c)|i=1i=N|yi(o)|$\[\chi^{(c)}=\frac{\sum_{i=1}^{i=N}\left|y_i^{(o)}-y_i^{(f)}\right|}{\sum_{i=1}^{i=N}\left|y_i^{(o)}\right|}=\frac{\sum_{i=1}^{i=N}\left|y_i^{(c)}\right|}{\sum_{i=1}^{i=N}\left|y_i^{(o)}\right|}\]$(1)

and χ(s)=σ(s)σ(o),$\[\chi^{(s)}=\frac{\sigma^{(s)}}{\sigma^{(o)}},\]$(2)

where yi(f)$\[y_{i}^{(f)}\]$ is the fit obtained using the MAM, yi(o)$\[y_{i}^{(o)}\]$ is the original data (measured in fluxes or magnitudes), σ(o) is the standard deviation of yi(o),yi(c)$\[y_{i}^{(o)}, y_{i}^{(c)}\]$ is the corrected data (i.e. yi(o)si$\[y_{i}^{(o)}-s_{i}\]$), and si is the correction factor calculated from the MAM (see red lines in Fig. 1). Equation (1) defines a normalised cumulative absolute deviation between the original and filtered signals. This metric is scale-invariant and captures the impact of the filtering process as a fraction of the original signal’s magnitude, allowing for direct comparisons of light curves with different amplitudes. In contrast, Equation (2) represents the ratio of standard deviations before and after applying the smoothing. This quantity provides a direct measure of the variability suppressed by the moving average, making it a valuable diagnostic for evaluating the influence of filtering on the time-domain structure of the signal.

The rate at which the distortion metrics χ(c) and χ(s) approach their asymptotic values is governed principally by the duty cycle1 of the variable star signal. For example, in detached and semidetached systems the flux is nearly constant for most of the orbital cycle (see Fig. 2 Y(EA) and Y(EB)), because eclipse events last only a short period of d = Δtecl/P ≲ 0.05–0.15 (e.g. Prša et al. 2011; Maciel et al. 2011; Carmo et al. 2020). When the smoothing window satisfies TSS ≳ dP, almost every position of the window is dominated by out-of-eclipse samples. Consequently, the moving average hardly alters the light curve and the cumulative-correction ratio χ(c) drops rapidly towards zero, while the standard-deviation ratio χ(s) tends to unity. On the other hand, pulsating stars, spotted rotators, and ellipsoidal variables (e.g. Ferreira Lopes et al. 2015b,c,a, 2021; Baeza-Villagra et al. 2025) exhibit brightness changes throughout the entire period (see Fig. 2 Y(Ceph), Y(RR), Y(RRblz), and Y(Rot)). In this case, every placement of the window mixes epochs of different flux, so the filter removes power over a broader Fourier range. The convergence of χ(c) is therefore slower, and χ(s) decreases monotonically with TSS/P until it reaches the white-noise floor.

thumbnail Fig. 1

CoRoT light curves before (upper panels) and after (bottom panels) applying moving average method (MAM). Phase-folded light curves are shown in the top right corner of each panel. The orange line represents the MAM with a time segment size (TSS) of one day, while the blue line corresponds to a TSS equal to the variability period. Each panel is labelled with the CoRoT ID, field, and period of variability at the top.

thumbnail Fig. 2

Folded CoRoT signals (left panels) and χ(c) and χ(s) parameters (see Eqs. (1) and (2)) as a function of the ratio between the TSS length and the period. The dashed black line sets the position where TSS = P; i.e. TSS/P = 1.

2.2 Simulation results

We conducted 106 simulations to investigate the correlation between the TSS factor and modifications in the time-series signal. The simulations considered typical CoRoT signals representing rotating variable stars (Y(Rot)), Beta Lyrae eclipsing binaries (Y(EB)), Algol eclipsing binaries (Y(EA)), pulsating stars (Y(Ceph), Y(RR), Y(RRblz)), and white noise (Y(Uniform) and Y(Normal)), as shown in the left panels of Figure 2 (see Ferreira Lopes et al. 2018, for more details). The synthetic signals were generated using sinusoidal functions with representative profiles for standard classes of variable stars, with periods spanning a range adequate to cover TSS/P ratios from below 0.05 to 10. The cadence and total timespan of the simulations were consistent with typical CoRoT cadence space-based photometry, but without introducing observational gaps or phase discontinuities. No CoRoT-like noise or instrumental effects were added, allowing us to isolate the mathematical impact of the smoothing process. Key findings from our simulations are as follows (see Figure 2):

  • The bias introduced by the MAM, computed over segments of length TSS, on a time series with period shorter than TSS (i.e., TSS/P > 1) decreases as the TSS/P ratio increases for all non-noise signals analysed. This trend is observed through the behaviour of the parameters χ(c) and χ(s), which tend towards one and zero, respectively. These results demonstrate that increasing the segment length effectively reduces the bias introduced by the moving-average method.

  • When TSS/P > 1, only small variations in the parameters χ(c) and χ(s) are observed, particularly for signals such as Y(Ceph), Y(RR), Y(RRblz), and Y(Rot).

  • For noisy data, represented by normal and uniform distributions, the values of χ(c) and χ(s) remain approximately constant. This indicates that the MAM does not significantly affect the statistical properties of noise-dominated light curves.

  • The rapid increase of χ(c) → 1 and χ(s) → 0 occurs more quickly for signals such as Y(EA) and Y(EB). This behaviour is explained by the quasi-constant flux outside the eclipses, where a larger number of points contribute to the moving average, particularly at higher TSS/P values.

In summary, the MAM only introduces a minor modification to the signal when TSS/P > 1, while the characteristics of noisy data remain largely unaffected across the entire range of TSS/P. With TSS = 1, we can effectively analyse CoRoT signals with periods shorter than one day. Since both short CoRoT runs (SR) and long (LR) CoRoT runs typically span more than 20 days, this further supports the use of TSS = 1 to access and analyse the full CoRoT dataset.

Each data point in the MAM calculation incorporates measurements within the TSS length. For a light curve with a duration of 20 days, approximately 10% of the data points computed by the MAM may exhibit bias in the case of a single jump. However, when additional observations before and after a jump are included in the calculation, the resulting averages become less biased and more representative of the true underlying signal (see lower right panel in Fig. 1). Consequently, the percentage of biased data points is expected to be lower in realistic scenarios involving multiple data segments.

It is important to note that signals with asymmetries exhibit smaller variations compared to the main signals, as shown in Fig. 2, and thus require more detailed analysis. Future papers in this series will investigate alternative methodologies for characterising variable stars with longer periods.

In this study, a one-day moving average was applied to the entire CoRoT dataset, and the generalised Lomb-Scargle periodogram (LSG; Lomb 1976; Scargle 1982; Zechmeister & Kürster 2009) was used to search for variability periods. The signal-to-noise ratio (e.g., De Medeiros et al. 2013; Ferreira Lopes et al. 2015b) was then computed to identify potential variable star candidates, as described in subsequent sections and illustrated in Fig. 3.

3 Sample selection and initial classification

Our initial sample was selected based on two parameters: the signal-to-noise ratio (SNR) and a period bias flag. We used a bin of ±10−7 days to estimate the number of stars sharing similar periods. If more than ten stars fell within this interval, the corresponding period was flagged as biased. This flag is listed as Flag(Period) in the catalogue (see Table 1). Based on these criteria, we obtained a sample of 10 196 stars with SNR > 2.5 and non-biased period, as shown in Fig. 3.

To simplify the visual inspection process, we propose a supervised selection method called the light-curve shape-selection method (LC-SSM). This method groups similar signals based on a template model, where signals within a given χ2 threshold are considered to have similar shapes. The LC-SSM requires two free parameters: the minimum χ2 value and the number of sources per template model (NS). However, it can be reduced to a single parameter by defining the minimum χ2 value as a function of the number of sources per template model. The LC-SSM is applied using the following procedure.

To simplify the visual inspection process, we propose a supervised selection method called the LC shape-selection method (LC-SSM). This method groups similar signals based on a template model, which serves as a reference light curve. Signals with a reduced chi-squared (χ2) below a defined threshold relative to the template model are considered to have similar shapes. The LC-SSM requires two free parameters: the minimum acceptable χ2 value and the number of sources (NS) assigned to each template model. However, this can be reduced to a single parameter by determining the χ2 threshold as a function of NS. The LC-SSM is implemented using the following procedure, as illustrated in Fig. 4:

  • A – the LC is folded using twice the period obtained from the LSG method (see Section 4.1 for more details).

  • B – the phase-template model, labelled MD0001. Although harmonic fits could be used, binning the phase diagram is faster and sufficiently accurate for well-sampled data. Step C is then repeated to assign all remaining sources that match MD0001.

  • E – if the number of sources assigned to MD0001 exceeds the user-defined threshold (NS), it is retained as a template model. Otherwise, the χ2 threshold is incremented, and steps C–D–E are repeated.

  • F – this procedure is repeated until a maximum χ2 threshold is reached or the number of unassigned sources falls below a minimum threshold.

The LC-SSM approach enables an efficient and systematic grouping of similar LC shapes, providing a time-effective alternative to the labour-intensive process of visual inspection (Ferreira Lopes et al. 2015a, 2020; Nikzat et al. 2022). For example, summarising a sample of approximately 10 000 sources into 175, 105, 90, and 90 template models (for NS = 20, NS = 40, NS = 60, and NS = 100, respectively) required 10.5, 8.2, 7.0, and 5.0 hours. A minimum χ2 value of 0.01 was adopted in our analysis, representing a trade-off between the number of template models and maintaining sufficient descriptive accuracy, as illustrated in Fig. 4 (e.g. MD0001 vs. MD0100). The analysis was performed on a machine equipped with an Intel® CoreTM i7-4510U CPU and 16 GB of RAM. Execution time could be further reduced with a more powerful machine or by optimising the code. For this study, we selected NS = 40, resulting in a total of 102 template models. All template models are normalised to the [0,1] interval, adopting the phase corresponding to the minimum magnitude as phase zero. The χ2 values computed between each pair of models exceed 0.01, which is the threshold used to distinguish them. Considering MD0001 as the reference model, the χ2 values relative to the other templates range from 0.018 to 0.38. This wide range highlights the diversity of LC shapes represented by the template models.

The sources were grouped according to these template models, and a visual inspection of the LCs and their folded diagrams was conducted side by side, considering both raw and processed data (see Fig. 1). This inspection process facilitated the efficient identification and classification of variable stars. The main conclusions are summarised below:

  • Approximately 9% of the sources in our initial sample were removed during the visual inspection process. These sources exhibited issues such as residual bounces, trends, or other data artefacts that rendered their SNR values unreliable.

  • The majority of the removed sources (~95%) belonged to the short CoRoT runs, and ~90% had SNR < 3. The shorter time coverage in these runs reduced the significance of detected signals, leading to a higher rate of misidentifications.

  • All sources associated with models where χ2 < 1.0 × 10−1 exhibited reliable signals (i.e. signals confirmed in the light curves and not affected by data issues). Higher χ2 values were typically found for signals affected by data artifacts, resulting in uncommon shapes. This outcome is expected, as the χ2 value quantifies the similarity between the signal and the associated model.

  • Sources not assigned to any of the 102 template models were generally linked to data quality issues or unusual signal patterns. Approximately 2% of these were due to incorrect periods or atypical signals. These cases may be of interest for follow-up, as they could represent rare phenomena such as double-mode pulsation or eclipsing binaries with strong asymmetries or reflection effects (see Fig. 6).

  • Folding the LC templates using twice the period computed by the LSG method (see Sect. 2) helped reconstruct the correct signal shape for eclipsing binaries, as the LSG method frequently identifies half the true period as dominant (e.g., Papageorgiou et al. 2018; Ferreira Lopes et al. 2021; Christopoulou et al. 2022). This approach ensured proper classification of eclipsing binaries and preserved two cycles for other variable types. Template models such as MD0040, MD0043, MD0049, and MD0096 included these cases.

  • Visual inspection of sources grouped by template models significantly streamlined and accelerated the classification process compared to inspecting all sources individually.

  • Approximately 95% of reliable signals could be identified without visual inspection by selecting only sources with χ2 ≲ 1.5 × 10−1.

These observations highlight the effectiveness of the LC-SSM approach. As a result, a final catalogue of 9272 stars was selected based on the criteria discussed above. This catalogue, known as the CoRoT-CVSP (CoRoT Catalogue of Short-Period Variable Stars), provides valuable information on the identified variable stars. The main details of the catalogue are presented in Table 1, which can be accessed through the portal of the Centre de Données astronomiques de Strasbourg (CDS). A useful parameter can be derived from the template model to determine whether the correct variability period is the true period or its harmonic. The χF2$\[\chi_{F}^{2}\]$ parameter is defined as the standard deviation of yiyi+N/2, where N is the number of components used in the templates model (N = 200 in our case). This parameter serves as a period flag by comparing the symmetry between the two halves of the phase diagram.

For single-period LCs, where both sides of the phase diagram are symmetric, χF2$\[\chi_{F}^{2}\]$ ≃ 0, as no asymmetry is present. However, for double-period LCs, where the two halves of the LC differ, χF2$\[\chi_{F}^{2}\]$ increases with signal asymmetry. By applying a threshold of χF2$\[\chi_{F}^{2}\]$ > 5.0 × 10−2, we identified 39 models for which the true variability period is likely twice the folded period.

To provide this information, we introduced a flag, denoted as FP, which classifies template models as either single period (FP = 1 or double period (FP = 2). However, for EW and ELL stars, the symmetry of their light curves may not be fully captured by the χF2$\[\chi_{F}^{2}\]$ value alone. Thus, relying solely on χF2$\[\chi_{F}^{2}\]$ may not always yield an accurate classification for these cases.

We found that most stars assigned to template models MD0013 and MD0036 belong to the EW type and were flagged as double-period models (FP = 2). It is important to note that some template models, such as MD0046, may contain both EW and pulsating stars and were not flagged as FP = 2. In total, 1529 sources were flagged as FP = 2, indicating that their period should be doubled.

We also investigated the sources located in the biased period regions (see Fig. 3) with Flag(Period) > 10, which correspond to periods of ~1.40 h, ~0.85 h, ~0.57 h, and ~24.07 h, and their harmonics. The periods of ~24 h, ~12 h, and ~6 h were previously reported by Degroote et al. (2009) as being caused by instrumental drift. However, that study was based on a sample of 100 stars without variability signatures, whereas many sources in these period ranges exhibit clear variability signal with smooth phase diagrams. Approximately 96% of the light curves in these regions come from only one LR field (out of 17) and seven SR fields (out of 8), with ~42% of them belonging to the LRa04 field. The remaining sources are distributed among SR fields as follows: SRa02 (~20%), SRC02 (~12%), SRa04 (~7%), SRC03 (~6%), SRa01 (~4%), SRa03 (~3%), and SRa01 (~2%). Among these fields, only sources in LRa04 exhibit periods of ~1.40 h. We did not find any reported technical issues specifically related to these fields. The significant concentration of these periods suggests the presence of periodic noise associated with CCD temperature variations (Lapeyrere et al. 2006; Auvergne et al. 2009).

thumbnail Fig. 3

Distribution of signal-to-noise ratio (SNR) as function of the period. The horizontal black line indicates the adopted threshold (SNR > 2.5) used to select variable star candidates. Histograms in the top and right panels show the marginal distributions of period and SNR, respectively. Additionally, grey dotted lines mark concentrations of periods at specific values, indicating possible instrumental or sampling artefacts such as thermal cycling, scattered light, and satellite observing cadence.

Table 1

First ten entries of CoRoT-CVSP catalogue.

thumbnail Fig. 4

Workflow diagram of LC-SSM proposed in this study.

4 Classification and discussions

To incorporate both previous and newly derived classifications, we adopted a multi-step approach. The LC-SSM provides an initial estimate for various types of variable stars, serving as a helpful starting point. However, since it relies solely on the light-curve morphology, it cannot reliably classify all types of variability. As a first step, we used classifications available in the AAVSO and SIMBAD databases to construct a training set for our classification procedure (see Sect. 4.1). Subsequently, we compared and combined the preliminary classifications obtained from the LC-SSM with those from the literature to derive a more robust and consistent final classification for each object (see Sects. 4.2 and 4.3).

4.1 Cross-matching and preliminary classification

Approximately 1739 CoRoT-CVSP sources are listed in the AAVSO International Variable Star Index (VSX; Watson et al. 2014), 1961 are in the SIMBAD database2 (Wenger et al. 2000), and 2114 are in the catalogue compiled by Gavras et al. (2023) (hereafter referred to as GavrasCatalog) having variability classification. The matches found in VSX, SIMBAD, and GavrasCatalog were combined to create a master catalogue set with 2324 known variable star types. Indeed, the LS-SSM method provided initial classifications for various types of variable stars, serving as supplementary information for a machine-learning-based classification (MCL). To visualise the data, a t-distributed stochastic-neighbour-embedding (t-SNE; van der Maaten & Hinton 2008) approach was implemented using Scikit-learn (Pedregosa et al. 2011), with a perplexity parameter set to 45 and 5000 iterations.

The projection was performed on a lower 3D manifold, enabling the exploration of interesting clusters, such as variable stars with asymmetric light curves, total eclipsing binaries, and contact eclipsing binaries with spurious double periods. To specifically identify Beta Lyrae eclipsing binaries (EB) among known eclipsing binaries, the modified local-linear-embedded method (MLLE; Zhang & Wang 2006) was applied for dimensionality reduction, projecting the data onto a 3D manifold. Variable stars positioned close to each other in this space are likely to belong to the same class. The number of nearest neighbours was set to ten. After visually inspecting the projected light-curve data, new labels were assigned to the best sub-types of eclipsing binaries (EW, EA, EB).

A semi-supervised machine-learning approach was applied using the label-spreading method (Zhou et al. 2003). The known sample was divided into labelled and unlabelled sets, with 20% of the data designated as a test set. The LS algorithm iteratively propagated label information from labelled points to their neighbours until convergence was reached or the maximum number of iterations was completed. The final classifications for the unlabelled points were determined based on the information obtained at the end of the iterative process. For the Scikit-learn implementation of the LS algorithm, the number of nearest neighbours was set to ten, and the gamma parameter was set to 2000.

We classified the variable stars into the following types and subtypes: EW, EA, EB, δ Scuti (DSCT), γ Doradus (GDOR), and RR Lyrae (RRab, RRc, and RRd) following the matched data with Simbad and VSX datasets. Stars with ambiguous variability signatures were classified as ‘other’. The accuracy of the classification was assessed using five-fold cross-validation on the labelled subset, which was derived from cross-matched sources in VSX and SIMBAD. Performance metrics, including precision and the F1-score, were computed to evaluate the model. The overall classification accuracy was approximately 83%, with the highest accuracy for EA-type stars (~93%) and the lowest for γ Doradus and RR Lyrae stars, at ~73% and ~70%, respectively.

Table 2 presents an overview of our final classification of CoRoT-CVSP variable stars, including the number of both previously known and newly detected sources (see Sect. 4.3). It is important to note that SIMBAD and VSX compile most variable star detection catalogues, meaning some sources identified as ‘new’ in this study may have prior identifications in the literature. Additionally, sources with limited classification information, such as those labelled as ‘star’ or ‘Var’, were categorised as ‘unknown’ sources.

A note of caution is warranted regarding the limitations of the MAM used in this study. The MAM tends to remove signals with periods greater than approximately 2 days, as illustrated in Fig. 1. Therefore, our classification should be interpreted as relating to the short-period signals present in the light curves. In cases where a signal contains both a short period (P < 1 day) and a long period (P > 1 day), it is expected that rankings from past and present studies would align when derived from light curves corresponding to the same period. This consideration is crucial when comparing our classification results with those of other studies and in interpreting the nature of the identified variable stars.

Table 2

Distribution of variable stars in different classes given in CoRoT-CVSP catalogue.

Table 3

Concordance rates from comparing the DCL method and our classification.

4.2 Benchmarking against previous classifications

The classification scheme developed by Debosscher et al. (2007, 2009) (DCL) was employed both as a benchmark for evaluating our classification and as a means to refine previous categorizations. Their approach relies on period analysis and harmonic fitting, using a training set derived from OGLE data (Walkowicz & Basri 2013), which encompasses a broad range of variable star types, including BCEP (β-Cephei), BE (Be stars), CLCEP (Classical Cepheids), δ Sct, ECL (eclipsing binaries, all types), ELL (ellipsoidal variables), GDOR, RRab, RRc, RRd, and SPB (Slowly Pulsating B stars), among others. Indeed, DCL groups all eclipsing binaries into a single class (ECL), whereas our LC-SSM method distinguishes between the three primary types of eclipsing binaries: EA (Algol-type), EB (β Lyrae-type), and EW (W Ursae Majoris-type). This distinction allows for a more nuanced analysis of the variability properties across different binary morphologies.

The DCL method employs a second-order polynomial fit, which is insufficient for addressing specific data quality issues. In cases where the data are relatively clean and the detected frequencies are consistent, both the MCL and DCL methods are expected to produce similar classification results. However, incorporating a probability factor significantly improves the reliability of the classifications. For instance, stars identified as δ Sct variables with a probability greater than 0.8 are considered robust classifications, as noted by Michel et al. (2017). Considering these constraints, we selected all variable stars classified with a probability greater than 0.8 by the DCL method and whose dominant frequencies were also recovered by the MCL approach. This subset was used to compare the two classification methods. The similarity of periods obtained by the DCL and MCL methods in this study suggests that the light curves have minor issues, and the results from the two methods should generally be in agreement. Comparing the outcomes of the DCL and MCL methods provides insight into their effectiveness in identifying different classes of variable stars. Conducting these methods independently and using different parameters helps mitigate potential biases, ensuring a more robust evaluation of their performance.

Table 3 presents the classification results of the DCL and MCL methods for various types of variable stars. The concordance rates for GDOR and δ Sct stars represent the percentage of stars classified as such by both methods, which are approximately 88% and 82%, respectively. The mixed rate indicates the percentage of stars misclassified by both methods as the other type, at approximately 1.4% and 16.4% for GDOR and δ Sct stars, respectively. These findings suggest that both methods are relatively effective in identifying GDOR and δ Sct stars. However, achieving a complete separation of these classes requires further analysis, as they overlap in the parameter space concerning signal shape, period, and amplitude. Additional investigation, including other characteristics, may be necessary to achieve a more precise classification of these variable star types.

Eclipsing binaries demonstrate a concordance rate greater than 77%, indicating a relatively high level of agreement between the DCL and MCL methods in identifying this category. The highest concordance rate, around 97%, is observed for EA-type stars. Conversely, the elevated mixing rate for GDOR stars can be attributed to several factors, including data issues and the similarity of GDOR light curves to those of EA/EB/EW stars.

4.3 Refined classification results

By integrating the results from the DCL, MCL, and LC-SSM methods, we obtained a more comprehensive and robust classification of the CoRoT-CVSP variable stars. The final classification scheme was based on the criteria listed below.

  • Only variable stars classified with a probability greater than 0.8 by the DCL method and with matching frequencies identified by the MCL method were considered. This subset represents approximately 85% of the total number of variables analysed by the DCL method, ensuring higher classification reliability and minimising ambiguities.

  • Classifications from the DCL and MCL methods were merged to form a unified label. In cases of disagreement between the two methods, both classifications were retained.

  • A classification of other was assigned in cases where both the DCL and MCL methods identified the star as MISC or other.

  • After merging the results from the DCL and MCL methods, the sources were grouped according to their variability class and associated with LC-SSM shape models. Stars with clear signatures of eclipsing binaries or RRab were re-labelled, whereas stars with ambiguous morphologies remained labelled as other.

  • Finally, for stars labelled as MISC or other after the previous steps, we adopted the classification available in external catalogues such as SIMBAD, VSX, or the GavrasCatalog, if available.

This multi-method approach allowed us to leverage the strengths of each technique. The LC-SSM proved to be the most efficient, capturing nearly 98% of the light curve shapes found in the CoRoT-CVSP dataset. Its use significantly streamlined the classification process, reducing the need for visual inspection to only 102 representative light curves. It was particularly effective in identifying clear signals such as eclipsing binaries and RRab stars. However, for some cases with ambiguous or overlapping features, additional analysis was required to improve classification reliability. In contrast, the MCL method served as a valuable tool for comparison and helped validate classifications suggested by the LC-SSM and DCL approaches. Although the MLC approach may be incomplete if the training set lacks certain variable star types, it still provides valuable classifications for further analysis and comparison with other methods.

As mentioned earlier, data issues can affect classification accuracy. In particular, GDOR light curves may mimic those of eclipsing binaries, especially when the GDOR signal has a low SNR or when the distinction between the primary and secondary eclipses is unclear. Visual inspection of light curves – such as those shown in Fig. 5 for M0096 and M0057 – can reveal subtle features that help distinguish GDOR stars from eclipsing binaries. Similar challenges arise in the classification of RRab stars, where overlapping characteristics may lead to misidentification. Therefore, visual inspection remains essential for resolving ambiguities and improving classification reliability.

thumbnail Fig. 5

CoRoT phase diagrams used for classifying CoRoT-CVSP. The colour density represents all sources included in the analysis. The black dashed line indicates the template model (see Section 4.1 for details). The model ID, number of sources, and maximum chi-square value used for selecting sources during model construction are indicated at the top of each panel. All template model data can be accessed through the CDS.

Results

The CoRoT-CVSP catalogue, the main product of this study, contains a total of 9272 variable stars, of which 6249 are not listed in the VSX, SIMBAD, or GavrasCatalog databases. The catalogue also includes atmospheric parameters from TESS3 (Ricker et al. 2015) and Gaia DR34 (Gaia Collaboration 2016, 2021). These additional parameters offer important information about the physical properties and variability behaviour of the stars, particularly those located in opposite directions of the Milky Way.

Our analysis and classification procedure offer a homogeneous and reliable assessment of these variable stars, addressing the primary data challenges associated with CoRoT observations for variable stars with periods shorter than one day. The resulting catalogue represents a valuable resource for studying various variable star classes, including RR Lyrae stars, eclipsing binaries, and pulsating stars, among others. The sample consists of 5550 stars located towards the Galactic centre and 3722 stars towards the Galactic anti-centre (see Table 2). To facilitate further analysis, we define two subsamples: CoRoT-CVSP-C for stars in the Galactic centre region and CoRoT-CVSP-A for those in the Galactic anti-centre.

Figure 7 presents violin plots illustrating the distributions of period and amplitude for our sample of variable stars. The different colours represent the positions of these stars in the Galactic centre (CoRoT-CVSP-C) and anti-centre (CoRoT-CVSP-A). These violin plots reveal a significant disparity in the period and amplitude distributions between the CoRoT-CVSP-C and CoRoT-CVSP-A samples. Interestingly, variable star classifications often overlook these variations (e.g., Marsakov et al. 2011; De Medeiros et al. 2013; Zoccali et al. 2017; Eilers et al. 2022; Ratcliffe et al. 2023). This finding highlights the importance of integrating spatial information and other relevant parameters, such as metallicity and age, into the classification process. Accounting for Galactic location and additional factors, such as period and amplitude distributions, can lead to more refined and accurate classification models.

Comparing the CoRoT-CVSP-A and CoRoT-CVSP-C subsamples provides crucial insights into the different variability characteristics of stars across Milky Way regions. To assess variations in period and amplitude, we applied the Kuiper test, here referred to as the invariant KS test (e.g., Jetsu & Pelt 1996; Paltani 2004), to the two sub-samples. Notably, at an approximate distance of 4 kpc, a clear separation emerges between the centre and anti-centre CoRoT-CVSP samples, according to Gaia distances. The KS test results for period and amplitude demonstrate a strong dependence on Galactic location, yielding p values smaller than ~10−3 when comparing stars in the centre and anti-centre regions. These findings suggest a high probability that CoRoT-CVSP-A and CoRoT-CVSP-C originate from distinct stellar populations.

This result has broader implications for understanding the Milky Way, under the assumption that the regions analysed are not exceptional. It aligns with the well-established variations in age and chemical composition across the Galaxy. The Galactic bulge, for instance, exhibits a wide range of metallicities, spanning from −3.0 to +1.0 dex (e.g., Marsakov et al. 2011; Do et al. 2015; Zoccali et al. 2017; Eilers et al. 2022; Ratcliffe et al. 2023). Moreover, the anti-centre region is expected to host older, more metal-poor stellar populations compared to the central region (e.g., Feuillet et al. 2019). These complex spatial and elemental differences emphasise the necessity of considering diverse Galactic environments when analysing stellar variability across the Milky Way.

thumbnail Fig. 6

Example of CoRoT LCs (left panels) and phase diagram (right panels) of ‘atypical’ sources found in our sample.

thumbnail Fig. 7

Violin plot illustrating distribution of period and amplitude for CoRoT-CVSP-C (in pale orange) and CoRoT-CVSP-A (in teal) samples.

6 Conclusions

This study presents the results of a search for stars with variability periods shorter than one day, based on an extensive analysis of CoRoT mission data. A total of 9272 variable stars were identified, including 6249 not listed in SIMBAD or VSX, primarily classified through comparison with previously known variable types. Among them, we identified 309 β Cephei, 3105 δ Scuti, 599 Algol-type, 844 β Lyrae, and 497 W Ursae Majoris eclipsing binaries, as well as 1443 γ Doradus, 63 RR Lyrae, and 32 T Tauri stars. The final sample, compiled into the CoRoT-CVSP catalogue, was built using a novel semi-supervised visual inspection method that achieved approximately 90% efficiency in detecting variability signatures.

The CoRoT-CVSP catalogue, resulting from this study, is a valuable resource for studying stellar variability and the evolution of the Milky Way. It enables investigations of period-luminosity relations for δ Scuti, RR Lyrae, and γ Doradus stars, and supports the analysis of reflection effects in eclipsing binaries with ‘atypical’ light-curve signatures.

This paper also presents a comprehensive classification of variable stars, combining multiple methods and literature-based information. The LC-SSM and MCL method proved to be effective for summarising and classifying large photometric datasets. With the integration of double-period analysis, the MCL approach enables the identification of variable stars through their smoother phase curves. These methodologies are broadly applicable beyond CoRoT, including to missions such as Kepler and TESS. The LC-SSM is particularly suited for large-scale surveys where manual inspection is unfeasible.

In future papers of this series, we will extend our approach to identify CoRoT variable stars with periods longer than one day. This additional step will enable the detection of a broader range of variability types and contribute to a more comprehensive understanding of their properties.

Data availability

The full version of Table 1 and the model templates are available at the CDS via https://cdsarc.cds.unistra.fr/viz-bin/cat/J/A+A/703/A32.

Acknowledgements

C.E.F.L. and this project are supported by ANID’s Millennium Science Initiative through grants ICN12_12009 and AIM23-0001, awarded to the Millennium Institute of Astrophysics (MAS); by ANID/FONDECYT Regular grant 1231637; by DIUDA 88231R11; by the LSST Discovery Alliance grant; and by GEMINI/ANID grant 32240028. D.H. acknowledges support from ANID through doctoral fellowship grant 21232262 for pursuing a Ph.D. M.C. acknowledges additional support from ANID’s Basal project FB210003. This study was partially funded by the Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq), the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)—Finance Code 001, and the CAPES-Print program. D.O.F. acknowledges CAPES graduate fellowships, and R.L.G. acknowledges a CNPq PDE fellowship. B.L.C.M. (grant no. 305804/2022-7), I.C.L. (grant no. 313103/2022-4), and J.R.M. (grant no. 308928/2019-9) acknowledge CNPq research fellowships. R.L.G. acknowledges CNPq postdoctoral fellowships (Grant nos. 200031/2023-6 and 200744/2024-0).

References

  1. Affer, L., Micela, G., Favata, F., & Flaccomio, E. 2012, MNRAS, 424, 11 [Google Scholar]
  2. Alencar, S. H. P., Teixeira, P. S., Guimarães, M. M., et al. 2010, A&A, 519, A88 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  3. Anders, F., Chiappini, C., Minchev, I., et al. 2017, A&A, 600, A70 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  4. Anders, F., Minchev, I., & Chiappini, C. 2020, in IAU General Assembly, 257 [Google Scholar]
  5. Auvergne, M., Bodin, P., Boisnard, L., et al. 2009, A&A, 506, 411 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  6. Baeza-Villagra, K., Rodríguez-Segovia, N., Catelan, M., et al. 2025, A&A, 694, A72 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  7. Baglin, A., Auvergne, M., Barge, P., et al. 2007, in American Institute of Physics Conference Series, 895, Fifty Years of Romanian Astrophysics, eds. C. Dumitrache, N. A. Popescu, M. D. Suran, & V. Mioc, 201 [Google Scholar]
  8. Benko, J. M. 2016, Inform. Bull. Variable Stars, 6189, 1 [Google Scholar]
  9. Bouchy, F., Deleuil, M., Guillot, T., et al. 2011, A&A, 525, A68 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  10. Carmo, A., Ferreira Lopes, C. E., Papageorgiou, A., et al. 2020, Bol. Asoc. Argentina Astron. La Plata Argentina, 61C, 88 [Google Scholar]
  11. Chadid, M., Benkő, J. M., Szabó, R., et al. 2010, A&A, 510, A39 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  12. Chaintreuil, S., Bellucci, A., Baudin, F., et al. 2016, II. 5 Where to find the CoRoT data?, 109 [Google Scholar]
  13. Chiappini, C., Anders, F., Rodrigues, T. S., et al. 2015, A&A, 576, L12 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  14. Christopoulou, P.-E., Lalounta, E., Papageorgiou, A., et al. 2022, MNRAS, 512, 1244 [CrossRef] [Google Scholar]
  15. Csizmadia, S., Hatzes, A., Gandolfi, D., et al. 2015, A&A, 584, A13 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  16. Damiani, C., Meunier, J. C., Moutou, C., et al. 2016, A&A, 595, A95 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  17. De Medeiros, J. R., Ferreira Lopes, C. E., Leão, I. C., et al. 2013, A&A, 555, A63 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  18. Debosscher, J., Sarro, L. M., Aerts, C., et al. 2007, A&A, 475, 1159 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  19. Debosscher, J., Sarro, L. M., López, M., et al. 2009, A&A, 506, 519 [CrossRef] [EDP Sciences] [Google Scholar]
  20. Degroote, P., Aerts, C., Ollivier, M., et al. 2009, A&A, 506, 471 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  21. Deleuil, M., & Fridlund, M. 2018, CoRoT: The First Space-Based Transit Survey to Explore the Close-in Planet Population, 79 [Google Scholar]
  22. Deleuil, M., Meunier, J. C., Moutou, C., et al. 2009, AJ, 138, 649 [NASA ADS] [CrossRef] [Google Scholar]
  23. Do, T., Kerzendorf, W., Winsor, N., et al. 2015, ApJ, 809, 143 [Google Scholar]
  24. Dolez, N., Vauclair, S., Michel, E., et al. 2009, A&A, 506, 159 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  25. Eilers, A.-C., Hogg, D. W., Rix, H.-W., et al. 2022, ApJ, 928, 23 [NASA ADS] [CrossRef] [Google Scholar]
  26. Ferreira Lopes, C. E., & Cross, N. J. G. 2016, A&A, 586, A36 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  27. Ferreira Lopes, C. E., & Cross, N. J. G. 2017, A&A, 604, A121 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  28. Ferreira Lopes, C. E., Dékány, I., Catelan, M., et al. 2015a, A&A, 573, A100 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  29. Ferreira Lopes, C. E., Leão, I. C., de Freitas, D. B., et al. 2015b, A&A, 583, A134 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  30. Ferreira Lopes, C. E., Neves, V., Leão, I. C., et al. 2015c, A&A, 583, A122 [EDP Sciences] [Google Scholar]
  31. Ferreira Lopes, C. E., Cross, N. J. G., & Jablonski, F. 2018, MNRAS, 481, 3083 [Google Scholar]
  32. Ferreira Lopes, C. E., Cross, N. J. G., Catelan, M., et al. 2020, MNRAS, 496, 1730 [Google Scholar]
  33. Ferreira Lopes, C. E., Cross, N. J. G., & Jablonski, F. 2021, MNRAS, 501, 4123 [Google Scholar]
  34. Feuillet, D. K., Frankel, N., Lind, K., et al. 2019, MNRAS, 489, 1742 [Google Scholar]
  35. Gaia Collaboration (Prusti, T., et al.) 2016, A&A, 595, A1 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  36. Gaia Collaboration (Brown, A. G. A., et al.) 2021, A&A, 649, A1 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  37. García, R. A., Mathur, S., Salabert, D., et al. 2010, Science, 329, 1032 [Google Scholar]
  38. Gavras, P., Rimoldini, L., Nienartowicz, K., et al. 2023, A&A, 674, A22 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  39. Guenther, E. W., Gandolfi, D., Sebastian, D., et al. 2012, A&A, 543, A125 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  40. Hajdu, T., Matécsa, B., Sallai, J. M., & Bódi, A. 2022, MNRAS, 516, 5165 [NASA ADS] [CrossRef] [Google Scholar]
  41. Hareter, M. 2012, Astron. Nachr., 333, 1048 [NASA ADS] [Google Scholar]
  42. Jetsu, L., & Pelt, J. 1996, A&AS, 118, 587 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  43. Lanza, A. F., Pagano, I., Leto, G., et al. 2009, A&A, 493, 193 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  44. Lapeyrere, V., Bernardi, P., Buey, J. T., Auvergne, M., & Tiphène, D. 2006, MNRAS, 365, 1171 [Google Scholar]
  45. Leão, I. C., Pasquini, L., Ferreira Lopes, C. E., et al. 2015, A&A, 582, A85 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  46. Léger, A., Rouan, D., Schneider, J., et al. 2009, A&A, 506, 287 [CrossRef] [EDP Sciences] [Google Scholar]
  47. Lomb, N. R. 1976, Ap&SS, 39, 447 [Google Scholar]
  48. Maciel, S. C., Osorio, Y. F. M., & De Medeiros, J. R. 2011, New A, 16, 68 [Google Scholar]
  49. Marsakov, V. A., Koval’, V. V., Borkova, T. V., & Shapovalov, M. V. 2011, Astron. Rep., 55, 667 [CrossRef] [Google Scholar]
  50. Michel, E., Dupret, M.-A., Reese, D., et al. 2017, in European Physical Journal Web of Conferences, 160, 03001 [Google Scholar]
  51. Mislis, D., Schmitt, J. H. M. M., Carone, L., Guenther, E. W., & Pätzold, M. 2010, A&A, 522, A86 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  52. Moya, A., Suárez, J. C., García Hernández, A., & Mendoza, M. A. 2017, MNRAS, 471, 2491 [Google Scholar]
  53. Nikzat, F., Ferreira Lopes, C. E., Catelan, M., et al. 2022, A&A, 660, A35 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  54. Ollivier, M., Tiphène, D., Samadi, R., Levacher, P., & CoRot Team 2016, V.2 CoRoT heritage in future missions, 237 [Google Scholar]
  55. Paltani, S. 2004, A&A, 420, 789 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  56. Papageorgiou, A., Catelan, M., Christopoulou, P.-E., Drake, A. J., & Djorgovski, S. G. 2018, ApJS, 238, 4 [NASA ADS] [CrossRef] [Google Scholar]
  57. Pedregosa, F., Varoquaux, G., Gramfort, A., et al. 2011, J. Mach. Learn. Res., 12, 2825 [Google Scholar]
  58. Prša, A., Batalha, N., Slawson, R. W., et al. 2011, AJ, 141, 83 [Google Scholar]
  59. Queloz, D., Bouchy, F., Moutou, C., et al. 2009, A&A, 506, 303 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  60. Ratcliffe, B., Minchev, I., Anders, F., et al. 2023, MNRAS, 525, 2208 [NASA ADS] [CrossRef] [Google Scholar]
  61. Ricker, G. R., Winn, J. N., Vanderspek, R., et al. 2015, J. Astron. Telesc. Instrum. Syst., 1, 014003 [Google Scholar]
  62. Sarro, L. M., Debosscher, J., Neiner, C., et al. 2013, A&A, 550, A120 [Google Scholar]
  63. Scargle, J. D. 1982, ApJ, 263, 835 [Google Scholar]
  64. Sebastian, D., Guenther, E. W., Deleuil, M., et al. 2022, MNRAS, 516, 636 [NASA ADS] [CrossRef] [Google Scholar]
  65. van der Maaten, L., & Hinton, G. 2008, J. Mach. Learn. Res., 9, 2579 [Google Scholar]
  66. Walkowicz, L. M., & Basri, G. S. 2013, MNRAS, 436, 1883 [Google Scholar]
  67. Watson, C., Henden, A. A., & Price, A. 2014, VizieR Online Data Catalog: 1, 2027 [Google Scholar]
  68. Wenger, M., Ochsenbein, F., Egret, D., et al. 2000, A&AS, 143, 9 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  69. Zechmeister, M., & Kürster, M. 2009, A&A, 496, 577 [CrossRef] [EDP Sciences] [Google Scholar]
  70. Zhang, Z., & Wang, J. 2006, in NIPS [Google Scholar]
  71. Zhou, D., Bousquet, O., Lal, T. N., Weston, J., & Schölkopf, B. 2003, in NIPS [Google Scholar]
  72. Zoccali, M., Vasquez, S., Gonzalez, O. A., et al. 2017, A&A, 599, A12 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  73. Zorec, J., Hubert, A. M., Martayan, C., & Frémat, Y. 2023, A&A, 676, A81 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

1

The fraction of time that a signal or phenomenon is active or observable during a given period.

2

Accessed in February 2025.

3

Transiting Exoplanet Survey Satellite.

4

Data Release 3.

All Tables

Table 1

First ten entries of CoRoT-CVSP catalogue.

Table 2

Distribution of variable stars in different classes given in CoRoT-CVSP catalogue.

Table 3

Concordance rates from comparing the DCL method and our classification.

All Figures

thumbnail Fig. 1

CoRoT light curves before (upper panels) and after (bottom panels) applying moving average method (MAM). Phase-folded light curves are shown in the top right corner of each panel. The orange line represents the MAM with a time segment size (TSS) of one day, while the blue line corresponds to a TSS equal to the variability period. Each panel is labelled with the CoRoT ID, field, and period of variability at the top.

In the text
thumbnail Fig. 2

Folded CoRoT signals (left panels) and χ(c) and χ(s) parameters (see Eqs. (1) and (2)) as a function of the ratio between the TSS length and the period. The dashed black line sets the position where TSS = P; i.e. TSS/P = 1.

In the text
thumbnail Fig. 3

Distribution of signal-to-noise ratio (SNR) as function of the period. The horizontal black line indicates the adopted threshold (SNR > 2.5) used to select variable star candidates. Histograms in the top and right panels show the marginal distributions of period and SNR, respectively. Additionally, grey dotted lines mark concentrations of periods at specific values, indicating possible instrumental or sampling artefacts such as thermal cycling, scattered light, and satellite observing cadence.

In the text
thumbnail Fig. 4

Workflow diagram of LC-SSM proposed in this study.

In the text
thumbnail Fig. 5

CoRoT phase diagrams used for classifying CoRoT-CVSP. The colour density represents all sources included in the analysis. The black dashed line indicates the template model (see Section 4.1 for details). The model ID, number of sources, and maximum chi-square value used for selecting sources during model construction are indicated at the top of each panel. All template model data can be accessed through the CDS.

In the text
thumbnail Fig. 6

Example of CoRoT LCs (left panels) and phase diagram (right panels) of ‘atypical’ sources found in our sample.

In the text
thumbnail Fig. 7

Violin plot illustrating distribution of period and amplitude for CoRoT-CVSP-C (in pale orange) and CoRoT-CVSP-A (in teal) samples.

In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.