Open Access
Issue
A&A
Volume 707, March 2026
Article Number A85
Number of page(s) 12
Section Galactic structure, stellar clusters and populations
DOI https://doi.org/10.1051/0004-6361/202556426
Published online 10 March 2026

© The Authors 2026

Licence Creative CommonsOpen Access article, published by EDP Sciences, under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

This article is published in open access under the Subscribe to Open model. This email address is being protected from spambots. You need JavaScript enabled to view it. to support open access publication.

1 Introduction

The Canis Major (CMa) star-forming region is a remarkable laboratory for studying the formation and evolution of stars. Located approximately 1.2 kpc from the Sun (Gregorio-Hetem 2008), this region is home to more than 200 luminous B-type stars and a few late-type O stars. First identified as a stellar association by Ambartsumian (1947), subsequent studies (Herbst et al. 1978) have revealed that CMa hosts a predominantly young stellar population (more than a hundred young stars), rich in pre- main-sequence stars with an average age of just 0.5 Myr. This cosmic nursery is closely linked to the reflection nebula CMa R1, which is complemented by three prominent HII regions–Sh 2-292, Sh 2-296, and Sh 2-297 (Herbst et al. 1978)–and contains over a dozen known open clusters (Hunt & Reffert 2023), establishing it as a dynamic hub of star formation.

The region also includes six dense dark clouds LDN 1653– LDN 1658 (Dobashi et al. 2005), which obscure parts of CMa and contain numerous embedded stars, indicating ongoing star formation. Initially referred to as I Canis Majoris by Ambartsumian (1947), this area was later reclassified as CMa OB1 by Ruprecht (1966). The relationship between CMa OB1 and CMa R1 has since been clarified, with CMa R1 identified as a reflection nebula embedded within the broader CMa OB1 association. Notably, Clariá (1974) confirmed that the majority of stars studied by Racine (1968) in CMa R1 are indeed members of CMa OB1.

X-ray studies have also contributed to the understanding of CMa. The first wide-field X-ray study of the young stellar population associated with CMa R1 was performed by ROSAT data, revealing the previously unknown older, fainter low-mass stellar population (Gregorio-Hetem et al. 2009). Using XMM-Newton, Santos-Silva et al. (2018) analysed the Sh 2-296 nebula, identifying 58 members of the region, including 41 T Tauri stars and 15 additional pre-main-sequence objects. These studies revealed that half of the young stars in the region have masses below 1 M and ages between 1 and 2 Myr. Despite these complexities, CMa remains an invaluable site for investigating young stellar populations and their interactions with the interstellar medium.

In Fischer et al. (2016), young stellar objects (YSOs) in the CMa region were classified as Class I and Class II within a 100 deg2 field. A total of 335 sources were identified as Class II, while 144 were classified as Class I. Class I objects correspond to an earlier evolutionary phase, in which the protostar remains deeply embedded within its natal envelope. In contrast, Class II objects are more evolved, characterized by the presence of a prominent protoplanetary disc and reduced circumstellar obscuration. Based on these classifications, Fischer et al. (2016) concluded that the stellar population in this region is predominantly Class II objects. This trend was also noticed by Fernandes et al. (2015), in the characterization of41 T Tauri stars associated with the Sh 2-296 nebula. They found that half of the sample has ages <1–2 Myr, but only a small fraction (25%) shows evidence of IR excess due to the presence of circumstellar discs.

The distance to the CMa region remains debated, with estimates varying from 1000 to 1300 pc (Shevchenko et al. 1999; Pettersson & Reipurth 2019; Zucker et al. 2020), depending on the objects studied. For instance, Clariá (1974) used 36 confirmed association members to derive an average distance of 1150 pc. Also, Shevchenko et al. (1999) analysed 165 stars brighter than magnitude 13 in the CMa R1 region, identifying 88 early-type candidate members with a colour excess E(B – V) = 0.16 mag, corresponding to a distance of approximately 1 kpc.

In Pettersson & Reipurth (2019), the study focused on Hα emitters detected by WISE, which are predominantly concentrated in star-forming regions. The authors identified 398 objects classified as Hα emitters. The distance distribution for these objects spans the range of 1050 to 1350 pc, with a pronounced peak at 1185 pc. Additionally, using OB stars as reference points, they determined a median distance of 1282 pc.

In Zucker et al. (2020), the authors focused on calculating precise distances to local molecular clouds using Bayesian inference. Their study included four molecular clouds in the CMa OB1 region, with distances ranging from 1169 to 1268 pc.

Santos-Silva et al. (2021) identified four subgroups within the CMa region, designated as CMa05, CMa06, CMa07, and CMa08, which consist of young stellar populations with ages between 10 and 20 Myr, located at distances ranging from 1000 to 1200 pc. Among these, CMa06 stands out due to its distinct proper motion distribution compared to the overlapping proper motions of the other subgroups. More recently, Dong et al. (2024) analysed the distributions of molecular gas, based on the positions and velocities of independent structures revealed by the 12CO data. These gas structures were combined with the differences in distances and motions of the YSOs to suggest a division of the CMa region into seven subregions. The distance found for the subregions ranges from 1080 pc to 1159 pc.

The CMa region is part of the Radcliffe Wave (Alves et al. 2020), a large-scale galactic structure composed of interconnected star-forming regions, extending over more than 2.7 kpc. Canis Major is located near the outer edge of this structure, making it one of the most distant regions along the wave, similar to the Cygnus-X complex (Alves et al. 2020).

Despite the wealth of available data, precise radial velocity measurements for this region remain scarce, as Gaia’s radial velocity data lack sufficient precision due to large uncertainties. Accurate kinematic measurements are therefore crucial for characterizing the region and understanding its relationship with the larger galactic structures.

With the recent third data release of the Gaia space mission (Gaia Collaboration 2023), the CMa region can be reanalysed using high-precision astrometric and photometric data. Membership analysis methods allow us to characterize the stellar population, while the inclusion of unprecedented radial velocity data establishes CMa as a fresh starting point for this study, through the unique combination of Gaia DR3 data with ground-based spectroscopy. The primary goal of this work is to characterize the CMa region in terms of its stellar population, spatial structure, kinematics, and age.

This paper is structured as follows. In Sect. 2, we perform a new membership analysis of the CMa region centred around the youngest subgroups identified by Santos-Silva et al. (2021) based on Gaia DR3 data. In Section 3 we describe our observations and procedure to derive precise radial velocity measurements of CMa stars from ground-based observations. In Sect. 4, we reassess key properties of the cluster, such as distance, kinematics, age, and spatial distribution, using our new sample of cluster members. Finally, we summarize our findings in Sect. 5.

2 Membership analysis

In this section, we outline our strategy to identify new members of the CMa region and confirm previous candidates. The methodology employed to select the most likely members of CMa is based on the methods developed by Sarro et al. (2014) and Olivares et al. (2019).

Our membership analysis is based on data from the Gaia DR3 catalogue. We downloaded the Gaia DR3 catalogue in the CMa region defined by 104° < α < 108° and −13.4° < δ < −10° that encompasses the youngest clusters identified by Santos-Silva et al. (2021). The initial catalogue comprised 736 632 sources. After downloading the data, we applied a cleaning process by imposing a 3σ cut on the proper motion modulus, resulting in a final selection of 162 331 sources in the field. The representation space (i.e. space of parameters) that we use in the membership analysis includes both astrometric and photometric features from the Gaia DR3 catalogue (μα cos δ, μδ, π, G, and GRP).

We identified probable CMa members by modelling the field and cluster populations. The field population was characterized with a Gaussian mixture model (GMM) applied to both astrometric and photometric spaces. The optimal number of Gaussian components was determined using the Bayesian information criterion (BIC). We tested models with the number of components ranging from 40 to 180. The model with 80 components yielded the lowest BIC value and was subsequently adopted to construct the field model. This field model was computed at the beginning and remained static throughout the analysis.

The cluster model is generated independently, combining a GMM in the astrometric space (μα cos δ, μδ, and π). The photometric model’s mean is represented by a principal curve, following the cluster’s sequence. Our method calculates membership probabilities for each source and classifies the sources as members or non-members based on a user-defined probability threshold, pin. In this study, we tested threshold values ranging from 0.5 to 0.9 (corresponding to 50% to 90%).

Our membership analysis takes an initial list of candidate members to define the locus of the cluster in the space of parameters in the first iteration of the code. We used as the initial list of members the sample of cluster members for the youngest populations of CMa subgroups (CMa05, CMa06, CMa07, and CMa08) provided by Santos-Silva et al. (2021). This sample of candidate members was obtained from Gaia DR2 data. We updated this dataset with Gaia DR3 data and included it in our analysis.

From this point, the algorithm iteratively refined the list of members based on their membership probabilities. The algorithm iterated until convergence, which occurs when the membership list stabilises after successive iterations. Then, we generated synthetic data using the field and cluster models that were computed in our analysis as explained before. We determined the optimal probability threshold of our analysis after computing the true positive rate (TPR) and contamination rate (CR) of the sample as explained in Olivares et al. (2019).

Table 1 summarizes the results for different user-defined probability thresholds, pin. The results with pin = 0.5 were omitted from Table 1 due to convergence issues of the field model that are most likely caused by the high contamination in this solution. We note from Table 1 that the number of members retrieved in each solution decreases with the increasing pin threshold (the only exception is the solution obtained with pin = 0.7). It is also clear from Table 1 that a less restrictive pin value is often associated with a more conservative popt (see, for example, the solutions obtained with pin = 0.6 and pin = 0.9). The closest match between these two probability thresholds occurs with pin =0.8 (and popt = 0.858).

As shown in Table 1, the most extreme solutions (pin = 0.6 and pin = 0.9) exhibit the lowest TPRs and can therefore be excluded. We note that the faintest sources are missing in the solution with pin = 0.9, which prevents the detection of low-mass cluster members and the study of the initial mass function (IMF) in the faint end. When comparing the other two solutions, we note a more significant dispersion in the proper motion and parallax diagrams of the sources in the pin = 0.7 solution compared to the results obtained with pin = 0.8.

We have therefore chosen to work with the solution obtained from pin = 0.8, as it provides the highest TPR and the smallest difference between the user-defined and optimal probability thresholds. Consequently, the final list of cluster members derived from our study includes 1531 stars (see Fig. 1). Table A.1 lists all the 1531 cluster members identified in our membership analysis. In Table A.2 we provide the membership probabilities of individual sources in the field for the solutions investigated in this study with different pin thresholds.

Figures 1 and 2 show the distribution of position, proper motion, and parallax of the sample of CMa stars selected in our analysis. It is apparent from Fig. 2 that our sample of CMa stars consists of different subgroups (see Sect. 4.1 for more details). Moreover, we note that the more dispersed sources in the space of proper motions and parallax have the lowest membership probabilities.

Figure 2 also shows the colour–magnitude diagram (CMD) for the sources in our sample. The CMD displays apparent magnitudes from Gaia without any extinction correction applied. For extinction-corrected magnitudes, refer to Fig. 9. As depicted, CMa stars primarily populate the magnitude range of 10 to 18 mag in the Gaia G band. We observe a distinct magnitude cutoff, which becomes more pronounced as the membership probability threshold increases. For instance, at a threshold of pin = 0.8, no sources fainter than 18 mag are found, indicating that objects at higher (fainter) magnitudes generally exhibit lower membership probabilities. Similarly, sources brighter than G = 10 mag are absent from our sample due to selection constraints. The brightest star included has a magnitude of G = 10.4 mag and corresponds to Gaia DR3 3046209991397371392, classified as a B5-type star.

Figure 3 presents a comparison of cluster members identified in this study with those of Santos-Silva et al. (2021). Our membership analysis confirms 401 stars from the initial list of candidate members based on the sample identified by Santos-Silva et al. (2021), while rejecting 136 sources from that list. We retrieved 296 objects from the CMa06 group and 105 objects from the CMa05, CMa07, and CMa08 groups. Consequently, 102 objects were rejected from CMa06, and 34 from CMa05, CMa07, and CMa08. As explained in Section 4.1, we collectively refer to CMa05, CMa07, and CMa08 as one single population in the remainder of this paper.

The discarded objects in the analysis predominantly exhibited probabilities between 50% and 80%. Our analysis uses a higher probability threshold compared to the Santos-Silva et al. (2021) paper where the standard threshold of 50% is employed, which naturally leads to the rejection of some sources identified in that study. However, it is interesting to note that we have identified 1130 new members (see Fig. 3). We have therefore tripled the number of cluster members associated with these subgroups of the CMa region with respect to the most recent census of the stellar population conducted with Gaia data (Santos-Silva et al. 2021).

Table 1

Results of our membership analysis with various probability thresholds.

Thumbnail: Fig. 1 Refer to the following caption and surrounding text. Fig. 1

Spatial distribution of 1531 stars identified as members of CMa region, overlaid on Digitized Sky Survey 2 (DSS2) red-band image. The stars are colour-coded according to their subgroup membership: yellow symbols indicate stars belonging to Cluster A, while orange symbols represent stars assigned to Cluster B, as described in Section 4.1.

Thumbnail: Fig. 2 Refer to the following caption and surrounding text. Fig. 2

Parallaxes and proper motions of 1 531 stars identified in our membership analysis and colour-magnitude diagram of CMa sample found by our membership investigation. The empirical isochrone, derived in our analysis, is indicated by the solid black line. The individual membership probabilities of the stars are scaled from 0 to 1 and shown with different colours.

Thumbnail: Fig. 3 Refer to the following caption and surrounding text. Fig. 3

Venn diagram comparing the number of stars in common between our analysis and the previous study by Santos-Silva et al. (2021).

3 Analysis of radial velocities

The scarcity of radial velocity measurements is currently the main limitation to investigating the kinematic properties of the CMa region. The Gaia DR3 catalogue provides radial velocity information for 197 sources in our list of 1531 cluster members with a poor precision ranging from 2 to 40km s−1 that is insufficient for many astrophysical purposes. We searched for high-resolution spectra in public databases, and we could not find any spectrum for our targets. We therefore resorted to making our own observations.

Our radial velocity analysis is based on observations (program 108.2250, PI: Galli) from the FLAMES spectrograph (Pasquini et al. 2002). The observations were conducted in service mode on the nights of November 12 and 16, 2021. FLAMES enables simultaneous observations with both UVES and GIRAFFE. For UVES, we utilized the 7+1 mode (580 nm setup, R=47 000), employing seven fibres for target observations and one fibre to simultaneously record the ThAr lamp. For GIRAFFE, we used the MEDUSA mode (H665.0/HR15N setup, R=19 000), with some fibres illuminated by a ThAr lamp for wavelength calibration. Most of our targets were observed with GIRAFFE, which can simultaneously observe up to 132 targets, including sky fibres.

We observed six fields in the CMa region with 1800 s exposure times. To maximise fibre allocation, we targeted 700 sources across the region. However, only 188 were confirmed as cluster members and included in our analysis.

The spectra were reduced with the ESO pipeline, and afterwards we used iSpec (Blanco-Cuaresma et al. 2014) to measure the radial velocity of each target. We determined radial velocities by cross-matching each spectrum with templates of different spectral types (A0, F0, G2, K0, K5, and M5) and selecting the one that has the closest spectral type to the target. We computed the cross-correlation function (CCF) for each pair of spectra (target-template) and visually inspected them to remove radial velocities that would result from a poor fit between the target spectrum and template, or a low signal-to-noise ratio S/N.

In doing so, we retained 90 high-quality spectra that produced reliable radial velocity measurements. The mean S/N for this sample was 48.6. The significant reduction in the number of spectra can be attributed to the characteristics of the data collection process. FLAMES operates as a multi-object spectrograph, capturing multiple sources simultaneously with the same exposure time. Consequently, faint sources would require longer exposures or individualized measurements. The lack of such adjustments for these sources resulted in spectra with low S/N and consequently poor CCFs that result in uncertain radial velocity measurements.

Fig. 4 reveals an approximately symmetric radial velocity distribution with a peak near 30 km s−1 and a few outliers (see Sect. 4.2). The uncertainty of individual measurements ranges from 0.2 to 2.9 km s−1, and the mean radial velocity uncertainty is about 1 km s−1. Fig. 5 shows a comparison of the radial velocities derived in this paper with the ones given in the Gaia DR3 for 17 stars in common between the two projects. The mean difference between the radial velocities in the two datasets is 5 km s−1, and the root mean square of the differences is about 42 km s−1. These high values arise from the poor precision of the radial velocities in Gaia DR3. The mean uncertainty of the Gaia DR3 radial velocity for this subsample is 10 km s−1. As explained above, our radial velocity measurements are more precise than those given in the Gaia DR3 catalogue. In the following we use these newly derived radial velocity data to investigate the kinematic properties of the CMa region.

Thumbnail: Fig. 4 Refer to the following caption and surrounding text. Fig. 4

Left panel: kernel density estimation of the radial velocity distribution for the sample of 90 members with available measurements. Right panel: distribution of radial velocity uncertainties. The ticks in the horizontal axes mark the measurements for individual stars.

4 Properties of the CMa subgroups

4.1 Subgroups of the CMa region

As illustrated in Fig. 1, our sample of CMa stars comprises multiple populations of young stars. In this section, we employ the partitioning around medoids (PAM, Kaufman & Rousseeuw 1990) clustering algorithm to explore the underlying structures and patterns within our sample of cluster members, aiming to deepen our understanding of the overall properties of the CMa region.

Partitioning around medoids (PAM) is a robust clustering method that divides a dataset into k distinct clusters. Unlike the k-means algorithm, which uses centroids (mean points) to represent clusters, PAM selects actual data points, called medoids, as cluster representatives. This approach makes PAM less sensitive to outliers and noise.

The algorithm begins by randomly selecting k data points as initial medoids. Each data point is then assigned to the nearest medoid based on a specified distance metric, such as the Euclidean distance. Within each cluster, the medoid is iteratively updated to the most centrally located point, provided a better one exists. This assignment and update process continues until the medoids stabilize or a predefined number of iterations is reached. In this paper, we use the k-medoids routine from the PyClustering library (Novikov 2019) that implements the PAM algorithm.

To determine the optimal number of clusters (k) for our dataset, we employed three widely used methods: the elbow method (Thorndike 1953), calculated using pairwise distances from the scikit-learn library; the silhouette score (Rousseeuw 1987), also implemented in scikit-learn; and the gap statistic (Tibshirani et al. 2002), available in the pyclustering library. We performed the clustering analysis in a 3D space defined by proper motions and parallaxes. These methods suggest that the optimal number of clusters in our analysis is k = 2.

Figure 6 displays the results of our clustering analysis performed with the PAM algorithm in the 3D astrometric space defined by proper motions (μα cos δ, μδ) and parallax (π). Two major subgroups were identified within CMa. The first subgroup, referred to as Cluster A, corresponds to the CMa06 subgroup defined by Santos-Silva et al. (2021). The second subgroup, designated as Cluster B, encompasses sources previously assigned to CMa05, CMa07, and CMa08 in that study. Based on the higher-precision Gaia DR3 astrometry used in our analysis, we found no evidence of substructure within Cluster B. This outcome might be due to the specific clustering method applied (as Santos-Silva et al. (2021) used HDBSCAN) and the inclusion of newly identified sources in our sample. Therefore, throughout this paper, we collectively refer to CMa05, CMa07, and CMa08 as Cluster B.

The upcoming Gaia-DR4 catalogue, combined with groundbased spectroscopic surveys, will enable us in the near future to identify additional substructures within the CMa region, similar to what was done in Sco-Cen by Ratzenböck et al. (2023), which could not be resolved in this study due to the relatively small number of sources with radial velocity measurements. This will provide new insights into the star formation history of CMa.

Thumbnail: Fig. 5 Refer to the following caption and surrounding text. Fig. 5

Comparison of radial velocity data obtained from our observations and Gaia DR3 data. One source of our sample (namely Gaia DR3 3046207139539172864) is not shown to improve the visibility of the plot.

Thumbnail: Fig. 6 Refer to the following caption and surrounding text. Fig. 6

PAM clustering results with k = 2, showing identified groups.

4.2 Distance and spatial velocities

We used our new list of cluster members identified in this study to re-visit the distance to CMa. We employed the Kalkayotl 2.0 method (Olivares et al. 2025), a Bayesian inference framework designed to infer the 3D positions and 3D velocities of stellar clusters (and individual members) based on Gaia DR3 astrometry. The distances and velocities thus derived for individual sources in our sample are given in Table A.3.

We obtain the distance of 116596+95Mathematical equation: $1165_{ - 96}^{ + 95}$ pc for the full sample of 1 531 stars, which is consistent w-ith values in the literature that place the region between 1000 and 1200 pc (Gregorio-Hetem 2008). This estimate is more precise than the distance of 1177107+130Mathematical equation: $1177_{ - 107}^{ + 130}$ pc calculated from the inverse of the mean parallax (ϖ=0.850±0.085masMathematical equation: $\varpi = 0.850 \pm 0.085\,{\rm{mas}}$ mas). Our result is also fully consistent with the mean of the photogeometric distance estimate of 1177121+96Mathematical equation: ${\rm{1177}}_{ - 121}^{ + 96}$ pc that is computed from a direction-dependent prior on dis-tance, the colour and apparent magnitude of the sources (Bailer-Jones et al. 2021). Moreover, we also note that the distances inferred for the two subgroups in our sample (dA=115087+79Mathematical equation: ${d_A} = 1150_{ - 87}^{ + 79}$ pc and dB=1183108+103Mathematical equation: ${d_B} = 1183_{ - 108}^{ + 103}$ pc) are consistent between themselves and with the mean d-istance to CMa within the reported uncertainties (see Table 2).

As mentioned before, Kakayotl 2.0 also returns the UVW velocity components of the stars (see Table A.3). The U, V, and W velocity components are given in a right-handed system with its origin at the Sun where X points to the Galactic centre, Y points to the direction of Galactic rotation, and Z points to the Galactic North pole. We applied the interquartile range (IQR) criterion to identify and remove potential outliers from the distribution of the U,V, and W velocity components. In doing so, we rejected eight stars (out of 90 stars, see Section 3) yielding a final sample of 82 stars with complete data. Clusters A and B have 60 and 22 stars from this sample, respectively. Figure 7 presents the velocity vector of the stars. Although no significant differences are observed in the direction of motion between the two subgroups, there is a clear distinction in the spatial positions of the subgroups.

It is well known that the transformation of parallaxes, proper motions, and radial velocities into 3D velocities can result in correlated errors even in the absence of measurement errors (see e.g. Brown et al. 1997; Perryman et al. 1998). Here, we employed the Kalkayotl 2.0 code that implements a simultaneous modelling of positions, velocities, and their correlations, and also parallax and proper motions angular correlations, making full use of the astrometric data given in the Gaia DR3 catalogue to derive accurate 3D positions and 3D velocities. Figure 8 illustrates the correlation among the velocity components as we compare the velocity distributions of the subgroups in our sample. There is a clear overlap between the velocity distribution of the two subgroups within the confidence region of 99.7%, which suggests that their 3D space motions are still consistent.

The median uncertainty in each component of the UVW velocity of the full sample is 0.4, 0.3, and 0.3 km/s, respectively. These values are smaller than the observed velocity dispersion measured from the standard deviation in each direction (see Table 2) suggesting that the intrinsic velocity dispersion of the CMa population is resolved (σUint2.6km/sMathematical equation: $\sigma _U^{{\rm{int}}} \simeq 2.6\,{\rm{km/s}}$, σVint2.9km/sMathematical equation: $\sigma _V^{{\rm{int}}} \simeq 2.9\,{\rm{km/s}}$, σWiht2.2km/sMathematical equation: $\sigma _W^{iht} \simeq 2.2\,{\rm{km/s}}$). This implies that the velocity dispersion in CMa is somewhat consistent with the 1D velocity dispersion reported for other star-forming regions, for example Orion and Taurus, which typically range between 2 and 3 km/s (Kounkel et al. 2018; Galli et al. 2019).

Table 2

Distance and spatial velocity of CMa subgroups.

Thumbnail: Fig. 7 Refer to the following caption and surrounding text. Fig. 7

Spatial velocities of 82 members with known RVs projected on XY, YZ, and ZX planes.

Thumbnail: Fig. 8 Refer to the following caption and surrounding text. Fig. 8

Distribution of 3D velocity of CMa stars. The different symbols and colours indicate the two subgroups in our sample. The contours indicate the 68%, 95.4%, and 99.7% confidence levels computed from the mean covariance matrix of each population. Solid and dashed lines indicate the confidence ellipses of Clusters A and B, respectively.

Table 3

Isochronal age of CMa subgroups.

Thumbnail: Fig. 9 Refer to the following caption and surrounding text. Fig. 9

Colour-magnitude diagram of CMa region obtained for our original sample (left panel) and control sample (right panel). The different symbols indicate the two subgroups identified in our analysis, and the lines mark the 2 Myr isochrone obtained from different models.

4.3 Ages

4.3.1 Isochronal ages

In the following step we computed isochronal ages of the stars in our sample. The stellar ages were computed by interpolating the position of individual stars in the absolute colour-magnitude diagram constructed from Gaia DR3 photometry among the isochrones of pre-main sequence stars evolutionary models. We used three different models to compute isochronal ages: the Baraffe et al. (2015) models (hereafter, BHAC15), PARSEC v.1.2S (Bressan et al. 2012), and MIST (Dotter 2016).

We computed isochronal ages for 1130, 1531, and 1531 cluster members using the BHAC15, PARSEC, and MIST models, respectively. Extinction corrections for the G band magnitudes were applied based on the Bayestar catalogue (Schlafly et al. 2016). BHAC15 isochrones do not cover the full range of colours displayed in our sample of CMa stars, which explains the smaller number of stars with ages computed from this model.

Table 3 presents a comparative analysis of the stellar ages derived using three different evolutionary models. The mean isochronal age of the CMa region lies between approximately 2 and 3 Myr, depending on the adopted model. We do not find any significant age difference between the two subgroups, indicating that they are likely coeval. This result suggests that the CMa subgroups share a common formation history, similar to the stellar populations observed in the subgroups of the Lupus star-forming region (Galli et al. 2020).

We constructed a control sample using the mean distance of the CMa region (see Table 2) to compute the absolute magnitude for all stars. This was done to investigate whether the observed vertical spread in the colour-magnitude diagram can be explained by parallax measurement errors. However, as illustrated in Figure 9 we do not see significant changes in the colour-magnitude diagram caused by distance spread and the resulting ages are also fully consistent with those derived from our original sample (see Table 3). This analysis confirms the robustness of our results and indicates that the stellar population under investigation is remarkably young, with some objects as young as approximately 1 Myr. In the Perseus region, Olivares et al. (2023) reported a median stellar age of 10 Myr, while Bertout et al. (2007) found that the Taurus region hosts a younger population, with a median age of approximately 5 Myr. In the case of Orion, Kounkel et al. (2018) identified subgroups with ages ranging between 2 and 7 Myr. Based on these results, we infer that the CMa sample is younger than both the Perseus and Taurus star-forming regions and more closely resembles the younger subgroups within Orion.

Table 4

Results of classification scheme of Koenig & Leisawitz (2014) applied to our sample.

4.3.2 Fraction of disc-bearing stars

To complement our age analysis, we assessed the fraction of disc-bearing stars in our sample of cluster members. In this context, we used the methodology developed by Koenig & Leisawitz (2014) to detect and classify young stellar objects based on their infrared excess emission. We cross-matched our list of CMa members with the 2MASS and AllWISE catalogues to retrieve the infrared photometry of the stars in our sample. To do so, we used the pre-computed cross-match tables available in the Gaia archive with both the 2MASS and AllWISE catalogues. We found 2MASS and AllWISE photometry for 1031 stars (in our sample of 1531 cluster members), but 604 of them exhibit poor measurements that were discarded according to the Koenig & Leisawitz (2014) quality criteria. This leaves us with an effective sample of 427 sources with available infrared photometry for this analysis.

In our sample, we identified 67 Class II stars based on the classification scheme proposed by Koenig & Leisawitz (2014). No Class I sources were detected. One object, Gaia DR3 3048979420667363072, was initially classified as an active galactic nucleus (AGN); however, its estimated age (2 Myr; see Sect. 4.3.1) and parallax (ϖ=0.918±0.062Mathematical equation: $\varpi = 0.918\, \pm \,0.062$ mas) are inconsistent with this classification. We therefore consider this source to have been misclassified. The remaining 359 stars were not assigned a class by this method; nevertheless, considering their young ages (see Fig. 9) and their loci in the colour–colour diagrams (Fig. 10), it is plausible to classify them as Class III objects. A similar reasoning applies to Gaia DR3 3048979420667363072, bringing the total number of Class III sources to 360.

In Table 4, we present the number of objects corresponding to each subclass of YSOs identified in the two clusters of our sample. In Cluster A, 44 members were classified as Class II (28.6%), while 110 were identified as Class III (71.4%). For Cluster B, a total of 273 sources were classified, comprising 23 Class II stars (8.4%) and 249 Class III stars (91.2%). When comparing the fraction of disc-bearing stars between the two populations, we observe a slight indication that Cluster B exhibits a lower disc fraction. The separation between the classifications is illustrated in Fig. 10.

When comparing the proportion of classified sources within each subgroup, there is a subtle indication that Cluster B may be at a more advanced evolutionary stage than Cluster A, suggesting that the latter could be relatively younger. It is interesting to note that Cluster A, which appears to be somewhat younger (see Table 3), is also more concentrated in the vicinity of the molecular clouds of the CMa region while Cluster B defines a more dispersed population of young stars (see Figure 1). However, the limited number of classified sources in this region prevents us from drawing firm conclusions. The most plausible scenario, based on our results, is that both clusters formed contemporaneously.

Thumbnail: Fig. 10 Refer to the following caption and surrounding text. Fig. 10

Classification of objects using method developed by Koenig & Leisawitz (2014).

5 Conclusions

In this paper, we performed a new membership analysis of the CMa region based on Gaia DR3 data. We inferred membership probabilities for 162 331 sources in the field of CMa and identified 1531 high-probability cluster members. We confirm 401 members that were known from previous studies and identified 1 130 new members. We have therefore increased the number of cluster members in this region by a factor of 3 with respect to the previous study in this region by Santos-Silva et al. (2021).

We successfully calculated radial velocities for 90 sources in the CMa region using observations conducted by our team with the FLAMES spectrograph. The radial velocities derived in this paper have a typical (mean) uncertainty of about 1 km s−1. This represents a major improvement with respect to the Gaia DR3 catalogue where the mean radial velocity uncertainty of the stars in our sample is about 10km s−1. This is the most precise radial velocity survey of the CMa region to date.

We used our new census of cluster members to investigate the structure of the CMa region using the PAM clustering and identified two subgroups labelled Cluster A and Cluster B. The former corresponds to the CMa06 subgroup and the latter encompasses CMa05, CMa07, and CMa08, which were all previously identified by Santos-Silva et al. (2021). We estimate the distance of the two subgroups in CMa from Bayesian inference and show that they are roughly located at the same distance (d=116596+95Mathematical equation: $d = 1165_{ - 96}^{ + 95}$ pc) and have similar space motions.

We investigated the age of the CMa subgroups using two different approaches. First, we computed isochronal ages from different models, and we showed that the two subgroups have a mean age of 2–3 Myr. We do not see significant age differences between the two subgroups with this method. Second, we investigated the fraction of disc-bearing stars in the sample based on their infrared excess emission. We show that the fraction of discbearing stars in Cluster A is somewhat higher, implying that it may be at a younger evolutionary stage. However, the high number of sources with missing data for this analysis prevents us from drawing firm conclusions and a more careful analysis will be necessary in future studies.

The main limitation to this study was the lack of highresolution spectroscopy, which is crucial for characterizing the cluster members individually in terms of their physical properties and kinematics. In particular, radial velocities are the missing piece for unlocking the 3D space motion of the stars and reconstructing the local history of star formation of the CMa region. We therefore encourage astronomers to perform highresolution spectroscopy of the CMa region, which combined with the precise astrometry delivered by the Gaia satellite will allow for a much more complete picture of this star-forming complex.

Data availability

Tables A.1A.3 are available at the CDS via https://cdsarc.cds.unistra.fr/viz-bin/cat/J/A+A/707/A85.

Acknowledgements

We thank the referee for constructive criticism. S.N.S. acknowledges financial support from the São Paulo Research Foundation (FAPESP, grant: 2022/06054-4). P.A.B.G. acknowledges financial support from the São Paulo Research Foundation (FAPESP, grant: 2020/12518-8) and Con- selho Nacional de Desenvolvimento Científico e Tecnológico (CNPq, grant: 303659/2024-6). JO acknowledge financial support from: “Ayudas para con- tratos postdoctorales de investigación UNED 2021” and project PID2022- 142707NA-I00 financed by MCIN/AEI/10.13039/501100011033/FEDER, UE. N.M.R. acknowledges support from the Beatriu de Pinos’ postdoctoral program under the Ministry of Reseach and Universities of the Government of Catalonia (Grant Reference No. 2023 BP 00215). J.G.H. acknowledges financial support from the São Paulo Research Foundation (FAPESP, grant: 2023/08726-2). This work has made use of data from the European Space Agency (ESA) mission Gaia (https://www.cosmos.esa.int/gaia), processed by the Gaia Data Processing and Analysis Consortium (DPAC, https://www.cosmos.esa.int/web/gaia/dpac/consortium). This research has also made use of data products from the Two Micron All Sky Survey (2MASS) and the AllWISE program, which combines data from the WISE and NEOWISE missions. Additionally, this study has made use of the SIMBAD database, operated at CDS, Strasbourg, France.

References

  1. Alves, J., Zucker, C., Goodman, A. A., et al. 2020, Nature, 578, 237 [NASA ADS] [CrossRef] [Google Scholar]
  2. Ambartsumian, V. 1947, Commun. Byurakan Astrophys. Observ., 69, 127 [Google Scholar]
  3. Bailer-Jones, C. A. L., Rybizki, J., Fouesneau, M., Demleitner, M., & Andrae, R. 2021, AJ, 161, 147 [Google Scholar]
  4. Baraffe, I., Homeier, D., Allard, F., & Chabrier, G. 2015, A&A, 577, A42 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  5. Bertout, C., Siess, L., & Cabrit, S. 2007, A&A, 473, L21 [CrossRef] [EDP Sciences] [Google Scholar]
  6. Blanco-Cuaresma, S., Soubiran, C., Heiter, U., & Jofré, P. 2014, A&A, 569, A111 [CrossRef] [EDP Sciences] [Google Scholar]
  7. Bressan, A., Marigo, P., Girardi, L., et al. 2012, MNRAS, 427, 127 [NASA ADS] [CrossRef] [Google Scholar]
  8. Brown, A. G. A., Perryman, M. A. C., Kovalevsky, J., et al. 1997, in ESA Special Publication, 402, Hipparcos – Venice 1997, ed. B. Battrick, 681 [Google Scholar]
  9. Clariá, J. J. 1974, A&A, 37, 229 [Google Scholar]
  10. Dobashi, K., Uehara, H., Kandori, R., et al. 2005, PASJ, 57, S1 [Google Scholar]
  11. Dong, Y., Xu, Y., Hao, C., et al. 2024, AJ, 168, 225 [Google Scholar]
  12. Dotter, A. 2016, ApJS, 222, 8 [Google Scholar]
  13. Fernandes, B., Gregorio-Hetem, J., Montmerle, T., & Rojas, G. 2015, MNRAS, 448, 119 [Google Scholar]
  14. Fischer, W. J., Padgett, D. L., Stapelfeldt, K. L., & Sewilo, M. 2016, ApJ, 827, 96 [NASA ADS] [CrossRef] [Google Scholar]
  15. Gaia Collaboration (Vallenari, A., et al.) 2023, A&A, 674, A1 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  16. Galli, P.A. B., Loinard, L., Bouy, H., et al. 2019, A&A, 630, A137 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  17. Galli, P. A. B., Bouy, H., Olivares, J., et al. 2020, A&A, 634, A98 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  18. Gregorio-Hetem, J. 2008, The Canis Major Star Forming Region, 5, ed. B. Reipurth, 1 [Google Scholar]
  19. Gregorio-Hetem, J., Montmerle, T., Rodrigues, C. V., et al. 2009, A&A, 506, 711 [CrossRef] [EDP Sciences] [Google Scholar]
  20. Herbst, W., Racine, R., & Warner, J. W. 1978, ApJ, 223, 471 [NASA ADS] [CrossRef] [Google Scholar]
  21. Hunt, E. L., & Reffert, S. 2023, A&A, 673, A114 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  22. Kaufman, & Rousseeuw. 1990, Partitioning Around Medoids (Program PAM) (John Wiley & Sons, Ltd), 68 [Google Scholar]
  23. Koenig, X. P., & Leisawitz, D. T. 2014, ApJ, 791, 131 [Google Scholar]
  24. Kounkel, M., Covey, K., Suárez, G., et al. 2018, AJ, 156, 84 [NASA ADS] [CrossRef] [Google Scholar]
  25. Novikov, A. 2019, J. Open Source Softw., 4, 1230 [Google Scholar]
  26. Olivares, J., Bouy, H., Sarro, L. M., et al. 2019, A&A, 625, A115 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  27. Olivares, J., Bouy, H., Miret-Roig, N., et al. 2023, A&A, 671, A1 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  28. Olivares, J., Bouy, H., Dorn-Wallenstein, T. Z., & Berihuete, A. 2025, A&A, 693, A12 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  29. Pasquini, L., Avila, G., Blecha, A., et al. 2002, The Messenger, 110, 1 [Google Scholar]
  30. Perryman, M. A. C., Brown, A. G. A., Lebreton, Y., et al. 1998, A&A, 331, 81 [NASA ADS] [Google Scholar]
  31. Pettersson, B., & Reipurth, B. 2019, A&A, 630, A90 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  32. Racine, R. 1968, AJ, 73, 233 [CrossRef] [Google Scholar]
  33. Ratzenböck, S., Großschedl, J. E., Möller, T., et al. 2023, A&A, 677, A59 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  34. Rousseeuw, P. J. 1987, J. Computat. Appl. Math., 20, 53 [CrossRef] [Google Scholar]
  35. Ruprecht, J. 1966, Bull. Astron. Inst. Czech., 17, 33 [Google Scholar]
  36. Santos-Silva, T., Gregorio-Hetem, J., Montmerle, T., Fernandes, B., & Stelzer, B. 2018, A&A, 609, A127 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  37. Santos-Silva, T., Perottoni, H. D., Almeida-Fernandes, F., et al. 2021, MNRAS, 508, 1033 [NASA ADS] [CrossRef] [Google Scholar]
  38. Sarro, L. M., Bouy, H., Berihuete, A., et al. 2014, A&A, 563, A45 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
  39. Schlafly, E. F., Meisner, A. M., Stutz, A. M., et al. 2016, ApJ, 821, 78 [NASA ADS] [CrossRef] [Google Scholar]
  40. Shevchenko, V. S., Ezhkova, O. V., Ibrahimov, M. A., van den Ancker, M. E., & Tjin A Djie, H. R. E. 1999, MNRAS, 310, 210 [Google Scholar]
  41. Thorndike, R. 1953, Psychometrika, 18, 267 [CrossRef] [Google Scholar]
  42. Tibshirani, R., Walther, G., & Hastie, T. 2002, J. Roy. Statist. Soc. Ser. B: Statist. Methodol., 63, 411 [Google Scholar]
  43. Zucker, C., Speagle, J. S., Schlafly, E. F., et al. 2020, A&A, 633, A51 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

Appendix A Tables (online material)

Table A.1

Properties of 1531 cluster members selected from our membership analysis.

Table A.2

Membership probability of 162 331 sources in the input catalogue.

Table A.3

Properties of 90 members with radial velocities measured in our analysis.

All Tables

Table 1

Results of our membership analysis with various probability thresholds.

Table 2

Distance and spatial velocity of CMa subgroups.

Table 3

Isochronal age of CMa subgroups.

Table 4

Results of classification scheme of Koenig & Leisawitz (2014) applied to our sample.

Table A.1

Properties of 1531 cluster members selected from our membership analysis.

Table A.2

Membership probability of 162 331 sources in the input catalogue.

Table A.3

Properties of 90 members with radial velocities measured in our analysis.

All Figures

Thumbnail: Fig. 1 Refer to the following caption and surrounding text. Fig. 1

Spatial distribution of 1531 stars identified as members of CMa region, overlaid on Digitized Sky Survey 2 (DSS2) red-band image. The stars are colour-coded according to their subgroup membership: yellow symbols indicate stars belonging to Cluster A, while orange symbols represent stars assigned to Cluster B, as described in Section 4.1.

In the text
Thumbnail: Fig. 2 Refer to the following caption and surrounding text. Fig. 2

Parallaxes and proper motions of 1 531 stars identified in our membership analysis and colour-magnitude diagram of CMa sample found by our membership investigation. The empirical isochrone, derived in our analysis, is indicated by the solid black line. The individual membership probabilities of the stars are scaled from 0 to 1 and shown with different colours.

In the text
Thumbnail: Fig. 3 Refer to the following caption and surrounding text. Fig. 3

Venn diagram comparing the number of stars in common between our analysis and the previous study by Santos-Silva et al. (2021).

In the text
Thumbnail: Fig. 4 Refer to the following caption and surrounding text. Fig. 4

Left panel: kernel density estimation of the radial velocity distribution for the sample of 90 members with available measurements. Right panel: distribution of radial velocity uncertainties. The ticks in the horizontal axes mark the measurements for individual stars.

In the text
Thumbnail: Fig. 5 Refer to the following caption and surrounding text. Fig. 5

Comparison of radial velocity data obtained from our observations and Gaia DR3 data. One source of our sample (namely Gaia DR3 3046207139539172864) is not shown to improve the visibility of the plot.

In the text
Thumbnail: Fig. 6 Refer to the following caption and surrounding text. Fig. 6

PAM clustering results with k = 2, showing identified groups.

In the text
Thumbnail: Fig. 7 Refer to the following caption and surrounding text. Fig. 7

Spatial velocities of 82 members with known RVs projected on XY, YZ, and ZX planes.

In the text
Thumbnail: Fig. 8 Refer to the following caption and surrounding text. Fig. 8

Distribution of 3D velocity of CMa stars. The different symbols and colours indicate the two subgroups in our sample. The contours indicate the 68%, 95.4%, and 99.7% confidence levels computed from the mean covariance matrix of each population. Solid and dashed lines indicate the confidence ellipses of Clusters A and B, respectively.

In the text
Thumbnail: Fig. 9 Refer to the following caption and surrounding text. Fig. 9

Colour-magnitude diagram of CMa region obtained for our original sample (left panel) and control sample (right panel). The different symbols indicate the two subgroups identified in our analysis, and the lines mark the 2 Myr isochrone obtained from different models.

In the text
Thumbnail: Fig. 10 Refer to the following caption and surrounding text. Fig. 10

Classification of objects using method developed by Koenig & Leisawitz (2014).

In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.