| Issue |
A&A
Volume 704, December 2025
|
|
|---|---|---|
| Article Number | A312 | |
| Number of page(s) | 9 | |
| Section | Extragalactic astronomy | |
| DOI | https://doi.org/10.1051/0004-6361/202556635 | |
| Published online | 18 December 2025 | |
An accurate measure of the size of dark matter haloes using the size of galaxies
1
Instituto de Astrofísica de Canarias, C/ Vía Láctea s/n, 38205 La Laguna, Spain
2
Departamento de Astrofísica, Universidad de La Laguna, Av. del Astrofísico Francisco Sánchez s/n, 38206 La Laguna, Spain
★ Corresponding author: claudio.dalla.vecchia@iac.es
Received:
28
July
2025
Accepted:
17
October
2025
The physically motivated definition of galaxy size proposed recently, linked to the farther location of the in situ star formation, considerably reduces the scatter of the galaxy mass–size relation and provides a viable method to infer the galaxy stellar mass from its size. We provide a similar relation correlating the size of galaxies with the size of their dark matter haloes by leveraging the small scatter of the aforementioned relation. We analysed the simulated galaxies of the two main cosmological volumes of the EAGLE simulations and computed the size of the galaxies and their mass when mimicking the observational analysis. For central galaxies, we computed the relation between galaxy size and halo size. We show that the simulated galaxies reproduce the observed stellar mass–size relation’s normalisation and slope. The scatter of this relation, 0.06 dex, matches the intrinsic scatter measured in observation. We then computed the correlation between galaxy size and halo size and found that the relation is steeper than when using the half-mass radius as a measure of size, with the scatter (0.1 dex) being a factor of two smaller than the observed relation. As well, the galaxy-to-halo mass relation derived from the simulations provides a factor of two better scatter than the observed scatter. This opens the possibility of measuring the size of dark matter haloes with greater accuracy (less than 50%, i.e. around six times better than using the effective radius) by using only deep imaging data.
Key words: galaxies: evolution / galaxies: fundamental parameters / galaxies: halos / galaxies: structure
© The Authors 2025
Open Access article, published by EDP Sciences, under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
This article is published in open access under the Subscribe to Open model. Subscribe to A&A to support open access publication.
1. Introduction
A physically motivated definition of galaxy size, given by the farthest radial location where gas could efficiently collapse and turn into stars, was proposed by Trujillo et al. (2020). As a proxy for measuring this position, these authors suggest using the radial location of a stellar mass density contour at 1 M⊙ pc−2, denoted R1. This measure is consistent with the gas density threshold for star formation derived theoretically by Schaye (2004). The use of 1 M⊙ pc−2 was calibrated using deep observations of Milky Way-sized galaxies at z ∼ 0, and it provides a more representative boundary for galaxies than traditional metrics such as the effective radius, Re (Sérsic 1968). Using this definition over a wide range of stellar masses (from 107 to 1012 M⊙) significantly reduces the scatter in the stellar mass–size relation to about 0.06 dex, thus outperforming conventional measures such as Re and the Holmberg radius, RH (Holmberg 1958).
Even though R1 is only a proxy to characterise the physical radius where in situ star formation is or has been taking place, the enormous reduction in the scatter of the stellar mass–size relation underscores the robustness of R1 as a galaxy size metric. The results of using R1 also reveal clear trends within the stellar mass–size relation. Galaxies with stellar masses between 107 and 1011 M⊙ follow a consistent power-law relationship with a slope of around 1/3. Across the range of galaxy stellar masses studied, the scatter is consistently small (∼0.06 dex), providing an accurate measure of the galaxy’s stellar mass by knowing only its physical size.
Observed disc galaxies follow the relation R1 ∝ M★1/3 over several orders of magnitude in stellar mass. A similar relation has been proposed for neutral hydrogen by Broeils & Rhee (1997). The mass–size relation for neutral hydrogen is
, where DH I is the diameter of the neutral hydrogen disc of a spiral galaxy. Wang et al. (2016), using observations of 500 galaxies, showed that this correlation holds irrespective of galaxy luminosity, fraction of neutral gas, and morphology, with a scatter of 0.06 dex. This points to the non-trivial interplay of gas accretion, star formation efficiency, and stellar feedback in maintaining the two mass–size relations over several orders of magnitude in stellar mass.
Chamba et al. (2022) tested the validity of the hypothesis of using R1 as an indicator of the radius up to which in situ star formation has occurred. To do this, they identified the position of the stellar edge of galaxies, Redge, as the outermost cut-off in their stellar mass radial profile using deep multi-band optical imaging; verified that it corresponds to a change in the radial colour profile and surface brightness profile; and measured the stellar mass surface density it corresponds to. Chamba et al. (2022) find that Redge ∼ R1 for galaxies with masses similar to the Milky Way (see also Golini et al. 2025). For lower-mass galaxies (dwarfs) or more massive galaxies (ellipticals), the surface mass density of the galaxy edge is slightly different from 1 M⊙ pc−2. It is smaller by a factor of about two for low-mass galaxies and larger by a factor of about three for massive elliptical galaxies. Be that as it may, within a range of five orders of magnitude in stellar mass, the approximation of using R1 as an indicator of galaxy size is quite good, and its practical implementation is straightforward. One of the main results of Trujillo et al. (2020) is that as long as the definition of galaxy size includes most of the stellar mass, the mass–size relation should have a small scatter (e.g. Miller et al. 2019). Sánchez Almeida (2020) has shown that for a fixed stellar mass, stellar profiles with different Sérsic indexes all cross at a similar stellar mass surface density, and they concluded that this explains the small scatter when using R1 as the size.
In a subsequent paper, Buitrago & Trujillo (2024) studied the evolution of the stellar mass–size relation with redshift for a sample of Milky Way-like galaxies, directly measuring Redge (instead of R1) of galaxies up to z = 1. They showed that the edge is located at higher stellar surface densities at a higher redshift and that disc galaxies of the same stellar mass are smaller with increasing redshift, with their size inversely proportional to (1 + z). This shows that using R1 as a proxy for galaxy size is a valid option only at z ∼ 0. The dependence of galaxy size on redshift reveals the connection between the growth of the dark matter halo and the galaxy disc, as the former grows in size with the same redshift scaling (Navarro et al. 1997).
In a companion paper, Arjona-Gálvez et al. (2025) show that when using R1 as the definition of size, galaxies extracted from a variety of zoom-in simulations performed with different models for galaxy formation all share the same mass–size relation. When studying its evolution in simulations, the relation does not change with redshift; rather, the galaxies move on the relations as they grow in mass and size. The difference between this and the observations of Buitrago & Trujillo (2024) is that the stellar surface density threshold is kept fixed with redshift in the analysis, whilst observations show that the stellar surface density at which disc truncation is observed is larger at a higher redshift. What the simulations have in common is that their star formation models all try to reproduce the observed (local) Kennicutt-Schmidt law and its surface density threshold for star formation (Kennicutt 1998; Martin & Kennicutt 2001; Bigiel et al. 2008), which is assumed to hold at any redshift. Combining this with a fixed stellar surface density may result in galaxies of the same mass having the same R1 at any redshift.
Given the close connection between the new definition of galaxy size and the physical process of in situ star formation, as well as the observation that the size thus defined scales with the expected evolution of dark matter halo size, in this paper we explore the correlation between galaxy size, R1, and dark matter halo size, R200, defined as the radius of the sphere with an average mass density 200 times the critical density of the Universe. We find that the simulations reproduce the observed relation between stellar mass and size remarkably well. Moreover, they predict that the relation between R1 and R200 is extremely tight so that when given R1, one can estimate R200 with an uncertainty of less than 50%. This is a factor of more than two better than the predicted relation between the effective radius and R200 (Kravtsov 2013) and the predicted stellar-to-halo mass relation for observed galaxies when the galaxy size is given by the radius containing 80% of the light (Mowla et al. 2019).
In this work, we analyse the two main cosmological hydrodynamical simulations of the EAGLE project (Schaye et al. 2015; Crain et al. 2015) for the stellar mass range ≈107.5 − 1012 M⊙. We describe the data used in the simulations and the analysis in the following section. In Section 3, we compare the results from the simulations with observations. In Section 4, we link the size of the galaxies to the size of their halo, showing that this correlation can predict the halo mass and size with a higher accuracy when compared to using the half-mass radius or Re. A discussion and our conclusions are presented in Section 6.
2. Analysis of the simulations
2.1. Simulations and simulation data
The simulations employed in this work were performed with the EAGLE model for galaxy formation and evolution. For more information on the EAGLE model and its calibration against global relations of the observed galaxy population (galaxy stellar mass function, stellar mass–size relation, and stellar mass-black hole mass relation), we refer to Schaye et al. (2015) and Crain et al. (2015). For more details on the numerical algorithms describing the photo-ionisation equilibrium cooling, star formation, stellar evolution, stellar feedback, black hole growth, and feedback we refer to Schaye & Dalla Vecchia (2008), Wiersma et al. (2009a,b), Dalla Vecchia & Schaye (2012), Rosas-Guevara et al. (2015), respectively. A description of the benefits of the hydrodynamic scheme used in the EAGLE model can be found in Schaller et al. (2015a) (see also Durier & Dalla Vecchia 2012). For the analysis presented here, the raw particle data at z = 0 and the public database1 were used (McAlpine et al. 2016; The EAGLE Team 2017).
The EAGLE model implements the sub-grid prescription for star formation of Schaye & Dalla Vecchia (2008). Briefly, the model is based on the conversion of the Kennicutt-Schmidt empirical law (Kennicutt 1998) that correlates the surface rate of star formation to the surface density of gas, to a volumetric law that expresses the rate of star formation as a function of gas pressure. The density threshold above which the gas is able to form stars is assumed to vary with the metallicity of the gas (Schaye 2004). Therefore, star formation in EAGLE occurs from the threshold density set by the metallicity of the gas and beyond. This threshold decreases with increasing metallicity. Details on the numerical implementation of the dependence on metallicity of the density threshold for star formation can be found in Schaye et al. (2015). We emphasise that the theoretical predictions of Schaye (2004) are used as a starting hypothesis for the definition of galaxy size by Trujillo et al. (2020).
The post-processing of the simulations was performed with the friends-of-friends algorithm (e.g. Einasto et al. 1984; Davis et al. 1985) and the SubFind algorithms (Springel et al. 2001; Dolag et al. 2009), which define the distribution of overdense regions in the simulated cosmological volumes and their substructure. We further post-processed the z = 0 outputs of the EAGLE simulations to compute dust-attenuated stellar absorption spectra that we convolved with filter response curves and integrated in the wavelength bands of the Sloan Digital Sky Survey (Abazajian et al. 2003). We employed the EMILES stellar spectra library (Vazdekis et al. 2016) and the same methodology described in Negri et al. (2022). More details are given in that work.
We analysed the z = 0 output of two EAGLE simulations. The reference EAGLE volume of (100 Mpc)3 – the ‘reference’ simulation – initially contained 2 × 15043 dark matter and gas particles, had a physical spatial resolution of ϵ = 0.7 kpc, had particle masses of mgas = 1.81 × 106 M⊙ and mdm = 9.70 × 106 M⊙, for gas and dark matter, respectively. The recalibrated EAGLE volume of (25 Mpc)3 – the ‘recalibrated’ simulation – initially contained 2 × 7523 dark matter and gas particles, had a physical spatial resolution of ϵ = 0.35 kpc, and had particle masses of mgas = 2.26 × 105 M⊙ and mdm = 1.21 × 106 M⊙. For the recalibrated simulation, the EAGLE galaxy formation model was calibrated in order to match the same global relations at z = 0 reproduced within the largest volume (Crain et al. 2015). The recalibrated simulation was used to extend the analysis down to galaxy stellar masses ≈2 × 107 M⊙ (about 1 dex smaller stellar masses than in the reference simulation), after restricting the minimum stellar mass to about 100 stellar particles. Finally, the initial conditions for the aforementioned simulations were generated with the cosmological parameters inferred by the Planck Collaboration XVI (2014): Ω0 = 0.307, ΩΛ = 0.693, Ωb = 0.04825, h = 0.6777, σ8 = 0.8288, and ns = 0.9611.
2.2. Determination of galaxy size and stellar mass
We describe here how we measured the size of the simulated galaxies. The R1 radius has been defined as the projected distance from the centre of the galaxy to the location where the stellar surface density reaches 1 M⊙ pc−2. We determined R1 by first rotating the galaxy face-on using the angular momentum vector of the stellar component, thus avoiding the uncertainties introduced by de-projecting the stellar distribution. Because of the relatively low spatial and mass resolution of the simulations, we did not calculate maps of the stellar mass surface density as usually done in observational work. Indeed, given the stellar particle mass of about 106 M⊙ for the reference simulation, at the threshold surface density of 1 M⊙ pc−2, the particle number density would be about 1 kpc−2, making any projected surface density very noisy. We therefore opted to use the stellar surface density profile derived by averaging the projected stellar mass in circular annuli. We set the centre of the galaxy at the position of the minimum of the gravitational potential of the halo, as provided by the sub-halo finder. This is a good estimate of the centre of the galaxy, as the offset of the stellar component with respect to the dark matter halo is less than the spatial resolution of the simulation for the majority of galaxies (Schaller et al. 2015b). R1 was then derived by interpolating the surface density profile at the position of the threshold surface density.
We determined the galaxy stellar mass as the projected stellar mass within a radius of constant surface brightness. This was done in order to consistently include the same bias introduced by observers in determining the relation. Indeed, there is no direct correlation between the measurement of R1 and M★ in Trujillo et al. (2020). R1 was derived by converting the surface brightness into surface stellar mass density, while M★ was derived by integrating the stellar mass above within the surface brightness threshold of 29 mag arcsec−2 in the g band. This corresponds to a stellar surface density threshold lower than 1 M⊙ pc−2. In order to calculate the position of the threshold in surface brightness, we converted the stellar mass surface density by inverting Equation (1) of Bakos et al. (2008):
where Σ★ is the stellar surface density in M⊙ pc−2, M/L is the mass-to-light ratio at the wavelength λ, and mabs, ⊙, λ is the absolute magnitude of the Sun at the same wavelength. We calculated M/L with the Sloan Digital Sky Survey g band magnitude that we computed in the post-processing of the simulations for each stellar particle. From here on, we denote the projected stellar mass within R29, g as M★.
For each galaxy, we also calculated the stellar half-mass radius, R50, as the radius containing half the projected stellar mass. This radius is a good proxy for Re. The computation of R1, R50, and R29, g was performed for all simulated galaxies with a mass larger than 100 initial gas particle masses (2.26 × 107 and 1.81 × 108 M⊙ for the recalibrated and reference simulations, respectively). However, the algorithm failed to converge for a fraction of galaxies that are barely resolved or have a disturbed morphology. We marked them and did not use them in the analysis. The final selection of the samples of galaxies used in the analysis is described in Appendix A.
We show in Figure 1 one example of a simulated spiral galaxy with a stellar mass of 3.8 × 1010 M⊙. From left to right, we show the radial profiles for the projected stellar surface density, the surface brightness, and the star formation surface density. The insets are the corresponding maps and have an arbitrary logarithmic colour scale. The circles in the maps and the vertical lines in the profiles plots indicate the values of the half-mass radius, R50 (dashed line); R29, g (dash-dotted line); and the galaxy size, R1 (solid line). For the galaxy shown in Figure 1, R1 is an excellent proxy for measuring the edge of the star forming disc and thus a good approximation of the physically motivated definition of galaxy size. The stellar disc truncation is close to R1, and the gas surface density steeply declines beyond R1 and so does the star formation surface density. If the galaxy ceased to form stars for the rest of its evolution and without contribution to its stellar mass from mergers, the edge of the in situ star formation would still be marked by R1.
![]() |
Fig. 1. From left to right, projected density maps and radial profiles of stellar mass, stellar luminosity, and gas star formation rate for a simulated spiral galaxy of stellar mass 3.8 × 1010 M⊙ selected from the recalibrated simulation. The vertical dashed, solid, and dot-dashed lines mark the positions of R50, R1, and R29, g, respectively. For this star forming galaxy, R1 is a better measure of the size of the galaxy for its proximity to the edge (truncation radius) of the disc. The edge of the disc coincides with a net drop in the star formation rate surface density, the starting hypothesis for the new definition of galaxy size. |
3. Results
The results of the study of the galaxy mass–size relation in simulations is presented here. We show in Figure 2 the correlation between R1 and galaxy stellar mass for all galaxies in the sample, central and satellite2. The sample was selected as described in Appendix A. We emphasise that even when using the same definition of R1, the analysis of the simulated data differs substantially from that of observational data. The large dots in the figure are the median values of R1 measured as described in Section 2.2, while the small dots are individual galaxies where the stellar mass bin contains fewer than ten galaxies. The shaded area represents the 16% and 84% quantiles in each bin. The simulated relation follows the trend of observations, with large slopes for stellar masses below 108.5 M⊙ and above 1010.5 M⊙ with respect to the mass range in between. As mentioned in the introduction, Trujillo et al. (2020) explains that this is due to the fixed surface density threshold employed in their analysis. The same trend is shown in the figure of Appendix B, where we show the calculated relation for the samples of central3 and satellite galaxies.
![]() |
Fig. 2. Correlation between R1 and stellar mass for all galaxies in the sample (orange symbols). The median values in each bin are plotted together with the 16% and 84% quantiles (shaded area). For bins with fewer than ten galaxies, single galaxies are plotted as small dots. The orange solid line depicts the double power-law fit for M★ > 108.6 M⊙ (extrapolated below that mass). We show the relation between R50 and stellar mass for all galaxies in the sample in purple. For visual comparison, the galaxies in the sample of Trujillo et al. (2020) are represented with small dots, both R1 and R50. The distribution of the residuals around the best-fit linear relation for all galaxies is shown in the inset. The distribution has been fitted with a Gaussian (solid line) with a dispersion of σΔlog10R1 = 0.06 dex. |
For comparison, in Figure 2 we also plot the median distribution of projected stellar half-mass radii, R50, with its 16% and 84% quantiles (purple solid line and shaded area). The scatter around the median is more than double of that for R1. Moreover, the correlation between R50 and stellar mass is very weak in the mass range 108.5 < M★/M⊙ < 1010.5, where the mass of galaxies with similar R50 ≈ 2 − 3 kpc spans over two orders of magnitude. The weak correlation and its large scatter provide a relation with little information on the properties of galaxies. However, for a given stellar mass, the scatter around R50 can provide insight into how gas has been redistributed by feedback processes within a star forming galaxy (e.g. Crain et al. 2015; Rohr et al. 2022).
In order to provide an analytic fit to the relation, we chose the extended double power-law function defined as follows:
with free parameters (A, xs, α, β, s). Parameter A sets the normalisation, xs is the scale at which the slope transitions between the asymptotic values α and β, and s is the strength of this transition. Here, we assumed (x, y, xs)≡(M★, R1, Ms). We then fit the median values for M★ > 108.5 M⊙, excluding the low-mass end drop of R1 with decreasing mass4. The χ-square minimisation gives the following set of parameters for the best-fit function: (A, Ms, α, β, s)≈(22.8 kpc, 3.21 × 1010 M⊙, 0.35, 0.60, 4.25). The asymptotic slopes, α = 0.35 and β = 0.60, are remarkably close to the observed values given by Trujillo et al. (2020) (and the high-mass slope given by Mowla et al. 2019).
In Appendix B, we provide similar fits for central and satellite galaxies and a table with the double power-law fit parameters and their errors for the three sub-samples of galaxies. As a check of the quality of the fit to the binned values, we also calculated the best-fit linear function to individual galaxies within the mass range 108.5 ≤ M★/M⊙ ≤ 1010.5 (dashed line), which yielded the correlation
which successfully matches the asymptotic slope of the fit to the median.
The slope of the size-mass relation increases to a larger value above the mass Ms ≈ 1010.5 M⊙. In numerical modelling, this is a characteristic mass that marks the transition between supernova and black hole feedback dominance for several numerical models (e.g Bower et al. 2017; McAlpine et al. 2017; Pillepich et al. 2018). Observations show a similar characteristic mass. Trujillo et al. (2020) give a value close to 1010.8 M⊙, whilst Mowla et al. (2019) quote a mass of 1010.2 M⊙. In both cases, the characteristic stellar mass marks the morphological transition from disc to elliptical galaxies, which is, from theoretical studies, closely connected to the transition between stellar and black hole feedback and to the transition in the process of stellar mass growth from gas accretion to dry mergers with increasing halo mass.
Finally, we show in the inset in Figure 2 the distribution of residuals around the best-fit stellar mass–size relation and in the same mass range used for the calculation of the best-fit analytic function for the full sample of galaxies. The distribution is well represented by a Gaussian with a central value of ⟨Δlog10R1⟩ = 0.00 dex and a dispersion of σΔlog10R1 = 0.06 dex, in agreement with the intrinsic scatter measured by Trujillo et al. (2020). We show in the next section that this very small scatter yields a tighter correlation between galaxy and halo size than what was inferred by Kravtsov (2013) and a smaller scatter in the stellar-to-halo mass relation than that observed by Mowla et al. (2019) with their definition of galaxy size.
4. Correlation between galaxy and halo sizes
In this section, we present the main result of this work, which is the study of the correlation between galaxy and halo sizes. We show in Figure 3 the relation between R1 and R200. The top horizontal axis gives the mass of the halo, M200, proportional to R2003. By definition, M200 and R200 are the mass and radius of a sphere centred on the potential minimum and with an average density 200 times the critical density of the Universe: Mδ = (4/3)πδρcritRδ3, with δ = 200.
![]() |
Fig. 3. Correlation between R1 and R200 for central galaxies. Median (dots and solid lines) and 16% and 84% quantiles (shaded areas) are shown. We quote in the legend the average of the scatter for the entire sample of central galaxies and with respect to the analytic fit as well as the dispersion around the mean. The purple line and shaded area are the relation between R50 and R200 and its 16% and 84% quantiles. The dotted line is the linear correlation proposed by Kravtsov (2013) for early-type galaxies, R50 = 0.015 R200, with a scatter of 0.2 dex (shaded area). The top x-axis gives the corresponding halo mass, M200. |
We restricted the sample to only central galaxies because the calculation of the mass and radius of their haloes is more reliable. The data distribution is shown as the median in equally spaced logarithmic bins (orange dots), with the orange shaded area delimiting the 16% and 84% quantiles. The correlation between R1 and stellar mass can be described by a double power-law function. We employed the extended double power-law function of Equation (2), with (x, y, xs)≡(R200, R1, Rs), whose χ-square minimisation fit provides the parameter values (A, Rs, α, β, s)≈(16.4 kpc, 175.8 kpc, 2.04, 1.25, 15.1). The scale radius, Rs, corresponds to the halo mass of about 1011.8 M⊙. All fitted parameters with errors are given in Table B.2. Given the tight correlation between R1 and M★, the above correlation can be interpreted as the stellar-to-halo mass relation (see Arjona-Gálvez et al. 2025, for more details, and later in this section for the comparison with observations).
For comparison, in Figure 3 we also plot the relation between R50 and R200 (purple line and shaded area) and the corresponding linear relation of Kravtsov (2013) (dotted line). The R50 = 0.015 R200 relation was derived using a large sample of observed galaxies with stellar masses between 105 and 1012 M⊙, and it qualitatively agrees with the EAGLE simulation data. The scatter measured by Kravtsov (2013), ≈0.2 dex, is twice that of the simulated galaxies, σΔlog10R1 = 0.1 dex.
Kravtsov (2013) assumed that the relation between galaxy and halo sizes is linear, with a well-defined proportionality for all galaxies and their haloes (see also Somerville et al. 2018, for a similar hypothesis). The reasoning is that simulations show a very tight distribution of the spin parameter, λ, of dark matter haloes at any scale (e.g. Peebles 1969; Bullock et al. 2001), and if the halo angular momentum dictates the size of the galaxy’s disc, there should be a well-defined proportionality between the distribution of stars in the disc and the size of its dark matter halo. At a fixed stellar mass, variations of the stellar half-mass radius should reflect the dispersion around the average value of λ. Although we found some indication that the scatter in the mass–size relation may correlate with λ at a fixed stellar mass (not shown in this work), this correlation is weak.
The R1 − R200 relation asymptotically approaches the slope β ≈ 1.25 for halo masses larger than ≈1011.5 M⊙, which is steeper than what is inferred for observed galaxies. We conclude that given the small scatter and prominent slope in the R1 − R200 relation, the physical size of observed galaxies can be used to infer the size (and mass) of their host haloes with noticeable accuracy. For a galaxy such as the Milky Way, using R1 to infer R200 would be six times more precise than using R50.
As a final exercise, we analysed the stellar-to-halo mass relation, shown in Figure 4. We computed the best-fit parameters for the double power-law function of Equation (2), with (x, y, xs)≡(M★, M200, Ms), which gives the values (A, Ms, α, β, s)≈(5.8 × 1011 M⊙, 3.13 × 1010 M⊙, 0.41, 1.69, 0.98), confirming the change of slope at Ms ≈ 1010.5 M⊙. The residual distribution is shown in the inset of Figure 4, with the calculated dispersion around the mean of 0.11 dex, which is half the value reported by Mowla et al. (2019) in their work.
![]() |
Fig. 4. Correlation between M200 and M★ for central galaxies. Median (dots and solid lines) and 16% and 84% quantiles (shaded areas) are shown. We quote in the legend the average of the scatter for the entire sample of central galaxies and with respect to the analytic fit, together with dispersion around the mean. We plot in the inset the distribution of residuals for the analytic fitting function and all the galaxies in the sample. |
5. Discussion
The definition of galaxy size of Trujillo et al. (2020) is physically motivated by the theoretical prediction that the onset of star formation requires specific conditions for the gas to collapse into star forming clouds (Schaye 2004). Accretion of gas into the galaxy and its subsequent cooling provide the material for the galaxy to grow both in stellar mass and size through star formation. Therefore, the physical edge of a star forming galaxy can be defined by the extension of its in situ star formation activity. The threshold gas surface density for star formation given by Schaye (2004) can be converted into a stellar surface density that, even if the galaxy stops forming stars, marks the maximum extension of its in situ stellar mass buildup.
We have shown in this work that the observed galaxy mass–size relation as defined by Trujillo et al. (2020) holds in cosmological hydrodynamic simulations of the formation of galaxies. The remarkable agreement between simulations and observations can be used to infer the size and mass of dark matter haloes with an accuracy higher than 50%. The advantage of this definition of size is that the mass–size relation not only has reduced scatter but is also steeper for intermediate mass galaxies when compared to other empirical definitions of size, such as R50. This is what makes it a powerful (around six times better than R50) predictor of galaxy and halo mass.
We stress that R1 is only a proxy of the physically motivated definition of size, calibrated for late-type galaxies of mass similar to that of the Milky Way. On the other hand, as shown by Trujillo et al. (2020), Arjona-Gálvez et al. (2025), and this work, R1 provides a consistently tight relation between size and stellar mass over many orders of magnitude in mass. When applied to early-type galaxies, where the extent of in situ star formation has been modified by mergers, the threshold that defines the size should be between 3 and 10 M⊙ pc−2 (Chamba et al. 2022), suggesting that in situ star formation has happened at higher gas densities and larger redshift.
In a companion article, Arjona-Gálvez et al. (2025) have shown that simulations performed with different galaxy formation models all reproduce the same mass–size relation when R1 defines the galaxy size. How the interplay of gas accretion, star formation, and stellar feedback yields to the same result for a variety of numerical models is not trivial to understand. What all models have in common is that they are trying to reproduce the Kennicutt-Schmidt law and its threshold for star formation. Moreover, it seems that independent of the overall efficiency of star formation, galaxies of the same mass end up having the same size, although living in different haloes (see their Fig. 8, top panel).
The finding of Arjona-Gálvez et al. (2025) highlights a caveat for the prediction of the halo size. If the numerical model was not tuned to abundance matching models (e.g. Moster et al. 2013; Girelli et al. 2020) and galaxies were under- or over-massive with respect to the halo in which they were formed, the prediction of the halo size would suffer systematic errors. It is, however, worth mentioning that empirical abundance matching models do not agree among them for intermediate- and low-mass galaxies, and the systematic difference would also depend on which model is adopted. On the other hand, between the models, the difference in halo mass for a fixed stellar mass is less than a factor of two and slightly larger for different numerical models.
The question that remains open is why (simulated and observed) galaxies with stellar masses spanning many orders of magnitude and different morphologies follow such a tight mass–size relation. Sánchez Almeida (2020) investigated how the scatter in the mass–size relation significantly decreases when the size is measured at a fixed surface density, showing that galaxies with the same stellar mass always share at least one radius with identical surface density. Miller et al. (2019) hinted to this when defining the size of galaxies as the radius containing a given fraction of the stellar light, and they showed that when increasing the fraction of light, star forming and quiescent galaxies tend to move closer in the mass–size plane (see their Fig. 2). None of these explanations are physical, and we will investigate this question further in the future.
6. Conclusions
With this work we have explored the relation between galaxy and halo size by applying the physically motivated definition of galaxy size of Trujillo et al. (2020) to simulated galaxies. We leveraged the public data of the EAGLE simulation project, namely, the galaxies produced in two simulated cosmological volumes of different resolutions, but calibrated to give similar global relations at z = 0. In the following, we provide a summary of our conclusions:
-
We prove that the observed relation between galaxy stellar mass and size, as defined by Trujillo et al. (2020), holds in cosmological, hydrodynamical simulations of galaxy formation over five orders of magnitude in stellar mass. The remarkable agreement between the simulations and observations can be used to precisely infer the galaxy stellar mass from its observed physical size (Mowla et al. 2019; Sánchez Almeida 2020). We provide a fit to the distribution for galaxies more massive than 108.5 M⊙ (Figure 2 and Table B.1). We recovered the same scatter found by Trujillo et al. (2020).
-
We propose the theoretical prediction that the size (R200) and mass (M200) of the dark matter halo hosting the galaxy can be accurately inferred by knowing the size of its central galaxy. Figure 3 and Table B.2 quantify this prediction. The small scatter in the galaxy mass–size relation is a factor of two smaller than the observed scatter in the galaxy-to-halo size relation, whereas the slope of the relation greatly decrease the range of uncertainty when inferring the stellar mass of intermediate mass galaxies. These confirm the superiority of the galaxy size definition by nearly a factor of six when compared to observations using the half-mass radius (Kravtsov 2013).
The origin of the relation and its connection to gas accretion, star formation, and stellar feedback is still poorly understood. More theoretical work will be dedicated to it in the near future.
Trujillo et al. (2020) did not discriminate between central and satellite galaxies.
We refer the reader interested in the low-mass regime to Arjona-Gálvez et al. (2025) for a similar study employing high-resolution simulations.
Acknowledgments
This work has been supported by the Spanish Ministry of Science, Innovation and Universities (Ministerio de Ciencia, Innovación y Universidades, MICIU) through the research grants PID2021-122603NB-C22 and PID2022-140869NB-I00. CDV also acknowledges support from MICIU in the early stages of development of this work through grants RYC-2015-18078 and PGC2018-094975-C22. This research made use of computing time on the high-performance computing systems DEIMOS and DIVA of the Intituto de Astrofísica de Canarias. IT acknowledges support from: the ACIISI, Consejería de Economía, Conocimiento y Empleo del Gobierno de Canarias and the European Regional Development Fund (ERDF) under grant PROID2021010044; the IAC project P/302302, financed by the Ministry of Science and Innovation, through the State Budget, and by the Canary Islands Department of Economy, Knowledge, and Employment, through the Regional Budget of the Autonomous Community. This research also acknowledges support from the European Union through the following grants: UNDARK and ‘Excellence in Galaxies-Twinning the IAC’ of the EU Horizon Europe Widening Actions programmes (project numbers 101159929 and 101158446). Funding for this work/research was provided by the European Union MSCA EDUCADO (GA 101119830). Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union or European Research Executive Agency (REA). Neither the European Union nor the granting authority can be held responsible for them. The following Python packages have been used for this research: H5PY (https://www.h5py.org), NUMPY (https://numpy.org) (Harris et al. 2020), SCIPY (https://scipy.org) (Virtanen et al. 2020), MATPLOTLIB (https://matplotlib.org/) (Hunter 2007).
References
- Abazajian, K., Adelman-McCarthy, J. K., Agüeros, M. A., et al. 2003, AJ, 126, 2081 [Google Scholar]
- Arjona-Gálvez, E., Cardona-Barrero, S., Grand, R. J. J., et al. 2025, A&A, 699, A301 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Bakos, J., Trujillo, I., & Pohlen, M. 2008, ApJ, 683, L103 [Google Scholar]
- Bigiel, F., Leroy, A., Walter, F., et al. 2008, AJ, 136, 2846 [Google Scholar]
- Bower, R. G., Schaye, J., Frenk, C. S., et al. 2017, MNRAS, 465, 32 [Google Scholar]
- Broeils, A. H., & Rhee, M. H. 1997, A&A, 324, 877 [NASA ADS] [Google Scholar]
- Buitrago, F., & Trujillo, I. 2024, A&A, 682, A110 [Google Scholar]
- Bullock, J. S., Dekel, A., Kolatt, T. S., et al. 2001, ApJ, 555, 240 [Google Scholar]
- Chamba, N., Trujillo, I., & Knapen, J. H. 2022, A&A, 667, A87 [Google Scholar]
- Crain, R. A., Schaye, J., Bower, R. G., et al. 2015, MNRAS, 450, 1937 [Google Scholar]
- Dalla Vecchia, C., & Schaye, J. 2012, MNRAS, 426, 140 [Google Scholar]
- Davis, M., Efstathiou, G., Frenk, C. S., & White, S. D. M. 1985, ApJ, 292, 371 [Google Scholar]
- Dolag, K., Borgani, S., Murante, G., & Springel, V. 2009, MNRAS, 399, 497 [Google Scholar]
- Durier, F., & Dalla Vecchia, C. 2012, MNRAS, 419, 465 [Google Scholar]
- Einasto, J., Klypin, A. A., Saar, E., & Shandarin, S. F. 1984, MNRAS, 206, 529 [Google Scholar]
- Girelli, G., Pozzetti, L., Bolzonella, M., et al. 2020, A&A, 634, A135 [Google Scholar]
- Golini, G., Trujillo, I., Zaritsky, D., et al. 2025, A&A, 700, A91 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Harris, C. R., Millman, K. J., van der Walt, S. J., et al. 2020, Nature, 585, 357 [Google Scholar]
- Holmberg, E. 1958, Meddelanden fran Lunds Astronomiska Observatorium Serie II, 136, 1 [Google Scholar]
- Hunter, J. D. 2007, Comput. Sci. Eng., 9, 90 [Google Scholar]
- Kennicutt, R. C. 1998, ApJ, 498, 541 [Google Scholar]
- Kravtsov, A. V. 2013, ApJ, 764, L31 [Google Scholar]
- Martin, C. L., & Kennicutt, R. C. 2001, ApJ, 555, 301 [Google Scholar]
- McAlpine, S., Helly, J. C., Schaller, M., et al. 2016, Astron. Comput., 15, 72 [Google Scholar]
- McAlpine, S., Bower, R. G., Harrison, C. M., et al. 2017, MNRAS, 468, 3395 [Google Scholar]
- Miller, T. B., van Dokkum, P., Mowla, L., & van der Wel, A. 2019, ApJ, 872, L14 [Google Scholar]
- Moster, B. P., Naab, T., & White, S. D. M. 2013, MNRAS, 428, 3121 [Google Scholar]
- Mowla, L., van der Wel, A., van Dokkum, P., & Miller, T. B. 2019, ApJ, 872, L13 [Google Scholar]
- Navarro, J. F., Frenk, C. S., & White, S. D. M. 1997, ApJ, 490, 493 [Google Scholar]
- Negri, A., Dalla Vecchia, C., Aguerri, J. A. L., & Bahé, Y. 2022, MNRAS, 515, 2121 [Google Scholar]
- Peebles, P. J. E. 1969, ApJ, 155, 393 [Google Scholar]
- Pillepich, A., Nelson, D., Hernquist, L., et al. 2018, MNRAS, 475, 648 [Google Scholar]
- Planck Collaboration XVI. 2014, A&A, 571, A16 [Google Scholar]
- Rohr, E., Feldmann, R., Bullock, J. S., et al. 2022, MNRAS, 510, 3967 [Google Scholar]
- Rosas-Guevara, Y. M., Bower, R. G., Schaye, J., et al. 2015, MNRAS, 454, 1038 [Google Scholar]
- Sánchez Almeida, J. 2020, MNRAS, 495, 78 [Google Scholar]
- Schaller, M., Dalla Vecchia, C., Schaye, J., et al. 2015a, MNRAS, 454, 2277 [Google Scholar]
- Schaller, M., Robertson, A., Massey, R., Bower, R. G., & Eke, V. R. 2015b, MNRAS, 453, L58 [Google Scholar]
- Schaye, J. 2004, ApJ, 609, 667 [Google Scholar]
- Schaye, J., & Dalla Vecchia, C. 2008, MNRAS, 383, 1210 [Google Scholar]
- Schaye, J., Crain, R. A., Bower, R. G., et al. 2015, MNRAS, 446, 521 [Google Scholar]
- Sérsic, J. L. 1968, Atlas de Galaxias Australes (Cordoba, Argentina: Observatorio Astronomico) [Google Scholar]
- Somerville, R. S., Behroozi, P., Pandya, V., et al. 2018, MNRAS, 473, 2714 [Google Scholar]
- Springel, V., White, S. D. M., Tormen, G., & Kauffmann, G. 2001, MNRAS, 328, 726 [Google Scholar]
- The EAGLE Team 2017, arXiv e-prints [arXiv:1706.09899] [Google Scholar]
- Trujillo, I., Chamba, N., & Knapen, J. H. 2020, MNRAS, 493, 87 [Google Scholar]
- Vazdekis, A., Koleva, M., Ricciardelli, E., Röck, B., & Falcón-Barroso, J. 2016, MNRAS, 463, 3409 [Google Scholar]
- Virtanen, P., Gommers, R., Oliphant, T. E., et al. 2020, Nat. Meth., 17, 261 [Google Scholar]
- Wang, J., Koribalski, B. S., Serra, P., et al. 2016, MNRAS, 460, 2143 [Google Scholar]
- Wiersma, R. P. C., Schaye, J., & Smith, B. D. 2009a, MNRAS, 393, 99 [Google Scholar]
- Wiersma, R. P. C., Schaye, J., Theuns, T., Dalla Vecchia, C., & Tornatore, L. 2009b, MNRAS, 399, 574 [Google Scholar]
Appendix A: Selection of the sample
We show in figure A.1 the distribution of processed galaxies in the (M★, R1) (left panel, all galaxies), (M200, M★) (mid panel, only central galaxies) and (M200, R1) (right panel, only central galaxies) planes. The median values in bins of halo mass are shown as dots and diamonds for the low- and high-resolution simulations, respectively. The shaded areas indicate the 16% and 84% quantiles. Smaller dots and diamonds are for individual galaxies that are in bins with fewer than ten galaxies. The two simulations overlap remarkably well over more than two orders of magnitude in both stellar mass and halo mass. For the computation of the R1 − M★ relation, we selected galaxies with M★ > 107.5 M⊙ from the recalibrated simulations, and with M★ > 8 × 107.5 M⊙ from the reference simulation, accordingly to the different mass resolution. These mass thresholds are depicted with vertical dashed lines in the left panel. At the low-mass ends, the relations with M200 (mid and right panels) flatten when reaching the resolution limit. For the computation of the R1 − R200 relation, we then consider central galaxies in haloes with mass above Mthr = 1010.3 M⊙ and 8 × Mthr for the recalibrated and reference simulations, respectively, and accordingly to their particle mass resolution. This choice is adequate for both M★ − M200 and R1 − M200 relations.
![]() |
Fig. A.1. Left panel. Relation between size and stellar mass for all galaxies in the two simulations. Dots and diamonds are the median values in bins of halo mass, and the shaded areas indicate the 16% and 84% quantiles. Single dots are for individual galaxies, in bins containing fewer than ten galaxies. The vertical lines mark the minimum masses considered for the two simulations in the computation of the analytic fits. Middle panel. Relation between stellar mass, M★, and halo mass, M200, for all central galaxies in the two simulations. Symbols are as in the left panel. The vertical lines mark the lower halo mass limit of the samples extracted from the two simulations and employed in the computation of the R1 − R200 relation. Right panel. Same as the middle panel, but for the R1 − M200 relation. |
Appendix B: Mass–size relation for central and satellite galaxies
We restricted the sample to central and satellite galaxies, and repeated the analysis of section 3. We show in figure B.1 the stellar mass–size relations for central (left panel) and satellite (right panel) galaxies. The figure is similar to figure 2, where median values are fitted with a double power-law (equation 2) for stellar masses M★ > 108.5 M⊙. For the same galaxies we also show the relation between stellar mass and half-mass radius, R50, in purple. The median is given by the purple solid lines, whereas the shaded areas represent the 16% and 84% quantiles. We report in table B.1 the best-fit parameters for the stellar mass–size relation for all galaxies and for the subsamples of central and satellite galaxies, and in table B.2 the best-fit parameters for the galaxy size-halo size relation for central galaxies. The errors on the parameters are given by the χ-square minimisation. The dispersion around the best-fitting function is σΔlog10R1 = 0.05 and σΔlog10R1 = 0.07 for central and satellite galaxies, respectively.
![]() |
Fig. B.1. Correlation between R1 and stellar mass for central (left panel) and satellite (right panel) galaxies in the sample. The median values for each bin are plotted together with the 16% and 84% quantiles (shaded area). For bins with fewer than ten galaxies, single values are plotted. Both fits to the data are for stellar masses M★ > 108.5 M⊙. For comparison, we plot the relation for R50 in purple. |
Best-fit parameters for the stellar mass–size relation for all galaxies and central and satellite galaxies. The errors are given by the χ-square minimisation.
Best-fit parameters for the galaxy size-halo size relation for central galaxies. The errors are given by the χ-square minimisation.
All Tables
Best-fit parameters for the stellar mass–size relation for all galaxies and central and satellite galaxies. The errors are given by the χ-square minimisation.
Best-fit parameters for the galaxy size-halo size relation for central galaxies. The errors are given by the χ-square minimisation.
All Figures
![]() |
Fig. 1. From left to right, projected density maps and radial profiles of stellar mass, stellar luminosity, and gas star formation rate for a simulated spiral galaxy of stellar mass 3.8 × 1010 M⊙ selected from the recalibrated simulation. The vertical dashed, solid, and dot-dashed lines mark the positions of R50, R1, and R29, g, respectively. For this star forming galaxy, R1 is a better measure of the size of the galaxy for its proximity to the edge (truncation radius) of the disc. The edge of the disc coincides with a net drop in the star formation rate surface density, the starting hypothesis for the new definition of galaxy size. |
| In the text | |
![]() |
Fig. 2. Correlation between R1 and stellar mass for all galaxies in the sample (orange symbols). The median values in each bin are plotted together with the 16% and 84% quantiles (shaded area). For bins with fewer than ten galaxies, single galaxies are plotted as small dots. The orange solid line depicts the double power-law fit for M★ > 108.6 M⊙ (extrapolated below that mass). We show the relation between R50 and stellar mass for all galaxies in the sample in purple. For visual comparison, the galaxies in the sample of Trujillo et al. (2020) are represented with small dots, both R1 and R50. The distribution of the residuals around the best-fit linear relation for all galaxies is shown in the inset. The distribution has been fitted with a Gaussian (solid line) with a dispersion of σΔlog10R1 = 0.06 dex. |
| In the text | |
![]() |
Fig. 3. Correlation between R1 and R200 for central galaxies. Median (dots and solid lines) and 16% and 84% quantiles (shaded areas) are shown. We quote in the legend the average of the scatter for the entire sample of central galaxies and with respect to the analytic fit as well as the dispersion around the mean. The purple line and shaded area are the relation between R50 and R200 and its 16% and 84% quantiles. The dotted line is the linear correlation proposed by Kravtsov (2013) for early-type galaxies, R50 = 0.015 R200, with a scatter of 0.2 dex (shaded area). The top x-axis gives the corresponding halo mass, M200. |
| In the text | |
![]() |
Fig. 4. Correlation between M200 and M★ for central galaxies. Median (dots and solid lines) and 16% and 84% quantiles (shaded areas) are shown. We quote in the legend the average of the scatter for the entire sample of central galaxies and with respect to the analytic fit, together with dispersion around the mean. We plot in the inset the distribution of residuals for the analytic fitting function and all the galaxies in the sample. |
| In the text | |
![]() |
Fig. A.1. Left panel. Relation between size and stellar mass for all galaxies in the two simulations. Dots and diamonds are the median values in bins of halo mass, and the shaded areas indicate the 16% and 84% quantiles. Single dots are for individual galaxies, in bins containing fewer than ten galaxies. The vertical lines mark the minimum masses considered for the two simulations in the computation of the analytic fits. Middle panel. Relation between stellar mass, M★, and halo mass, M200, for all central galaxies in the two simulations. Symbols are as in the left panel. The vertical lines mark the lower halo mass limit of the samples extracted from the two simulations and employed in the computation of the R1 − R200 relation. Right panel. Same as the middle panel, but for the R1 − M200 relation. |
| In the text | |
![]() |
Fig. B.1. Correlation between R1 and stellar mass for central (left panel) and satellite (right panel) galaxies in the sample. The median values for each bin are plotted together with the 16% and 84% quantiles (shaded area). For bins with fewer than ten galaxies, single values are plotted. Both fits to the data are for stellar masses M★ > 108.5 M⊙. For comparison, we plot the relation for R50 in purple. |
| In the text | |
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.








