| Issue |
A&A
Volume 703, November 2025
|
|
|---|---|---|
| Article Number | A149 | |
| Number of page(s) | 15 | |
| Section | Cosmology (including clusters of galaxies) | |
| DOI | https://doi.org/10.1051/0004-6361/202556282 | |
| Published online | 14 November 2025 | |
Challenges of standard halo models in constraining galaxy properties from cosmic infrared background anisotropies
1
Institutes of Computer Science and Astrophysics, Foundation for Research and Technology Hellas (FORTH), Heraklion, Crete, Greece
2
Aix Marseille Univ, CNRS, CNES, LAM, Marseille, France
3
Université de Strasbourg, CNRS, Observatoire astronomique de Strasbourg, UMR 7550, 67000 Strasbourg, France
4
Kavli Institute for Particle Astrophysics and Cosmology, Stanford University, 452 Lomita Mall, Stanford, CA 94305, USA
5
SLAC National Accelerator Laboratory, 2575 Sand Hill Road, Menlo Park, CA 94025, USA
⋆ Corresponding author: agkogkou@ia.forth.gr
Received:
7
July
2025
Accepted:
19
September
2025
The halo model, combined with halo occupation distribution (HOD) prescriptions, is widely used to interpret cosmic infrared background (CIB) anisotropies and extract physical information about star-forming galaxies and their connection to large-scale structures. Recent CIB-specific implementations of the halo model have adopted more physical parameterizations. However, the extent to which these models can reliably recover meaningful physical parameters remains uncertain. We assessed whether the current parameterization of CIB halo models is sufficient to recover astrophysical quantities, such as star formation efficiency, η(Mh, z), and halo mass at which the peak of star formation efficiency occurs, Mmax, when fit to mock data. We also assessed whether discrepancies arise from assumptions about galaxy emission (the HOD ingredients) or from more fundamental components in the halo model, such as bias and matter clustering. We fit the M21 CIB HOD model, implemented within the halo model framework, to mock CIB power spectra and star formation rate density (SFRD) data generated from the SIDES-Uchuu simulation, and compared the best-fit parameters to the known simulation inputs. We then repeated the analysis using a simplified version of the simulation (SSU), explicitly designed to match the HOD assumptions. A detailed comparison of model and simulation outputs was carried out to trace the origin of observed discrepancies. While the M21 HOD model provides a good fit to the mock data, it failed to recover the intrinsic parameters accurately, particularly the halo mass at which star formation efficiency peaks. This mismatch persists even when fitting data generated with the same model assumptions. We find strong agreement (within 5%) in the emission-related components (SFRD, emissivity), but observe a scale- and redshift-dependent offset exceeding 20% in the two-halo term of the CIB power spectrum. This likely arises from limitations in the treatment of halo bias and matter clustering within the linear approximation. Additionally, incorporating scatter in the SFR–halo mass relation and the spectral energy distribution (SED) templates significantly affects the shot noise (∼50%), but has only a modest impact (less than 10%) on the clustered component. These results suggest that recovering physical parameters from CIB clustering requires improvements to the cosmological ingredients of the halo model framework, such as adopting scale-dependent halo bias and nonlinear matter power spectra in addition to careful modeling of emission physics.
Key words: cosmic background radiation / large-scale structure of Universe / infrared: diffuse background
© The Authors 2025
Open Access article, published by EDP Sciences, under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
This article is published in open access under the Subscribe to Open model. Subscribe to A&A to support open access publication.
1. Introduction
The cosmic infrared background (CIB) is the second most prominent cosmic background. It originates from emissions of dust-enshrouded galaxies undergoing star formation. Studying it helps to understand the cosmic star formation history up to high redshifts, i.e., z ∼ 6 (e.g., Gispert et al. 2000; Maniyar et al. 2018; Jego et al. 2023). Over the years, high-precision measurements of CIB anisotropies have been achieved (Lagache et al. 2007; Planck Collaboration XVIII 2011; Thacker et al. 2013; Viero et al. 2013; Planck Collaboration XXX 2014; Viero et al. 2019), with additional advancements expected from upcoming missions (Zemcov & SPHEREx Science Team 2018; Matsuura et al. 2024). These efforts have offered insights into the spatial distribution of the numerous unresolved extragalactic sources. We can thus probe the dusty star-forming galaxy distribution within the large-scale structures of the Universe and connect it to the underlying dark matter.
For the interpretation of the CIB anisotropies, the halo model introduced by Cooray & Sheth (2002) has been extensively employed in the literature (see Asgari et al. 2023, for a review). This model is based on the assumption that the dark matter is collapsed into symmetric halos and that all galaxies reside within these halos. Its key components include the halo mass function (HMF), the halo bias quantifying the over-density or under-density of dark matter halos compared to the overall matter distribution, the halo density profile, and the halo occupation distribution (HOD). Together, these ingredients describe the abundance, clustering, internal structure of halos and the galaxy properties within them.
Initial HOD models connected total CIB emissivity solely with the power spectrum of the galaxy number density contrast (Viero et al. 2009; Planck Collaboration XVIII 2011; Amblard et al. 2011), evolving to models considering the luminosity of galaxies dependent on host dark matter halo mass (Shang et al. 2012; Viero et al. 2013; Planck Collaboration XXX 2014; Zagatti et al. 2024). When fitting a model to the available CIB power spectrum data, there is a need to retrieve key quantities relevant to galaxy evolution, such as the star formation rate density (SFRD). In this context, Béthermin et al. (2013) introduced a connection between the galaxy star formation rate (SFR) and baryonic accretion rate (BAR) onto the host dark matter halos. Expanding on this, Maniyar et al. (2021) (hereafter M21) maintained this link to propose a four-parameter HOD model. This model successfully fits CIB power spectrum data from Planck and Herschel, while also aligning with SFRD available data, allowing for high-z star formation rate (SFR) information to be extracted from the CIB power spectrum.
While these models adequately fit observational data, uncertainties persist regarding their obtained parameters. For instance, estimates of the halo mass hosting the most efficient star formation range from 1012.1 M⊙ as estimated by Viero et al. (2013) to 1012.94 M⊙ in the latest model by Maniyar et al. (2021). This discrepancy emphasizes the need for validation of the standard HOD models within the halo model framework. Such validation is crucial not only for interpreting CIB data but also for forecasting line intensity mapping (LIM) experiments, which employ similar methodologies but focus on narrower redshift slices (Kovetz et al. 2017; Bernal & Kovetz 2022). To address this, we aim to validate the latest HOD model from Maniyar et al. (2021) by comparing it to realistic simulations, advancing toward a new generation of more accurate and precise HOD models for the CIB.
The paper is structured as follows: In Sect. 2, we describe both the SIDES-Uchuu simulation and the HOD model employed in our analysis. Sect. 3 presents results from fitting the HOD model to mock SIDES-Uchuu data, simulating Planck High Frequency Instrument (HFI) data. This analysis concentrates on Planck frequencies, offering a detailed exploration of high-z effects, though it is also applicable to Herschel data. Sect. 4 identifies discrepancies within the halo model framework by fitting to mock data from a simplified SIDES-Uchuu (SSU) simulation version, closely aligning with HOD model assumptions and parameterization. Sect. 5 delves into the sources of discrepancy originating from fundamental halo model components. In Sect. 6 we explore the impact of excluding scatter in astrophysical recipes within the HOD model. Finally, we summarize our results and present the conclusions in Sect. 7.
2. Method
2.1. SIDES-Uchuu simulation
The SIDES-Uchuu simulation is the product of two distinct simulations, namely the Simulated Infrared Dusty Extragalactic Sky (SIDES)1 and the Uchuu N-body simulation from the Uchuu suite2. The combination of these simulations is detailed in Gkogkou et al. (2023). SIDES simulates the far-infrared (FIR) and submillimeter sky by using observed empirical relations (Béthermin et al. 2017, 2022). It populates the dark-matter halos initially provided by the Uchuu N-body cosmological simulation with galaxies of specific stellar masses through abundance matching (Gkogkou et al. 2023).
The Uchuu cosmological simulation (Ishiyama et al. 2021), provides a cosmological box with a comoving volume of 2000 h−1 Mpc and a mass resolution of 3.27 × 108 h−1 M⊙ (with halos resolved at ≳40 particles). The resulting lightcone spans nearly six orders of magnitude in halo masses at z = 0, gradually decreasing to 3–4 orders of magnitude by z = 7. The completeness of the Uchuu lightcone down to approximately ∼1.3 × 1010 h−1 M⊙ across all redshifts is established by comparing halo mass functions (HMFs) with theoretical predictions from Despali et al. (2016). The Uchuu simulation adopts cosmological parameters consistent with Planck Collaboration VI (2020).
In the SIDES framework the stellar mass regulates various galaxy properties (e.g., SFR, LIR) using a population of main-sequence galaxies (e.g., Daddi et al. 2007; Schreiber et al. 2015) and starburst galaxies (Sargent et al. 2012; Béthermin et al. 2012). There is also a fraction of passive galaxies that are assumed to be non-emitting but still populate the dark matter halos. Additionally, far-infrared and submillimeter line emissions are generated based on this framework (Béthermin et al. 2022).
In the simulation, galaxy SFR is determined based on its stellar mass and redshift, with values drawn from the distributions of main sequence and starburst galaxies as described in Schreiber et al. (2015), including a scatter of 0.3 dex. LIR values are derived from SFR values using the relation established by Kennicutt (1998). Spectral energy distribution (SED) templates are selected from the library provided by Magdis et al. (2012) and updated at z > 2 in Béthermin et al. (2015) and Béthermin et al. (2017). In addition, a scatter of 0.2 dex on the mean intensity of the radiation field (⟨U⟩) is implemented in the model, a parameter strongly correlated with dust temperature and influencing the SED template selection for each galaxy. All sources are assigned a magnification value (μ), resulting in either strong or weak lensing.
2.2. CIB halo model
The CIB comes from the dust emission from all the star-forming galaxies across cosmic time; it can thus probe the large-scale structures and provide important information about the clustering of the galaxies. In order to study the large-scale fluctuations of the CIB (or any other large-scale tracer with a specified halo profile), a halo model has been introduced (for a review Cooray & Sheth 2002; Asgari et al. 2023). In this framework the underlying assumptions are: 1) all the dark matter is enclosed in collapsed dark matter halos, 2) all the galaxies lie in the dark matter halos, and 3) the clustering of the galaxies can be described by the sum of two components: the one- and two-halo terms. The former describes the correlation of the galaxies within a single halo and the latter describes the correlation of the galaxies within different halos. The required ingredients for such a model are: a halo mass function (HMF) that specifies the number density of dark matter halos of a given mass, a halo bias model that quantifies the degree to which the distribution of dark matter halos deviates from that of the underlying matter distribution, a halo profile specifying the distribution of the matter within the halos, and a halo occupation distribution (HOD) prescription that defines how dark matter halos are populated with galaxies. More specifically in the CIB studies, it determines how the light, often quantified as emissivity, is distributed within these dark-matter halos.
Maniyar et al. (2021) (hereafter M21) have developed and presented a halo model for the CIB fluctuations. This model links the baryonic matter accretion rate (BAR) onto the dark matter halos to their total star formation rate (SFR). The BAR is defined by the halo mass at a given redshift as
where Ωb and Ωm are the dimensionless cosmological baryonic density and total matter density, respectively, and ̇M(Mh, z) is the mean mass growth rate, taken from Fakhouri et al. (2010) and is defined as
The SFR is connected to the BAR assuming that the stars are formed from the accreted gas with a certain efficiency. This efficiency is parameterized with a lognormal function
where Mmax is the halo mass with the highest star formation efficiency ηmax. σMh(z) is the lognormal variance, which characterizes the halo mass range with efficient star formation and evolves with redshift as
zc is the pivot redshift. Above zc the width equals σMh0, below it, σMh varies with redshift, and τ sets the strength of that evolution. The selection of the lognormal parameterization served the purpose of regulating the star formation efficiency in low and high halo masses. It has been shown by several studies (e.g., Silk 2003; Kereš et al. 2005; Béthermin et al. 2013) that the efficiency peaks at ∼1012 − 1013 M⊙ and significantly drops below and above this range. Different parameterizations have been proposed, as for instance from Moster et al. (2013), who proposed a double power-law. This parameterization was tested in Eq. (3) but led to an unrealistically low contribution from the halos above the mass of maximum efficiency at high redshift, as explained in the appendix of Maniyar et al. (2021). Therefore, the lognormal parameterization was used since it offered both a good fit to the observational data and physically accepted results.
The M21 model accounts for the quenching of the satellite galaxies by computing the SFR for the subhalos in two different ways. The first one simply follows Eq. (3), where Mh is substituted by the mass of the subhalo. And the second way is by weighting the SFR of the host halo by the ratio of the subhalo mass over the host halo mass (SFRsub = SFRc × (msub/mh)). The subhalo’s final SFR is determined by selecting the lower value of the two.
Knowing the SFR of both the main halos and their corresponding subhalos at any given redshift, it becomes possible to calculate the differential emissivity (
). This quantity represents the emissivity per halo mass bin. For the main halos it is
where
is the halo-mass function, taken from Tinker et al. (2008), and χ(z) is the comoving distance to redshift z.
is the effective SED of the infrared galaxies at a given redshift for a given frequency (Béthermin et al. 2013). SFRc is the SFR for the central galaxies with a given halo mass. K is the Kennicutt constant (K = SFR/LIR), which has a value of
for a Chabrier IMF, and LIR is the infrared luminosity (8-1000 μm). The differential emissivity for the subhalos is computed as
where
is the subhalo mass function for the satellite galaxies with a subhalo mass msub in a main halo of mass Mh. The subhalo mass function is the one from Tinker & Wetzel (2010)3. The effective SEDs for the satellite galaxies are assumed to be the same as those of the central galaxies.
Finally, the differential emissivity of the main halos and their subhalos are used to construct the one- and two-halo term along with the other halo model ingredients mentioned previously. The one-halo term of the CIB power spectrum describes the clustering of the galaxies within a single halo and is defined as
where a is the scale factor of the Universe, u(k, Mh, z) is the Fourier transform of the halo density profile, which describes the density distribution inside the halo. A Navarro-Frenk-White (NFW, Navarro et al. 1997) profile was considered. The two-halo term describes the correlation between two galaxies in two different halos of mass Mh and
and is calculated as
where b(Mh,z) is the halo bias prescription given by Tinker et al. (2010), Plin(k, z) is the linear matter power spectrum.
The clustering of the galaxies can be described by the sum of the one-halo (
) and two-halo (
) term. There is additionally a shot-noise term (
), which is a flat component and comes from the fact that the number of sources within a certain surveyed area follows the Poissonian statistics. This component is not predicted by the M21 model, it is rather included as a constant term. Therefore, the total CIB power spectrum is computed as
where the one-halo term arises from galaxy pairs within the same halo, the two-halo term from pairs in different halos, and the shot-noise term is flat in ℓ and captures the Poisson contribution from unresolved sources.
The novelty of the M21 model can be summarized in the following aspects. It is a simple model with only four free parameters (Mmax, ηmax, σMh0, τ) as opposed to other models with a larger number of parameters (e.g., Viero et al. 2013). By connecting the matter accretion onto the dark matter halos to their SFR, it provides a more physically driven parameterization compared to other models that assume a luminosity-halo mass relation (e.g., Shang et al. 2012; De Bernardis & Cooray 2012; Planck Collaboration IX 2014; McCarthy & Madhavacheril 2021). The model was fit to CIB angular power spectrum data measured by Planck, while also taking into account external observational constraints, such as the SFRD and the mean level of the CIB. With only four free parameters, it both obtained a very good fit and it is compatible with the SFRD and mean CIB data.
2.3. Philosophy of our work
Halo models are widely used to interpret CIB observational data and infer astrophysical and cosmological parameters. However, the extent to which we can trust the parameters derived from these models remains uncertain. Our work has two main goals. The first one is to examine the effectiveness of the parameterization of the M21 HOD model, the latest halo model used for the CIB analysis, to properly extract astrophysical information. The second objective involves a comprehensive evaluation of the family of halo models as a whole for CIB anisotropies. This entails an investigation into their individual components and their ability to establish a realistic connection between the large-scale structures and the emission from individual DSFGs. Our approach involves comparing the halo model to a simulation, where all components and parameters are known beforehand. For this purpose we use the SIDES-Uchuu simulation, which is modified as needed throughout this paper to facilitate various validation tests.
It is worth stressing that our goal is not to capture the full physical complexity of DSFGs in the real Universe. SIDES-Uchuu itself is based on heuristic prescriptions linking halos to galaxies, and thus inevitably omits some processes. Instead, our objective is to test whether a simplified halo-model parameterization can recover the astrophysical quantities present within this simulation, providing a controlled benchmark where the true inputs are known.
The analysis begins by fitting the halo model to mock data, generated using the original SIDES-Uchuu simulation. Subsequently, we compare the SFR-BAR efficiency curve determined by the best-fit parameters to the one inherent in the simulation used to generate the mock data. This step aims to test the accuracy of the chosen parameterization in the M21 HOD model.
As a follow up test, we fit the halo model to a new set of mock data, generated using a modified version of the SIDES-Uchuu simulation. These modifications align the simulation’s prescriptions to those employed in the halo model and they involve two key adjustments. First, the SFR drawing method is adjusted from the redshift-evolved main sequence to the analytic formula given in Eq. (3). Second, instead of using SED templates dependent on both redshift and the galaxy’s mean radiation field (⟨U⟩), effective SED templates that depend solely on redshift are incorporated. These modified simulations are referred to as simplified SIDES-Uchuu (SSU). This step aims to assess whether the resulting efficiency now closely matches that of the simulation. However, as we find, persistent offsets between the simulation and the efficiency curves from the M21 HOD model suggested a potential inconsistency not just in the specific HOD model (Eq. (5) and Eq. (6)) but also in the halo model as a framework (Eq. (7) and Eq. (8)).
To delve into the investigation of the halo model, we once again used the SSU. We conducted a thorough comparison of the various quantities used to model the CIB power spectrum (Eq. (8)) in the halo model with the simulation. This allowed us to systematically isolate and explore different components of the halo model (e.g., SFR, differential emissivity) as potential sources of discrepancies. Finally, we investigate and quantify the impact of introducing some scatter in various recipes (e.g., SFR) on the obtained power spectrum, shedding light on the importance of modeling the scatter in the employed recipes.
3. Fitting the M21 HOD model to SIDES-Uchuu mock data
In this section, we fit the M21 HOD model4 to mock data obtained with SIDES-Uchuu. We then present the results and discuss their implications.
3.1. SIDES-Uchuu mock data
In order to generate mock CIB data as they would be observed from Planck, we used the SIDES-Uchuu simulation. We first generated four maps, corresponding to a simulation area of 117 deg2, for the four highest Planck frequencies (217, 353, 545, and 857 GHz from the HFI instrument). To fulfill this task, we had to multiply the SEDs by the corresponding Planck bandpasses (see Sect. 2.6 from Béthermin et al. 2022, for more details on the cube/map making). Next, we computed the auto- and cross-power spectra of all the maps using the powspec5 python package. From the resulting auto- and cross-power spectra we kept the data points corresponding to the same ℓ multipoles as those in the Planck datasets. Data points from SIDES-Uchuu and the M21 HOD model are not color corrected (Lagache et al. 2020, Eq. A.2), since they are already in the same convention. Therefore, we did not apply any color corrections.
We estimated the covariance matrix (see Appendix A) using the flat-sky implementation of pymaster6 (Alonso et al. 2019). The computation uses four frequency channels (217, 353, 545, and 857 GHz) and eight multipole bins for each power spectrum. As a result, the full Gaussian covariance matrix has dimensions (8 × 4, 8 × 4) = (32, 32). The error bars on the simulated data points are derived from the square roots of the diagonal elements of this analytically computed covariance matrix.
To determine the shot noise in each simulated map, we shuffled the sources within the simulated catalog before generating the map. In this way any clustering information is removed. We then computed the power spectrum of the shuffled map, which provided a direct measure of the shot-noise level for all the auto and cross power spectra. Having this information allows us to calculate the correlation matrix (ℛSN), which is used during the fitting process (see Sect. 3.2). This matrix describes the degree of correlation between the shot noise power of the various frequencies. The analytic formula for calculating the correlation matrix is
The M21 model uses external constrains, such as the mean CIB intensity and the SFRD. We thus had to generate mock data for these two quantities. For the mean CIB intensity (CIB monopole), we generated three maps after applying the three Herschel/SPIRE filters accordingly (250, 350, and 500 μm). We use Herschel/SPIRE data for the CIB monopole even though we use Planck data for the power spectrum to maintain consistency with the methodology followed by Maniyar et al. (2021). We then computed the mean of the generated maps and converted the units from Jy sr−1 to nW m−2 sr−1. In order to obtain the error bars for each CIB mean, we used the error bars of the Herschel/SPIRE data after normalizing accordingly for SIDES, as
Finally, we obtained the mock SFRD data by computing the total SFR from the SIDES catalogs and summing the SFR values from all the sources in the given redshift bin. This value was divided by the corresponding comoving volume. From the resulting SFRD curve, we only kept the points at the exact same redshifts as the observational data (Madau & Dickinson 2014; Khusanova et al. 2021) and used the same relative error bars.
3.2. Fitting results
We performed a Markov chain Monte Carlo (MCMC) analysis of the global M21 CIB model parameter space using the Python package emcee (Foreman-Mackey et al. 2013). There are eight free parameters:
-
Physical model parameters: Mh, max, ηmax, σMh0, τ
-
Shot noise: SN217, SN353, SN545, SN857.
The global χ2 has a contribution from the CIB auto/cross power spectra and the priors imposed by the ⟨ICIB⟩ and SFRD external observational constraints.
In contrast to M21, where measurement uncertainties were assumed to be Gaussian and uncorrelated, we use the full covariance matrix (see Appendix A) when computing the χ2 statistic. This allows us to account for correlations between frequencies and provides a more accurate assessment of the model fit. To establish prior distributions for the eight model parameters, a uniform distribution is adopted, with a predefined range. A comprehensive summary of these priors can be found in Table 1. In the M21 model, concerning the shot noise, only the auto-power shot noise parameters are considered as free parameters. Meanwhile, for the cross-power shot noise, the model computes the values following the correlation matrix from Béthermin et al. (2013), which accounts for the fact that different frequencies can have contributions from overlapping redshifts and thus the same sources. As a result, the shot noise for the cross frequencies is defined as
Best-fit values with uncertainties and the ranges of uniform distributions used as priors for fitting the M21 HOD model to mock data from SIDES-Uchuu.
where ℛ is the correlation matrix we have computed from the simulation (Eq. (10)).
The priors that have been used for the fit are shown in detail in Table 1 together with the best-fit values. For the four physical parameters the priors are exactly the same as those used in Maniyar et al. (2021). But for the priors of the shot-noise levels we chose a ±20% range centered on the real value of the shot noise for each frequency. We obtained a global χ2 of 384 for 80 Planck Cℓ points. The χ2 value obtained in Maniyar et al. (2021) for the same amount of points was 113. The difference in the quality of the fit is somewhat surprising but it could be the result of the fact that the M21 analysis used the wrong version of the Tinker & Wetzel (2010) subhalo mass function, while in our analysis we used the updated version with the corrected value. It could additionally be the result of using different priors for the shot noise parameters. M21 placed broad flat priors on the auto-power shot noise with width of [0.1–2] times the value estimated by the Béthermin et al. (2012) model. In our analysis we used a width of [0.8–1.2] times the value we obtained from the shuffled SIDES-Uchuu maps. Narrowing the range further (e.g., [0.9, 1.1]) pulls the inferred shot-noise levels closer to the true values but does not improve the recovery of the four astrophysical parameters (log Mh, max, ηmax, σMh, 0, τ). Conversely, widening the shot-noise priors (e.g., [0.5, 1.5]) broadens the astrophysical posteriors but does not improve the fit to the large-scale, two-halo–dominated part of the spectra, nor does it reduce the overall χ2. We therefore retain the informative yet conservative [0.8, 1.2] baseline.
The resulting best-fit model (one-halo term + two-halo term + shot noise) together with the mock data points are shown in Fig. 1. In Fig. 2 we show the SFRD data from our simulation and the contours of the multiple models defined by the various sets of parameters explored by the MCMC process. The corner plots of the marginal distributions of the model parameters are displayed in the Appendix C in Fig. C.1.
![]() |
Fig. 1. Mock data of the CIB auto- and cross-power spectra obtained by SIDES-Uchuu (black points) and the best-fit CIB halo model (black line) with its components (one-halo term, two-halo term, shot noise). The ratio of the model over the data is shown in Fig. C.2. |
3.3. Discussion
As evident in Fig. 1 and supported by the resulting χ2 value, the M21 HOD model provides a statistically good fit to the mock data. Nonetheless, the goal of this analysis is to draw conclusions regarding the resulting best-fit parameters and their effectiveness in describing the physical quantities embedded in the simulation.
The exact level of the shot noise can be easily computed from the simulated maps we used to obtain the mock data. We can thus directly compare the intrinsic to the best-fit value on a one-to-one basis. Table 2 shows both the intrinsic and best-fit values of the four Planck frequencies shot noise, along with their corresponding statistical uncertainties. The fractional difference is below 18%, which is seemingly low, but the best-fit value does not fall within the range defined by the 1σ-statistical uncertainty for any of the shot noise parameters. This shows the weakness of the fit to recover the shot noise values, potentially due to the strong degeneracy of the shot noise with the one-halo term, as well as the correlation between the various frequencies.
Intrinsic and best-fit values of the shot noise for the four Planck frequencies (217, 353, 545, 857 GHz).
Regarding the ability of the model to reproduce the right SFRD, we can conclude from Fig. 2 that the overall shape of the models agrees with that of the mock SFRD. Between z = 0.3 and z = 1 there seems to be a systematic underestimate of the models but they still lay within the error bars. At z > 3 the range of the potential models increases as well as the size of the error bars. However, the agreement still holds.
![]() |
Fig. 2. Mock data of the SFRD obtained by SIDES-Uchuu (blue points) and the 1 and 2 σ contours of the SFRD models defined by the various sets of parameters explored by the MCMC walkers (red shaded areas). |
In order to assess the reliability of the remaining four best-fit parameters (Mmax, ηmax, σMh0, τ), which define the SFR-BAR efficiency, it is essential to compare the simulation efficiency with the efficiency curve defined by the best-fit parameters. This comparison is depicted in Fig. 3, where the 1 σ and 2 σ contours of the models are plotted together with the SIDES data points and their mean. The model manages to capture the general shape of the intrinsic SFR efficiency, but falls short in accurately reproducing the amplitude (ηmax) and the halo mass of the highest star formation efficiency (Mmax). Mmax seems to be always overestimated by the halo model. This inconsistency becomes more pronounced at z < 2 and can significantly impact the interpretation of the CIB, as the primary flux contribution is expected to originate from 1 < z < 2 (e.g., Fig. 4 from Maniyar et al. 2018).
![]() |
Fig. 3. SFR-BAR efficiency from SIDES-Uchuu (blue gradient) and the mean efficiency (solid blue line). The various models from the MCMC parameter exploration of the M21 HOD model is shown in red (red contours for 1σ and 2σ). Each subplot corresponds to a different redshift, with the center of the redshift bin displayed on the top right of each panel. The redshift interval used to compute the density of the SFR/BAR values of the sources in the simulation is Δz = 0.05. The colorbar indicates the SFR/BAR value density, with blue representing high density and white indicating lower density, implying fewer sources with these values. |
This result could imply that there may be more suitable parameterizations for the efficiency between SFR and BAR that can capture the redshift evolution more accurately while still maintaining the simplicity of having only four free parameters.
4. Investigation of discrepancies within the halo model
In this section we aim to investigate in detail the source of the inconsistency we noticed in Sect. 3. For this purpose, we modify the SIDES-Uchuu simulation by incorporating the same parameterizations as in the M21 HOD model. This allows for a direct comparison.
4.1. Simplified SIDES-Uchuu (SSU)
In the previous test the halo model failed to reproduce the quantities intrinsic to SIDES-Uchuu. However, there are recipes that are different from those in the halo model. Therefore, we produce an alternative version of SIDES-Uchuu using the same exact parameterization as in the M21 HOD model. In this way we aim to explore potential biases that are not caused by the parameterization of the model. We implemented the following modifications: the SFR recipe, the implemented SED templates for the galaxies, and the incorporation of the quenching process.
SFR recipe: Within the existing version of SIDES, the SFR values are determined based on the redshift-evolving main sequence. To align SIDES with the M21 HOD model, we incorporated in SIDES the SFR parameterization of the HOD model, described by Eq. (3). Consequently, in the simplified version of our simulation, the SFR value is solely determined by the halo mass (Mh) and the redshift, precisely matching the assumption made in the M21 HOD model. One can choose to simplify this recipe further by disabling the redshift-induced evolution of the efficiency curve’s width. This can be achieved by setting τ = 0 in Eq. (4). This is further investigated in Appendix B. Finally, no scatter is added to the assigned SFR values.
SED templates: To estimate the continuum emission (LIR) SIDES adopts a SED template sourced from the updated template library introduced in Béthermin et al. (2017). Each template accounts for the galaxy type (main sequence or starburst), redshift, and the mean radiation field (⟨U⟩) closely linked to the dust temperature. However, in the M21 HOD model in order to determine the emissivity of galaxies within a dark matter halo, an effective SED is assumed. The effective SED at a specific redshift is calculated as the average of all SEDs. These SEDs are from the same library as those used in SIDES (Béthermin et al. 2017), corresponding to different ⟨U⟩ values for each galaxy type at that redshift. Each SED is weighted based on its contribution to the infrared luminosity density. After obtaining the final effective SED (Eq. (5)), a specific bandpass (e.g., Planck, Herschel/SPIRE) is applied. Subsequently, we adopted the same effective SEDs for the simplified version of SIDES.
Quenching and subhalos: In the M21 HOD model there is a way to account for the quenching as explained in Sect. 2.2. This allows for the incorporation of the influence of the subhalo’s environment, which can potentially result in a reduced SFR of the subhalos. However, in SIDES all dark-matter halos, whether they are main or subhalos, are treated uniformly. This implies that the same scaling relations are applied without distinguishing between centrals and subhalos. In order to directly compare the model to the simulation, it is more convenient to deactivate the quenching module in the model rather than introducing a new recipe in SIDES specifically designed to account for quenching. It is also possible and straightforward to discard the subhalos contribution, either in the model, the simulation, or both if needed.
Lensing: Finally, in the simplified version of SIDES we switch the lensing module off (i.e., μ = 1) for all the galaxies. This aims to prevent any additional contributions to the tension between the model and the simulation.
4.2. Fitting process and results
We generated two mock datasets by using the SSU. We choose the parameters log Mh, max, ηmax, σMh0, and τ to be the same as the values obtained in M217. The first data set includes both central dark-matter halos and subhalos (referred to as the “with subs” data set), and the second one is without subhalos (referred to as the “no subs” data set)8. We have followed the same exercise by setting the τ parameter to 0 to investigate the effect of an even more simplified parameterization. The results can be found in Appendix B.
Each mock data set comprises of: 1) CIB auto- and cross-power spectrum data points for the four highest Planck frequencies, 2) mean CIB intensity data at the three Herschel/SPIRE bandpasses, and 3) SFRD data points within the redshift range of 0 to 5. The error bars for the power spectra are derived from the square root of the diagonal elements of the covariance matrix, as detailed in Appendix A. The SFRD and mean CIB intensity data points are computed following the same procedure described in Sect. 3.1.
Similarly to Sect. 3.2 we conducted a MCMC analysis for both mock datasets. The obtained best-fit parameters and the global χ2 values are gathered in Table 3. Comparing the input values to the best-fit results reveals that, although the model provides a good fit to the simulated data, the recovered parameter values deviate significantly from the true inputs. The resulting corner plots are shown in Fig. C.3. We furthermore show in Fig. 4 the fractional difference between the intrinsic and best-fit values for each parameter across the two different datasets.
![]() |
Fig. 4. Percentage difference between the intrinsic and the best-fit values of the model parameters from the halo model fit to the SSU simulated data. The green color corresponds to the “no subs” case, and the orange color corresponds to “with subs”. |
Best-fit values together with the uncertainties obtained after fitting the HOD model to SSU mock data.
4.3. Discussion
The M21 HOD model integrated in the halo model framework can fit the mock data. However, similarly to Sect. 3, the recovered parameters do not exactly match the intrinsic ones, even though the exact same parameterization was used in the SSU (see Fig. 4).
In the “with subs” case, the shot noise parameters are recovered within 20%, but the four astrophysical parameters (log Mh, max, ηmax, σMh0, τ) show discrepancies exceeding 50%, indicating a poor recovery of the physical quantities that govern the star formation efficiency. This suggests that degeneracies within the model, particularly between the one-halo term and shot noise, may obscure the true values of these parameters. To mitigate this, we fit the halo model to the “no subs” mock data, which excludes subhalos and thus suppresses the one-halo contribution. The fit quality improves, with all parameters recovered within 20% of their intrinsic values, except for log Mh, max. This supports the hypothesis that reducing the one-halo–shot noise degeneracy improves parameter constraints. However, the persistent offset in log Mh, max, even in the absence of subhalos, suggests that its poor recovery is likely rooted in a more fundamental limitation of the model’s parameterization or in residual degeneracies with other parameters such as ηmax or σMh0.
We repeated the above fitting exercise with an even simpler parameterization, setting the τ parameter to zero. While some parameters were better constrained, others showed worse results. Overall, this simplification did not lead to significantly better outcomes. The resulting plots are presented and discussed further in Appendix B.
5. LSS and cosmological components: Impacts on the halo model-simulation tension
In the previous sections, we fit the halo model to mock data generated using both the original and simplified versions of SIDES-Uchuu. However, the obtained fit values consistently deviated from the intrinsic ones, even when the same parameterization was applied to both the halo model and the simulation. To better understand the origin of these deviations, we now focus on directly comparing the halo model with the SSU simulation for the exact same parameter values.
At various steps of the computation we compare the quantities used by the model to construct the power spectrum. First, the SFRD, next the differential emissivity
(Eq. (5)), and finally the power spectra. For all these comparisons, in addition to using the SSU (which shares the same parameterization as the M21 HOD model) we incorporate the exact halo mass function (HMF) from the SIDES-Uchuu simulation into the M21 HOD model. This direct comparison could eliminate potential uncertainties arising from the fitting process (e.g., parameter degeneracies) or differences between the analytical and the Uchuu HMF.
To simplify the analysis and focus on intermediate and large scales, we discard subhalos. This choice is supported by “no subs” best-fit values that closer align with intrinsic parameters. The intrinsic parameters used in the SSU framework throughout this section correspond to those labeled as “intrinsic” in Table 3.
5.1. SSU vs. HOD model: Star formation rate density (SFRD)
As shown in Eq. (5), the SFR combines with the selected effective SED at a given frequency to construct the differential emissivity. We, thus, compare the SFRD of the M21 HOD model and the SSU as a function of redshift to determine whether they fully coincide, given that the same parameterization and assumptions for the SFR (Eq. (3)) were applied. The result of the comparison is presented in Fig. 5.
![]() |
Fig. 5. Comparison of the SFRD from the M21 HOD model (red line) and the SSU (blue line) as a function of redshift. The bottom panel displays the ratio of the SSU SFRD to that of the M21 HOD model. |
For the SFRD of the HOD model, we first computed the SFR using Eq. (3) at redshifts ranging from 0 to 7. Next, using the same HMF as the SSU, we computed the SFRD according to Eq. (13),
For the SFRD of the SSU, we binned the redshift range with centers corresponding to the same values used in the analytical calculation, assigning a bin width of dz = 0.5 for each central redshift. We summed the SFR within each bin and divided by the corresponding comoving volume.
The two lines exhibit strong agreement across the entire redshift range. As indicated by the ratio of the lines, there is only a minor offset of less than 5%, which is most noticeable at higher redshifts. This offset likely originates from numerical factors, such as the redshift binning required to compute the SFRD in the simulation.
5.2. SSU vs. HOD model: Differential emissivity
The next quantity used to construct the power spectrum is the differential emissivity. For the halo model, we compute it using Eq. (5) over the redshift range from 0 to 7. In the simulation, the flux of each galaxy is calculated as
, where the redshift of each galaxy determines the comoving distance χ(z), the SFR via Eq. (3), and the effective SED at the corresponding frequency band. For a given redshift bin, we then bin galaxies by halo mass and sum the fluxes of all galaxies within the corresponding redshift and halo mass bins, dividing by the comoving volume of the redshift bin and the size of the halo mass bin.
In the top panel of Fig. 6, the results for z = 2.412 demonstrate very good agreement between the two lines. However, the ratio of the SSU model to the halo model, shown in the bottom panel of Fig. 6, reveals a slight offset (less than 5%) across all halo masses. This offset is observed across all redshifts, with a slight increase at higher redshifts.
![]() |
Fig. 6. Comparison of the differential emissivity predicted by the M21 HOD model (red line) and the SSU model (blue line) at z = 2.412. The bottom panel illustrates the ratio of the SSU model to the M21 HOD model at z = 2.412 (black line), alongside the ratios for other redshifts (represented by lines in various colors). |
This offset arises from the need to discretize redshift when computing the analytical formulas. The effective SED (
) as a function of redshift is continuous. In the simulation, each galaxy is assigned a
value based on its redshift, covering a very dense range of values along this curve. However, the analytical differential emissivity is computed at specific, equally spaced redshift intervals, corresponding to the given redshift grid. This means that all galaxies within a given redshift bin are assigned the same effective SED, based on the central value of the redshift bin. In contrast, galaxies in the simulation with redshifts close to each other but within the same bin are assigned slightly different
values based on their exact redshifts. These small discrepancies between the simulation and the analytical model accumulate within each redshift bin, leading to the observed systematic mismatch. Although this discrepancy is unavoidable, quantifying it helps us understand and predict how it will manifest in the final quantity of interest, the power spectrum.
5.3. SSU Simulation vs. halo model: Power spectrum
The emission-related components, which are part of the HOD formalism, show strong agreement between the M21 HOD and the SSU, and even the small offset between the two has been quantified. We now turn to a direct comparison of the two-halo term of the power spectrum, which is the primary quantity we aim to model using the halo model formalism.
The analytical computation of the two-halo term was performed using Eq. (8). For the simulation, we generated maps at all Planck frequencies from the SSU catalogs with subhalos excluded, and computed the power spectra from these subhalo-free maps. To obtain the clustering component, we subtracted the shot noise. Fig. 7 shows the direct comparison of the analytical two-halo term and the clustering component of the simulation. There is a clear offset starting from the intermediate scales (ℓ > 2 × 103) and increasingly extending down to smaller scales (ℓ > 104). The small mismatch from the emission-related components cannot account for the amplitude of the mismatch observed in the power spectrum, nor can it explain its strong dependence on scale.
![]() |
Fig. 7. Comparison of the two-halo term from the M21 HOD model (red line) and the SSU (blue line), excluding subhalos. The bottom panel displays the ratio of the SSU to that of the M21 HOD model. |
To better understand the source of this offset, we repeated the comparison between the two-halo term of the halo model and the simulation, focusing on two redshift bins: 0 < z < 3 and 3 < z < 7. These bins were chosen because sources at z < 3 are expected to dominate the CIB (e.g., Béthermin et al. 2013; Schmidt et al. 2015), while those at z > 3 contribute significantly less. This division allows us to examine potential differences in the offset across these redshift ranges. The resulting ratios are presented in the bottom plot of Fig. 7. Although the power spectra become noisy at smaller scales, it is evident that the offset between the SSU and the M21 HOD model depends on scale. Interestingly, this scale-dependent offset appears more pronounced at z > 3. The fact that the discrepancy grows with increasing ℓ may help explain, at least in part, the mismatch between the intrinsic and recovered shot noise observed in the fitting process.
The next logical step is thus to investigate the dependencies of the remaining cosmological components in the power spectrum parameterization (Eq. (8)), such as the bias model and the matter power spectrum. This investigation is particularly relevant given that the M21 HOD model employs a linear model for both the bias and the matter power spectrum.
5.4. Discussion
Our analysis shows that the emission-related components, which are governed by the HOD parameterization (e.g., SFR, emissivity) are in excellent agreement between the simulation and the analytical model. The differences are consistently below 5% and well within expected numerical uncertainties. In contrast, the two-halo term shows a much larger discrepancy, exceeding 20%. This points not to issues in the emission modeling but rather to limitations in the cosmological components of the broader halo model framework, such as the treatment of halo bias or matter clustering.
Specifically, when decomposing the two-halo term (Eq. (8)), the most likely contributors to the mismatch are the halo bias and matter power spectrum ingredients. Their product, b2Pmatter, sets both the amplitude and the scale dependence of clustering. In the M21 framework, both factors are taken from linear theory (bias from Tinker et al. 2010 and Pmatter from CAMB), whereas SIDES–Uchuu includes the full nonlinear evolution of structure, naturally producing scale-dependent halo bias and a nonlinear matter power spectrum. This difference plausibly explains the redshift- and scale-dependent residuals we observe in the two-halo component.
Given these considerations, the discrepancy is best interpreted as arising from limitations in the overall halo modeling framework, rather than the HOD prescription alone. In particular, the use of linear approximations for halo bias and Pmatter appears insufficient to match the fully evolved clustering signal of the simulation. This interpretation is consistent with other studies highlighting the breakdown of linear theory at intermediate scales and higher redshifts (Pénin et al. 2018; Acuto et al. 2021; Mahony et al. 2022; Dizgah et al. 2022; Dvornik et al. 2023; Jun et al. 2025b).
Future improvements to CIB modeling will likely require extending the halo model to include scale-dependent or nonlinear bias prescriptions (e.g., Fedeli et al. 2014; Mead et al. 2021; Mead & Verde 2021; Dizgah et al. 2022) and replacing the linear matter power spectrum with more accurate nonlinear models. These modifications would enhance the robustness of parameter inference, particularly as the precision of dataincreases.
6. The effect of scatter on the SFR and SED
The motivation to compare the M21 HOD model with a simplified version of the SIDES-Uchuu simulation came about when the model failed to accurately recover the exact parameters of the simulation. Therefore, the subsequent analysis presented in the previous sections, focused on the recipes governing the distribution and emission of sources (e.g., HMF, SFR). To isolate the effects of these recipes, we initially omitted any scatter in the relations used to generate the SFR and SED for each galaxy. In this section, we reintroduce this scatter to assess the extent of any disagreement that may arise from the SFR scatter, the SED scatter, or a combination of both.
6.1. Estimating the impact of the SFR and SED scatter
The original SIDES-Uchuu simulation considers the SFR and SED scatter by construction. However, the way that the original simulation is built, it is rather complicated to switch on and off the scatter. In order to get around this, we use the SSU version of the simulation, where activating or deactivating the scatter in SFR or SED is straightforward. In this way we can quantify the impact of scatter’s presence (or absence) on the final powerspectrum.
The SFR recipe used in the SSU is described by Eq. (3). Following the same approach as in the original SIDES-Uchuu, where a scatter of 0.36 dex is used in the Mh-SFR recipe (0.2 dex scatter in the Mh-M* relation through the abundance matching and 0.3 dex in the M*-SFR relation), we also used the same lognormal scatter into the SFR value of each source. However, introducing lognormal scatter requires an additional correction due to a property of the lognormal distribution. Its positive skewness shifts the mean to higher values compared to the central value. To ensure the mean SFR remains consistent with the case without scatter, we applied a correction to shift the mean SFR with scatter back to match the original value.
The M21 HOD model uses effective SEDs that depend only on the redshift of the sources, as detailed in Sect. 4.1. Introducing scatter to the SEDs, however, allows each galaxy’s SED template to be selected based on its mean radiation field value, ⟨U⟩, which in turn is determined by the galaxy’s redshift with an added scatter of 0.2 dex. As a result, galaxies at the same redshift can be assigned different SED templates. In contrast, when using effective SEDs, all galaxies at a given redshift share the same SED template. Furthermore, only main-sequence SED templates are employed, excluding starburst templates, as opposed to the original SIDES model. By enabling or disabling SED scatter, we aim to evaluate the impact of this simplification and assess the validity of the HOD model’s approach.
We consider the four following cases:
-
No Scatter: Both SFR and SED scatter are disabled,
-
SFR Scatter: SFR scatter enabled, SED scatter disabled,
-
SED Scatter: SFR scatter disabled, SED scatter enabled,
-
SFR & SED Scatter: Both SFR and SED scatter are enabled.
The goal is to compare these four cases with each other, in order to quantify the extent to which each type of scatter influences the resultant power spectrum. We investigate this effect separately for the clustering and the shot-noise component of the power spectrum.
As previously, to calculate the shot noise, we generate maps by randomly rearranging the positions of the sources while keeping their luminosity unchanged, and then compute the power spectrum of these maps. To compute the clustering component, we subtract the shot noise spectrum from the total powerspectrum.
6.2. Results
Figure 8 shows the power spectra at 217 × 217 GHz for both clustering and shot-noise components. The four different cases that result in slightly different power spectra, are indicated with different colors. Furthermore, the bottom panels feature a visual representation of the power spectrum ratios relative to the “no scatter” scenario, clearly demonstrating and quantifying the difference among the various cases.
![]() |
Fig. 8. Power spectra computed using the simplified SIDES-Uchuu simulation for the 217 × 217 GHz Planck frequency, featuring varying scatter scenarios, each represented by distinct colors, as detailed in the legend. The top panels display the raw power spectra, while the bottom panels depict the power spectrum ratios relative to the “no scatter” case in linear scale. Left: the clustering component of the power spectrum. Right: the shot-noise component. |
Introducing scatter significantly affects the shot-noise component, leading to noticeable differences in the power spectra compared to the “no scatter” case. Specifically, the SFR scatter results in a shift of ∼20%, while SED scatter alters the shot noise by ∼30%. Combining both SFR and SED scatter results in a power spectrum that deviates by over 50% from the scenario without any scatter. This behavior is expected since the effect of the two different types of scatter adds up when used together.
Things are different for the case of the clustering component, where it is obvious that the scatter has a smaller impact than in the shot noise. The addition of SFR scatter has an impact on the clustering below 5%, while the SED scatter can introduce a shift on the order of 10%.
6.3. Discussion
We anticipated that scatter in the emission relations would notably affect the shot noise, due to its high sensitivity to the flux of individual sources. The presence of scatter in either the SFR or the SED modifies the source luminosity, which directly affects their flux. Since the shot noise depends on an integral over the squared flux, even modest variations can substantially impact its amplitude. Typically, the shot noise of the CIB power spectrum provides insight into the abundance of faint, unresolved sources. However, our results indicate that the implementation of scatter in the relations of the M21 HOD model can lead to significantly different levels of shot noise. This variation could result in misinterpretations when analyzing observed shot noiseamplitudes.
In contrast, the effect of scatter on the clustered (two-halo) component of the power spectrum is more modest in our study, with deviations remaining below 10%. This result is consistent with the findings of Jun et al. (2025a), who studied the impact of secondary bias, specifically, correlations between star formation rate and halo structural properties, on the two-halo term of the LIM power spectrum. Using the IllustrisTNG simulation, they found that such correlations can enhance the two-halo term by up to 5%. While their focus is on secondary environmental dependencies rather than stochastic scatter in emission relations, both mechanisms introduce similar scale-dependent modulations to the large-scale power. This strengthens the interpretation that SFR-related stochasticity and environmental effects may modestly bias clustering-based inferences, but are not dominant.
In comparison, Murmu et al. (2023) found that scatter in the luminosity–halo mass relation can impact the [CII] power spectrum by up to 50%. This is more aligned with our findings for the shot-noise component. However, a direct comparison is challenging: their study examined a single spectral line, whereas we analyze broadband continuum emission integrated over many sources and many redshifts. Moreover, Murmu et al. (2023) focused on the total power spectrum, without disentangling the one-halo, two-halo, and shot-noise components, making it difficult to isolate which part is most affected. It is likely that the majority of the offset in their study stems from the shot-noise component, which aligns with our findings and is also consistent with Gkogkou et al. (2023), where it was shown that shot noise exhibits much greater variation for [CII] than for the CIB.
Overall, incorporating scatter in the emission relations of the HOD model would enhance its physical realism and versatility. However, scatter has only a limited impact on the clustered component and can reasonably be neglected when the focus is on large-scale structure modeling. On the other hand, it plays a critical role in the amplitude of the shot noise. Fully modeling scatter in the emission relations would significantly increase the complexity of the halo model framework, as simple mean scaling relations would no longer be sufficient. However, treating the shot noise as a free parameter offers a practical and effective alternative. It allows the model to absorb uncertainties introduced by scatter or sample variance (Gkogkou et al. 2023) without overcomplicating the parameter space.
7. Summary and conclusions
Halo models are a key tool for interpreting CIB anisotropies. They aim to extract astrophysical and cosmological parameters by linking galaxy emission to large-scale dark matter structure. In this work, we evaluated the four-parameter HOD model developed by Maniyar et al. (2021), which is embedded within the broader halo model framework and tailored for CIB data. Our goal was to test its ability to recover physical quantities by fitting it to simulated data with known ground truth.
We first fit the M21 HOD model to mock data generated from the SIDES-Uchuu simulation, which includes a realistic treatment of galaxy formation and clustering. While the model reproduced the observed power spectra and SFRD reasonably well, it systematically failed to recover key intrinsic parameters, notably overestimating the halo mass corresponding to peak star formation efficiency.
To isolate the cause, we constructed a simplified version of the simulation (SSU), adopting the same prescriptions as the M21 HOD model for SFR, SEDs, and halo occupation. Despite this alignment, best-fit parameters still deviated from the known inputs, suggesting that the problem lies beyond the choice of parameterization.
A detailed, one-to-one comparison between the HOD model and the SSU simulation revealed that the SFRD and differential emissivity agree to within 5%, confirming that emission-related components are robustly modeled. However, we find a persistent, scale- and redshift-dependent offset in the two-halo component of the power spectrum, exceeding 20%. This points not to shortcomings of the HOD itself, but to limitations of the broader halo-model ingredients, most notably the use of linear, scale-independent halo bias together with a linear matter power spectrum. These limitations are expected to be most prominent across the one- to two-halo transition, where nonlinear growth, scale-dependent and higher order bias, and possible environmental dependencies become important. Addressing these effects, for example by incorporating scale-dependent or nonlinear bias together with a nonlinear P(k), will be necessary for robust parameter recovery from high precision CIB data.
We also assessed the impact of including scatter in the SFR and SED assignments. This primarily affected the shot-noise component of the power spectrum (with shifts exceeding 50%), while the clustered signal remained largely unchanged (below 10% variation). This validates the treatment of shot noise as a free nuisance parameter but suggests that shot noise interpretations should be treated with caution when comparing to data.
In summary, although the halo model provides a good fit to the CIB power spectrum, it does not reliably recover the underlying physical parameters, even when matched to the exact simulation prescriptions. The dominant source of this discrepancy appears to lie in the broader halo model framework, particularly in its linear treatment of halo bias and matter clustering. Incorporating scale-dependent or nonlinear models for these cosmological components is likely necessary to improve the accuracy of halo model-based CIB analyses. This work underscores the value of simulation-based validation and has broader implications for both current CIB studies and future high redshift line-intensity mapping efforts.
All the results published in M21 were obtained using the astroph version of Tinker & Wetzel (2010). It should be noted that this version contains a typographical error in the subhalo mass function, where a value of 0.13 was used instead of the correct value, which is 0.3. In this work we have used the correct one.
Acknowledgments
This project has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No 788212). This work was supported by the TITAN ERA Chair project (contract no. 101086741) within the Horizon Europe Framework Program of the European Commission. We thank M. Jose Luis Bernal, Azadeh Moradinezhad Dizgah, and Mathilde Van Cuyck for the helpful discussions.
References
- Acuto, A., McCarthy, I. G., Kwan, J., et al. 2021, MNRAS, 508, 3519 [NASA ADS] [CrossRef] [Google Scholar]
- Alonso, D., Sanchez, J., Slosar, A., & LSST Dark Energy Science Collaboration 2019, MNRAS, 484, 4127 [NASA ADS] [CrossRef] [Google Scholar]
- Amblard, A., Cooray, A., Serra, P., et al. 2011, Nature, 470, 510 [NASA ADS] [CrossRef] [Google Scholar]
- Asgari, M., Mead, A. J., & Heymans, C. 2023, Open J. Astrophys., 6, 39 [NASA ADS] [CrossRef] [Google Scholar]
- Bernal, J. L., & Kovetz, E. D. 2022, Line-Intensity Mapping: Theory Review [Google Scholar]
- Béthermin, M., Daddi, E., Magdis, G., et al. 2012, ApJ, 757, L23 [Google Scholar]
- Béthermin, M., Wang, L., Doré, O., et al. 2013, A&A, 557, A66 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Béthermin, M., Daddi, E., Magdis, G., et al. 2015, A&A, 573, A113 [Google Scholar]
- Béthermin, M., Wu, H.-Y., Lagache, G., et al. 2017, A&A, 607, A89 [Google Scholar]
- Béthermin, M., Gkogkou, A., Van Cuyck, M., et al. 2022, A&A, 667, A156 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Cooray, A., & Sheth, R. 2002, Phys. Rep., 372, 1 [Google Scholar]
- Daddi, E., Dickinson, M., Morrison, G., et al. 2007, ApJ, 670, 156 [NASA ADS] [CrossRef] [Google Scholar]
- De Bernardis, F., & Cooray, A. 2012, ApJ, 760, 14 [Google Scholar]
- Despali, G., Giocoli, C., Angulo, R. E., et al. 2016, MNRAS, 456, 2486 [NASA ADS] [CrossRef] [Google Scholar]
- Dizgah, A. M., Nikakhtar, F., Keating, G. K., & Castorina, E. 2022, JCAP, 2022, 026 [Google Scholar]
- Dvornik, A., Heymans, C., Asgari, M., et al. 2023, A&A, 675, A189 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Fakhouri, O., Ma, C.-P., & Boylan-Kolchin, M. 2010, MNRAS, 406, 2267 [Google Scholar]
- Fedeli, C., Semboloni, E., Velliscig, M., et al. 2014, JCAP, 2014, 028 [Google Scholar]
- Foreman-Mackey, D., Hogg, D. W., Lang, D., & Goodman, J. 2013, PASP, 125, 306 [Google Scholar]
- Gispert, R., Lagache, G., & Puget, J. L. 2000, A&A, 360, 1 [NASA ADS] [Google Scholar]
- Gkogkou, A., Béthermin, M., Lagache, G., et al. 2023, A&A, 670, A16 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Ishiyama, T., Prada, F., Klypin, A. A., et al. 2021, MNRAS, 506, 4210 [NASA ADS] [CrossRef] [Google Scholar]
- Jego, B., Ruiz-Zapatero, J., García-García, C., Koukoufilippas, N., & Alonso, D. 2023, MNRAS, 520, 1895 [Google Scholar]
- Jun, R. L., Theuns, T., Moriwaki, K., & Bose, S. 2025a, MNRAS, 543, 1494 [Google Scholar]
- Jun, R. L., Theuns, T., Moriwaki, K., & Bose, S. 2025b, MNRAS, 540, 433 [Google Scholar]
- Kennicutt, R. C., Jr 1998, ARA&A, 36, 189 [NASA ADS] [CrossRef] [Google Scholar]
- Kereš, D., Katz, N., Weinberg, D. H., & Davé, R. 2005, MNRAS, 363, 2 [Google Scholar]
- Khusanova, Y., Bethermin, M., Le Fèvre, O., et al. 2021, A&A, 649, A152 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Kovetz, E. D., Viero, M. P., Lidz, A., et al. 2017, arXiv e-prints [arXiv:1709.09066] [Google Scholar]
- Lagache, G., Bavouzet, N., Fernandez-Conde, N., et al. 2007, ApJ, 665, L89 [NASA ADS] [CrossRef] [Google Scholar]
- Lagache, G., Bethermin, M., Montier, L., Serra, P., & Tucci, M. 2020, A&A, 642, A232 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Madau, P., & Dickinson, M. 2014, ARA&A, 52, 415 [Google Scholar]
- Magdis, G. E., Daddi, E., Bethermin, M., et al. 2012, ApJ, 760, 6 [NASA ADS] [CrossRef] [Google Scholar]
- Mahony, C., Dvornik, A., Mead, A., et al. 2022, MNRAS, 515, 2612 [NASA ADS] [CrossRef] [Google Scholar]
- Maniyar, A., Béthermin, M., & Lagache, G. 2018, A&A, 614, A39 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Maniyar, A., Béthermin, M., & Lagache, G. 2021, A&A, 645, A40 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Matsuura, S., Bock, J. J., Cooray, A., et al. 2024, in Space Telescopes and Instrumentation 2024: Optical, Infrared, and Millimeter Wave, eds. L. E. Coyle, S. Matsuura, & M. D. Perrin, SPIE Conf. Ser., 13092, 130920V [Google Scholar]
- McCarthy, F., & Madhavacheril, M. S. 2021, Phys. Rev. D, 103, 103515 [CrossRef] [Google Scholar]
- Mead, A. J., & Verde, L. 2021, MNRAS, 503, 3095 [NASA ADS] [CrossRef] [Google Scholar]
- Mead, A., Brieden, S., Tröster, T., & Heymans, C. 2021, MNRAS, 502, 1401 [Google Scholar]
- Moster, B. P., Naab, T., & White, S. D. M. 2013, MNRAS, 428, 3121 [Google Scholar]
- Murmu, C. S., Olsen, K. P., Greve, T. R., et al. 2023, MNRAS, 518, 3074 [Google Scholar]
- Navarro, J. F., Frenk, C. S., & White, S. D. M. 1997, ApJ, 490, 493 [Google Scholar]
- Pénin, A., Umeh, O., & Santos, M. 2018, MNRAS, 473, 4297 [Google Scholar]
- Planck Collaboration XVIII. 2011, A&A, 536, A18 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Planck Collaboration IX. 2014, A&A, 571, A9 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Planck Collaboration XXX. 2014, A&A, 571, A30 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Planck Collaboration VI. 2020, A&A, 641, A6 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Sargent, M. T., Béthermin, M., Daddi, E., & Elbaz, D. 2012, ApJ, 747, L31 [Google Scholar]
- Schmidt, S. J., Ménard, B., Scranton, R., et al. 2015, MNRAS, 446, 2696 [NASA ADS] [CrossRef] [Google Scholar]
- Schreiber, C., Pannella, M., Elbaz, D., et al. 2015, A&A, 575, A74 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Shang, C., Haiman, Z., Knox, L., & Oh, S. P. 2012, MNRAS, 421, 2832 [NASA ADS] [CrossRef] [Google Scholar]
- Silk, J. 2003, MNRAS, 343, 249 [Google Scholar]
- Thacker, C., Cooray, A., Smidt, J., et al. 2013, ApJ, 768, 58 [Google Scholar]
- Tinker, J. L., & Wetzel, A. R. 2010, ApJ, 719, 88 [NASA ADS] [CrossRef] [Google Scholar]
- Tinker, J., Kravtsov, A. V., Klypin, A., et al. 2008, ApJ, 688, 709 [Google Scholar]
- Tinker, J. L., Robertson, B. E., Kravtsov, A. V., et al. 2010, ApJ, 724, 878 [NASA ADS] [CrossRef] [Google Scholar]
- Viero, M. P., Ade, P. A. R., Bock, J. J., et al. 2009, ApJ, 707, 1766 [NASA ADS] [CrossRef] [Google Scholar]
- Viero, M. P., Wang, L., Zemcov, M., et al. 2013, ApJ, 772, 77 [Google Scholar]
- Viero, M. P., Reichardt, C. L., Benson, B. A., et al. 2019, ApJ, 881, 96 [NASA ADS] [CrossRef] [Google Scholar]
- Zagatti, G., Calabrese, E., Chiocchetta, C., et al. 2024, A&A, 692, A190 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
- Zemcov, M., & SPHEREx Science Team 2018, Am. Astron. Soc. Meeting Abstr., 231, 354.24 [Google Scholar]
Appendix A: Covariance matrix estimation
To accurately estimate uncertainties and correlations between the power spectra at different frequencies and angular scales, we computed the full Gaussian covariance matrix using the NaMaster flat-sky formalism. This was done across the four Planck frequencies (217, 353, 545, and 857 GHz) and eight multipole bins (see Fig. 1). The computation uses simulated maps covering the full SIDES-Uchuu area with a uniform mask, under the flat-sky approximation.
NaMaster internally accounts for mode coupling via the coupling matrix derived from the sky mask. However, due to the large and nearly uniform sky coverage of the SIDES-Uchuu maps, we find that off-diagonal terms from mode coupling are negligible. Therefore, while mode coupling is included in the computation, its impact is minimal in our case.
The resulting 32 × 32 matrix captures the joint covariance structure of all power spectra. The square roots of its diagonal elements are used as error bars in the analysis. This procedure is repeated for each simulation configuration considered in this work. The corresponding correlation matrices for the SIDES-Uchuu and SSU cases are shown in Fig. A.1 and Fig. A.2, respectively. The correlation matrix Ri, j is computed from the covariance matrix Ci, j as:
![]() |
Fig. A.1. Correlation matrix of the SIDES-Uchuu power spectra for the four Planck frequency bands at eight ℓ bins. |
![]() |
Fig. A.2. Correlation matrices of the four Planck frequency bands for the four different SSU configurations. Top row, left to right: "with subs" and τ ≠ 0, "with subs" and τ = 0. Bottom row, left to right: "no subs" and τ ≠ 0, "no subs" and τ = 0. |
Appendix B: Fitting the M21 HOD model to SSU data with τ = 0
In this section we show the results of fitting the M21 HOD model to the SSU data, with the τ parameter (which regulates the evolution of the efficiency curve width with redshift) set to zero. We explored this scenario to evaluate whether a simpler parameterization of the efficiency in converting baryonic accretion into star formation could improve the M21 HOD model’s ability to describe the simulated SSU data.
The fitting procedure follows the same methodology described in Sect. 4. The resulting corner plot is displayed in Fig. B.1 and the percentage differences between the intrinsic simulation values and those derived from the HOD model are shown in Fig. B.2. All parameters are recovered with comparable accuracy to the τ ≠ 0 case, showing no significant improvement. This result is unexpected, as removing τ was intended to break the degeneracy with σMh and thereby improve parameter recovery.
![]() |
Fig. B.1. Resulting corner plots from fitting the M21 model to mock data generated with SSU for the case of τ = 0 no evolution of the width of the efficiency curve with respect to redshift. The orange contours and histograms correspond to "with subs", while the green color represents the "no subs" case. |
![]() |
Fig. B.2. Percentage difference between the intrinsic and the best-fit values of the model parameters for the case of τ = 0 while keeping the τ ≠ 0 case from Fig. 4 for reference. |
The aim of this exercise was not to advocate for a simpler parameterization, but rather to evaluate whether this modification could eliminate the observed offsets and clarify their origin. Since the offsets persist, the results suggest that they likely stem from a more fundamental limitation within the components of the M21 HOD model.
Appendix C: Complementary plots
Fig. C.1 presents the corner plots obtained from fitting the M21 HOD model to simulated data from the original SIDES-Uchuu simulation. Details of the fitting process are provided in Sect. 3.
![]() |
Fig. C.1. Corner plots depicting the outcomes of fitting the M21 HOD model to mock SIDES data. The best-fit values of the model parameters are highlighted in red. |
To complement our results, we include additional plots for all other frequencies analyzed in our tests. Fig. C.2 shows the ratio of the original SIDES-Uchuu simulated data to the power spectra derived from the HOD model’s best-fit parameters for all combinations of the Planck frequency bands. The ratios exhibit the same behavior across all frequencies as presented in the main text for the 217 × 217 GHz case, with only minor variations in amplitude.
![]() |
Fig. C.2. Ratio of the CIB halo model over the mock data obtained with the original SIDES (Fig. 1). The different colors correspond to different frequency pairs. |
Fig. C.3 presents the corner plots obtained from fitting the M21 HOD model to SSU simulated data in the case of "with subs" and τ ≠ 0. Details of the data generation and fitting process are provided in Sect. 4.
![]() |
Fig. C.3. Resulting corner plots from fitting the M21 model to mock data generated with SSU (see Sect. 4.1 for a detailed description). The orange contours and histograms correspond to the scenario where the subhalos are considered ("with subs"), while the green color represents the case when subhalos are excluded ("no subs"). The intrinsic parameter values for the "with subs" and "no subs" cases are illustrated by the dotted dashed orange line and the dashed green line, respectively. Above each histogram, the best-fit value is displayed in the corresponding color, while the intrinsic value is shown in parentheses and in black. Note that the intrinsic values of the shot noise in the "no subs" and "with subs" case are not the same. |
All Tables
Best-fit values with uncertainties and the ranges of uniform distributions used as priors for fitting the M21 HOD model to mock data from SIDES-Uchuu.
Intrinsic and best-fit values of the shot noise for the four Planck frequencies (217, 353, 545, 857 GHz).
Best-fit values together with the uncertainties obtained after fitting the HOD model to SSU mock data.
All Figures
![]() |
Fig. 1. Mock data of the CIB auto- and cross-power spectra obtained by SIDES-Uchuu (black points) and the best-fit CIB halo model (black line) with its components (one-halo term, two-halo term, shot noise). The ratio of the model over the data is shown in Fig. C.2. |
| In the text | |
![]() |
Fig. 2. Mock data of the SFRD obtained by SIDES-Uchuu (blue points) and the 1 and 2 σ contours of the SFRD models defined by the various sets of parameters explored by the MCMC walkers (red shaded areas). |
| In the text | |
![]() |
Fig. 3. SFR-BAR efficiency from SIDES-Uchuu (blue gradient) and the mean efficiency (solid blue line). The various models from the MCMC parameter exploration of the M21 HOD model is shown in red (red contours for 1σ and 2σ). Each subplot corresponds to a different redshift, with the center of the redshift bin displayed on the top right of each panel. The redshift interval used to compute the density of the SFR/BAR values of the sources in the simulation is Δz = 0.05. The colorbar indicates the SFR/BAR value density, with blue representing high density and white indicating lower density, implying fewer sources with these values. |
| In the text | |
![]() |
Fig. 4. Percentage difference between the intrinsic and the best-fit values of the model parameters from the halo model fit to the SSU simulated data. The green color corresponds to the “no subs” case, and the orange color corresponds to “with subs”. |
| In the text | |
![]() |
Fig. 5. Comparison of the SFRD from the M21 HOD model (red line) and the SSU (blue line) as a function of redshift. The bottom panel displays the ratio of the SSU SFRD to that of the M21 HOD model. |
| In the text | |
![]() |
Fig. 6. Comparison of the differential emissivity predicted by the M21 HOD model (red line) and the SSU model (blue line) at z = 2.412. The bottom panel illustrates the ratio of the SSU model to the M21 HOD model at z = 2.412 (black line), alongside the ratios for other redshifts (represented by lines in various colors). |
| In the text | |
![]() |
Fig. 7. Comparison of the two-halo term from the M21 HOD model (red line) and the SSU (blue line), excluding subhalos. The bottom panel displays the ratio of the SSU to that of the M21 HOD model. |
| In the text | |
![]() |
Fig. 8. Power spectra computed using the simplified SIDES-Uchuu simulation for the 217 × 217 GHz Planck frequency, featuring varying scatter scenarios, each represented by distinct colors, as detailed in the legend. The top panels display the raw power spectra, while the bottom panels depict the power spectrum ratios relative to the “no scatter” case in linear scale. Left: the clustering component of the power spectrum. Right: the shot-noise component. |
| In the text | |
![]() |
Fig. A.1. Correlation matrix of the SIDES-Uchuu power spectra for the four Planck frequency bands at eight ℓ bins. |
| In the text | |
![]() |
Fig. A.2. Correlation matrices of the four Planck frequency bands for the four different SSU configurations. Top row, left to right: "with subs" and τ ≠ 0, "with subs" and τ = 0. Bottom row, left to right: "no subs" and τ ≠ 0, "no subs" and τ = 0. |
| In the text | |
![]() |
Fig. B.1. Resulting corner plots from fitting the M21 model to mock data generated with SSU for the case of τ = 0 no evolution of the width of the efficiency curve with respect to redshift. The orange contours and histograms correspond to "with subs", while the green color represents the "no subs" case. |
| In the text | |
![]() |
Fig. B.2. Percentage difference between the intrinsic and the best-fit values of the model parameters for the case of τ = 0 while keeping the τ ≠ 0 case from Fig. 4 for reference. |
| In the text | |
![]() |
Fig. C.1. Corner plots depicting the outcomes of fitting the M21 HOD model to mock SIDES data. The best-fit values of the model parameters are highlighted in red. |
| In the text | |
![]() |
Fig. C.2. Ratio of the CIB halo model over the mock data obtained with the original SIDES (Fig. 1). The different colors correspond to different frequency pairs. |
| In the text | |
![]() |
Fig. C.3. Resulting corner plots from fitting the M21 model to mock data generated with SSU (see Sect. 4.1 for a detailed description). The orange contours and histograms correspond to the scenario where the subhalos are considered ("with subs"), while the green color represents the case when subhalos are excluded ("no subs"). The intrinsic parameter values for the "with subs" and "no subs" cases are illustrated by the dotted dashed orange line and the dashed green line, respectively. Above each histogram, the best-fit value is displayed in the corresponding color, while the intrinsic value is shown in parentheses and in black. Note that the intrinsic values of the shot noise in the "no subs" and "with subs" case are not the same. |
| In the text | |
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.


![$$ \begin{aligned} \begin{split} \mathrm{SFR}(M_h, z)&= \eta (M_h, z) \times \mathrm{BAR}(M_h, z) \\&= \eta _{\rm max} \, e^{- \frac{\left[ ln(M_h) - ln(M_{\rm max}) \right]^2}{2\sigma ^2_{M_h}(z)}} \times \mathrm{BAR}(M_h, z), \end{split} \end{aligned} $$](/articles/aa/full_html/2025/11/aa56282-25/aa56282-25-eq3.gif)



![$$ \begin{aligned} \begin{split} C^{1h}_{\ell ,\nu ,\nu ^{\prime }}&= \int \int \frac{d\chi }{dz} \left( \frac{a}{\chi } \right)^2 \left[ \frac{dj_{\nu ,c}}{dlogM_h} \frac{dj_{\nu ^{\prime },sub}}{dlogM_h} u(k,M_h,z) \right. \\&+ \frac{dj_{\nu ^{\prime },c}}{dlogM_h} \frac{dj_{\nu ,sub}}{dlogM_h} u(k,M_h,z) \\&+ \left. \frac{dj_{\nu ,sub}}{dlogM_h} \frac{dj_{\nu ^{\prime },sub}}{dlogM_h} u^2(k,M_h,z) \right] \left( \frac{d^2N}{dlogM_h dV} \right)^{-1} dz dlogM_h\,, \end{split} \end{aligned} $$](/articles/aa/full_html/2025/11/aa56282-25/aa56282-25-eq12.gif)
![$$ \begin{aligned} \begin{split} C^{2h}_{\ell ,\nu ,\nu ^{\prime }}&= \int \int \int \frac{d\chi }{dz} \left( \frac{a}{\chi } \right)^2 \left[ \frac{dj_{\nu ,c}}{dlogM_h} + \frac{dj_{\nu ,sub}}{dlogM_h} u(k,M_h,z) \right] \\&\times \left[ \frac{dj_{\nu ^{\prime },c}}{dlogM^{\prime }_h} + \frac{dj_{\nu ^{\prime },sub}}{dlogM^{\prime }_h} u(k,M_h,z) \right] \\&\times b(M_h,z) b(M^{\prime }_h,z) P_{\rm lin}(k,z) d \mathrm{log} M_h d \mathrm{log} M^{\prime }_h dz\,, \end{split} \end{aligned} $$](/articles/aa/full_html/2025/11/aa56282-25/aa56282-25-eq14.gif)




















