Stochastic analysis of ultrahigh-energy cosmic ray interactions

Leonel Morejon; Karl-Heinz Kampert

doi:10.1051/0004-6361/202557405

Open Access

Issue		A&A Volume 708, April 2026


Article Number		A21
Number of page(s)		23
Section		Astrophysical processes
DOI		https://doi.org/10.1051/0004-6361/202557405
Published online		25 March 2026

A&A, 708, A21 (2026)

Stochastic analysis of ultrahigh-energy cosmic ray interactions

Leonel Morejon^★ and Karl-Heinz Kampert^★

Bergische Universität Wuppertal, Gaußstraße 20, 42119 Wuppertal, Germany

^★ Corresponding authors: This email address is being protected from spambots. You need JavaScript enabled to view it. ; This email address is being protected from spambots. You need JavaScript enabled to view it.

Received: 25 September 2025
Accepted: 13 February 2026

Abstract

Context. Photonuclear interactions between ultrahigh-energy cosmic ray (UHECR) nuclei and photon fields are key to connecting the compositions observed at Earth to those emitted from the sources as well as understanding their evolution during propagation. The stochastic nature of these interactions implies that any deterministic description is an approximation, but an exact probabilistic formulation is not yet available. Currently, Monte Carlo simulations are the only means to probe the probability space of these events.

Aims. We aim to obtain a rigorous probabilistic description of UHECR interactions based on the theory of stochastic processes and to derive closed-form expressions for the probability distributions of the distance, with the inclusion of all energy losses and cosmological effects. Additionally, we aim to demonstrate the application of this description in astrophysical scenarios where UHECR-photon interactions are relevant, such as radiation models of UHECR sources and extragalactic propagation of UHECRs.

Methods. We applied the theory of Markov jump processes to the nuclear cascades produced by photonuclear interactions of UHECRs, providing definitions to characterize the cascades within this new framework. By comparing different choices of input originating from nuclear physics experiments and electromagnetic observations, we improved our understanding of the underlying physics.

Results. We developed the sought-after probabilistic description and derived analytical expressions for various types of cascades, which are defined in terms of the number of nuclei and disintegration channels involved. We discuss the fundamental properties of these cascades and their relation to physical parameters, such as the UHECR composition, the Lorentz boost, and the target photon thickness. We discuss the role of continuous energy losses in this description and show their impact in the propagation horizons of UHECRs. We computed several quantities of interest, such as the UHECR disintegration distance as a function of the nuclear mass, the composition evolution in a gamma-ray burst (GRB) source scenario, and the angular diffusion of disintegration products produced during extragalactic propagation, while accounting for interactions.

Key words: astroparticle physics / relativistic processes

© The Authors 2026

Open Access article, published by EDP Sciences, under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

This article is published in open access under the Subscribe to Open model. This email address is being protected from spambots. You need JavaScript enabled to view it. to support open access publication.

1. Introduction

Experimental observations of cosmic rays alone are insufficient to answer the fundamental questions about their origins. A precise understanding of the magnetic deflections and interactions that affect their production and propagation is essential for reconstructing their past history. In the case of ultrahigh-energy cosmic rays (UHECRs), it is now understood that their sources must be extragalactic (Aab et al. 2015; Abdul Halim et al. 2024a). The plausibility of hypothetical sources is assessed by using knowledge of interactions and magnetic deflections to produce synthetic quantities that can be compared with the main observables, such as the energy spectrum and fluctuations in the depth of shower maximum (Abdul Halim et al. 2023) as well as, more recently, including arrival directions (Abdul Halim et al. 2024b). The present work focuses on the interactions of UHECRs, with some mention of the effects of turbulent magnetic fields. The effect of Galactic magnetic fields (Unger & Farrar 2024; Korochkin et al. 2025) will not be addressed here.

During acceleration and diffusion within the sources, as well as during propagation, UHECRs interact with surrounding photon fields¹, such as the cosmic microwave background (CMB), the cosmic infrared background (IRB), or nonthermal spectra in the source. These interactions result in the loss of energy and photodisintegration of the UHECR. Some interactions do not change the nuclear species of the UHECR and are well characterized as deterministic (often referred to as continuous energy losses, CEL) if fluctuations of the inelasticity and the interaction length are negligible. Some examples include Bethe-Heitler pair production and synchrotron losses. Conversely, stochastic interactions often result in the transformation or loss of the interacting particle (referred to as stochastic losses, SL) with discrete changes in interaction lengths and multiple possible outcomes for the resulting species. Examples of such interactions include the photodisintegration of cosmic ray nuclei, where the number of nucleons lost is not deterministic, as well as photomeson production, where multiple meson-producing channels are available (depending on the energy), each with a distribution of inelasticity and number of secondaries. This process also leads to a distribution of nuclear fragments (Morejon et al. 2019).

Although the fundamental differences between SL and CEL were already recognized in early works (Puget et al. 1976; Yoshida & Teshima 1993), a CEL approach has been widely employed (e.g. Hill & Schramm 1985; Berezinskii & Grigor’eva 1988) to describe the evolution of the cosmic ray spectrum. However, improvements in the experimental precision and indications of a heavier composition have increased the need for more sophisticated methods that account for the probabilistic nature of SL efficiently. Today, approaches to computing the interactions of UHECR nuclei can be grouped into two types. The first type uses the continuous-limit approximation (Hooper et al. 2008; Aloisio et al. 2013a,b; Boncioli et al. 2017; Biehl et al. 2018; Heinze et al. 2019; Lia & Tamborra 2024), where the energy densities of different nuclear species are computed by solving a coupled system of differential equations in which SL and CEL are treated as continuous losses. The second type uses Monte Carlo methods (Epele & Roulet 1998; Globus et al. 2015; Alves Batista et al. 2016; Aloisio et al. 2017), which simulate the underlying stochasticity by sampling each interaction and tracking the products individually. The former approach has the advantage of faster computation and analytic solutions have even been put forward by limiting the number of disintegration channels (Hooper et al. 2008; Ahlers & Taylor 2010; Ahlers et al. 2013; Aloisio et al. 2013a,b). The latter is considered to be more theoretically correct because it best reflects the nature of the interactions and allows for the estimation of stochastic effects. However, there are intrinsic limitations to the method, such as the computational expense involved, depending on assumptions, and problems of convergence (e.g., Asmussen & Glynn 2007). More importantly, Monte Carlo simulations provide limited theoretical insight because the impact of input uncertainties (e.g., nuclear cross-sections and photon field models) cannot be easily determined without an exhaustive and computationally demanding parameter space scan. In contrast, an analytic framework can facilitate the study of correlations between inputs (in some cases, explicitly) and the computation is considerably more efficient. It can also achieve arbitrary precision at modest computational effort. Conversely, Monte Carlo methods often waste computational resources on uninteresting events as they are blind to the underlying probability space (see Appendix E). Currently, there is no formal theoretical framework that can describe the stochasticity of the UHECR interactions analytically.

This paper presents an analytic theoretical framework that addresses the interactions of UHECRs with photon fields prevalent in extragalactic propagation and within sources. The resulting closed-form expressions describe the probability distributions as a function of target thickness for an arbitrary initial condition. This approach can easily be extended to include nuclear masses beyond iron, enabling the independent study of the effects of uncertainties in inputs such as nuclear cross-sections and photon fields.

2. Stochastic description

The continuous-limit temporal evolution of the energy densities of UHECR nuclei interacting with photon fields is described by a coupled system of ordinary differential equations,

$\begin{matrix} \frac{\partial}{\partial t} n_{i} (E_{i}) = \frac{\partial}{\partial E_{i}} (b n_{i}) + q_{i}^{ext} (E_{i}) + \sum_{j} λ (E_{j} / m_{j}) n_{j} (E_{j}), \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \frac{\partial }{\partial t} n_i (E_i) = \frac{\partial }{\partial E_i} (b n_i) + q_i^{\mathrm{ext} }(E_i) + \sum _j \lambda (E_j/m_j) n_j(E_j), \end{aligned} $$$ (1)

where n_i(E_i)≡n_i(E_i, t) is the differential number density of nuclear species i (noting that quantities are time dependencies although not written explicitly). The first term on the right-hand side includes all CEL processes, such as synchrotron and escape losses in the case of source scenarios, and pair production and adiabatic losses in the case of extragalactic propagation. The term q_i^ext describes the injection of particles with energy, E_i, which could represent the injections through acceleration mechanisms within the sources, or emissions from multiple sources in the case of extragalactic propagation. The terms λ(E_j/m_j) denote the interaction rates for all SL processes incurred by species j, leading to the production of species i, with photons of energy ϵ and number density n(ϵ). This includes the term j = i, which has a different form −λ_in_i(E_i) with λ(E_i/m_i), the total interaction rate. The total cross-section for species i as a function of photon energy, ε, (in the center-of-mass rest frame), σ_i(ε) = ∑_kσ_i → k(ε), includes all possible products, k, and is given by the sum of the interaction rates λ_i(γ) = ∑_kλ_i → k(γ), which are computed as

$\begin{matrix} λ_{i \to k} (γ) = \frac{1}{2 γ^{2}} \int_{0}^{\infty} \frac{n (ϵ)}{ϵ^{2}} d ϵ \int_{0}^{2 ϵ γ} ε σ_{i \to k} (ε) d ε, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \lambda _{i\rightarrow k}(\gamma ) = \frac{1}{2\gamma ^2}\int _0^{\infty } \frac{n(\epsilon )}{\epsilon ^2} d\epsilon \int _0^{2\epsilon \gamma } \varepsilon \sigma _{i \rightarrow k} (\varepsilon ) d\varepsilon \, , \end{aligned} $$$ (2)

with γ representing the Lorenz factor. The system described by Eq. (1) may comprise between ∼50–200 nuclear species when including elements up to iron and an energy grid with enough resolution (∼100 bins in logarithmic scale) to capture the details of the spectra.

The approach presented here aims at describing nuclear cascades initiated by individual cosmic rays and because of the boost preserving property of SLs, this implies solving Eq. (1) for individual values of the Lorentz boost γ ≈ E_k/m_k, so we can write

$\begin{matrix} \frac{\partial}{\partial t} {\tilde{n}}_{i} (γ) = \frac{\partial}{\partial γ} (\tilde{b} {\tilde{n}}_{i}) + {\tilde{q}}_{i}^{ext} (γ) + \sum_{j} λ_{j \to i} (γ) {\tilde{n}}_{j} (γ), \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \frac{\partial }{\partial t} \tilde{n}_i (\gamma ) = \frac{\partial }{\partial \gamma } (\tilde{b} \tilde{n}_i) + \tilde{q}_i^{\mathrm{ext} }(\gamma ) + \sum _j \lambda _{j \rightarrow i}(\gamma ) \tilde{n}_j(\gamma ) , \end{aligned} $$$ (3)

where the tilde reflects that the quantities are now differential in boost instead of energy. This linear system of ordinary differential equations can be written as a matrix differential equation,

$\begin{matrix} \frac{\partial}{\partial t} N - N Λ = \frac{\partial}{\partial γ} (\tilde{b} N) + Q^{ext}, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \frac{\partial }{\partial t}\boldsymbol{N} - \boldsymbol{N}\boldsymbol{\Lambda } = \frac{\partial }{\partial \gamma } (\tilde{b} \boldsymbol{N}) + \boldsymbol{Q}^{\mathrm{ext} } , \end{aligned} $$$ (4)

where N is a row vector containing all densities ${{\tilde{n}}_{i} (γ, t)}$ $Mathematical equation: $ \{\tilde n_i(\gamma, t)\} $$ , Q^ext is a row vector, with elements ${{\tilde{q}}_{i}^{ext} (γ)}$ $Mathematical equation: $ \{ \tilde q_i^{\mathrm{ext}}(\gamma) \} $$ , and Λ is the interaction rate matrix, {λ_ji = λ_j → i(γ)} ({λ_ii = −λ_i(γ)}), which is a square matrix with zeros for elements j with no production of element i. The numerical integration of Eq. (1) (or, equivalently, Eq. (4)) yields the time evolution of the species densities N(t) requiring knowledge of the initial densities N(t = 0)≡N₀ and injections Q^ext (including possible time dependence). Notably, for certain functions Q^ext, the solution may have an analytic form when the term $\frac{\partial}{\partial γ} (\tilde{b} N)$ $Mathematical equation: $ \frac{\partial}{\partial \gamma} (\tilde b \boldsymbol{N}) $$ is negligible (no CEL) since this term is the only one coupling the equations corresponding to different values of γ.

Equation (4) (as Eq. (1)) reflects the mean behavior of individual cascades (continuous limit), but it does not describe the stochastic behavior of the interactions and the resulting fluctuations of the stochastic quantities. The accurate underlying process is as follows: an initial UHECR nucleus propagates along a path of random length (determined by the relevant magnetic field) until it decays or interacts with the surrounding photon field. Each interaction produces a random number of secondaries according to a given set of probabilities. The secondaries and the remnant species (the secondary with the largest mass) continue to propagate under the influence of magnetic fields and subsequently interact stochastically with further random products. This corresponds to a Markov jump process (Bladt & Nielsen 2017), where the transient states are the nuclear species with transition probabilities determined by the current state. The transitions (jumps) are exponentially distributed as a function of the path length (or time). In this probabilistic framework, the homogeneous form of Eq. (4) (without CEL and no injections) is analogous to Kolmogorov’s differential equation,

$\begin{matrix} \frac{d}{dt} P^{t} = G P^{t} = P^{t} G, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \frac{d}{d t} \boldsymbol{P}^t = \boldsymbol{G} \boldsymbol{P}^t = \boldsymbol{P}^t\boldsymbol{G} , \end{aligned} $$$ (5)

where instead of the density vector, N, the more appropriate P^t appears, which is a matrix where each row i contains the probabilities p_ij^t of transitioning from the i-th state to each possible state j at a time t. The infinitesimal generator, G, is equivalent to the interaction matrix, Λ, when no “absorbing state” is considered and fulfills G1 = 0, where 1 and 0 are column vectors of ones and zeros, respectively, of the same dimension as G. An “absorbing state” in our context is a species or set of species we wish to observe; thus, it does not interact or decay further once it is reached. This is of relevance to obtain the distance distributions until observing sets of species of interest (distance until absorption). When considering absorption,

$\begin{matrix} G (γ) = (\begin{matrix} Λ & - Λ 1 \\ 0 & 0 \end{matrix}), \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \boldsymbol{G}(\gamma ) = \begin{pmatrix} \boldsymbol{\Lambda }&-\boldsymbol{\Lambda } \boldsymbol{1} \\ \boldsymbol{0}&0 \\ \end{pmatrix} , \end{aligned} $$$ (6)

where 1 (0) are a column (row) vector of ones (zeros) of the same dimension as Λ. This reflects that G contains the absorbing state, in addition to all species included in Λ, and, consequently, the rows in Λ do not add up to zero for species which can produce species in the absorbing state. For these species, the main diagonal terms in Λ, {−λ_i} contain also jumps to the absorbing state; hence, the last column −Λ1 to ensure G is properly defined.

The connection between Eq. (5) and the homogeneous form of Eq. (4) is evident since the solution P^t = e^Gt for time-homogeneous conditions (length and time independence of Λ) in the former, is equivalent to the solution of the latter N(t) = N₀e^Λt, and, as mentioned e^Gt ≡ e^Λt in the absence of any absorbing states. However, it should be emphasized that the two equations are in fact describing different quantities and are not completely equivalent: Eq. (5) describes the time evolution of stochastic quantities, such as the occupation probability for each state of the nuclear cascade. Meanwhile, Eq. (4) describes the time evolution of the deterministic quantities (the number density distribution for each nuclear species). These two descriptions can be connected in the continuous limit, when stochastic effects are less important and the evolution of the system behaves similarly to a fluid flow between a network of containers (Bladt & Nielsen 2017, Section 4.6). On the other hand, the differences between approaches are more appreciable for inhomogeneous scenarios or noncontinuous injection. For example, these descriptions may be equivalent for a delta-like injection term, Q^ext, but for a time-varying injection, the stochastic treatment needs additional assumptions (see Section 4.3). Similarly, the inclusion of CEL leads to inhomogeneities; thus, P^t ≠ e^Gt and the equivalence of the approaches is not as evident as in the homogeneous case.

Nevertheless, the stochastic description presented here is not limited to stationary and homogeneous cases, changes in time or in the propagation path of UHECRs can be treated as a type of time-inhomogeneity; for instance, a global change in the normalization of the target photon field density (see Section 3). This is also how CELs are incorporated within this framework and other effects, such as the influence of magnetic fields. Before addressing the more complex inhomogeneous scenarios, it is helpful to outline the fundamental properties of the homogeneous ones and establish benchmarks. To this end, we focus here on the distributions of distance until reaching the absorption state.

2.1. Serial and regular cascades; the canonical form

First, we consider the case with only one nuclear species for each mass and only one possible interaction channel at each state; alternatively, there could be multiple channels, but some m-th channel has the largest branching ratio λ_i(γ)≈λ_i → m(γ). A typical case is the one-nucleon-loss assumption (Hooper et al. 2008), where the cascade of a nucleus with mass A proceeds in a chain of nuclei with descending mass {A, A − 1, A − 2, ..., A − k + 2, A − k + 1}, denoting the sequence of states visited over k consecutive interactions. The interactions of the species are governed by the respective rates, making up the interaction vector λ_{A → A − k}(γ):={λ_A(γ),λ_A − 1(γ),λ_A − 2(γ),...,λ_{A − k + 2}(γ),λ_{A − k + 1}(γ)}, computed by substituting the relevant cross-section into Eq. (2), and evaluated on the common boost, γ. The sequential nature of these cascades implies that the probability distribution of the propagation path lengths until k disintegrations, L_k, is the convolution of k exponential distributions. This is a hypoexponential distribution with parameter vector λ(γ). The expected value of this distribution has a straightforward physical meaning: $E [L_{k}] = \sum_{i = A - k + 1}^{A} 1 / λ_{i}$ $Mathematical equation: $ \mathbb{E}[L_k] = \sum_{i=A-k+1}^A 1/\lambda_i $$ , the sum of the mean interaction lengths of each species in the chain. This assumption was used in Morejon (2021) to understand the behavior of more complex disintegration networks for nuclei of masses up to lead. Cascades of these type, in which each interaction produces only one channel with one nuclear species at each stage are referred to as serial cascades (SeCs) herein.

For the canonical cascade, we assume that the photonuclear interaction rates are proportional to the mass number of the species, namely, λ_A(γ) = Aλ₁(γ), where λ₁(γ) is the interaction rate per nucleon, which implies the relations $λ_{A_{l}} = \frac{A_{l}}{A_{k}} λ_{A_{k}}$ $Mathematical equation: $ \lambda_{A_l} = \frac{A_l}{A_k} \lambda_{A_k} $$ for any k and l. This is motivated by the approximate proportionality of the photonuclear cross-section to the mass number, as reflected by the Thomas-Reiche-Kuhn sum rule ( $\int σ (ε) d ε \propto \frac{ZN}{A}$ $Mathematical equation: $ \int \sigma(\varepsilon) d\varepsilon \propto \frac{ZN}{A} $$ ) for giant dipole resonances (GDR) and the mass scaling of the cross-section in photomeson interactions (Morejon et al. 2019). Cascades where the rates follow this type of proportionality with mass are called regular herein. In general, photonuclear cross-sections deviate from this behavior from one species to another. However, these relations are a good approximation of the mean interaction rates and constitute a suitable benchmark for analyzing realistic distributions (see Fig. 1).

Fig. 1.

Estimation of the deviation from regularity of photonuclear cross-sections. Top: Dependence of the energy-weighted photodisintegration cross-section divided by the nuclear mass as a function of photon energy. The lines show the average over all nuclear species in the respective model. The shaded bands represent the standard deviation at each energy and are centered on the mean. Bottom: The coefficient of variation (standard deviation divided by the mean) at each energy.

Thus, we define the canonical form that we use as a benchmark, the regular serial cascade (RSeC): a SeC that obeys the regularity condition. The probability density of distances until k nucleons are stripped from a nucleus of mass number A is given by (see Appendix A):

$\begin{matrix} f_{A \to A - k}^{RS} (L) = λ_{A - k + 1} e^{- λ_{A - k + 1} L} (\begin{matrix} A \\ k - 1 \end{matrix}) {(1 - e^{- λ_{1} L})}^{k - 1} . \end{matrix}$ $Mathematical equation: $$ \begin{aligned} f^{\mathrm{RS} }_{A \rightarrow A-k}(L) = \lambda _{A-k+1}e^{-\lambda _{A-k+1}L} \left({\begin{array}{c}A\\ k-1\end{array}}\right) \left(1 - e^{-\lambda _1 L}\right)^{k-1} . \end{aligned} $$$ (7)

The interpretation of this expression is very intuitive: the distribution consists of k independent events: the probability that any k − 1 nucleons out of the initial A interact within the trajectory length, L, (the term $(\binom{A}{k - 1}) {(1 - e^{- λ_{1} L})}^{k - 1}$ $Mathematical equation: $ {A \choose k-1} \left(1 - e^{-\lambda_1 L}\right)^{k-1} $$ ) and the probability density for the interaction of species with mass A − k + 1 (the term λ_{A − k + 1}e^{−λ_{A − k + 1}L}), which is the last species that leads to the production of A − k. This interpretation becomes clearer in terms of the binomial distribution. Setting the interaction probability (success) for one nucleon to be equal to ξ = 1 − e^−λ₁L the equation can be transformed to

$\begin{matrix} f_{A \to A - k}^{RS} (ξ) = \frac{k}{ξ} B (A, k, ξ) = \frac{A - k + 1}{1 - ξ} B (A, k - 1, ξ), \end{matrix}$ $Mathematical equation: $$ \begin{aligned} f^{\mathrm{RS} }_{A \rightarrow A-k}(\xi ) = \frac{k}{\xi } \mathrm{B} (A, k, \xi ) = \frac{A-k+1}{1-\xi } \mathrm{B} (A, k-1, \xi ) , \end{aligned} $$$ (8)

where the relation $\frac{d}{dL} = λ_{1} (1 - ξ) \frac{d}{d ξ}$ $Mathematical equation: $ \frac{d}{dL} = \lambda_1 (1 - \xi) \frac{d}{d\xi} $$ is used. The binomial distribution, denoted by B(A, k, ξ), is the probability of obtaining k disintegrations (successes) out of A independent trials. This is a consequence of the regularity of the cascade: the constancy of the interaction rate per nucleon, λ₁, implies that nuclear effects are negligible and that the cascade is insensitive to the specific nuclei involved. The factors $\frac{k}{ξ}$ $Mathematical equation: $ \frac{k}{\xi} $$ , $\frac{A - k + 1}{1 - ξ}$ $Mathematical equation: $ \frac{A-k+1}{1-\xi} $$ result from the change of differential variable in the density and the arbitrary choice of the “success” probability, ξ or 1 − ξ.

Equation (7) is also equivalent to the beta distribution ℬ(α, β) with parameters (α = k, β = A − k + 1) and defined expressions for the moments from which trivial relations for the RSeCs can be obtained (see Table 1). Given the previous expressions, the distribution for a specified initial composition, represented by the set of fractions {C_i}={η_i, A_i}, where the fractions η_i add up to 1, can be constructed as a linear combination of the distributions for each initial mass,

$\begin{matrix} f_{{C_{i}} \to A_{f}}^{RS} (ξ) = \sum_{i} η_{i} f_{A_{i} \to A_{f}}^{RS} (ξ) . \end{matrix}$ $Mathematical equation: $$ \begin{aligned} f^{\mathrm{RS} }_{\{C_i\} \rightarrow A_f}(\xi ) = \sum _i \eta _i f^{\mathrm{RS} }_{A_i \rightarrow A_f}(\xi ) . \end{aligned} $$$ (9)

Table 1.

Characteristics of the distributions with a closed form.

2.2. Irregular cascades and the nuclear decays

The regularity condition assumes that nuclear cross-sections are unaffected by nuclear effects. In reality, changes in the number of protons and neutrons have a significant impact on the properties of the GDR, including the peak energy and the width. Consequently, the mass scaling of the interaction rates exhibits deviations from the regular values².

The deviations from regularity can be quantified independently of the target photon spectrum using the energy-weighted cross-section

$\begin{matrix} {\bar{σ}}_{ε} (ε) = \frac{2}{ε^{2}} \int_{0}^{ε} ε^{'} σ (ε^{'}) d ε^{'}, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \bar{\sigma }_{\varepsilon }(\varepsilon ) = \frac{2}{\varepsilon ^2} \int _0^{\varepsilon } \varepsilon ^{\prime } \sigma (\varepsilon ^{\prime }) d\varepsilon ^{\prime } \, , \end{aligned} $$$ (10)

which forms part of Eq. (2) when rewritten in the form $λ (γ) = \int_{0}^{\infty} n (ϵ) {\bar{σ}}_{ε} (2 γ ϵ) d ϵ$ $Mathematical equation: $ \lambda(\gamma) = \int_0^{\infty} n(\epsilon) \bar\sigma_{\varepsilon}(2\gamma\epsilon)d\epsilon $$ . Figure 1 represents the deviations from regularity for different cross-section datasets as the average over all nuclear species of the energy-weighted cross-section divided by the mass number. The shaded bands represent one standard deviation centered around the mean of the respective curves, and the bottom plot shows the coefficient of variation (ratio of width of the band to the line values). For a regular model the bands collapse to the mean line, since the standard deviation would be null. The models shown illustrate different existing choices for the set of nuclear species and the functional shape of their cross-sections: some contain only one species per mass number, as the PSB model (Puget et al. 1976) or the model available in SimProp v2r4 (Aloisio et al. 2017) with command-line option -M 2 < xsect_BreitWigner_TALYS-1.6.txt, both with 56 species; meanwhile, others contain larger collections of species, such as the default model in CRPropa 3.2 (Kampert et al. 2013; Alves Batista et al. 2022) with 184 species, or the much larger collection of cross-sections, the GDR Atlas (Kawano et al. 2020), which has two different parameterizations for the GDR (SLO / SMLO) and covers 532 species up to nuclear mass 56 (larger masses are also available in the Atlas). The coefficient of variation is large for energies below the GDR and reduces after the peak for all models, typically to about 10% or less for all except the serial models which remain above 30%. The mean energy-weighted cross-section divided by the mass number is a fundamental quantity for a cross-section model, as it is connected to the mean interaction rate per nucleon by $⟨ λ_{1} ⟩ (γ) = \int_{0}^{\infty} n (ϵ) ⟨ {\bar{σ}}_{ε} / A ⟩ (2 γ ϵ) d ϵ$ $Mathematical equation: $ \langle\lambda_1\rangle(\gamma) = \int_0^{\infty} n(\epsilon) \langle\bar\sigma_{\varepsilon}/A\rangle(2\gamma\epsilon)d\epsilon $$ which is the analogous of λ₁ in irregular models.

Another cause of irregularity is spontaneous nuclear decay because in this framework the decay rate is part of the total interaction rate,

$\begin{matrix} λ_{A_{i} \to A_{j}}^{tot} (γ) = λ_{A_{i} \to A_{j}} (γ) + γ / c τ, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \lambda ^{\mathrm{tot} }_{A_i \rightarrow A_j}(\gamma ) = \lambda _{A_i \rightarrow A_j} (\gamma ) + \gamma /c \tau , \end{aligned} $$$ (11)

which produces deviations from regularity for boost values and decay times, τ, where the second term is comparable to the first. SeCs with rates that deviate from the regular relations are referred to as irregular serial cascades (ISeCs) herein.

In contrast to Eq. (8), the probability density for ISeCs cannot be reduced to a dependence on the masses, because the mass scaling regularity does not apply. An ISeC is distributed according to a hypoexponential distribution, while its density can be expressed as a linear combination of the exponential distributions with interaction vector λ_{A → A − k}(γ) as long as they are all different³,

$\begin{matrix} f_{A \to A - k}^{IS} (L) = \sum_{i = 1}^{k} p_{i} (0) λ_{A - i} e^{(- λ_{A - i} L)} . \end{matrix}$ $Mathematical equation: $$ \begin{aligned} f^{\mathrm{IS} }_{A \rightarrow A-k}(L) = \sum _{i = 1}^{k} p_i(0) \lambda _{A-i} e^{(-\lambda _{A-i} L)} . \end{aligned} $$$ (12)

Here the coefficients p_i(0) are the Lagrange interpolation polynomials evaluated at λ = 0,

$\begin{matrix} p_{j} (λ) = \prod_{j = 1, j \neq i}^{k} \frac{λ_{A - j} - λ}{λ_{A - j} - λ_{A - i}} . \end{matrix}$ $Mathematical equation: $$ \begin{aligned} p_j(\lambda ) = \prod _{j = 1,\, j \ne i}^{k} \frac{\lambda _{A-j} - \lambda }{\lambda _{A-j} - \lambda _{A-i}} . \end{aligned} $$$ (13)

Equation (12) facilitates estimating the impact of irregularity on the density and it can be reduced to Eq. (7) when the regularity condition is imposed, as expected (see Appendix B).

The more general expression, which is also applicable to cases where not all rates differ, is

$\begin{matrix} f_{A \to A - k}^{IS} (L) = - ϕ e^{Λ L} Λ 1, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} f^{\mathrm{IS} }_{A \rightarrow A-k}(L) = -\boldsymbol{\phi } e^{\boldsymbol{\Lambda } L } \boldsymbol{\Lambda } \boldsymbol{1} , \end{aligned} $$$ (14)

where ϕ is a row vector denoting the initial fractions. Therefore, the vector is all zeros except for a one in the first element, as in this case, there is only one starting species which corresponds to mass A. The interaction matrix Λ contains the negative interaction rates λ ≡ {λ_A(γ),λ_A − 1(γ),...,λ_{A − k + 1}(γ)} on the main diagonal and on the upper diagonal, contiguous to the main diagonal, the positive interaction rates in λ except the last one (all rows add up to zero except the last row).

Equation (14) can be written as a combination of the base exponential distributions, as in Eq. (12), which is particularly useful for comparisons to other cascades. Since the interaction matrix Λ is upper triangular, it is nonsingular (provided all diagonal rates are different) and diagonalizable. Its diagonalized form $D_{Λ} = J^{- 1} Λ J$ $Mathematical equation: $ \boldsymbol{D}_{\Lambda} = \boldsymbol{J}^{-1} \boldsymbol{\Lambda} \boldsymbol{J} $$ has the same diagonal elements as Λ (where J is an invertible matrix). Thus, Eq. (14) can be written as

$\begin{matrix} f_{A \to A - k}^{IS} (L) = - b e^{D_{Λ} L} D_{Λ} d . \end{matrix}$ $Mathematical equation: $$ \begin{aligned} f^{\mathrm{IS} }_{A \rightarrow A-k}(L) = -\boldsymbol{b} e^{\boldsymbol{D}_{\Lambda } L } \boldsymbol{D}_{\Lambda } \boldsymbol{d} . \end{aligned} $$$ (15)

The starting vector b = ϕJ and the ending vector d = J⁻¹1 depend on the contents of Λ and the central term $e^{D_{Λ} L} D_{Λ}$ $Mathematical equation: $ e^{\boldsymbol{D}_{\Lambda} L } \boldsymbol{D}_{\Lambda} $$ has a diagonal form and is common to all interaction matrices, Λ, having the same diagonal elements. Thus, it is useful for comparing cascades with the same total interaction rates but with a different number of channels. Equation (15) implies a linear combination of exponentials with rates from λ_{A → A − k} and coefficients c_k = −b_kd_k given by the elements of the starting and ending vectors. In this form, the physical meaning of the starting and ending vectors is lost and the coefficients c_k may take complex values.

The expression for the distribution function of ISeCs is

$\begin{matrix} F_{A \to A - k}^{IS} (L) = 1 - ϕ e^{Λ L} 1 . \end{matrix}$ $Mathematical equation: $$ \begin{aligned} F^{\mathrm{IS} }_{A \rightarrow A-k}(L) = 1 - \boldsymbol{\phi } e^{\boldsymbol{\Lambda } L} \boldsymbol{1} . \end{aligned} $$$ (16)

Some moments of interest are listed in Table 1. In the cases where analytic expressions for the moments and variance are not available, some bounds can be established (He et al. 2019; He 2021). The distribution functions for an arbitrary mixture can be computed as in Eq. (9), where the distribution for each individual cascade has the form of Eq. (14). However, it is more convenient to build the starting vector with the initial fractions, ϕ_mix = (η_A, η_A − 1, ....,η_A − k), and substitute in Eq. (14),

$\begin{matrix} f_{A \to A - k}^{IS, mix} (L) = ϕ_{mix} e^{Λ L} Λ 1 . \end{matrix}$ $Mathematical equation: $$ \begin{aligned} f^{\mathrm{IS, mix} }_{A \rightarrow A-k}(L) = \boldsymbol{\phi }_{\mathrm{mix} } e^{\boldsymbol{\Lambda } L} \boldsymbol{\Lambda } \boldsymbol{1} . \end{aligned} $$$ (17)

2.3. Concurrent cascades

The general cascade requires the inclusion of multiple channels at each step, producing a network of states. Unlike serial cascades where the path between any pair of states is unique, in the general case, each node may branch into multiple options forming a network of intersecting ISeCs that develop concurrently. These cascade types are referred to as concurrent cascades (CoCs) herein. One of the simplest examples in the literature is the disintegration scheme proposed by Puget et al. (1976), the PSB model. In the PSB model, while there is only one species for each mass, each nucleus can jump to multiple other nuclei due to the additional disintegration channels, such as one- and two-nucleon emission in the GDR region and 6–15 nucleon emission in the quasi-deuteron region. The density function for the distance until absorption is the same as in Eq. (14), but the matrix Λ has additional terms in each row representing jumps to other nuclei in the chain. This is unlike the matrix for ISeCs, which contains only jumps to the immediate species with lower mass. An expression in the form of Eq. (15) may not exist in general for CoCs as no set of coefficients c_k can produce the equivalent function (see Appendix B).

In their most general form, CoCs should include all known nuclear species, having multiple nuclei with the same mass number. The density is described as in Eq. (14) although the interaction matrix Λ and the starting vector ϕ contain a larger number of rows, matching the increased number of species. The nondiagonal elements of the matrix Λ in this case are expressed as

$\begin{matrix} λ_{S_{i} \to S_{j}} = λ_{S_{i} \to S_{j}}^{tot} (γ) = \sum_{k} λ_{S_{i} \to S_{j}}^{k} (γ) + γ / c \sum_{m} τ_{m} \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \lambda _{S_i \rightarrow S_j} = \lambda ^{\mathrm{tot} }_{S_i \rightarrow S_j}(\gamma ) = \sum _k \lambda ^k_{S_i \rightarrow S_j} (\gamma ) + \gamma /c \sum _m \tau _m \end{aligned} $$$ (18)

which denotes all k photonuclear channels where species S_i produces species S_j, and all m decays having a decay time τ_m, where S_i decays into S_j. The sequence of indices i, j in ϕ and Λ is chosen in order of descending mass and charge numbers, as for RSeCs and ISeCs. This ensures that the matrix Λ is upper triangular, since disintegrations can only produce species with lower masses⁴. However, the lower triangular section of the matrix Λ may contain nonzero elements if there are nuclear decays that preserve the mass number, while increasing the charge number (e.g., β⁻ decays). The main diagonal of the interaction matrix contains the total interaction rate for each species, S_i, which is the sum of all processes that lead to any other species, S_j, in the disintegration cascade,

$\begin{matrix} λ_{S_{i}} = λ_{S_{i}}^{tot} (γ) = \sum_{S_{j}} λ_{S_{i} \to S_{j}}^{tot} (γ) . \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \lambda _{S_i} = \lambda ^{\mathrm{tot} }_{S_i}(\gamma ) = \sum _{S_j} \lambda ^{\mathrm{tot} }_{S_i \rightarrow S_j} (\gamma ) . \end{aligned} $$$ (19)

With these elements, the resulting interaction matrix takes the form

$\begin{matrix} Λ (γ) = (\begin{matrix} - λ_{S_{1}} & λ_{S_{1} \to S_{2}} & λ_{S_{1} \to S_{3}} & . . . & λ_{S_{1} \to S_{N}} \\ 0 & - λ_{S_{2}} & λ_{S_{2} \to S_{3}} & . . . & λ_{S_{2} \to S_{N}} \\ 0 & 0 & - λ_{S_{3}} & . . . & λ_{S_{3} \to S_{N}} \\ . . . & . . . & . . . & . . . & . . . \\ 0 & 0 & 0 & . . . & - λ_{S_{N}} \end{matrix}) \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \boldsymbol{\Lambda }(\gamma ) = \begin{pmatrix} -\lambda _{S_1}&\lambda _{S_1 \rightarrow S_2}&\lambda _{S_1 \rightarrow S_3}&...&\lambda _{S_1 \rightarrow S_N}\\ 0&-\lambda _{S_2}&\lambda _{S_2 \rightarrow S_3}&...&\lambda _{S_2 \rightarrow S_N}\\ 0&0&-\lambda _{S_3}&...&\lambda _{S_3 \rightarrow S_N}\\ ...&...&...&...&...&\\ 0&0&0&...&-\lambda _{S_N}\\ \end{pmatrix} \end{aligned} $$$ (20)

where N is the total number of species included, and the probability density and distribution functions for the distance until absorption are

$\begin{matrix} f^{CC} (L) = - ϕ exp (Λ L) Λ 1 \end{matrix}$ $Mathematical equation: $$ \begin{aligned} f^{\mathrm{CC} }(L) = -\boldsymbol{\phi } \exp \left( \boldsymbol{\Lambda } L\right) \boldsymbol{\Lambda } \boldsymbol{1} \end{aligned} $$$ (21)

$\begin{matrix} F^{CC} (L) = 1 - ϕ exp (Λ L) 1 . \end{matrix}$ $Mathematical equation: $$ \begin{aligned} F^{\mathrm{CC} }(L) = 1 - \boldsymbol{\phi } \exp \left( \boldsymbol{\Lambda } L\right) \boldsymbol{1} . \end{aligned} $$$ (22)

In CoCs, the “absorption state” may be a group of states (and not always a unique species) and it is represented by the absorption vector ω = −Λ1 whose components are the rates of transitioning to absorption from each of the species. For instance, when computing transitions between mass groups, ϕ would contain nonzero values for nuclei with a mass equal to the injection mass number and the absorption vector ω would be nonzero for nuclei with a mass equal to the final mass. Thus, this formulation allows us to study any possible type of cascade, and the construction of the matrix Λ encodes also the absorption state.

Figure 2 (left) exemplifies these cascades by showing the probability that specific primary nuclei or a composition of primary nuclei with a Lorenz factor of γ = 7 ⋅ 10⁹ have fully disintegrated after propagating a certain distance. The green and purple outer lines represent primary ⁴He and ⁵⁶Fe, respectively.

Fig. 2.

Left: Probability of finding an injected nucleus or composition of nuclei fully disintegrated after propagating a specified distance (γ = 7 ⋅ 10⁹). The green and purple solid lines in the extremes correspond to ⁴He and ⁵⁶Fe injection, respectively, representing the lightest and heaviest initial compositions. The black solid line uses a similar composition as obtained in fits of the UHECR spectrum (Abdul Halim et al. 2024b). The specific nuclei and their approximate fractions are given in the legend. The dashed black line represents the case where all species share the same fraction and the dot-dashed black line a composition reflecting solar abundances. Right: Occupation probabilities for species in the nuclear cascade for ⁵⁶Fe injection (γ = 7 ⋅ 10⁹). The values are given for two propagation distances in the range where the full disintegration probability is negligible: 2 Mpc (purple) and 10 Mpc (green).

Additionally, three examples of mixed composition are shown: solar abundance (black dot-dashed line), which is very light but contains a nonzero fraction of species heavier than helium; a UHECR-like composition (black solid line), which has elemental fractions similar to those obtained by fitting the UHECR spectrum and composition; and an equal fraction for all species (dashed black line) which places a larger fraction on heavier species because they are more numerous.

The figure illustrates the marked differences in composition evolution over typical UHECR propagation lengths ranging from 1 to 100 Mpc. After a propagation distance of ∼100 Mpc, all nuclei are fully disintegrated in the three cases, but the lighter mixtures reach 50% probability of complete disintegration within 20-30 Mpc, while the heavier mixtures require 2-3 times larger distances.

It should be noted that these distributions do not describe the occupation probabilities of species in the nuclear cascade, but only the probability of having the initial species fully disintegrated. Figure 2 (right) represents such occupation probabilities for an initial ⁵⁶Fe (γ = 7 ⋅ 10⁹) after two different propagation distances for which the probability of full disintegration is almost null. After 2 Mpc the average mass is reduced by 8-10 nucleons, given the short interaction lengths for iron and nuclei of similar mass. It might seem counterintuitive that after five times larger distance (10 Mpc) the average mass is still ∼20 instead of five times the loss for 2 Mpc, but this is the expected result given the reduction of the interaction lengths as the cascade moves to lower masses. The details of the cascade evolution demonstrate that injecting certain species as surrogates for mass groups is not a valid simplification for probability distributions.

Nevertheless, the correlation between a certain mass loss and the corresponding distance needed is remarkably stable. Figure 3 illustrates this relation over a broad boost range. Since the distributions span from a few to thousands of megaparsecs depending on the boost, the mean of the distribution was used to regularize the distance scale. The density distributions for the distance until the initial state ⁵⁶Fe is absorbed into a certain mass group (indicated with different colors) are shown with solid lines representing the mean, and shaded bands bracketing the extreme values at each distance point, as the boost moves in the range 4 ⋅ 10⁸ to 3 ⋅ 10¹⁰. The remarkable regularity of the distributions is evident from the negligible variation, especially considering the broad range of length scales and the differences in the target photon fields. Indeed, ⟨L⟩ is in the sub- to few megaparsec scales for γ ≳ 3 ⋅ 10⁹ (predominantly CMB interactions) and in the tens to gigaparsec scales for γ ≲ 3 ⋅ 10⁹ (predominantly IRB interactions). Distributions involving only a few species have a broader relative width, as seen for absorption at mass 50, becoming narrower with the increase of intermediate species, and showing little change for absorption masses below 40. A discussion of the implications of this regularity in extragalactic propagation is included in Sect. 4.

Fig. 3.

Density functions of distance until reaching different values of nuclear mass, the variation for the boost γ ∈ [4 ⋅ 10⁸, 3 ⋅ 10¹⁰] is represented by the shaded bands. The distributions are standardized and centered at the expected value, as they span different scales at different boosts.

2.4. Light secondary products

In addition to the leading mass, nuclear cascades produce multiple light nuclei, such as deuterium and α-particles, which can be considered boost-preserving products. They also produce light secondaries, such as pions and single nucleons, which are produced with a broad spectrum of energies. Larger nuclear fragments may also be present. For example, photo-fission yields at least two fragments of similar mass. In this stochastic description, all of these products are treated as additional particles in each stochastic jump. In the discussions above, the largest mass nucleus has been used as the nominal species, denoting the current state of the cascade. Here, we describe the treatment of the aforementioned secondary products, which are produced during state jumps in the cascade.

Clearly, the production of light secondaries is also stochastic, as it relates to transitions in the developing cascade. A simple approach to compute the production of the k-th secondary, $\frac{d}{dL} Q_{k} (γ, L)$ $Mathematical equation: $ \frac{d}{dL}Q_k(\gamma, L) $$ , as a function of the path length L and the Lorentz boost γ

$\begin{matrix} \frac{d Q_{k}}{dL} (γ, L) = ϕ \frac{d}{dL} P^{L} Y_{k} 1 = ϕ P^{L} (L, γ) Λ (γ) Y_{k} (γ) 1, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \frac{dQ_k}{dL}(\gamma , L) = \phi \frac{d}{d L} \boldsymbol{P}^L \boldsymbol{Y}_k \boldsymbol{1} = \phi \boldsymbol{P}^L(L, \gamma ) \boldsymbol{\Lambda }(\gamma ) \boldsymbol{Y}_k(\gamma ) \boldsymbol{1} , \end{aligned} $$$ (23)

where the yield matrix Y_k(γ) = {y_ij^k(γ)} contains the number of light secondaries of species, k, produced in jumps from any species j to i. The Y_k matrix is strictly lower triangular, although some of the upper triangular elements could be nonzero, as discussed for the lower triangular part of the matrix Λ.

Boost-preserving products are injected into the same boost. For products with a broad spectrum, the boost distribution is described by a function dn_i → j^k/dx, where x is the fraction to the primary energy, x = E_k/E_i ≈ A_k/A_iγ_k/γ_i (generally independent of the boost). The norm is equal to the yield y_ij^k = ∫dn_i → j^k/dx. The treatment of the production of these light particles is well understood (Hümmer et al. 2010; Morejon et al. 2019) and the evolution of their spectrum over propagation can be computed analytically (Berezinsky et al. 1990).

3. Continuous energy losses

The stochastic processes discussed so far do not account for the effect of CELs, which are deterministic (nonstochastic) interactions that cause energy losses without altering the nuclear species. This degradation in energy affects the Markov property of the cascade because the rates are no longer constant, but evolve as the Lorentz boost changes. Thus, CELs correspond to inhomogeneous continuous-time Markov chains, which violate the time homogeneity (the temporal independence of the rates of jumps between states). This means that the current state of the cascade depends on the entire past history rather than just the previous state.

In our context, it is useful to distinguish between two types of inhomogeneities caused by CELs: coherent inhomogeneities (CI), in which the present state depends on the total time (distance) elapsed, but not on the specific history of the process (i.e., the sequence of species), and dispersive inhomogeneities (DI), where the probability of the present state depends on the specific sequence of species in the past history. The latter type (DI) leads to differences in the boost evolution of the underlying concurrent cascades (dispersion), whereas in the former (CI) all concurrent cascades experience the same boost evolution (coherence). The effects of CI can be accommodated analytically through variable transformations if the time-dependence of the CI is known. DI effects are generally not analytically computable, but approximations and numerical methods are available for such cases (e.g., Arns et al. 2010). The relevant scenarios are discussed below for both the propagation of UHECRs and in-source interactions.

3.1. Coherent inhomogeneities

Cases of coherent inhomogeneities involve target photon fields that vary over time, since all rates are affected by the time-dependence of the field regardless of the state of the cascade. For sources, a fireball scenario fits this description, given the adiabatic cooling of the interaction volume as it expands. In the case of propagation, the cosmological evolution of the target photon backgrounds and the adiabatic losses experienced by UHECRs can produce such inhomogeneities.

In general, when the interaction rates can be expressed as the product of a scaling function μ(L) dependent on distance (or redshift, time, etc.) and a rate dependent on the boost, the distribution and the density functions are analogous to the homogeneous ones (Albrecher & Bladt 2019; Zhang & Comput 2021):

$\begin{matrix} f (L) = - μ (L) ϕ exp (\int_{0}^{L} μ (s) d s Λ) Λ 1 \end{matrix}$ $Mathematical equation: $$ \begin{aligned} f(L) = -\mu (L) \boldsymbol{\phi } \exp \left( \int _0^L \mu (s) ds \boldsymbol{\Lambda }\right) \boldsymbol{\Lambda }\boldsymbol{1} \end{aligned} $$$ (24)

$\begin{matrix} F (L) = 1 - ϕ exp (\int_{0}^{L} μ (s) d s Λ) 1 . \end{matrix}$ $Mathematical equation: $$ \begin{aligned} F(L) = 1 - \boldsymbol{\phi } \exp \left( \int _0^L \mu (s) ds \boldsymbol{\Lambda }\right) \boldsymbol{1} . \end{aligned} $$$ (25)

For example, suppose the target photon density is a function of time of the form n(ϵ, t) = m(t)n₀(ϵ). The corresponding rates after integrating Eq. (2) take the form λ(γ, t) = m(t)λ₀(γ), the interaction matrix constructed using the rates λ(γ, t) results in a product of a time dependent scalar and a time independent matrix Λ(γ, t) = m(t)Λ₀(γ), and Eqs. (24)–(25) apply with μ(s)≡m(s/c). Comparing Eqs. (24)–(25) to Eqs. (21)–(22) makes it clear that they are equivalent if the propagated length in Eqs. (24)–(25) is understood as the target thickness traversed δ = ∫₀^Lμ(s)ds, and the expressions become identical when no scaling occurs, as μ(s) = 1 produces δ ≡ L. The application to source scenarios is evident when the interaction region expands adiabatically. In these cases, the geometry of the volume informs the functional dependence of m(t), which governs the scaling of the target photon field. Similarly, for plasmoids moving along jets the scaling of the external photon fields could result in a change of only the norm (e.g., Hoerbe et al. 2020), in which case the temporal evolution would determine the form of m(t). Examples of time dependence in a GRB are shown in Sect. 4.3.

In the case of extragalactic propagation, the redshift scaling of the photon densities for the CMB and IRB leads to the convenient form for the interaction rates,

$\begin{matrix} λ (γ, z) = a (z) {(1 + z)}^{3} \cdot λ ((1 + z) γ, z = 0), \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \lambda (\gamma , z) = a(z)(1+z)^3 \cdot \lambda ((1+z)\gamma , z = 0) , \end{aligned} $$$ (26)

using the scaling prescription of Kampert et al. (2013), where a(z) reflects the ratio of the photon number density at redshift z and to the present one (a(z)≡1 for the CMB). The redshift-dependent argument (1 + z)γ of the rates in Eq. (26) produces an additional boost drift in the rates so they are not truly separable into a redshift-dependent and boost-dependent components. However, the volume compression factor plays a larger role and the argument (1 + z)γ can be considered constant for sufficiently small propagation lengths⁵. With this assumption, the interaction matrix can be written as a constant matrix multiplied by a(z)(1 + z)³, yielding a thickness of

$\begin{matrix} δ = \int_{0}^{L} a (z) {(1 + z)}^{3} d s = \int_{z (0)}^{z (L)} a (z) \frac{c {(1 + z)}^{2}}{H (z)} d z . \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \delta = \int _0^L a(z)(1+z)^3 ds = \int _{z(0)}^{z(L)} a(z) \frac{c(1+z)^2}{H(z)} dz . \end{aligned} $$$ (27)

This thickness has units of distance and can be interpreted as the equivalent propagation distance that a cosmic ray would need to cross to experience the same number of interactions as produced by the increased photon density. The distributions for extragalactic propagation assuming only the thickness is affected by the redshift are analogous to the homogeneous ones

$\begin{matrix} f^{CI} (δ) = - ϕ exp (Λ (γ) δ) Λ (γ) 1, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} f^\mathrm{{CI}}(\delta ) = -\boldsymbol{\phi } \exp \left( \boldsymbol{\Lambda }(\gamma )\delta \right) \boldsymbol{\Lambda }(\gamma )\boldsymbol{1} , \end{aligned} $$$ (28)

$\begin{matrix} F^{CI} (δ) = 1 - ϕ exp (Λ (γ) δ) 1, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} F^\mathrm{{CI}}(\delta ) = 1 - \boldsymbol{\phi } \exp \left( \boldsymbol{\Lambda }(\gamma )\delta \right) \boldsymbol{1} , \end{aligned} $$$ (29)

with the cosmological thickness replacing the propagation distance and noting that

$\begin{matrix} f^{CI} (δ) = \frac{d}{d δ} F^{CI} (δ) = \frac{1}{a (z) {(1 + z)}^{3}} \frac{d}{dL} F^{CI} (δ) . \end{matrix}$ $Mathematical equation: $$ \begin{aligned} f^\mathrm{{CI}}(\delta ) = \frac{d}{d\delta }F^\mathrm{{CI}}(\delta ) = \frac{1}{a(z)(1+z)^3}\frac{d}{dL}F^\mathrm{{CI}}(\delta ) . \end{aligned} $$$ (30)

Figure 4 compares propagation horizons for ¹⁴N (left) and ⁵⁶Fe (right) caused by disintegration. The abscissa uses the comoving boost γ_c = γ/(1 + z), because it does not change during propagation under CI. For reference to previous works, the energy loss length,

$\begin{matrix} L_{EL} = \sum_{j} \frac{E_{i}}{(E_{i} - E_{j}) λ_{A i \to A_{j}} (γ_{c})} \approx \sum_{j} \frac{A_{i}}{(A_{i} - A_{j}) λ_{A i \to A_{j}} (γ_{c})} \end{matrix}$ $Mathematical equation: $$ \begin{aligned} L_{\rm {EL}}=\sum _j \frac{E_i}{(E_i - E_j) \lambda _{Ai \rightarrow A_j}(\gamma _c)} \approx \sum _j \frac{A_i}{(A_i - A_j) \lambda _{Ai \rightarrow A_j}(\gamma _c)} \end{aligned} $$$ (31)

Fig. 4.

Cosmic ray horizons of ¹⁴N (left) and ⁵⁶Fe (right) in the background photon fields. The widely used energy loss length (dashed red) overestimates the effect of interactions. The L_FD horizons correspond to the distance where the full disintegration probability is 99%: in dot-dashed green the values for the homogeneous case, in solid purple the values assuming coherent inhomogeneities, and in dotted orange the values obtained numerically treating the redshift dependence of the rates and adiabatic losses.

has been included (with present interaction rates, z = 0), as it is often used to estimate the propagation horizons. The continuous limit is implicitly assumed in L_EL as it estimates the distance required to dissipate the initial energy, E_i, at an average energy loss rate computed from the interaction channels of the initial species. The horizons derived here take into account the stochastic nature of the process by including the probability that the initial species has fully disintegrated. The distance $L_{FD}^{ℓ}$ $Mathematical equation: $ L^{\ell}_\mathrm{{FD}} $$ at which the full disintegration distribution $F_{FD} (L_{FD}^{ℓ}) = ℓ$ $Mathematical equation: $ F_\mathrm{{FD}}( L^{\ell}_\mathrm{{FD}}) =\ell $$ , reaches a desired limit ℓ, constrains the probability of not fully disintegrating to 1 − ℓ. The nominal values employed here take ℓ = 99% to define $L_{FD} = L_{FD}^{99 %}$ $Mathematical equation: $ L_\mathrm{{FD}} = L^{99\%}_\mathrm{{FD}} $$ as the full disintegration horizon. The various lines for L_FD correspond to the different approaches employed. The dot-dashed-green line employed the distribution of the homogeneous case, where cosmological effects are ignored; hence, the values of lookback distance larger than the Hubble length. The solid-purple line employed Eq. (29), where the coherent inhomogeneities are taken into consideration only in the thickness, neglecting the redshift dependence of the second term in Eq. (26). The dotted-orange values were computed by numerically integrating Kolmogorov’s differential equation and updating the boost at each step to reflect the redshift dependence of the second term in Eq. (26) and also adiabatic losses. Additional grids denote the initial energy of the cosmic ray (top x-axis), the redshift corresponding to the lookback distance (rightmost inset scale, using a flat Λ-CDM cosmology with values fitted to the WMAP data, Bennett et al. 2013) and the thickness corresponding to the lookback distance (left of the redshift scale, assuming a(z) = 1).

The cosmological effects are negligible for distributions spanning a few hundred megaparsecs and for Lorentz boosts where the CMB is the predominant photon target (γ_c ≳ 5 ⋅ 10⁹) as reflected in the identical values of L_FD for all approaches with and without cosmological effects. In this range of boosts, L_EL yields much shorter horizons than L_FD, even considering the widths of the probability distributions. This is not surprising as L_EL assumes that the average energy loss rate is comparable to the initial values, neglecting the rate decrease as the mass of the cascade decreases and the stochastic fluctuations of the interaction lengths. For lower boosts (γ ≲ 5 ⋅ 10⁹) interactions with IRB photons become dominant, and cosmological effects become appreciable in the separation of the homogeneous horizons from the CI horizons. The solid-purple line includes the cosmological effects only on the thickness, so it illustrates the difference between δ and the lookback distance as it comes from evaluating the homogeneous distribution on δ instead of the lookback distance as in the dot-dashed-green line. The differences between these two curves quantify the error of employing the lookback distance instead of the thickness, and the limitations of the homogeneous approach. The comparison of the thickness and the lookback distance scales is sufficient to decide which approach is needed.

The dotted-orange line includes, besides the thickness increase present in the purple-solid line, the boost shifts produced by adiabatic losses and the cosmological energy increase of the photon backgrounds. We note that while the physical boost changes, the comoving boost, γ_c, does not change in the case of CI (see Sect. 3.2). The shorter horizons below γ ≲ 5 ⋅ 10⁹ are a consequence of the boost increase with redshift, which probes larger values of the interaction rates as they are monotonically increasing with the boost. Thus, the inclusion of δ alone is not sufficient to describe the cosmological effects, but it is a reasonable upper limit with an error appreciable here by comparing the solid-purple and the dotted-orange curves. The present description confirms the so called “explosive regime” in the mass evolution observed by Aloisio et al. (2013a) using a continuous approach and the one-nucleon loss approximation. We demonstrate this is also a feature of the stochastic cascades produced by the increase of the thickness with lookback distance, which becomes very large as redshifts above 0.1. Naturally, this is a general property of UHECR interaction cascades during propagation, irrespective of the nuclear interaction model and the target photon field (see Appendix C).

3.2. Dispersive inhomogeneities

Energy losses that depend on the nuclear species affect the cascade development in variable degrees depending on the specific sequence of states, thus the total energy loss after multiple disintegrations can vary significantly among the concurrent disintegration chains. This implies that different sequences within CoCs could produce diverging evolutions of the Lorentz boost, thus gradually rendering the cascade incoherent.

Examples of CELs that cause DIs include: synchrotron losses, which are relevant within sources with strong magnetic fields; and pair production losses, which are relevant for extragalactic propagation. The rate at which the boost changes (equivalent to the energy loss rate) for synchrotron losses is

$\begin{matrix} - \frac{1}{γ} \frac{d γ}{dL} = \frac{σ_{T} m_{e}^{2}}{6 π m_{p}^{4}} γ B^{2} {(\frac{Z}{A})}^{4} \end{matrix}$ $Mathematical equation: $$ \begin{aligned} -\frac{1}{\gamma }\frac{d\gamma }{dL} = \frac{\sigma _T m_e^2}{6\pi m_p^4} \gamma B^2 \left(\frac{Z}{A}\right)^4 \end{aligned} $$$ (32)

where σ_T is the Thomson cross-section, m_e and m_p are the masses of electrons and protons, respectively, and B is the magnetic field intensity in the source. The relation for the boost change in this expression depends on the nuclear species, given the factor ${(\frac{Z}{A})}^{4}$ $Mathematical equation: $ \left(\frac{Z}{A}\right)^4 $$ . Thus, the losses are affected by the specific sequence of nuclei and the distances traveled by each nucleus.

The rate of boost change for pair production losses (Blumenthal 1970) is

$\begin{matrix} - \frac{1}{γ} \frac{d γ}{dL} = α r_{0} m_{e}^{2} c^{4} \frac{Z^{2}}{γ A} \int_{2}^{\infty} d ξ n (\frac{ξ m_{e} c^{2}}{2 γ}) \frac{ϕ (ξ)}{ξ^{2}} . \end{matrix}$ $Mathematical equation: $$ \begin{aligned} -\frac{1}{\gamma }\frac{d\gamma }{dL} = \alpha r_0 m_e^2c^4 \frac{Z^2}{\gamma A} \int _2^\infty d\xi \, n\left(\frac{\xi m_ec^2}{2\gamma }\right) \frac{\phi (\xi )}{\xi ^2} . \end{aligned} $$$ (33)

This expression is also dependent on the nuclear charge and mass numbers. Following the notation $β_{0} / c = \frac{1}{γ^{p}} \frac{d γ^{p}}{dL}$ $Mathematical equation: $ \beta_0 / c = \frac{1}{\gamma^p}\frac{d\gamma^p}{dL} $$ for the loss length of protons (Aloisio et al. 2013a), the losses of nuclei in general can be written as

$\begin{matrix} \frac{1}{γ} \frac{d γ}{dL} = \frac{Z^{2}}{A} \frac{β_{0} (γ)}{c} . \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \frac{1}{\gamma }\frac{d\gamma }{dL} = \frac{Z^2}{A} \frac{\beta _0(\gamma )}{c} . \end{aligned} $$$ (34)

In the previous section, we discuss how adiabatic losses can be treated as CI, as well as the redshift scaling of the rates produced by the corresponding scaling of photon backgrounds. The complete form of the boost evolution with the redshift dependencies and adiabatic losses has been obtained for the kinetic equation formalism (Aloisio et al. 2013a) via

$\begin{matrix} \frac{1}{γ} \frac{d γ}{dL} - \frac{1}{1 + z} \frac{dz}{dL} = a (z) {(1 + z)}^{3} \frac{Z^{2}}{A} \frac{β_{0} ((1 + z) γ)}{c}, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \frac{1}{\gamma }\frac{d\gamma }{dL} -\frac{1}{1+z} \frac{dz}{dL}= a(z)(1+z)^3 \frac{Z^2}{A} \frac{\beta _0((1+z)\gamma )}{c} , \end{aligned} $$$ (35)

where we can consider a(z) = 1 since pair production losses are dominated by the CMB. Rewriting Eq. (35) in terms of the comoving Lorentz boost and the thickness, we obtain the principal relation that governs boost changes for cosmological propagation,

$\begin{matrix} \frac{1}{γ_{c}} \frac{d γ_{c}}{d δ} = \frac{Z^{2}}{A} \frac{β_{0} ({(1 + z)}^{2} γ_{c})}{c} . \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \frac{1}{\gamma _c}\frac{d\gamma _c}{d\delta } = \frac{Z^2}{A} \frac{\beta _0 ((1+z)^2\gamma _c)}{c} . \end{aligned} $$$ (36)

The CI approach corresponds to the homogeneous form of this differential equation, since in the CI approach γ_c is constant, as obtained for the solution in the absence of CELs (right-hand side null). Thus the effect of pair production losses is a modification of the CI result caused by a drift in the comoving boost.

The evolution of the boost over cosmological thickness can be obtained by integrating numerically Eq. (36), or via a variable separation for sufficiently short propagation distances, such that z is almost constant,

$\begin{matrix} \int_{γ_{c}^{1}}^{γ_{c}^{2}} \frac{c d γ_{c}}{γ_{c} β_{0} ({(1 + z)}^{2} γ_{c})} = Φ (γ_{c}^{2}) - Φ (γ_{c}^{1}) = \frac{Z^{2}}{A} δ, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \int _{\gamma ^1_c}^{\gamma ^2_c} \frac{cd\gamma _c}{\gamma _c\,\beta _0 ((1+z)^2\gamma _c)} = \Phi (\gamma _c^2)-\Phi (\gamma _c^1) = \frac{Z^2}{A} \delta , \end{aligned} $$$ (37)

where it is evident that the change in comoving boost from the initial γ_c¹ to a final γ_c² for any nuclear species is proportional to the thickness. For protons, this change is given by the function Φ(γ_c), which can be precomputed numerically for ranges of the redshift. For nuclei, the thickness required for a comparable change in the comoving boost is smaller by a factor of Z²/A.

The impact of DI is limited to a small region of the boost and redshift phase space relevant to the propagation of UHECRs (see Appendix C). Within this range, they can be accounted for using a quasi-homogeneous approach, in which the thickness is divided into segments small enough that DI are negligible, ensuring the CI description is sufficient. This is possible because, in any cascade, there is a dominant rate (typically for the species with the largest mass) and for sufficiently small values of δ, the constancy of γ_c can be assumed. The cascade can then be described by a set of CI descriptions, each applying within a segment in which the interaction matrix is evaluated at the constant boost of the segment. The boost values are updated after each segment by selecting the most likely value. The total boost change is also stochastic, but its distribution can be determined with a transformation via rewards (Bladt & Nielsen 2017).

4. Astrophysical examples

4.1. Distance horizons and mass evolution

The regularity of the mass evolution with distance, described in Morejon (2021) as a disciplined disintegration (DD), was invoked to explain the gradual decrease of the average mass with propagation distance observed in CoCs computed with PriNCe (Heinze et al. 2019). The DD was derived assuming the regularity condition and only one-nucleon emission per interaction, which lead to the distance until a set nucleon-loss being proportional to the inverse of the mean interaction rate per nucleon and the logarithm of the ratio of initial and final masses (see Eq. 6.11 in Morejon 2021). This is consistent with the expectation value in Table 1 for RSeCs. The validity of DD in CoCs was not quantitatively verified in Morejon (2021), but rather inferred from a small number of simulations. We expand on this idea here by first considering ISeCs, which are serial but do not follow the regularity condition, and then considering the more general CoCs, which include multiple branching channels.

Figure 5 compares the relationship between the initial mass and the expectation of the full disintegration length (L_FD, mean of the distribution) scaled by the mean interaction rate per nucleon λ₁. Different boost values contrast the changes in photodisintegrations as the rates transition from IRB-dominated to CMB-dominated with boost increase. The RSeC reference represents the relation λ₁L_FD = lnA with a solid black line. The ISeCs, based on the total cross-sections in CRPropa 3.2 (Kampert et al. 2013; Alves Batista et al. 2022), are represented by dashed lines, with colors indicating the boost values listed in the legend. Since the ISeCs cover only one species per mass and only one-nucleon loss, the rates employed are an average over nuclei of the same mass, and all other channels present in the cross-section table are ignored. This ensures a consistent comparison with CoCs (dots), which are based on the same cross-section table but include all channels and multiple species per nuclear mass.

Fig. 5.

Expected distance until full disintegration in units of the inverse of the mean interaction rate per nucleon 1/λ₁. The black line corresponds to the canonical cascade (RSeC), the dashed lines represent ISeCs, and the scattered points show values for CoCs, where multiple points appear for each mass corresponding the multiple isobars. The boost is indicated by the color, as listed in the legend.

Overall, ISeCs exhibit a similar behavior to RSeCs, apart from a boost-dependent offset that can be attributed to the variance in the mean interaction rate per nucleon. The proportionality to lnA found in RSeCs is a consequence of the serial character that is also found in ISeCs. However, the irregularities in ISeC rates produce offsets in the mass dependence that can be as large as $3 | 1 - \frac{A_{k} ⟨ λ_{1} ⟩}{λ_{A_{k}}} |$ $Mathematical equation: $ 3\left|1 - \frac{A_k\langle\lambda_1\rangle}{\lambda_{A_k}}\right| $$ (c.f., Appendix B), where A_k is the mass of the species in the cascade for which the interaction rate, λ_{A_k}, deviates the most from the regular rate, ⟨λ₁⟩A_k. These offsets can also vary with starting mass, because additional species included in cascades of heavier masses can slightly contribute to the rate variability. However, the main impact comes from the dependence of the rates on the boost: at the lowest and highest boost values the offsets are comparable, and they increase at intermediate values. This progression is related to the onset of photodisintegrations with the CMB: in the boost region 4 ⋅ 10⁹ − 6 ⋅ 10⁹, the rates integrate the energy-weighted cross-section from ε ≲ 20 MeV to ε ≲ 40 MeV, where the variance among species is the largest (see Fig. 1, CRPropa). As the boost increases and the variance decreases, the offset values become comparable to those around γ = 4 ⋅ 10⁸, where interactions with the IRB dominate. For masses lower than A = 12, the trend is visibly disrupted, possibly due to limitations in cross-section data employed for these nuclei (Kampert et al. 2013).

The additional disintegration channels have a significant effect on the CoCs models, which include all possible nucleon losses in the cross-section table. Multiple dots in each mass correspond to the different isobars; however, their differences become negligible for A ≳ 23 as the number of concurrent cascades increases, thereby smoothing the isobar variance. CoCs exhibit a linear rather than logarithmic mass dependence, which is a clear sign that the multiple concurrent cascades enhance the efficiency of the disintegration, thereby shortening the length scales (see Appendix B). Nevertheless, the proportionality of L_FD to the mass is why the DD effect holds in CoCs, as evidenced by PriNCe simulations (Morejon 2021) at γ = 2 ⋅ 10¹⁰ for nuclei up to lead (A = 208). However, the explanation proposed by Morejon (2021) is incomplete and applies only to serial cascades; it fails to reproduce the linear behavior demonstrated here.

The significant changes in length scales associated with boosts are a valuable feature that could be leveraged in future studies using the precise description proposed here and assuming the required accuracy in the cross-section data. Focusing on UHECRs in the boosts where CMB interactions begin to dominate, comparisons of events of adjacent boosts could allow probing different origins. Specifically, in the boost region 3 ⋅ 10⁹ − 1 ⋅ 10¹⁰, the horizons end up shortened considerably (see Fig. 4), while the full disintegration length scale can vary drastically between adjacent boosts. For example, comparing 3 ⋅ 10⁹ to 5 ⋅ 10⁹ (∼66% change) implies a reduction by more than half in L_FD, for both light (¹⁴N) and heavy nuclei (⁵⁶Fe). Additionally, dispersive inhomogeneities have a larger influence in this range (see Appendix C), which could enhance the differences between adjacent boosts. Extending the comparisons to slightly lower values, where IRB interactions still dominate, could allow testing the emitted spectrum in the paradigm of identical sources, as the expected changes in composition can now be computed with remarkable accuracy, including the stochastic effects and the probability distributions for individual events. In this paradigm, changes in composition for different energies would encode the relative contribution from different distances, since the observed composition can be efficiently computed with arbitrary precision. This allows us to employ the approach in minimization algorithms.

The verified DD effect implies that the cosmic ray horizon can be precisely defined as a quantity that naturally results from the photodisintegration cross-sections, the opacity of the target photon field, and the stochastic nature of cosmic ray propagation, rather than as an effective quantity dependent on source properties, such as the emission spectrum or the cosmic density. This quantity should be a function of the initial species, boost and redshift, such as the full disintegration horizon $L_{FD}^{ℓ}$ $Mathematical equation: $ L^{\ell}_\mathrm{{FD}} $$ obtained from the distributions of distance until full disintegration, as discussed in Sect. 3.1. Such a limit is meaningful even when considering magnetic deflections. The heaviest species in a composition has the largest horizon due to the DD effect, and their rigidity tends to be the largest. Indeed, R = E/Z = γ/κ and the charge-to-mass ratio κ = Z/A (typically within 0.3–0.6 for all nuclei and within 0.4-0.5 for stable nuclei) tends to be lower for heavier nuclei. Thus, the products of the heaviest nuclei emitted would propagate farther and experience the least magnetic deflections (see Sect. 4.4). The full disintegration limit constrains the propagation length, which is equivalent to the distance reached under ballistic propagation. However, under diffusive propagation, the distance reached by nuclei would be shorter because the lengths of diffusive paths tend to be larger than the radial distances reached. The effect of diffusive motion in sources is illustrated in Sect. 4.3.

4.2. Reverse propagation

Under certain conditions, the direct Markov jump process that describes nuclear cascades can be reversed. This is particularly relevant to the problem of inferring the composition of cosmic rays at their source, given the composition measured on Earth.

The simplest case of the reverse-propagation process is the quasi-stationary regime. In Markov jump processes, the stationary distribution, ϕ^s, is determined by the condition ϕ^sΛ(γ) = 0, meaning that the composition remains unchanged as time evolves. However, in the cascades discussed here, all nuclear states are transient; therefore, no such stationary distribution exists. Nevertheless, a quasi-stationary state can be reached with a corresponding distribution ${\tilde{ϕ}}^{s}$ $Mathematical equation: $ \boldsymbol{\tilde \phi}^s $$ , defined by the relation ${\tilde{ϕ}}^{s} Λ (γ) = - {\tilde{λ}}^{s} {\tilde{ϕ}}^{s}$ $Mathematical equation: $ \boldsymbol{\tilde\phi}^s\boldsymbol{\Lambda}(\gamma) = -\tilde\lambda^s \boldsymbol{\tilde\phi}^s $$ , which implies that reverse-propagation preserves the Markov property. The corresponding reverse interaction matrix can be easily constructed via

$\begin{matrix} {\tilde{Λ}}_{r} (γ) = diag {({\tilde{ϕ}}^{s})}^{- 1} Λ {(γ)}^{T} diag ({\tilde{ϕ}}^{s}) . \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \boldsymbol{\tilde{\Lambda }}_r(\gamma ) = \mathrm{diag} (\boldsymbol{\tilde{\phi }}^s)^{-1} \boldsymbol{\Lambda }(\gamma )^T \mathrm{diag} (\boldsymbol{\tilde{\phi }}^s) . \end{aligned} $$$ (38)

By construction, ${\tilde{ϕ}}^{s}$ $Mathematical equation: $ \boldsymbol{\tilde \phi}^s $$ is the same for both the forward and reverse processes. The reverse process can then be computed with ${\tilde{Λ}}_{r} (γ)$ $Mathematical equation: $ \boldsymbol{\tilde \Lambda}_r(\gamma) $$ integrating Kolmogorov’s differential equation or by building the probability distributions of distance until absorption as above. Here, absorption corresponds to probing the original species or composition assumption.

Figure 6 illustrates the likelihood of observing a cosmic ray nucleus of a given energy from different distances under different assumptions about the original species. These likelihoods were computed using Kolmogorov’s differential equation to evolve the probability vector. The likelihood for each species is the point probability for that species as a function of distance, normalized to a common value for comparison with the other species. However, the relative likelihoods can also be inferred by a different approach. As expected, the heavier the assumed original species, the greater the maximum likelihood distance. This approach can be used to estimate the origin of individual events with extreme energies (Morejon 2025), such as the recent Amaterasu particle detected by the Telescope Array (TA Collaboration 2023).

Fig. 6.

Likelihood of the distance of origin of an observed 70 EeV ¹²C (left) and ¹⁶O nucleus assuming different initial nuclei. Even slight variations in the mass of the observed nuclei can lead to significant differences in their most likely distance of origin.

Assuming a quasi-stationary distribution is a very specific condition that may not be met in reality. Verifying this assumption for the observed UHECR spectrum would require a level of precision in energy and composition that is currently out of experimental reach. A more general approach is to numerically solve Kolmogorov’s differential equation for the inverse process.

4.3. UHECR sources

There are two ways to apply the approach of solving Kolmogorov’s differential equation to model UHECR sources. The simplest method is to compute the distributions until absorption to e.g. determine the probability of escape of a given species (Morejon 2023; Morejon & Rautenberg 2025). For example, the probability vector ϕ_esc with the composition at the time of escape can be obtained applying Eq. (5) on an assumed injected composition, ϕ_inj,

$\begin{matrix} ϕ_{esc} (γ) = ϕ_{inj} P^{t_{s}} (γ) \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \boldsymbol{\phi }_{\rm {esc}}(\gamma ) = \boldsymbol{\phi }_{\rm {inj}} \boldsymbol{P}^{t_s}(\gamma ) \end{aligned} $$$ (39)

with t_s ≈ L_s/c as the characteristic crossing time of the source. Here, the rates contained in Λ (and therefore G(γ), used to find P(γ)) are computed with the source target photons (e.g. a broken power law). The effects of CI and DI can be taken into account (as discussed in Sect. 3) to define the boost evolution and the corresponding evolution of G(γ(t)). Furthermore, additional effects can be taken into consideration, such as the impact of different assumptions for the escape. For instance, if the distribution of trajectory lengths until escape, F_esc(γ, L), is known (a cumulative density as a function of trajectory lengths and the boost), the escape probability vector as a function of the boost would be

$\begin{matrix} ϕ_{esc} (γ) = ϕ_{inj} \int_{0}^{L} P {(γ)}^{L^{'} / c} (1 - F_{esc} (γ, L^{'})) d L^{'} . \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \boldsymbol{\phi }_{\rm {esc}}(\gamma ) = \boldsymbol{\phi }_{\rm {inj}} \int _0^L\boldsymbol{P}(\gamma )^{ L^{\prime }/c} \left(1 - F_{\rm {esc}}(\gamma , L^{\prime })\right) dL^{\prime } . \end{aligned} $$$ (40)

This expression assumes that changes in rigidity during successive disintegrations can be ignored, but this needs to be assessed for each specific scenario. When this is not the case, a more nuanced approach is also available (as illustrated in Sect. 4.4 for propagation).

The second method, of special interest, is simulating the time evolution of the composition in sources with a time-dependent cosmic ray injection. This type of modeling has been achieved with full nuclear cascades (e.g., NEUCOSMA Biehl et al. 2018; Rodrigues et al. 2018) by numerically integrating Eq. (4), yielding time-dependent spectral densities for each nuclear species. This task can also be accomplished using our stochastic approach with regularization; namely, on the condition that all jumps occur at regular intervals of the elapsed time or distance, which can be arbitrarily small. Such assumption is valid given the large luminosities, which justify a continuous limit approach. This allows us to treat the occupation probabilities as volumes in a system of equations such as Eq. (4) describing a fluid, where the changes in the occupation probability represent the amounts transferred between species as a function of time. In these cases, ${\tilde{Q}}^{ext} (γ, t)$ $Mathematical equation: $ \boldsymbol{\tilde Q}^\mathrm{{ext}}(\gamma, t) $$ represents the injection vector, which may be a function of time, and is typically a power-law of the energy or boost. The time evolution of the probability vector at a later time, t′, is given as above ${\tilde{Q}}^{ext} (γ, t) P^{t' - t} (γ)$ $Mathematical equation: $ \boldsymbol{\tilde Q}^\mathrm{{ext}}(\gamma, t)\boldsymbol{P}^{t{\prime}-t}(\gamma) $$ , and the total yield can be computed by integrating over a certain injection time, t_inj, or by taking a convolution product,

$\begin{matrix} N (γ, t_{inj}) = \int_{0}^{t_{inj}} {\tilde{Q}}^{ext} (γ, t') P^{t_{inj} - t'} (γ) d t^{'}, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \boldsymbol{N}(\gamma , t_{\rm {inj}}) = \int _0^{t_{\rm {inj}}} \boldsymbol{\tilde{Q}}^\mathrm{{ext}}(\gamma , t\prime )\boldsymbol{P}^{t_{\rm inj} - t\prime }(\gamma ) dt^{\prime } , \end{aligned} $$$ (41)

where N(γ, t_inj) is a vector with the final yields for each species in the cascade as a function of the boost. In simple cases where the DIs are negligible, we have P^t = e^Gt, as discussed previously. However, the general form, including the DIs, requires computing P^t numerically or following a similar approach, as described in Sect. 3.2. This expression allows for arbitrary choices in the temporal evolution of the injection.

Figure 7 illustrates the densities for different mass groups obtained by modeling a GRB example in the optically thick scenario (Biehl et al. 2018). More details are given in Appendix D. In this case, the injection rate vector ${\tilde{Q}}^{ext} (γ, t)$ $Mathematical equation: $ \boldsymbol{\tilde Q}^\mathrm{{ext}}(\gamma, t) $$ consists of only one species, ⁵⁶Fe, having a power-law dependence on the boost with a cutoff, while its norm C′ is determined by energy arguments. The effect of the temporal behavior is illustrated in Fig. 7 (left), using a constant injection of cosmic rays as the baseline (solid lines), a quadratically increasing injection as the lower limit (dotted bound of the shaded regions), and a linearly decreasing injection (dash-dotted bound of the shaded regions). All parameters were fixed by requiring the same total injection over the fixed injection time t_inj. This example assumes that nuclei escape after propagating a characteristic distance (or time scale), corresponding to an advective escape.

Fig. 7.

UHECR spectral densities of a GRB source in the optically thick scenario. Shaded regions and line styles indicate the effect of different model assumptions. Left: Influence of a time-varying injection with a fixed total injection. The case of a constant injection (solid lines) is contrasted with a quadratically increasing injection (dotted bound) and a linearly decreasing injection (dash-dotted bound). Right: Influence of rigidity-dependent escape assumptions. The solid lines represent advective escape, as in the left figure. The dash-dotted lines show the Bohmian diffusion, and the dotted lines show the effect of diffusion under a Kolmogorov-distributed turbulent magnetic field. Additional details are given in the text and in Appendix D.

In addition, as discussed above, other assumptions for the escape can be included. Figure 7 (right) presents the effect of different escape assumptions computed according to

$\begin{matrix} N (γ, t_{inj}) = {\tilde{Q}}^{ext} (γ) \int_{0}^{t_{inj}} P^{t_{inj} - t'} (γ) ° (1 - F_{esc} (R, t^{'})) d t^{'}, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \boldsymbol{N}(\gamma , t_{\rm {inj}}) = \boldsymbol{\tilde{Q}}^\mathrm{{ext}}(\gamma ) \int _0^{t_{\rm {inj}}} \boldsymbol{P}^{t_{\rm {inj}} - t\prime }(\gamma ) \circ \left( 1 - \boldsymbol{F}_{\rm {esc}}(R, t^{\prime }) \right)dt^{\prime } , \end{aligned} $$$ (42)

where the injection rate used corresponds to the constant injection of iron, as shown in the left plot. The matrix F_esc(R, t′) describes the probability distribution of escape as a function of rigidity R = E/Z = γ/κ, which varies with the nuclear species as κ = Z/A. The operation ° denotes the element-wise product of the two matrices, each evaluated at the time since injection t′. The solid lines represent the advective escape, as shown in Fig. 7 (right). Two other line styles represent alternative assumptions of rigidity-dependent escape: a Bohmian diffusion case and diffusive escape under a Kolmogorov-distributed turbulent magnetic field. In both cases, escape is exponentially distributed according to F_esc = 1 − exp(−t/t_diff) with dependencies t_diff = 3 ⋅ 10⁶/R for the Bohmian case (where the diffusion coefficient is proportional to rigidity) and t_diff = 2 ⋅ 10²/R^1/3 for the Kolmogorov case (where the diffusion coefficient is proportional to the cube root of rigidity).

One advantage of this approach is that it allows for consistent handling of interactions within the source and during propagation within a single model. This differs from current approaches, which simulate each environment separately, and allows us to fit the spectrum and composition of cosmic rays by including the source parameters in the minimization, such as the source’s optical thickness or the injected composition. For example, this approach enables us to connect UHECR emissions to nuclear cascade models, which describe optical observations of kilonovae resulting from neutron star mergers and associated GRBs.

4.4. Magnetic deflections and distribution of arrival direction

The probabilistic disintegration of nuclei during propagation also affects the arrival directions of cosmic rays. Previous studies (e.g. Lee et al. 1995; Waxman & Miralda-Escudé 1996) have discussed the angular deviations that occur during UHECR propagation under the influence of extragalactic magnetic fields (EGMFs) in the small-angle scattering regime (i.e., when the gyroradius exceeds the EGMF coherence length λ_B). However, these and similar studies neglect energy losses and disintegrations by assuming a constant rigidity R, which leads to the mean squared angular deviation formula ${⟨ Δ θ ⟩}^{2} \approx \frac{4}{π^{2}} \frac{B^{2}}{R^{2}} λ_{B} d$ $Mathematical equation: $ \langle\Delta \theta \rangle^2 \approx \frac{4}{\pi^2}\frac{B^2}{R^2}\lambda_Bd $$ (Lee et al. 1995), where d denotes the distance to the source and B the strength of the magnetic field.

This expression is sometimes used even in the presence of disintegrations, arguing that changes in nuclear species do not affect the rigidity, since R = E/Z = γ/κ, γ is conserved and κ = Z/A can be considered 0.5 for most stable species up to iron. However, the actual variation of κ in nuclear cascades can be up to 30% of the mean value, and the changes are stochastic as the cascade itself. Currently, there is no expression for the angular deviation that incorporates disintegrations. The most realistic treatment involves Monte Carlo simulations with a code that includes magnetic effects and disintegrations, such as CRPropa (Alves Batista et al. 2022).

Using the stochastic description in this work, we can derive an analytic expression for the expected angular spread of individual products. The mean squared angular deviation for each species i in the cascade is given by $θ_{ms, i} = {⟨ Δ θ_{i} ⟩}^{2} \approx \frac{4}{π^{2}} \frac{B^{2} κ_{i}^{2}}{γ^{2}} λ_{B} d_{i}$ $Mathematical equation: $ \theta_\mathrm{{ms, i}}=\langle\Delta \theta_i \rangle^2 \approx \frac{4}{\pi^2}\frac{B^2\kappa_i^2}{\gamma^2}\lambda_Bd_i $$ , where d_i is the rectilinear distance that species i travels from creation until interacting. The distribution of the total mean squared angular deviation is the sum of all the stochastic deviations experienced by each species θ_ms = ∑_iθ_ms, i (i.e., it is a stochastic variable itself). This also implies that θ_ms is phase-type distributed and can be obtained by applying a “transformation via rewards” to the distribution of the distance traveled until absorption. In this transformation, the “reward” is the deviation per unit distance for each species, which is given by $\frac{4}{π^{2}} \frac{B^{2} κ_{i}^{2}}{γ^{2}} λ_{B}$ $Mathematical equation: $ \frac{4}{\pi^2}\frac{B^2\kappa_i^2}{\gamma^2}\lambda_B $$ . It should be noted the linear relation between the propagation distance and θ_ms, i, which is a requirement of this type of transformations. Examples of the obtained distributions for θ_ms are shown in Fig. 8 (top) for the total mean squared angular deviation accumulated until reaching ¹⁰B, ¹²C and ¹⁴N in the propagation of a 300 EeV ⁵⁶Fe nucleus over a propagation distance of 50 Mpc. The products were chosen as species with similar masses and charges are produced at comparable distance scales, although their production in this example ranges from 10–100 Mpc, hence the choice of 50 Mpc for ⁵⁶Fe.

Fig. 8.

Impact of disintegration on the dispersion angle of different arriving nuclei resulting from the disintegration of 300 EeV ⁵⁶Fe nuclei over a distance of 50 Mpc. Top: Distributions of mean squared deflection angles for different secondaries with similar mass and charge to that of ¹²C. Bottom: The angular distributions of the same products in galactic coordinates are compared to the distribution for the parent ⁵⁶Fe disregarding disintegration. The star represents the position of the source in the sky and the circular lines denote the 95% confidence limit.

Without disintegration (i.e., constant rigidity), the distribution of the angular deviation θ from the source direction P(θ | θ_ms) is a Rayleigh distribution with parameter θ_ms = ⟨Δθ⟩². The inclusion of disintegration is reflected in the probability distribution of mean squared deviations f(θ_ms), and the sought distribution of the angular deviation is a mixture of Rayleigh distributions P(θ) = ∫P(θ | θ_ms)f(θ_ms)dθ_ms, using the density f(θ_ms) obtained with the transformation via rewards,

$\begin{matrix} P (θ) = \int_{0}^{\infty} 2 \frac{θ}{θ_{ms}} e^{- \frac{θ^{2}}{θ_{ms}}} f (θ_{ms}) d θ_{ms} . \end{matrix}$ $Mathematical equation: $$ \begin{aligned} P(\theta ) = \int _0^{\infty } 2\frac{\theta }{\theta _{\rm {ms}}} e^{ -\frac{\theta ^2}{\theta _{\rm {ms}}}} f(\theta _{\rm {ms}}) d\theta _{\rm {ms}} . \end{aligned} $$$ (43)

The bottom plot in Fig. 8 shows the distributions of angular deviation from the source’s direction for each of the aforementioned products. For comparison, the distribution of the initial iron after 50 Mpc neglecting disintegrations (as is typically assumed) as the initial iron nuclei do not survive beyond a few megaparsecs. The differences in angular distributions are quantified by the 95% containment angle, θ₉₅, represented with the corresponding lines matching the top plot. Differences between secondaries reflect mainly the differences in the range of production distances, as lighter products tend to have larger means of the distance distributions. The angular deviation for ¹⁴N is slightly reduced as it is produced at a mean distance of 42 Mpc, whereas for ¹²C the value is 46 Mpc and in the case of ¹⁰B is much larger at 68 Mpc and spans beyond 100 Mpc. An important consequence of this result is that precisely identifying the observed species could drastically alter their association with existing astrophysical objects as possible origins.

It should be noted that studying these types of distributions using Monte Carlo methods is extremely computationally expensive. This is because the phase space of distances, starting nuclei, and final products would require multiplying the already large number of candidates to be simulated by a factor of about 1000 to obtain an adequate description of these cascades (see Appendix E).

5. Conclusions

Until now, the stochasticity of UHECR interactions has been addressed using Monte Carlo approaches, which are limited by the availability of appropriate computational resources. This work demonstrates that interactions of UHECRs with photon fields in astrophysical scenarios can be described analytically with arbitrary precision. This description has additional advantages, including the ability to obtain closed-form probability distributions, such as the distance until loss of a number of nucleons and the deflections of UHECRs in the EGMF including interactions and secondary nuclei.

The stochastic approach presented here provides physical insights; for example, the regularity of the distributions under variations of the boost, or the mass dependence of the full disintegration distances, which can span from logarithmic to linear depending on the number of available disintegration channels. These insights are possible because of the closed form of the moments of the distributions and the distributions themselves. Furthermore, we have established a clear distinction of the types of cascades produced by the different types of nuclear cross-section models used in UHECR astrophysics. The classification of the cascades types relates to the number of disintegration channels and determines the type of distributions applicable and the impact on certain relations, such as the disintegration evolution with distance. Thus, this approach can enhance our understanding of the connections between the cross-section data and the propagation of UHECRs.

The detailed knowledge of the disintegration cascades strengthens the methods used to connect observations to the possible origins. We have implemented precise expressions for horizons, which can help establish distances of likely origin using priors on the original composition. The distributions of angular deviation derived here can discriminate between nuclei emitted by the source and their disintegration products. In addition, we show how the precise description of the cascade allows us to construct backpropagation processes to explore the connections of observed nuclei to their possible parent species.

Our approach is not limited to studying probabilities for compound quantities; it can also be applied to interactions in scenarios where stochasticity is less important due to the large number of expected events. For example, we computed the cosmic ray production for a GRB source scenario and demonstrated how source properties such as the magnetic field and time-dependent injection affect the variability of the produced spectrum. Furthermore, this approach could enable a combined source-propagation framework that circumvents “coupling defects” between the two scenarios, given that propagated compositions are currently often reduced to a few mass groups. At the same time, a common treatment of both the in-source and propagation processes allows for consistency in the cross-section and decay tables without the need for the simplifications that are often employed in existing frameworks.

The expressions presented here (along with additional utility functions) have been made available through an open-source Python package called Cosmic Ray Stochastic Interactions for Propagation (CRISP). The package is meant to provide a common implementation for the community to facilitate the computation of quantities of interest in UHECR astrophysics and for the reproducibility of the results shown in this work. Future research will be focused on studying the observed UHECR spectrum and composition with fewer assumptions (e.g., excluding the spectral index or source evolution) and exploring the sensitivity of the fitted parameters.

Acknowledgments

This work has received funding via the grant MultI-messenger probe of Cosmic Ray Origins (MICRO) from the DFG through project number 445990517. Further support was provided by Institut Pascal at Université Paris-Saclay within the program “Investissements d’avenir” ANR-11-IDEX-0003-01, the P2I axis of the Graduate School of Physics of Université Paris-Saclay, as well as IJCLab, CEA, IAS, OSUPS, and APPEC. This work employed the software packages Astropy (Astropy Collaboration 2013, 2018, 2022), Matplotlib (Hunter 2007), Numpy (Harris et al. 2020), Scipy (Virtanen et al. 2020).

References

Aab, A., Abreu, P., Aglietta, M., et al. 2015, ApJ, 802, 111 [Google Scholar]
Abdul Halim, A., Abreu, P., Aglietta, M., et al. 2023, JCAP, 2023, 024 [CrossRef] [Google Scholar]
Abdul Halim, A., Abreu, P., Aglietta, M., et al. 2024a, ApJ, 976, 48 [Google Scholar]
Abdul Halim, A., Abreu, P., Aglietta, M., et al. 2024b, JCAP, 2024, 022 [CrossRef] [Google Scholar]
Ahlers, M., & Taylor, A. M. 2010, Phys. Rev. D, 82, 123005 [Google Scholar]
Ahlers, M., Anchordoqui, L. A., & Taylor, A. M. 2013, PRD, 87, 023004 [Google Scholar]
Albrecher, H., & Bladt, M. 2019, J. Appl. Probab., 56, 1044 [Google Scholar]
Aloisio, R., Berezinsky, V., & Grigorieva, S. 2013a, Astropart. Phys., 41, 73 [Google Scholar]
Aloisio, R., Berezinsky, V., & Grigorieva, S. 2013b, Astropart. Phys., 41, 94 [Google Scholar]
Aloisio, R., Boncioli, D., di Matteo, A., et al. 2017, JCAP, 2017, 009 [CrossRef] [Google Scholar]
Alves Batista, R., Dundovic, A., Erdmann, M., et al. 2016, JCAP, 2016, 038 [Google Scholar]
Alves Batista, R., Becker Tjus, J., Dörner, J., et al. 2022, JCAP, 2022, 035 [CrossRef] [Google Scholar]
Arns, M., Buchholz, P., & Panchenko, A. 2010, INFORMS J. Comput., 22, 416 [Google Scholar]
Asmussen, S., & Glynn, P. W. 2007, in Stochastic Simulation: Algorithms and Analysis, (New York, NY: Springer), Stochastic Model. Appl. Probab., 57 [Google Scholar]
Astropy Collaboration (Robitaille, T. P., et al.) 2013, A&A, 558, A33 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Astropy Collaboration (Price-Whelan, A. M., et al.) 2018, AJ, 156, 123 [Google Scholar]
Astropy Collaboration (Price-Whelan, A. M., et al.) 2022, ApJ, 935, 167 [NASA ADS] [CrossRef] [Google Scholar]
Bennett, C. L., Larson, D., Weiland, J. L., et al. 2013, ApJS, 208, 20 [Google Scholar]
Berezinskii, V. S., & Grigor’eva, S. I. 1988, A&A, 199, 1 [Google Scholar]
Berezinsky, V. S., Bulanov, S. V., Dogiel, V. A., & Ptuskin, V. S. 1990, in Astrophysics of Cosmic Rays, ed. V. L. Ginzburg [Google Scholar]
Biehl, D., Boncioli, D., Fedynitch, A., & Winter, W. 2018, A&A, 611, A101 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Bladt, M., & Nielsen, B. F. 2017, Matrix-Exponential Distributions in Applied Probability (Springer), 81 [Google Scholar]
Blumenthal, G. R. 1970, PRD, 1, 1596 [Google Scholar]
Boncioli, D., Fedynitch, A., & Winter, W. 2017, Sci. Rep., 7, 4882 [Google Scholar]
Daw, A., & Pender, J. 2023, Adv. Appl. Probab., 55, 126 [Google Scholar]
Epele, L. N., & Roulet, E. 1998, J. High Energy Phys., 1998, 009 [Google Scholar]
Gilmore, R. C., Somerville, R. S., Primack, J. R., & Domínguez, A. 2012, MNRAS, 422, 3189 [NASA ADS] [CrossRef] [Google Scholar]
Globus, N., Allard, D., Mochkovitch, R., & Parizot, E. 2015, MNRAS, 451, 751 [NASA ADS] [CrossRef] [Google Scholar]
Harris, C. R., Millman, K. J., van der Walt, S. J., et al. 2020, Nature, 585, 357 [NASA ADS] [CrossRef] [Google Scholar]
He, Q.-M. 2021, J. Appl. Prob, 58, 880 [Google Scholar]
He, Q.-M., Horváth, G., Horváth, I., & Telek, M. 2019, Adv. Appl. Probab., 51, 168 [Google Scholar]
Heinze, J., Fedynitch, A., Boncioli, D., & Winter, W. 2019, ApJ, 873, 88 [Google Scholar]
Hill, C. T., & Schramm, D. N. 1985, PRD, 31, 564 [Google Scholar]
Hoerbe, M. R., Morris, P. J., Cotter, G., & Becker-Tjus, J. 2020, MNRAS, 496, 2885 [NASA ADS] [CrossRef] [Google Scholar]
Hooper, D., Sarkar, S., & Taylor, A. M. 2008, Phys. Rev. D, 77, 103007 [Google Scholar]
Hümmer, S., Rüger, M., Spanier, F., & Winter, W. 2010, ApJ, 721, 630 [CrossRef] [Google Scholar]
Hunter, J. D. 2007, Comput. Sci. Eng., 9, 90 [NASA ADS] [CrossRef] [Google Scholar]
Kampert, K.-H., Kulbartz, J., Maccione, L., et al. 2013, Astropart. Phys., 42, 41 [Google Scholar]
Kawano, T., Cho, Y., Dimitriou, P., et al. 2020, Nucl. Data Sheets, 163, 109 [CrossRef] [Google Scholar]
Korochkin, A., Semikoz, D., & Tinyakov, P. 2025, A&A, 693, A284 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Lee, S., Olinto, A. V., & Sigl, G. 1995, ApJ, 455, L21 [Google Scholar]
Lia, V. D., & Tamborra, I. 2024, JCAP, 2024, 054 [Google Scholar]
Morejon, L. 2021, Ph.D. Thesis, Humboldt U., Berlin, Humboldt U., Berlin [Google Scholar]
Morejon, L. 2023, PoS, ICRC2023, 284 [Google Scholar]
Morejon, L. 2025, Cent. Eur. Astrophys. Bull., 47 [Google Scholar]
Morejon, L., & Rautenberg, J. 2025, in Proceedings of 7th International Symposium on Ultra High Energy Cosmic Rays - PoS(UHECR2024), 484, 110 [Google Scholar]
Morejon, L., Fedynitch, A., Boncioli, D., Biehl, D., & Winter, W. 2019, JCAP, 2019, 007 [Google Scholar]
Puget, J. L., Stecker, F. W., & Bredekamp, J. H. 1976, ApJ, 205, 638 [Google Scholar]
Rodrigues, X., Fedynitch, A., Gao, S., Boncioli, D., & Winter, W. 2018, ApJ, 854, 54 [NASA ADS] [CrossRef] [Google Scholar]
Schmuland, B. 2003, Am. Math. Mon., 110, 407 [Google Scholar]
TA Collaboration 2023, Science, 382, 903 [NASA ADS] [CrossRef] [Google Scholar]
Unger, M., & Farrar, G. R. 2024, ApJ, 970, 95 [NASA ADS] [CrossRef] [Google Scholar]
Virtanen, P., Gommers, R., Oliphant, T. E., et al. 2020, Nat. Methods, 17, 261 [Google Scholar]
Waxman, E., & Miralda-Escudé, J. 1996, ApJ, 472, L89 [Google Scholar]
Yoshida, S., & Teshima, M. 1993, Prog. Theor. Phys., 89, 833 [Google Scholar]
Zhang, B., Comput, V. R. J., & Graphical, 2021, J. Comput. Graphical Stat., 30, 25 [Google Scholar]

Hadronic interactions are less important for UHECRs but may also occur. A subsequent publication will discuss the stochastic analytic method presented here for hadronic interactions.

This possibility also implies that the interaction matrix may be defective for a number of boosts, since different nuclei may have the same rate for some boost values. To avoid numerical problems, the corresponding boost values can be identified and excluded from the computation. The probability distributions for these values can then be determined by interpolating between valid adjacent boosts.

See footnote 2.

⁴

The computation of the moments is more efficient when the matrix Λ is upper triangular, since fast computation methods are available (e.g. the Matrioshka matrices (Daw & Pender 2023) apply).

⁵

For example, for a cosmic ray starting 300 Mpc from us (z ≈ 0.073) its Lorentz boost changes by only 7.3% as it propagates to Earth, while the initial interaction rate is (1 + z)³ ≈ 1.24 which is 24% greater than at present and δ ≈ 334 Mpc, about 11% greater than the propagation distance.

Appendix A: Derivation of the canonical form

The canonical form corresponding to an RSeC describes the probability that a nucleus of mass number A will interact k times over a trajectory length L. In these cascades, each interaction results in the loss of one nucleon, reducing the mass number by one. Thus, after the final interaction, the remaining nucleus has A − k nucleons. Additionally, the interaction rates for all nuclei satisfy the regularity condition (i.e., the relations $λ_{A_{i}} = \frac{A_{j}}{A_{i}} λ_{A_{j}} = A_{j} \cdot λ_{1}$ $Mathematical equation: $ \lambda_{A_i} = \frac{A_j}{A_i} \lambda_{A_j}=A_j \cdot \lambda_1 $$ hold) with λ₁ the interaction rate per nucleon.

The probability density that nucleus A will interact within the differential length dx at position x from a starting point follows an exponential distribution,

$\begin{matrix} f_{A \to A - 1} (x) = λ_{A} e^{- λ_{A} x}, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} f_{A \rightarrow A-1} (x) = \lambda _A e^{-\lambda _A x}, \end{aligned} $$$ (A.1)

with λ_A the interaction rate per unit length. The probability P_{A → A − 1}(x ≤ L) that nucleus A will interact within a trajectory length smaller than or equal to L is given by the corresponding distribution function,

$\begin{matrix} F_{A \to A - 1} (L) = 1 - e^{- λ_{A} L} . \end{matrix}$ $Mathematical equation: $$ \begin{aligned} F_{A \rightarrow A-1} (L) = 1 - e^{-\lambda _A L} . \end{aligned} $$$ (A.2)

To describe the probability that the sequence of nuclei {A, A − 1, ..., A − k + 1} occurs within the path length L, we must integrate over all possible intermediate trajectory lengths corresponding to the distances covered by each nucleus until interacting {x_A, x_A − 1, ..., x_{A − k + 1}}, such that L = x_A + x_A − 1 + ... + x_{A − k + 1}. In essence, this involves finding the distribution describing L as the sum of k exponentially distributed functions with rate parameters {λ_A, λ_A − 1, ..., λ_{A − k + 1}}, given by the integral,

$\begin{matrix} \begin{matrix} F_{A \to A - k} (L) = \int_{0}^{L} d x_{A} f_{A \to A - 1} (x_{A}) \int_{0}^{L - x_{A}} d x_{A - 1} f_{A - 1 \to A - 2} (x_{A - 1}) \int_{0}^{L - x_{A} - x_{A - 1}} d x_{A - 2} \int . . . \\ \int_{0}^{L - \sum_{l = 0}^{k - 1} x_{A - l}} d x_{A - k + 1} f_{A - k + 1 \to A - k} (x_{A - k + 1}) . \end{matrix} \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \begin{aligned} F_{A \rightarrow A-k}(L) = \int _0^{L} dx_A f_{A \rightarrow A-1} (x_A) \int _0^{L-x_A} dx_{A-1} f_{A-1 \rightarrow A-2} (x_{A-1}) \int _0^{L-x_A-x_{A-1}} dx_{A-2} \int ... \\ \int _0^{L - \sum _{l = 0}^{k-1}x_{A-l}} dx_{A-k+1} f_{A-k+1 \rightarrow A - k} (x_{A-k+1}) . \end{aligned} \end{aligned} $$$ (A.3)

The expressions for the first few values of k yield

$\begin{matrix} F_{A \to A - 1} (L) = & 1 - e^{- λ_{A} L}, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} F_{A \rightarrow A-1} (L) =&1 - e^{-\lambda _A L},\end{aligned} $$$ (A.4)

$\begin{matrix} F_{A \to A - 2} (L) = & 1 + (A - 1) e^{- λ_{A} L} - A e^{- λ_{A - 1} L}, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} F_{A \rightarrow A-2} (L) =&1 + (A-1)e^{-\lambda _A L} - Ae^{-\lambda _{A-1} L},\end{aligned} $$$ (A.5)

$\begin{matrix} F_{A \to A - 3} (L) = & 1 - \frac{(A - 1) (A - 2)}{2} e^{- λ_{A} L} + A (A - 2) e^{- λ_{A - 1} L} - \frac{(A - 1) (A - 2)}{2} e^{- λ_{A - 2} L}, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} F_{A \rightarrow A-3} (L) =&1 - \frac{(A-1)(A-2)}{2}e^{-\lambda _A L} + A(A-2)e^{-\lambda _{A-1} L} - \frac{(A-1)(A-2)}{2}e^{-\lambda _{A-2}L},\end{aligned} $$$ (A.6)

$\begin{matrix} F_{A \to A - 4} (L) = & 1 + \frac{(A - 1) (A - 2) (A - 3)}{6} e^{- λ_{A} L} - \frac{A (A - 2) (A - 3)}{2} e^{- λ_{A - 1} L}, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} F_{A \rightarrow A-4} (L) =&1 + \frac{(A-1)(A-2)(A-3)}{6}e^{-\lambda _A L} - \frac{A(A-2)(A-3)}{2}e^{-\lambda _{A-1} L}, \end{aligned} $$$ (A.7)

$\begin{matrix} - \frac{A (A - 1) (A - 3)}{2} e^{- λ_{A - 2} L} + \frac{A (A - 1) (A - 2)}{6} e^{- λ_{A - 3} L}, \end{matrix}$ $Mathematical equation: $$ \begin{aligned}&-\frac{A(A-1)(A-3)}{2}e^{-\lambda _{A-2}L} +\frac{A(A-1)(A-2)}{6}e^{-\lambda _{A-3}L},\end{aligned} $$$ (A.8)

$\begin{matrix} F_{A \to A - 5} (L) = & . . . \\ (A . 8) \end{matrix}$ $Mathematical equation: $$ \begin{aligned} F_{A \rightarrow A-5} (L) =&... \\ & (\rm A.10)\end{aligned} $$$ (A.9)

This can be generalized, by induction, in the form

$\begin{matrix} F_{A \to A - k} (L) = 1 + k (\binom{A}{k}) \sum_{l = 0}^{k - 1} {(- 1)}^{k - l} (\binom{k - 1}{l}) \frac{e^{- λ_{A - l} L}}{A - l} . \end{matrix}$ $Mathematical equation: $$ \begin{aligned} F_{A \rightarrow A-k} (L) = 1 + k {A \atopwithdelims ()k} \sum _{l = 0}^{k-1} (-1)^{k-l} {k-1 \atopwithdelims ()l} \frac{e^{-\lambda _{A-l}L}}{A-l} . \end{aligned} $$$ (A.11)

The density function can be found by deriving with respect to L,

$\begin{matrix} f_{A \to A - k} (L) = \frac{d}{dL} F_{A \to A - k} (L) = - k λ_{1} (\binom{A}{k}) e^{- λ_{A} L} \sum_{l = 0}^{k - 1} {(- 1)}^{k - l} (\binom{k - 1}{l}) e^{λ_{l} L}, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} f_{A \rightarrow A-k} (L) = \frac{d}{dL}F_{A \rightarrow A-k} (L) = -k \lambda _1 {A \atopwithdelims ()k} e^{-\lambda _AL}\sum _{l = 0}^{k-1} (-1)^{k-l} {k-1 \atopwithdelims ()l} e^{\lambda _l L}, \end{aligned} $$$ (A.12)

where the binomial theorem has been employed. Rearranging some terms and employing some known identities leads to Eq. 7. Noting that $k (\binom{A}{k}) = (A - k + 1) (\binom{A}{k - 1})$ $Mathematical equation: $ k {A \choose k} = (A - k + 1) {A \choose k-1} $$ and rewriting e^−λ_AL as e^{−λ_{A − k + 1}L}e^{−λ_k − 1L} brings the second factor into the sum. The expression then becomes

$\begin{matrix} f_{A \to A - k} (L) = λ_{A - k + 1} (\binom{A}{k - 1}) e^{- λ_{A - k + 1} L} \sum_{l = 0}^{k - 1} {(- 1)}^{k - 1 - l} (\binom{k - 1}{l}) e^{- λ_{k - 1 - l} L}, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} f_{A \rightarrow A-k} (L) = \lambda _{A-k+1} {A \atopwithdelims ()k-1} e^{-\lambda _{A-k+1} L}\sum _{l = 0}^{k-1} (-1)^{k-1-l} {k-1 \atopwithdelims ()l} e^{-\lambda _{k-1-l} L}, \end{aligned} $$$ (A.13)

where it is apparent from the binomial identity that the sum is equivalent to (1 − e^−λ₁L)^k − 1, as in Eq. 7.

Other forms of this expression can be found in terms of known functions with precomputed values. Setting the variable ξ = e^−λ₁L we have $\frac{d}{dL} = - λ_{1} ξ \frac{d}{d ξ}$ $Mathematical equation: $ \frac{d}{dL} = -\lambda_1 \xi \frac{d}{d\xi} $$ . Some terms change as e^−λ_AL = ξ^A and (e^λ_lL−1) = (1−e^−λ_lL)e^λ_lL, resulting in

$\begin{matrix} f_{A \to A - k} (ξ) = - (A - k + 1) (\binom{A}{k - 1}) ξ^{A - k} {(1 - ξ)}^{k - 1}, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} f_{A \rightarrow A-k} (\xi ) = - (A - k + 1) {A \atopwithdelims ()k-1} \xi ^{A - k} \left(1 - \xi \right)^{k-1}, \end{aligned} $$$ (A.14)

where the relation $(A - k + 1) (\binom{A}{k - 1}) = 1 / B_{ξ} (A - k + 1, k)$ $Mathematical equation: $ (A -k+1){A \choose k-1} = 1/\mathrm{B}_{\xi}(A-k+1, k) $$ was used, with B(α, β) the beta function. This form corresponds to the beta distribution ℬ(α, β), namely, computing the RSeC distribution requires only evaluating ℬ(α, β) with the appropriate arguments.

The corresponding distribution function,

$\begin{matrix} F_{A \to A - k} (ξ) = 1 - \int_{0}^{ξ} \frac{{ξ^{'}}^{A - k} {(1 - ξ^{'})}^{k - 1}}{B (A - k + 1, k)} d ξ^{'} = 1 - B_{ξ} (A - k + 1, k) = B_{1 - ξ} (k, A - k + 1), \end{matrix}$ $Mathematical equation: $$ \begin{aligned} F_{A \rightarrow A-k} (\xi ) = 1 - \int _0^{\xi } \frac{{\xi ^{\prime }}^{A-k} (1 - \xi ^{\prime })^{k-1}}{\mathrm{B} (A-k+1, k)} d\xi ^{\prime } = 1 - \mathcal{B} _{\xi }(A-k+1, k) = \mathcal{B} _{1-\xi }(k, A-k+1), \end{aligned} $$$ (A.15)

is given in terms of the incomplete beta function, ℬ_z(α, β), which has known expressions for the expected value $E [ξ] = \frac{α}{α + β}$ $Mathematical equation: $ \mathbb{E}[\xi]=\frac{\alpha}{\alpha+\beta} $$ and the variance $Var [ξ] = \frac{α β}{{(α + β)}^{2} (α + β + 1)}$ $Mathematical equation: $ \mathrm{Var} [\xi]=\frac{\alpha \beta}{(\alpha +\beta)^2 (\alpha+\beta+1)} $$ . Substituting ξ and the appropriate values for α and β in these expressions yields

$\begin{matrix} E {[λ_{1} L]}_{A \to A - k} = ln (\frac{A + 1}{A - k + 1}), \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \mathbb{E} [\lambda _1 L]_{A \rightarrow A-k} = \ln {\left( \frac{A+1 }{A-k+1}\right)}, \end{aligned} $$$ (A.16)

$\begin{matrix} Var {[λ_{1} L]}_{A \to A - k} = - ln (1 - \frac{k (A - k + 1)}{{(A + 1)}^{2} (A + 2)}) . \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \mathrm{Var} [\lambda _1L]_{A \rightarrow A-k} = -\ln \left(1 - \frac{k(A-k+1)}{(A+1)^2(A+2)} \right) . \end{aligned} $$$ (A.17)

The distribution ℬ(α, β) and the binomial distribution are well-known and interconnected. They help us interpret functions in terms of the superposition model where the interaction probability of an individual constituent nucleon determines the probability that the nucleus suffers a given number of interactions.

Appendix B: Full disintegration distance for ISeCs and CoCs

The expected distance until full disintegration for RSeCs involves the logarithm of the initial mass, as seen in Appendix A. This expression can be compared to the one for ISeCs, by transforming the expression of the expected distance for the latter,

$\begin{matrix} E [λ_{1} L] = λ_{1} \sum_{k = 1}^{A} \frac{1}{k λ_{1}} + (\frac{1}{λ_{k}} - \frac{1}{k λ_{1}}) = \sum_{k = 1}^{A} \frac{1}{k} + (\frac{1}{\tilde{k}} - \frac{1}{k}) = H_{A} + \sum_{k = 1}^{A} \frac{k - \tilde{k}}{k \tilde{k}}, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \mathbb{E} [\lambda _1L] = \lambda _1 \sum _{k = 1}^{A} \frac{1}{k\lambda _1} + \left(\frac{1}{\lambda _k} - \frac{1}{k\lambda _1}\right) = \sum _{k = 1}^{A} \frac{1}{k} + \left(\frac{1}{\tilde{k}} - \frac{1}{k}\right) = H_A + \sum _{k = 1}^{A} \frac{k - \tilde{k}}{k\tilde{k}} , \end{aligned} $$$ (B.1)

where H_A is the harmonic number and the second term contains the deviation from the regular case. Examining this sum closely, we see that the deviations $χ_{k} = \frac{k - \tilde{k}}{\tilde{k}}$ $Mathematical equation: $ \chi_k=\frac{k - \tilde k}{\tilde k} $$ are expected to be randomly distributed with a null mean and modules smaller than unity. We may place a limit on the second term by taking the largest deviation, χ^max, and factoring it out from the sum. It is clear that all ratios satisfy |χ_k/χ^max| ≤ 1. Thus, if we replace all ratios by unity with the corresponding sign, we can assert that the module of this sum is strictly larger

$\begin{matrix} | χ^{\max} \sum_{k = 1}^{A} \frac{χ_{k} / χ^{\max}}{k} | < | χ^{\max} \sum_{k = 1}^{A} \frac{ε_{k}}{k} | < | χ^{\max} | | \sum_{k = 1}^{\infty} \frac{ε_{k}}{k} |, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \left|\chi ^\mathrm{max}\sum _{k = 1}^{A} \frac{\chi _k/\chi ^\mathrm{max}}{k}\right| < \left|\chi ^\mathrm{max}\sum _{k = 1}^{A} \frac{\varepsilon _k}{k}\right| < \left|\chi ^\mathrm{max}\right| \left|\sum _{k = 1}^{\infty } \frac{\varepsilon _k}{k} \right|, \end{aligned} $$$ (B.2)

where $| \sum_{k = 1}^{\infty} \frac{ε_{k}}{k} | ≲ 3$ $Mathematical equation: $ \left|\sum_{k = 1}^{\infty} \frac{\varepsilon_k}{k} \right| \lesssim 3 $$ is the random harmonic series (Schmuland 2003), which is bounded. Therefore, we can conclude that for ISeCs 𝔼[λ₁L] differs from the behavior of RSeCs by no more than 3|χ^max|.

The general expression, applicable to arbitrary CoCs, can be derived explicitly from the general expression (see Table 1), 𝔼[λ₁L]= − λ₁ϕΛ⁻¹1, where ϕ would have only injection of the heaviest nucleus. Hence, all entries are null except the heaviest species (first row), where it is 1. In such a case, the expression yields the sum of all the elements in the first row of Λ⁻¹, which can be determined by solving for the first row Λ^Tx = e₁ (where e₁ is a vector with all entries null except the first one being 1) and where each entry can be solved iteratively, yielding

$\begin{matrix} x_{1} & = \frac{1}{λ_{S_{1}}}, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} x_1&= \frac{1}{\lambda _{S_1}},\end{aligned} $$$ (B.3)

$\begin{matrix} x_{2} & = \frac{λ_{S_{1} \to S_{2}}}{λ_{S_{1}} λ_{S_{2}}}, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} x_2&= \frac{\lambda _{S_1 \rightarrow S_2}}{\lambda _{S_1}\lambda _{S_2}},\end{aligned} $$$ (B.4)

$\begin{matrix} x_{3} & = \frac{λ_{S_{1} \to S_{3}}}{λ_{S_{1}} λ_{S_{3}}} + \frac{λ_{S_{1} \to S_{2}} λ_{S_{2} \to S_{3}}}{λ_{S_{1}} λ_{S_{2}} λ_{S_{3}}}, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} x_3&= \frac{\lambda _{S_1 \rightarrow S_3}}{\lambda _{S_1}\lambda _{S_3}} + \frac{\lambda _{S_1 \rightarrow S_2}\lambda _{S_2 \rightarrow S_3}}{\lambda _{S_1}\lambda _{S_2}\lambda _{S_3}},\end{aligned} $$$ (B.5)

$\begin{matrix} x_{4} & = \frac{λ_{S_{1} \to S_{4}}}{λ_{S_{1}} λ_{S_{4}}} + \frac{λ_{S_{1} \to S_{2}} λ_{S_{2} \to S_{4}}}{λ_{S_{1}} λ_{S_{2}} λ_{S_{4}}} + \frac{λ_{S_{1} \to S_{3}} λ_{S_{3} \to S_{4}}}{λ_{S_{1}} λ_{S_{3}} λ_{S_{4}}} + \frac{λ_{S_{1} \to S_{2}} λ_{S_{2} \to S_{3}} λ_{S_{3} \to S_{4}}}{λ_{S_{1}} λ_{S_{2}} λ_{S_{3}} λ_{S_{4}}}, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} x_4&= \frac{\lambda _{S_1 \rightarrow S_4}}{\lambda _{S_1}\lambda _{S_4}} + \frac{\lambda _{S_1 \rightarrow S_2}\lambda _{S_2 \rightarrow S_4}}{\lambda _{S_1}\lambda _{S_2}\lambda _{S_4}} + \frac{\lambda _{S_1 \rightarrow S_3}\lambda _{S_3 \rightarrow S_4}}{\lambda _{S_1}\lambda _{S_3}\lambda _{S_4}} + \frac{\lambda _{S_1 \rightarrow S_2}\lambda _{S_2 \rightarrow S_3}\lambda _{S_3 \rightarrow S_4}}{\lambda _{S_1}\lambda _{S_2}\lambda _{S_3}\lambda _{S_4}},\end{aligned} $$$ (B.6)

$\begin{matrix} x_{5} & = \frac{λ_{S_{1} \to S_{5}}}{λ_{S_{1}} λ_{S_{5}}} + \frac{λ_{S_{1} \to S_{2}} λ_{S_{2} \to S_{5}}}{λ_{S_{1}} λ_{S_{2}} λ_{S_{5}}} + \frac{λ_{S_{1} \to S_{3}} λ_{S_{3} \to S_{5}}}{λ_{S_{1}} λ_{S_{3}} λ_{S_{5}}} + \frac{λ_{S_{1} \to S_{4}} λ_{S_{4} \to S_{5}}}{λ_{S_{1}} λ_{S_{4}} λ_{S_{5}}} + . . . \end{matrix}$ $Mathematical equation: $$ \begin{aligned} x_5&= \frac{\lambda _{S_1 \rightarrow S_5}}{\lambda _{S_1}\lambda _{S_5}} + \frac{\lambda _{S_1 \rightarrow S_2}\lambda _{S_2 \rightarrow S_5}}{\lambda _{S_1}\lambda _{S_2}\lambda _{S_5}} + \frac{\lambda _{S_1 \rightarrow S_3}\lambda _{S_3 \rightarrow S_5}}{\lambda _{S_1}\lambda _{S_3}\lambda _{S_5}} + \frac{\lambda _{S_1 \rightarrow S_4}\lambda _{S_4 \rightarrow S_5}}{\lambda _{S_1}\lambda _{S_4}\lambda _{S_5}} + ... \end{aligned} $$$ (B.7)

$\begin{matrix} \frac{λ_{S_{1} \to S_{2}} λ_{S_{2} \to S_{3}} λ_{S_{3} \to S_{5}}}{λ_{S_{1}} λ_{S_{2}} λ_{S_{3}} λ_{S_{5}}} + \frac{λ_{S_{1} \to S_{2}} λ_{S_{2} \to S_{4}} λ_{S_{4} \to S_{5}}}{λ_{S_{1}} λ_{S_{2}} λ_{S_{4}} λ_{S_{5}}} + \frac{λ_{S_{1} \to S_{3}} λ_{S_{3} \to S_{4}} λ_{S_{4} \to S_{5}}}{λ_{S_{1}} λ_{S_{3}} λ_{S_{4}} λ_{S_{5}}} + \frac{λ_{S_{1} \to S_{2}} λ_{S_{2} \to S_{3}} λ_{S_{3} \to S_{4}} λ_{S_{4} \to S_{5}}}{λ_{S_{1}} λ_{S_{2}} λ_{S_{3}} λ_{S_{4}} λ_{S_{5}}}, \end{matrix}$ $Mathematical equation: $$ \begin{aligned}&\frac{\lambda _{S_1 \rightarrow S_2}\lambda _{S_2 \rightarrow S_3}\lambda _{S_3 \rightarrow S_5}}{\lambda _{S_1}\lambda _{S_2}\lambda _{S_3}\lambda _{S_5}} + \frac{\lambda _{S_1 \rightarrow S_2}\lambda _{S_2 \rightarrow S_4}\lambda _{S_4 \rightarrow S_5}}{\lambda _{S_1}\lambda _{S_2}\lambda _{S_4}\lambda _{S_5}} + \frac{\lambda _{S_1 \rightarrow S_3}\lambda _{S_3 \rightarrow S_4}\lambda _{S_4 \rightarrow S_5}}{\lambda _{S_1}\lambda _{S_3}\lambda _{S_4}\lambda _{S_5}} + \frac{\lambda _{S_1 \rightarrow S_2}\lambda _{S_2 \rightarrow S_3}\lambda _{S_3 \rightarrow S_4}\lambda _{S_4 \rightarrow S_5}}{\lambda _{S_1}\lambda _{S_2}\lambda _{S_3}\lambda _{S_4}\lambda _{S_5}},\end{aligned} $$$ (B.8)

$\begin{matrix} x_{6} & = . . . \end{matrix}$ $Mathematical equation: $$ \begin{aligned} x_6&= ... \end{aligned} $$$ (B.9)

By generalizing and adding the above expressions, we obtain the formula for the expectation of distance until full disintegration,

$\begin{matrix} E [λ_{1} L] = - λ_{1} ϕ Λ^{- 1} 1 = \frac{λ_{1}}{λ_{S_{1}}} \sum_{n = 1}^{| {S} |} \sum_{{α, β, . . .} \subseteq (\binom{[n]}{k})}^{(\binom{n}{k})} \frac{λ_{S_{1} \to S_{α}} λ_{S_{α} \to S_{β}} . . .}{λ_{S_{α}} λ_{S_{β}} . . .}, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \mathbb{E} [\lambda _1L] = -\lambda _1\boldsymbol{\phi } \boldsymbol{\Lambda }^{-1}\boldsymbol{1} =\frac{\lambda _1}{\lambda _{S_1}} \sum _{n = 1}^{|\{S\}|} \sum _{\{\alpha , \beta , ...\} \subseteq {[n] \atopwithdelims ()k}}^{n \atopwithdelims ()k} \frac{\lambda _{S_1 \rightarrow S_{\alpha }} \lambda _{S_{\alpha } \rightarrow S_{\beta }} ...}{\lambda _{S_{\alpha }}\lambda _{S_{\beta }} ...}, \end{aligned} $$$ (B.10)

where $(\binom{[n]}{k})$ $Mathematical equation: $ [n] \choose k $$ denotes the set formed by all $(\binom{n}{k})$ $Mathematical equation: $ n \choose k $$ combinations of k indices chosen out of n possible indices. This expression yields the ISeC result when there is only one nucleon emission channel, since only the term,

$\begin{matrix} \frac{λ_{S_{1} \to S_{2}} λ_{S_{2} \to S_{3}} . . . λ_{S_{n - 1} \to S_{n}}}{λ_{S_{1}} λ_{S_{2}} . . . λ_{S_{n}}} = \frac{1}{λ_{S_{n}}}, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \frac{\lambda _{S_1 \rightarrow S_2} \lambda _{S_2 \rightarrow S_3}...\lambda _{S_{n-1} \rightarrow S_n}}{\lambda _{S_1}\lambda _{S_2} ...\lambda _{S_n}} = \frac{1}{\lambda _{S_n}}, \end{aligned} $$$ (B.11)

survives for each n, yielding the inverse rate for the n-th species. Thus, the sum of these n terms yields the expected result. Furthermore, it is evident that with the regularity condition this expression produces the result for RSeCs.

However, for CoCs this expression produces shorter values, as can be seen in the following example. Assuming that all species can undergo one-nucleon loss with a branching ratio χ, so that the two-nucleon loss channel has a branching ratio of 1 − χ, we have the term,

$\begin{matrix} \frac{λ_{S_{1} \to S_{2}} λ_{S_{2} \to S_{3}} . . . λ_{S_{n - 1} \to S_{n}}}{λ_{S_{1}} λ_{S_{2}} . . . λ_{S_{n}}} = \frac{χ^{n - 1}}{λ_{S_{n}}}, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \frac{\lambda _{S_1 \rightarrow S_2} \lambda _{S_2 \rightarrow S_3}...\lambda _{S_{n-1} \rightarrow S_n}}{\lambda _{S_1}\lambda _{S_2} ...\lambda _{S_n}} = \frac{\chi ^{n-1}}{\lambda _{S_n}}, \end{aligned} $$$ (B.12)

similar to the previous case for one-nucleon loss only, and we have n − 2 terms, where only one two-nucleon loss rate appears,

$\begin{matrix} \frac{λ_{S_{1} \to S_{2}} λ_{S_{2} \to S_{3}} . . . λ_{S_{k} \to S_{k + 2}} . . . λ_{S_{n - 1} \to S_{n}}}{λ_{S_{1}} λ_{S_{2}} . . . λ_{k} . . . λ_{S_{n}}} = \frac{χ^{n - 2} (1 - χ)}{λ_{S_{n}}}, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \frac{\lambda _{S_1 \rightarrow S_2} \lambda _{S_2 \rightarrow S_3}...\lambda _{S_k \rightarrow S_{k+2}}...\lambda _{S_{n-1} \rightarrow S_n}}{\lambda _{S_1}\lambda _{S_2} ...\lambda _k ...\lambda _{S_n}} = \frac{\chi ^{n-2}(1-\chi )}{\lambda _{S_n}}, \end{aligned} $$$ (B.13)

and we have $(\binom{n - 3}{2})$ $Mathematical equation: $ n-3 \choose 2 $$ terms where two-nucleon loss rates appear,

$\begin{matrix} \frac{λ_{S_{1} \to S_{2}} λ_{S_{2} \to S_{3}} . . . λ_{S_{k} \to S_{k + 2}} . . . λ_{S_{l} \to S_{l + 2}} . . . λ_{S_{n - 1} \to S_{n}}}{λ_{S_{1}} λ_{S_{2}} . . . λ_{k} . . . λ_{l} . . . λ_{S_{n}}} = \frac{χ^{n - 3} (1 - χ)}{λ_{S_{n}}^{2}}, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \frac{\lambda _{S_1 \rightarrow S_2} \lambda _{S_2 \rightarrow S_3}...\lambda _{S_k \rightarrow S_{k+2}}...\lambda _{S_l \rightarrow S_{l+2}}...\lambda _{S_{n-1} \rightarrow S_n}}{\lambda _{S_1}\lambda _{S_2} ...\lambda _k ...\lambda _l ...\lambda _{S_n}} = \frac{\chi ^{n-3}(1-\chi )}{\lambda _{S_n}^2}, \end{aligned} $$$ (B.14)

and so on. Grouping terms ending in the same species, S_n, yields

$\begin{matrix} \frac{1}{λ_{S_{n}}} (\sum_{q = 0}^{n - 1} (\binom{n - 1 - q}{q}) χ^{n - 1 - q} {(1 - χ)}^{q}) < \frac{1}{λ_{S_{n}}}, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \frac{1}{\lambda _{S_n}} \left(\sum _{q = 0}^{n-1} {n-1-q \atopwithdelims ()q} \chi ^{n-1-q}(1-\chi )^q\right) < \frac{1}{\lambda _{S_n}}, \end{aligned} $$$ (B.15)

which is shorter than the corresponding inverse rate for the n-th species. Indeed, comparing the expression to (χ + (1 − χ))^n − 1 makes it clear that the coefficients in the expression are smaller than the binomial coefficients. This occurs for all n terms and explains why the expectation values for the full disintegration distance are reduced in CoCs compared to serial cascades.

Appendix C: Decoherence lengths for dispersive inhomogeneities during propagation

To estimate the impact of DIs caused by Bethe-Heitler pair production losses (BHL) during propagation over cosmological distances, we compare survival probability distributions, one including and one excluding BHL. This is reasonable because, in photodisintegration cascades, the shortest length scale is determined by the injected species with the largest mass, as its interaction rates are usually the highest. Additionally, DI effects are produced in chains of more than one species, so if the effect is small for the largest species in a cascade, it should be sufficiently small for all others.

The survival probability follows an exponential distribution, which, for sufficiently short distances, has a homogeneous interaction rate because the redshift does not change, namely,

$\begin{matrix} F^{H} (δ) = 1 - exp (- λ_{S_{0}} ({(1 + z)}^{2} γ_{c}) δ) . \end{matrix}$ $Mathematical equation: $$ \begin{aligned} F^\mathrm{H}(\delta ) = 1 - \exp {\left(-\lambda _{S_0}((1+z)^2\gamma _c)\delta \right)} . \end{aligned} $$$ (C.1)

In the presence of DIs, it is described by an inhomogeneous exponential distribution

$\begin{matrix} F^{IH} (δ) = 1 - exp (- \int λ_{S_{0}} ({(1 + z)}^{2} γ_{c}) d δ), \end{matrix}$ $Mathematical equation: $$ \begin{aligned} F^\mathrm{IH}(\delta ) = 1 - \exp {\left(-\int \lambda _{S_0}((1+z)^2\gamma _c)d\delta \right)} , \end{aligned} $$$ (C.2)

where changes in the rate with γ_c need to be included, as γ_c = γ_c(δ) varies with the cosmological thickness according to Eq. 36.

Figure C.1 shows the scales of cosmological thickness δ for different interactions experienced by an iron nucleus at two different redshifts. The total photodisintegration scale is shown by a black dotted line, which is the sum of the scales for interaction with the CMB (solid blue) and with the IRB (solid green, using the model by Gilmore et al. (2012)). The scales are expressed in terms of the cosmological thickness; thus, the CMB interaction rates remain constant with redshift, while the IRB rates change according to a(z) (see equations 26 and 27). The thickness scales for BHL are shown for multiple relative loss percentages. These scales do not change with redshift, as seen in Eq. 36. The photodisintegration scales are shorter, implying that iron likely photodisintegrates before experiencing a relative energy loss of 3 % due to BHL regardless of the Lorentz boost. As a consequence, photodisintegration rates practically do not change for the typical thickness scales required for this interaction. The same conclusion holds for larger cosmological scales, however, at larger redshifts (e.g. z ≥ 2) the photodisintegration distance scales for the IRB are comparable to the 10% relative energy BHL scales. This stems from the changes in IRB density with redshift, as reflected in a(z) (see Eq. 26), while for the CMB the relation to BHL does not change. Nevertheless, to understand the impact a relative energy loss has on the probability distribution, we need to quantify the change in the interaction rate and its corresponding effect on the distribution.

Fig. C.1.

Thickness scales for iron to undergo different processes as a function of the comoving boost. Photodisintegration scales are represented by solid lines: in blue for interactions with the CMB and in green for interactions with the IRB. The thickness scales for the Bethe-Heitler pair production interactions are shown by shaded regions for certain values of relative energy energy loss in terms of percentage.

In the first approximation of the Taylor expansion around δ = 0, the evolution of the disintegration rate yields, assuming the redshift can be considered constant,

$\begin{matrix} λ ({(1 + z)}^{2} γ_{c} (δ)) = λ_{0} + \frac{d λ}{d δ} |_{δ = 0} δ = λ_{0} + {(1 + z)}^{2} \frac{d λ}{d γ_{c}} \frac{d γ_{c}}{d δ} |_{δ = 0} δ = λ_{0} + {(1 + z)}^{2} \frac{d λ}{d γ_{c}} |_{δ = 0} γ_{c}^{0} \frac{Z^{2}}{A} \frac{β_{0} (γ_{c}^{0})}{c} δ, \end{matrix}$ $Mathematical equation: $$ \begin{aligned} \lambda ((1+z)^2\gamma _c(\delta )) = \lambda _0 + \frac{d\lambda }{d\delta }\bigg |_{\delta = 0}\delta = \lambda _0 + (1+z)^2\frac{d\lambda }{d\gamma _c}\frac{d\gamma _c}{d\delta }\bigg |_{\delta = 0}\delta = \lambda _0 + (1+z)^2\frac{d\lambda }{d\gamma _c}\bigg |_{\delta = 0}\gamma _c^0\frac{Z^2}{A}\frac{\beta _0(\gamma _c^0)}{c}\delta , \end{aligned} $$$ (C.3)

where the starting comoving boost is γ_c⁰. By using this expression in F^IH, we can evaluate the difference between the homogeneous and inhomogeneous descriptions,

$\begin{matrix} (F^{H} - F^{IH}) (δ) = e^{- λ_{0} δ} (exp ({(1 + z)}^{2} \frac{d λ (γ_{c})}{d γ_{c}} |_{δ = 0} γ_{c}^{0} \frac{Z^{2}}{A} \frac{β_{0} (γ_{c}^{0})}{c} \frac{δ^{2}}{2}) - 1), \end{matrix}$ $Mathematical equation: $$ \begin{aligned} (F^\mathrm{H}-F^\mathrm{IH})(\delta ) = e^{-\lambda _0\delta } \left( \exp {\left({(1+z)^2\frac{d\lambda (\gamma _c)}{d\gamma _c}\bigg |_{\delta = 0}\gamma _c^0\frac{Z^2}{A}\frac{\beta _0(\gamma _c^0)}{c}\frac{\delta ^2}{2}} \right)} - 1 \right), \end{aligned} $$$ (C.4)

which is a monotonic function of the cosmological thickness traversed δ. With this expression we can evaluate the relative error incurred when neglecting BHL for a given length scale.

Figure C.2 presents the relative error in neglecting the effect of BHL on a thickness scale comparable to three times the photodisintegration scale, which corresponds to a 95% chance of interaction. Different nuclei are shown for comparison, and error bands are provided across a broad range of comoving boosts and redshifts. The error values vary across the phase space, but remain below 3% for most of it, regardless of the nuclear species. The main differences concentrate around γ_c ≈ 3 − 4 ⋅ 10⁹ since this is the region of transition from IRB to CMB interactions, where BHL can become important, as shown in figure C.1. These error values are acceptable for most applications. However, when greater precision is needed, the distributions can be computed numerically in the reduced region of the phase space where the impact is greater.

Fig. C.2.

Percent error of neglecting DI for a length scale of three times the typical interaction thickness as function of the redshift and comoving boost.

Appendix D: Details of source examples

The source scenario considered in Fig. 7 is based on the GRB example in (Morejon et al. 2019) and is discussed in more detail as the “Optically Thick Case” in (Biehl et al. 2018). The GRB model is based on the “fireball” picture, in which cosmic rays are pre-accelerated and injected into the photon emission zone over a certain time interval, producing the emitted composition through photointeractions within this zone. The emission mechanism assumed is guided by the “internal shock model” in which relativistic shells collide, and a fraction of their kinetic energy powers the emission. The model conceives one such collision as a spherical shell expanding relativistically with a Lorentz factor Γ = 300, a radius of collision R_c = 2 ⋅ 10⁸ km, and a volume V_iso = 4πR_c²Δd′, where Δd′ is the shell’s thickness. The photon spectral number density is described by a broken power law between energies of 100 eV and 100 keV, with a power −1 below the break energy 1 keV and a power −2 above the break. The energy density is $u_{γ}' = \frac{L_{γ}}{4 π Γ^{2} R_{c}^{2}}$ $Mathematical equation: $ u_{\gamma}{\prime}=\frac{L_{\gamma}}{4\pi \Gamma^2 R_c^2} $$ , where L_γ = 10⁵³ ergs/s is the luminosity. The magnetic field intensity is estimated assuming its energy density is comparable to the photon luminosity, $B' = \sqrt{\frac{2 L_{γ}}{c Γ^{2} R^{2}}}$ $Mathematical equation: $ B{\prime}= \sqrt{\frac{2L_{\gamma}}{c \Gamma^2 R^2}} $$ . The injected cosmic ray species is ⁵⁶Fe and the spectrum is given by Q_⁵⁶Fe′(E′) = C′E′⁻²exp[−(E′/E′_max)²], where E′_max is the energy at which the acceleration rate is comparable to the sum of the rates for all energy losses, and C′ is determined by $\int_{0}^{10 E'_{\max}} \tilde{E} Q_{^{56} F e}' (\tilde{E}) d \tilde{E} = 10 u_{γ}' c / Δ d'$ $Mathematical equation: $ \int_0^{10E{\prime}_{\mathrm{max}}}\tilde E Q_{^{56}Fe}{\prime}(\tilde E) d\tilde E = 10u_{\gamma}{\prime}c / \Delta d{\prime} $$ which stems from assuming that the baryonic loading factor (the ratio of the cosmic ray luminosity to the photon luminosity) is 10.

Additional considerations for the source models in this work involve the variability in the time dependence of the injection and the inclusion of rigidity-dependent escape rates. The baseline case for the effect of changes in the temporal form of the injection is a constant injection of cosmic rays of $C' (t) = C'_{0} \approx 2.679 \cdot 10^{12} GeV {cm}^{- 3} s^{- 1}$ $Mathematical equation: $ C{\prime}(t) = C{\prime}_0 \approx 2.679 \cdot 10^{12}\, \rm{GeV\,cm^{-3}s^{-1}} $$ , and the variability is represented by the difference between the lower values of $C' (t) = 4 (1 - {(t - 1)}^{2}) \cdot 10^{12} GeV {cm}^{- 3} s^{- 1}$ $Mathematical equation: $ C{\prime}(t) = 4(1 - (t-1)^2) \cdot 10^{12}\, \rm{GeV\,cm^{-3}s^{-1}} $$ and the higher values of $C' (t) = (4.157 - 2.952 t) \cdot 10^{12} GeV {cm}^{- 3} s^{- 1}$ $Mathematical equation: $ C{\prime}(t) = (4.157 - 2.952 t) \cdot 10^{12} \,\rm{GeV\,cm^{-3}s^{-1}} $$ . All of these values require the same total injection over the source’s time scale, an interval of one second. Rigidity-dependent escape was implemented by taking the advective escape as the nominal case (whereby all species propagate one light-second until escape) and two other diffusion cases: Bohm and Kolmogorov. In both cases, the escape is exponentially distributed, F_esc = 1 − exp(−t/t_diff(R)), with diffusion time scales t_diff(R) = 3 ⋅ 10⁶/R for the Bohmian case (where the diffusion coefficient is proportional to the rigidity), and t_diff(R) = 2 ⋅ 10²/R^1/3 to model the Kolmogorov case. These choices of F_esc(t, R) illustrate the changes in the crossing time through the source medium, with a broadening at lower rigidities, as expected for diffusive propagation.

Appendix E: Efficiency over a Monte Carlo approach

The stochastic framework presented here offers a more efficient method for computing quantities related to UHECR interactions in astrophysical scenarios. Monte Carlo methods, commonly employed for UHECR propagation, have known drawbacks regarding computational efficiency in comparison to the evaluation of closed-form expressions. Nevertheless, for the sake of completeness, we compare the efficiency of this approach to that of a generic Monte Carlo.

Determining the computational cost or efficiency of Monte Carlo methods is not trivial because the conclusion depends on the aspects taken into consideration, e.g., the desired precision, the convergence of the employed algorithm(s), and the specific scenario being simulated. For simplicity, we focus on a few quantities that characterize the efficiency in terms of time and the number of computational operations required. We omit estimating energy costs in this comparison. However, it is clear that, in general, Monte Carlo methods are a poor choice when rare events are of interest and the region of interest in the input phase space is unknown. Additionally, costs are also incurred in training and testing simulations, which are often discarded due to errors. Many of these drawbacks can be addressed with specific sampling techniques or by introducing weights and biases. However, these techniques still require exploratory simulations to understand the underlying phase space.

The goal here is to establish an ideal limit based on two simple assumptions: a) All Monte Carlo trials have a similar computational cost; b) Sampling the input parameter space results in uniform sampling of the desired quantity. These ensure the probability of obtaining values of the distribution in the range [x_a, x_b] is F(x_b)−F(x_a), where F(x) is the distribution function, and for an infinitesimal range dx the probability is given by f(x)dx, where f(x) is the density function. Figure E.1 illustrates the computational costs for two different distributions of the probability density of producing nuclei with A = 28 for γ = 5 ⋅ 10⁹: one starting with ⁴⁰Ca (dashed blue) and one starting with ⁵⁶Fe (dotted-dashed green). The disintegration cross-sections used are from the default table provided in CRPropa 3.2 (Kampert et al. 2013; Alves Batista et al. 2022) (184 species). The number of trials, given on the right axis, is estimated by considering that, to limit the uncertainty to ∼10%, the number of successful events needed is N_s ∼ 100. This implies the number of trials is N_t = N_s/p, where the probability is p = f(x)dx, and f(x) the theoretical probability density illustrated with the lines. In terms of CPU time, the computational cost of a Monte Carlo approach can be estimated by assuming ∼10⁻⁷ CPU hours per trial. This is a conservative estimate based on the reported values for SimProp v2r4 (Aloisio et al. 2017). The shaded bands enclose the probability ranges that can be probed with the stated computational effort in CPU hours. A relatively good characterization of both curves can be achieved with a Monte Carlo approach for ∼1 CPU hour. However, obtaining the theoretical distributions with the desired 10% level of uncertainty would require ≳10⁸ CPU hours. Nevertheless, these distributions can be computed in a few seconds on a typical laptop using Python scripts and standard mathematical functions with the approach presented in this paper.

Fig. E.1.

Estimate of the computational effort with a Monte Carlo approach needed to access the probability distributions shown by the lines by evaluating the analytic expressions in this paper (a few seconds for one CPU). The shaded bands correspond to portions of the distributions that can be accessed at the stated computational cost in CPU hours.

A different estimate of the computational cost per trial can be made based on the number of propagation steps. In a Monte Carlo propagation of cosmic rays, the distribution over the propagation length is usually obtained through a sequence of propagation steps (of a user defined length), which involves repeatedly testing whether an interaction occurs after each iteration. The upper axis in Fig. E.1 presents the number of steps required to reach a given propagation length, assuming a step size of 100 kpc. The most likely events involve ∼100 steps, a number similar to the average number of steps per trial. Having an estimate of the CPU time for each step allows us to place constraints on the limits of different portions of the distribution and shows that the efficiency depends on whether the step size is much smaller than the length scales of the most relevant probable events. However, the resolution achievable with a given step size is limited since, in a Monte Carlo approach, the step size is generally chosen without knowledge of the form of the distribution. While a smaller step size would yield a better resolution of the entire distribution, it would increase the number of steps simulated with correspondingly more CPU time. Additionally, the asymmetry of these distributions implies that no single step size can describe both the small-scale rise and the large-scale tails with the same precision, without considerable CPU time. In scenarios of interest, it is often not possible to find an optimal step size that describes all underlying distributions with comparable level of precision, as the rise and decrease of different distributions can differ by orders of magnitude and peak at different distances. This occurs not only for different injected species, but also for different values of the Lorentz boost which are commonly simulated with the same steps. These limitations of Monte Carlo methods can be overcome when the underlying distributions are known in closed form.

E.1. Concrete examples: Comparison to CRPropa

Figure E.2 shows a comparison of some example serial distributions. The simulated nuclear chain is a serial cascade consisting of the following species {⁴⁰Ca→³⁹K→³⁸Ar→³⁷Ar→³⁶Ar→³⁵Cl} and their respective interaction lengths {8.98, 7.57, 11.48, 9.78, 10.64} Mpc for a boost of γ = 3.5 ⋅ 10⁹. The four top panels of the figure show the distributions of the propagation distances needed to disintegrate two, three, four, and five nucleons from the initial ⁴⁰Ca nuclei. Results of CRPropa simulations are shown with solid green bars, and results of the analytical approach are shown in hollow purple bars. In the Monte Carlo approach, 10⁵ nuclei were simulated for each case with a fixed propagation step of λ⁻¹/50 ≈ 180 kpc. Here, λ is the interaction rate for ⁴⁰Ca at the chosen boost 3.5 ⋅ 10⁹, which is the shortest for all species in the chain. The expected distance is larger the more nucleons are required to disintegrate, and the distributions become broader. Additionally, the number of recorded nuclei (N_ev) decreases for distributions with a larger nucleon loss because the number of interactions needed to reach the final state increases (one nucleon is lost per interaction) and the final state is increasingly less probable, consequently increasing the simulation time. The Monte Carlo simulations closely resemble the analytical distributions within the range of distances with the highest probabilities and are less accurate elsewhere. The bottom panel of Figure E.2 shows the relative differences between the CRPropa simulations and the analytic distributions. Each row shows the differences for a distribution identified by the final species denoted on the right side, including the one-nucleon-loss case. This demonstrates the good agreement in the range of the highest probabilities, while larger deviations concentrate in the tails of the distributions especially when the probabilities fall below 10⁻³. The simulation times (t_sim) are considerably longer than the time needed to evaluate the analytic expressions (ranging tens of milliseconds).

Fig. E.2.

Comparison of CRPropa simulations and analytic distributions for a serial cascade involving nuclear species {⁴⁰Ca→³⁹K→³⁸Ar→³⁷Ar→³⁶Ar→³⁵Cl}. The four top panels show the probability distributions as a function of distance for cases with 2-5 nucleon losses. The bottom panel shows the relative differences between CRPropa and the analytic distributions. There is a row for each case, including the one-nucleon-loss distributions. Each case simulated 10⁵ nuclei with γ = 3.5 ⋅ 10⁹. The interaction length for ⁴⁰Ca is the reference scale 1/λ.

Figure E.3 illustrates concurrent cascades with the distributions of heavier secondary species produced in the disintegration of ⁵⁶Fe nuclei after propagating for 30 Mpc with Lorentz boost γ = 3.5 ⋅ 10⁹. As in the previous figure, CRPropa simulations are shown with solid green bars and the analytical approach is shown in hollow purple bars. Although the distributions are closer for the most probable bins of mass (charge), the marginal distributions obtained with CRPropa peak at larger masses, indicating a less efficient disintegration. The relative differences between Monte Carlo and the analytical methods are shown in the bottom panel, for secondary species with probability larger than 1%. The most probable species are those with masses around 49-50, and the smallest differences between the distributions are found in that region, while above and below the range the distributions rapidly diverge. The origin of these discrepancies may reflect the choice of a fixed propagation step in CRPropa, which is a user-provided argument that affects the number of interaction samplings performed in propagating a nucleus and its secondaries. A proper choice of step size that is suitable for all intermediate species is not trivial as previously discussed, as each species has different interaction rates (approximately proportional to the mass). Alternatively, allowing CRPropa to make corrections to the step size during simulation can introduce biases in the computed distributions. A detailed study of these discrepancies would require more rigorous comparisons that systematically explore the effects of different parameters and simulation inputs. Such investigations will be pursued in future works.

Fig. E.3.

Distribution of heavier secondary species produced in the disintegration of Fe nuclei at boost γ = 3.5 ⋅ 10⁹ after propagating for 30 Mpc. Top: Marginal distributions of mass number (left) and charge number (right) obtained with Monte Carlo (using CRPropa, green solid) versus Analytical (using CRISP, purple lines). Down: Relative differences, in percentages, of Monte Carlo and Analytical probabilities for each secondary species with a probability larger than 1%.

All Tables

Table 1.

Characteristics of the distributions with a closed form.

In the text

All Figures

	Fig. 3. Density functions of distance until reaching different values of nuclear mass, the variation for the boost γ ∈ [4 ⋅ 10⁸, 3 ⋅ 10¹⁰] is represented by the shaded bands. The distributions are standardized and centered at the expected value, as they span different scales at different boosts.
In the text

	Fig. 6. Likelihood of the distance of origin of an observed 70 EeV ¹²C (left) and ¹⁶O nucleus assuming different initial nuclei. Even slight variations in the mass of the observed nuclei can lead to significant differences in their most likely distance of origin.
In the text

	Fig. C.2. Percent error of neglecting DI for a length scale of three times the typical interaction thickness as function of the redshift and comoving boost.
In the text

	Fig. E.1. Estimate of the computational effort with a Monte Carlo approach needed to access the probability distributions shown by the lines by evaluating the analytic expressions in this paper (a few seconds for one CPU). The shaded bands correspond to portions of the distributions that can be accessed at the stated computational cost in CPU hours.
In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.

[R1] Aab, A., Abreu, P., Aglietta, M., et al. 2015, ApJ, 802, 111 [Google Scholar]

[R2] Abdul Halim, A., Abreu, P., Aglietta, M., et al. 2023, JCAP, 2023, 024 [CrossRef] [Google Scholar]

[R3] Abdul Halim, A., Abreu, P., Aglietta, M., et al. 2024a, ApJ, 976, 48 [Google Scholar]

[R4] Abdul Halim, A., Abreu, P., Aglietta, M., et al. 2024b, JCAP, 2024, 022 [CrossRef] [Google Scholar]

[R5] Ahlers, M., & Taylor, A. M. 2010, Phys. Rev. D, 82, 123005 [Google Scholar]

[R6] Ahlers, M., Anchordoqui, L. A., & Taylor, A. M. 2013, PRD, 87, 023004 [Google Scholar]

[R7] Albrecher, H., & Bladt, M. 2019, J. Appl. Probab., 56, 1044 [Google Scholar]

[R8] Aloisio, R., Berezinsky, V., & Grigorieva, S. 2013a, Astropart. Phys., 41, 73 [Google Scholar]

[R9] Aloisio, R., Berezinsky, V., & Grigorieva, S. 2013b, Astropart. Phys., 41, 94 [Google Scholar]

[R10] Aloisio, R., Boncioli, D., di Matteo, A., et al. 2017, JCAP, 2017, 009 [CrossRef] [Google Scholar]

[R11] Alves Batista, R., Dundovic, A., Erdmann, M., et al. 2016, JCAP, 2016, 038 [Google Scholar]

[R12] Alves Batista, R., Becker Tjus, J., Dörner, J., et al. 2022, JCAP, 2022, 035 [CrossRef] [Google Scholar]

[R13] Arns, M., Buchholz, P., & Panchenko, A. 2010, INFORMS J. Comput., 22, 416 [Google Scholar]

[R14] Asmussen, S., & Glynn, P. W. 2007, in Stochastic Simulation: Algorithms and Analysis, (New York, NY: Springer), Stochastic Model. Appl. Probab., 57 [Google Scholar]

[R15] Astropy Collaboration (Robitaille, T. P., et al.) 2013, A&A, 558, A33 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R16] Astropy Collaboration (Price-Whelan, A. M., et al.) 2018, AJ, 156, 123 [Google Scholar]

[R17] Astropy Collaboration (Price-Whelan, A. M., et al.) 2022, ApJ, 935, 167 [NASA ADS] [CrossRef] [Google Scholar]

[R18] Bennett, C. L., Larson, D., Weiland, J. L., et al. 2013, ApJS, 208, 20 [Google Scholar]

[R19] Berezinskii, V. S., & Grigor’eva, S. I. 1988, A&A, 199, 1 [Google Scholar]

[R20] Berezinsky, V. S., Bulanov, S. V., Dogiel, V. A., & Ptuskin, V. S. 1990, in Astrophysics of Cosmic Rays, ed. V. L. Ginzburg [Google Scholar]

[R21] Biehl, D., Boncioli, D., Fedynitch, A., & Winter, W. 2018, A&A, 611, A101 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R22] Bladt, M., & Nielsen, B. F. 2017, Matrix-Exponential Distributions in Applied Probability (Springer), 81 [Google Scholar]

[R23] Blumenthal, G. R. 1970, PRD, 1, 1596 [Google Scholar]

[R24] Boncioli, D., Fedynitch, A., & Winter, W. 2017, Sci. Rep., 7, 4882 [Google Scholar]

[R25] Daw, A., & Pender, J. 2023, Adv. Appl. Probab., 55, 126 [Google Scholar]

[R26] Epele, L. N., & Roulet, E. 1998, J. High Energy Phys., 1998, 009 [Google Scholar]

[R27] Gilmore, R. C., Somerville, R. S., Primack, J. R., & Domínguez, A. 2012, MNRAS, 422, 3189 [NASA ADS] [CrossRef] [Google Scholar]

[R28] Globus, N., Allard, D., Mochkovitch, R., & Parizot, E. 2015, MNRAS, 451, 751 [NASA ADS] [CrossRef] [Google Scholar]

[R29] Harris, C. R., Millman, K. J., van der Walt, S. J., et al. 2020, Nature, 585, 357 [NASA ADS] [CrossRef] [Google Scholar]

[R30] He, Q.-M. 2021, J. Appl. Prob, 58, 880 [Google Scholar]

[R31] He, Q.-M., Horváth, G., Horváth, I., & Telek, M. 2019, Adv. Appl. Probab., 51, 168 [Google Scholar]

[R32] Heinze, J., Fedynitch, A., Boncioli, D., & Winter, W. 2019, ApJ, 873, 88 [Google Scholar]

[R33] Hill, C. T., & Schramm, D. N. 1985, PRD, 31, 564 [Google Scholar]

[R34] Hoerbe, M. R., Morris, P. J., Cotter, G., & Becker-Tjus, J. 2020, MNRAS, 496, 2885 [NASA ADS] [CrossRef] [Google Scholar]

[R35] Hooper, D., Sarkar, S., & Taylor, A. M. 2008, Phys. Rev. D, 77, 103007 [Google Scholar]

[R36] Hümmer, S., Rüger, M., Spanier, F., & Winter, W. 2010, ApJ, 721, 630 [CrossRef] [Google Scholar]

[R37] Hunter, J. D. 2007, Comput. Sci. Eng., 9, 90 [NASA ADS] [CrossRef] [Google Scholar]

[R38] Kampert, K.-H., Kulbartz, J., Maccione, L., et al. 2013, Astropart. Phys., 42, 41 [Google Scholar]

[R39] Kawano, T., Cho, Y., Dimitriou, P., et al. 2020, Nucl. Data Sheets, 163, 109 [CrossRef] [Google Scholar]

[R40] Korochkin, A., Semikoz, D., & Tinyakov, P. 2025, A&A, 693, A284 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[R41] Lee, S., Olinto, A. V., & Sigl, G. 1995, ApJ, 455, L21 [Google Scholar]

[R42] Lia, V. D., & Tamborra, I. 2024, JCAP, 2024, 054 [Google Scholar]

[R43] Morejon, L. 2021, Ph.D. Thesis, Humboldt U., Berlin, Humboldt U., Berlin [Google Scholar]

[R44] Morejon, L. 2023, PoS, ICRC2023, 284 [Google Scholar]

[R45] Morejon, L. 2025, Cent. Eur. Astrophys. Bull., 47 [Google Scholar]

[R46] Morejon, L., & Rautenberg, J. 2025, in Proceedings of 7th International Symposium on Ultra High Energy Cosmic Rays - PoS(UHECR2024), 484, 110 [Google Scholar]

[R47] Morejon, L., Fedynitch, A., Boncioli, D., Biehl, D., & Winter, W. 2019, JCAP, 2019, 007 [Google Scholar]

[R48] Puget, J. L., Stecker, F. W., & Bredekamp, J. H. 1976, ApJ, 205, 638 [Google Scholar]

[R49] Rodrigues, X., Fedynitch, A., Gao, S., Boncioli, D., & Winter, W. 2018, ApJ, 854, 54 [NASA ADS] [CrossRef] [Google Scholar]

[R50] Schmuland, B. 2003, Am. Math. Mon., 110, 407 [Google Scholar]

[R51] TA Collaboration 2023, Science, 382, 903 [NASA ADS] [CrossRef] [Google Scholar]

[R52] Unger, M., & Farrar, G. R. 2024, ApJ, 970, 95 [NASA ADS] [CrossRef] [Google Scholar]

[R53] Virtanen, P., Gommers, R., Oliphant, T. E., et al. 2020, Nat. Methods, 17, 261 [Google Scholar]

[R54] Waxman, E., & Miralda-Escudé, J. 1996, ApJ, 472, L89 [Google Scholar]

[R55] Yoshida, S., & Teshima, M. 1993, Prog. Theor. Phys., 89, 833 [Google Scholar]

[R56] Zhang, B., Comput, V. R. J., & Graphical, 2021, J. Comput. Graphical Stat., 30, 25 [Google Scholar]