| Issue |
A&A
Volume 700, August 2025
|
|
|---|---|---|
| Article Number | A84 | |
| Number of page(s) | 23 | |
| Section | Interstellar and circumstellar matter | |
| DOI | https://doi.org/10.1051/0004-6361/202452828 | |
| Published online | 06 August 2025 | |
BASIL: Fast broadband line-rich spectral-cube fitting and image visualization via Bayesian quadrature
1
Max-Planck-Institut für Extraterrestrische Physik,
Giessenbachstr. 1,
85748
Garching bei München,
Germany
2
Machine Learning Research Group, University of Oxford,
Walton Well Road,
Oxford
OX2 6ED,
UK
3
Max Planck Institute for Astrophysics,
Karl-Schwarzschild-Strasse 1,
85748
Garching bei München,
Germany
4
Lattice Lab, Toyota Motor Corporation,
1200 Mishuku, Susono,
Shizuoka,
Japan
★ Corresponding author: ylin@mpe.mpg.de
Received:
31
October
2024
Accepted:
21
May
2025
Context. Mapping the spatial distributions and abundances of complex organic molecules in hot cores and hot corinos surrounding nascent stars is crucial for understanding the astrochemical pathways and the inheritance of prebiotic material by nascent planetary systems. However, the line-rich spectra from these sources pose significant challenges for robustly fitting molecular parameters due to severe line blending and unidentified lines.
Aims. We present an efficient framework, Bayesian Active Spectral-cube Inference and Learning (BASIL), for estimating molecular parameter maps – excitation temperature, column density, centroid velocity, and line width – for hundreds of molecules based on the local thermodynamic equilibrium (LTE) model, applied to wideband spectral datacubes of line-rich sources. The main aim is to allow the simultaneous fitting of hundreds of molecules to disentangle line blending issues and map the kinematic and abundance spatial distributions of the molecular parameter maps.
Methods. We adopted stochastic variational inference (SVI) to infer molecular parameters from spectra at individual positions, achieving a balance between fitting accuracy and computational speed. For obtaining parameter maps, instead of querying every location or pixel, we introduced an active learning framework based on Bayesian quadrature and its parallelization. Specifically, we assessed and selected the locations or pixels of spectrum that are most informative for estimating the entire set of parameter maps by training a Gaussian processes (GP) model. By greedily selecting locations with maximum information gain, we achieve sublinear convergence: the estimation error of the GP model for parameter maps drops rapidly in the early stages of iterations and then stabilizes. At this point, we can halt the fitting process, providing a fast and reasonably accurate visualization of the molecular parameter maps, while further accuracy is obtained through additional iterations of model training by querying more locations.
Results. We benchmarked our algorithm using a synthetic spectral cube of 40 000 (200 × 200) pixels, in which each pixel contains 138016 frequency grids, and fit an LTE model with SVI to obtain four spectral parameters for a list of 117 molecules (117 × 4 dimensions). Our algorithm is able to estimate 468 molecule parameter maps for 40 000 pixels in ~180 hours (18 iterations of 50 parallel fittings, ~10 hours per batch), achieving a comparable root mean square error across all data points. The full analysis is done on a high-memory server with multi-core CPUs. In contrast, a traditional MCMC fitting would take approximately ~2 × 106 hours to achieve the same level of accuracy, while requiring significant manual tuning. In particular, with two iterations of ~20 hours computational time, the GP model predicts parameter maps that are visually accurate. Additional training iterations provide progressively more accurate results. This quick visualization meets the demands of big data in modern astronomical surveys.
Key words: methods: data analysis / methods: observational / methods: statistical / techniques: image processing / ISM: abundances
© The Authors 2025
Open Access article, published by EDP Sciences, under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
This article is published in open access under the Subscribe to Open model.
Open Access funding provided by Max Planck Society.
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.