| Issue |
A&A
Volume 704, December 2025
|
|
|---|---|---|
| Article Number | A324 | |
| Number of page(s) | 21 | |
| Section | Numerical methods and codes | |
| DOI | https://doi.org/10.1051/0004-6361/202555629 | |
| Published online | 06 January 2026 | |
Identification of molecular line emission using convolutional neural networks
1
Laboratoire d’Astrophysique de Bordeaux, Univ. Bordeaux, CNRS, UMR 5804,
33615
Pessac,
France
2
LERMA, Observatoire de Paris, PSL Research University, CNRS, Sorbonne Univ., UMR 8262,
75014
Paris,
France
3
IRAM, 300 Rue de la Piscine,
38046
Saint Martin d’Hères,
France
★ Corresponding author: This email address is being protected from spambots. You need JavaScript enabled to view it.
Received:
22
May
2025
Accepted:
20
October
2025
Context. Complex organic molecules (COMs) are found to be abundant in various astrophysical environments, particularly toward star-forming regions, where they are observed both toward protostellar envelopes as well as shocked regions. The emission spectrum, especially that of heavier COMs, might consist of up to hundreds of lines, where line blending hinders the analysis. However, identifying the molecular composition of the gas that leads to the observed millimeter spectra is the first step toward a quantitative analysis.
Aims. We have developed a new method based on supervised machine learning to recognize spectroscopic features of the rotational spectrum of molecules in the 3 mm atmospheric transmission band for a list of species including COMs, with the aim of obtaining a detection probability.
Methods. We used local thermodynamic equilibrium (LTE) modeling to build a large set of synthetic spectra of 20 molecular species, including COMs with a range of physical conditions typical for star-forming regions. We successfully designed and trained a convolutional neural network (CNN) that provides detection probabilities of individual species in the spectra.
Results. We demonstrate that the CNN model we developed has a robust performance to detect spectroscopic signatures from these species in synthetic spectra. We evaluated its ability to detect molecules according to the noise level, frequency coverage, and line-richness, as well as to test its performance for an incomplete frequency coverage with high detection probabilities for the tested parameter space, with no false predictions. Finally, we applied the CNN model to obtain predictions on observational data from the literature toward line-rich hot core-like sources, where the detection probabilities remain reasonable, with no false detections.
Conclusions. We demonstrate the use of CNNs in facilitating the analysis of complex millimeter spectra both on synthetic spectra, along with the first tests performed on observational data. Further analyses on its explainability, as well as calibration using a larger observational dataset, will help improve the performance of our method for future applications.
Key words: line: identification / methods: data analysis / stars: formation / ISM: molecules
© The Authors 2026
Open Access article, published by EDP Sciences, under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
This article is published in open access under the Subscribe to Open model. This email address is being protected from spambots. You need JavaScript enabled to view it. to support open access publication.
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.