Feature Selection Pipelines with Classification for Non-targeted Metabolomics Combining the Neural Network and Genetic Algorithm.

Lisitsyna, Anna; Moritz, Franco; Liu, Youzhong; Al Sadat, Loubna; Hauner, Hans; Claussnitzer, Melina; Schmitt-Kopplin, Philippe; Forcisi, Sara

doi:10.1021/acs.analchem.1c03237

journal article

Titel:: Feature Selection Pipelines with Classification for Non-targeted Metabolomics Combining the Neural Network and Genetic Algorithm.
Dokumenttyp:: Journal Article; Research Support, Non-U.S. Gov't
Autor(en):: Lisitsyna, Anna; Moritz, Franco; Liu, Youzhong; Al Sadat, Loubna; Hauner, Hans; Claussnitzer, Melina; Schmitt-Kopplin, Philippe; Forcisi, Sara
Abstract:: Non-targeted metabolomics via high-resolution mass spectrometry methods, such as direct infusion Fourier transform-ion cyclotron resonance mass spectrometry (DI-FT-ICR MS), produces data sets with thousands of features. By contrast, the number of samples is in general substantially lower. This disparity presents challenges when analyzing non-targeted metabolomics data sets and often requires custom methods to uncover information not always accessible via classical statistical techniques. In this work, we present a pipeline that combines a convolutional neural network with traditional statistical approaches and an adaptation of a genetic algorithm. The developed method was applied to a lifestyle intervention cohort data set, where subjects at risk of type 2 diabetes underwent an oral glucose tolerance test. Feature selection is the final result of the pipeline, achieved through classification of the data set via a neural network, with a precision-recall score of over 0.9 on the test set. The features most relevant for the described classification were then chosen via a genetic algorithm. The output of the developed pipeline encompasses approximately 200 features with high predictive scores, providing a fingerprint of the metabolic changes in the prediabetic class on the data set. Our framework presents a new approach which allows to apply complex modeling based on convolutional neural networks for the analysis of high-resolution mass spectrometric data. «
Non-targeted metabolomics via high-resolution mass spectrometry methods, such as direct infusion Fourier transform-ion cyclotron resonance mass spectrometry (DI-FT-ICR MS), produces data sets with thousands of features. By contrast, the number of samples is in general substantially lower. This disparity presents challenges when analyzing non-targeted metabolomics data sets and often requires custom methods to uncover information not always accessible via classical statistical techniques. In this... »
Zeitschriftentitel:: Anal Chem
Jahr:: 2022
Band / Volume:: 94
Heft / Issue:: 14
Seitenangaben Beitrag:: 5474-5482
Volltext / DOI:: doi:10.1021/acs.analchem.1c03237
PubMed:: http://view.ncbi.nlm.nih.gov/pubmed/35344349
Print-ISSN:: 0003-2700
TUM Einrichtung:: Else Kröner-Fresenius-Zentrum für Ernährungsmedizin - Klinik für Ernährungsmedizin
BibTeX

Vorkommen:

mediaTUM Gesamtbestand Einrichtungen Schools TUM School of Medicine and Health Departments Clinical Medicine Klinik für Ernährungsmedizin (Prof. Hauner)2022

mediaTUM Gesamtbestand Hochschulbibliographie 2022 Schools und Fakultäten Medizin Else Kröner-Fresenius-Zentrum für Ernährungsmedizin - Klinik für Ernährungsmedizin