Metabolites are intermediate molecules of metabolic processes such as sugars, amino acids, fatty acids or vitamins, which are nowadays measured in a high-throughput manner. Since metabolites are strongly interconnected in a biochemical reaction network, the measured metabolite concentrations are not independent. In this thesis, we evaluated statistical methods to reconstruct metabolic pathway reactions using large-scale metabolomics data from population cohorts.
The main focus of this work was on Gaussian graphical models (GGMs). GGMs are network models based on partial correlation coefficients, a statistical measure able to differentiate between direct and indirect associations in the data. In a first study, we could show that the networks reconstructed by GGMs indeed correspond to real metabolic pathways.
We then applied metabolomics GGMs to various biological questions. First, GGMs were used in combination with large-scale genotyping data to elucidate the identity of experimentally unidentified metabolites. Since a GGM will automatically put unannotated substances into their respective metabolic network context, we can gain hints on the metabolic reactions and pathways the unknown might be involved in. With this approach, we successfully identified 9 unknown metabolites, some of which had been previously reported to show associations with disease-related traits. Second, we integrated the GGMs with results from differential concentration analysis of gender-specific differences, influences of the fat-free body mass, and the type D personality on the metabolome. Using this combination of classical statistical methods with the network-based GGM approach, we could pinpoint specific changes in the metabolic pathways for the respective phenotypic traits.
In the last part of the thesis, we extended the purely covariance-based analysis of metabolite dependencies by integrating higher-order statistical moments using independent component analysis (ICA). We could show that the reconstructed statistically independent metabolite profiles contain strong signatures of specific metabolic pathways, including amino acid metabolism, lipid metabolism, and energy metabolism. Furthermore, the strength of a specific independent component in the study participants appeared to represent a strong biomarker for blood HLD (high density lipoprotein) levels.
In summary, this thesis establishes statistical methods for metabolomics data and provides various suggestions for the advanced biological analysis of large-scale datasets.
«
Metabolites are intermediate molecules of metabolic processes such as sugars, amino acids, fatty acids or vitamins, which are nowadays measured in a high-throughput manner. Since metabolites are strongly interconnected in a biochemical reaction network, the measured metabolite concentrations are not independent. In this thesis, we evaluated statistical methods to reconstruct metabolic pathway reactions using large-scale metabolomics data from population cohorts.
The main focus of this work wa...
»