Statistical prediction of microbial metabolic traits from genomes

Li, Zeqian and Selim, Ahmed and Kuehn, Seppe and Ouzounis, Christos A. (2023) Statistical prediction of microbial metabolic traits from genomes. PLOS Computational Biology, 19 (12). e1011705. ISSN 1553-7358

[thumbnail of journal.pcbi.1011705.pdf] Text
journal.pcbi.1011705.pdf - Published Version

Download (3MB)

Abstract

The metabolic activity of microbial communities is central to their role in biogeochemical cycles, human health, and biotechnology. Despite the abundance of sequencing data characterizing these consortia, it remains a serious challenge to predict microbial metabolic traits from sequencing data alone. Here we culture 96 bacterial isolates individually and assay their ability to grow on 10 distinct compounds as a sole carbon source. Using these data as well as two existing datasets, we show that statistical approaches can accurately predict bacterial carbon utilization traits from genomes. First, we show that classifiers trained on gene content can accurately predict bacterial carbon utilization phenotypes by encoding phylogenetic information. These models substantially outperform predictions made by constraint-based metabolic models automatically constructed from genomes. This result solidifies our current knowledge about the strong connection between phylogeny and metabolic traits. However, phylogeny-based predictions fail to predict traits for taxa that are phylogenetically distant from any strains in the training set. To overcome this we train improved models on gene presence/absence to predict carbon utilization traits from gene content. We show that models that predict carbon utilization traits from gene presence/absence can generalize to taxa that are phylogenetically distant from the training set either by exploiting biochemical information for feature selection or by having sufficiently large datasets. In the latter case, we provide evidence that a statistical approach can identify putatively mechanistic genes involved in metabolic traits. Our study demonstrates the potential power for predicting microbial phenotypes from genotypes using statistical approaches.

Item Type: Article
Subjects: Library Keep > Biological Science
Depositing User: Unnamed user with email support@librarykeep.com
Date Deposited: 10 Apr 2024 13:05
Last Modified: 10 Apr 2024 13:05
URI: http://archive.jibiology.com/id/eprint/2367

Actions (login required)

View Item
View Item