Gene Expression Value Prediction Based on XGBoost Algorithm

Li, Wei and Yin, Yanbin and Quan, Xiongwen and Zhang, Han (2019) Gene Expression Value Prediction Based on XGBoost Algorithm. Frontiers in Genetics, 10. ISSN 1664-8021

[thumbnail of pubmed-zip/versions/1/package-entries/fgene-10-01077.pdf] Text
pubmed-zip/versions/1/package-entries/fgene-10-01077.pdf - Published Version

Download (1MB)

Abstract

Gene expression profiling has been widely used to characterize cell status to reflect the health of the body, to diagnose genetic diseases, etc. In recent years, although the cost of genome-wide expression profiling is gradually decreasing, the cost of collecting expression profiles for thousands of genes is still very high. Considering gene expressions are usually highly correlated in humans, the expression values of the remaining target genes can be predicted by analyzing the values of 943 landmark genes. Hence, we designed an algorithm for predicting gene expression values based on XGBoost, which integrates multiple tree models and has stronger interpretability. We tested the performance of XGBoost model on the GEO dataset and RNA-seq dataset and compared the result with other existing models. Experiments showed that the XGBoost model achieved a significantly lower overall error than the existing D-GEX algorithm, linear regression, and KNN methods. In conclusion, the XGBoost algorithm outperforms existing models and will be a significant contribution to the toolbox for gene expression value prediction.

Item Type: Article
Subjects: Library Keep > Medical Science
Depositing User: Unnamed user with email support@librarykeep.com
Date Deposited: 17 Feb 2023 12:03
Last Modified: 17 Feb 2024 04:14
URI: http://archive.jibiology.com/id/eprint/149

Actions (login required)

View Item
View Item