An Auxiliary Approach to Prediction of Binary Outcome with Bayesian Network Model: Exploration with Data for Recurrence of Breast Cancer

Ganapathy, Sachit and Harichandrakumar, KT and Tamilarasu, Kadhiravan and Penumadu, Prasanth and Nair, N Sreekumaran (2023) An Auxiliary Approach to Prediction of Binary Outcome with Bayesian Network Model: Exploration with Data for Recurrence of Breast Cancer. JOURNAL OF CLINICAL AND DIAGNOSTIC RESEARCH, 17 (3). YC06 -YC10. ISSN 2249782X

[thumbnail of 59472_CE[Ra1]_F[SK]_PF1_(AnK_SS_OM)_PN(SS).pdf] Text
59472_CE[Ra1]_F[SK]_PF1_(AnK_SS_OM)_PN(SS).pdf - Published Version

Download (327kB)

Abstract

Introduction: Logistic regression is the classical statistical model that is incorporated to predict a binary outcome variable. These models have theoretical assumptions of independence of predictor variables and linearity of association with the outcome in the logarithmic scale. Alternative models developed in the machine learning context like Naïve Bayes model with similar assumptions and Bayesian Network (BN) model can be used for binary prediction.

Aim: To compare the predictive performance of logistic regression, Naïve Bayes and BN model in predicting the recurrence of Breast cancer.

Materials and Methods: The dataset was procured from UCI Machine Learning repository on recurrence of breast cancer. The study was done on retrospective data from December 2021 to July 2022. The sample size was boosted with the bootstrapping with logistic regression model. The dataset was split into training (70%) and testing (30%) dataset for internal validation. The effect estimates of the potential prognostic variables were estimated using multiple logistic regression model. Naïve Bayes and BN model was also learnt from the training dataset. The indices of predictive accuracy were estimated for the models in both training and testing dataset.

Results: Degree of malignancy and side of affected breast were found to be significant predictors of recurrence of breast cancer. BN model had the least misclassification rate and the best sensitivity in comparison to other models in spite of imbalance in outcome variable.

Conclusion: BN model performed the best in comparison to logistic regression model when the assumptions of logistic regression model were violated and there is imbalance in proportion of outcome.

Item Type: Article
Subjects: Library Keep > Medical Science
Depositing User: Unnamed user with email support@librarykeep.com
Date Deposited: 27 Jun 2023 06:59
Last Modified: 02 Dec 2023 05:57
URI: http://archive.jibiology.com/id/eprint/1228

Actions (login required)

View Item
View Item