Clinical Factor Analysis and Comparison of Heart Failure Patient Prediction Models Using Logistic Regression and XGBoost
Abstract
Heart failure is a serious chronic condition and a leading cause of death globally. Early detection of mortality risk among heart failure patients remains a challenge due to the complexity of clinical data. This study aims to identify the most influential clinical factors associated with patient mortality and to compare the performance of two classification models. Logistic Regression and Extreme Gradient Boosting in predicting death risk. The dataset includes clinical and demographic variables of heart failure patients. Key predictors identified include serum creatinine, ejection fraction, time, and age, which are clinically associated with kidney function, cardiac output, and treatment progression. These features were selected based on their relevance and contribution to the model’s predictive performance. Model performance was evaluated using accuracy, precision, recall, F1-score, and AUC. Results indicate that XGBoost slightly outperformed Logistic Regression in terms of accuracy (85%) and recall (63%) compared to Logistic Regression (83% and 58%). However, Logistic Regression achieved a higher AUC (0.88) and showed more consistent results between training and testing data. Its interpretability also makes it more appropriate for clinical applications. This study underscores the potential of data-driven approaches in enhancing risk stratification and guiding early interventions in heart failure management.
Downloads
Copyright (c) 2026 Novri Suhermi

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

















