Clinical Factor Analysis and Comparison of Heart Failure Patient Prediction Models Using Logistic Regression and XGBoost

  • Novri Suhermi Institut Teknologi Sepuluh Nopember, Indonesia https://orcid.org/0000-0002-8016-5803
  • Rahida R. Aisy Institut Teknologi Sepuluh Nopember, Indonesia https://orcid.org/0009-0006-5379-8185
  • Auriga Wiradhiani Institut Teknologi Sepuluh Nopember, Indonesia
  • Edvina Kresnaningrum Institut Teknologi Sepuluh Nopember, Indonesia
  • Fathin Sahirah Institut Teknologi Sepuluh Nopember, Indonesia
  • Faza Inayatulloh Institut Teknologi Sepuluh Nopember, Indonesia
  • Grahsaro Y. Teduhati Institut Teknologi Sepuluh Nopember, Indonesia
  • Hollyviar R. P. Zalukhu Institut Teknologi Sepuluh Nopember, Indonesia
  • Muhammad R. Insan Institut Teknologi Sepuluh Nopember, Indonesia
  • Regytha P. Ayuningtyas Institut Teknologi Sepuluh Nopember, Indonesia
  • Yoel P. Simamora Institut Teknologi Sepuluh Nopember, Indonesia
  • Linda D. Rahmawati Institut Teknologi Sepuluh Nopember, Indonesia
  • Zelika A. Rachman Institut Teknologi Sepuluh Nopember, Indonesia
  • Syahwalia Asacha Institut Teknologi Sepuluh Nopember, Indonesia
Keywords: Heart failure, logistic regression, machine learning, XGBoost.

Abstract

Heart failure is a  serious chronic condition and a leading cause of death globally. Early detection of mortality risk among heart failure patients remains a challenge due to the complexity of clinical data. This study aims to identify the most influential clinical factors associated with patient mortality and to  compare  the  performance  of  two  classification models,Logistic  Regression  and Extreme  Gradient Boosting,in  predicting  death  risk.  The  dataset  includes  clinical  and  demographic  variables  of  heart failure  patients.Key  predictors  identified  include  serum  creatinine,  ejection  fraction,  time,  and  age, which are clinically associated with kidney function, cardiac output, and treatment progression. These features were selected based on their relevance and contribution to the model’s predictive performance. Model  performance  was  evaluated  using  accuracy,  precision,recall,  F1-score,  and  AUC.  Results indicate that XGBoost slightly outperformed Logistic Regression in terms of accuracy (85%) and recall (63%)  compared  to  Logistic  Regression  (83%  and  58%).  However,  Logistic  Regression  achieved  a higher  AUC  (0.88)  and  showed  more  consistent  results  between  training  and  testing  data.  Its interpretability  also  makes  it  more  appropriate  for  clinical  applications.  This  study  underscores  the potential of data-driven approaches in enhancing risk stratification and guiding early interventions in heart failure management.

Downloads

Download data is not yet available.
Published
2026-05-26