Evaluating the Performance of Ordinal Logistic Regression and XGBoost on Ordinal Classification Datasets

  • Jasmin Nur Hanifa IPB University, Indonesia
  • Rizka Annisa Mingka IPB University, Indonesia
  • Indahwati Indahwati IPB University, Indonesia
  • Pika Silvianti IPB University, Indonesia
Keywords: Classification, Ordinal Logistic Regression, Ordinal XGBoost.

Abstract

Choosing the appropriate classification model is crucial, especially when dealing with data featuring an ordinal dependent variable. This study explores and compares the performance of Ordinal Logistic Regression (OLR) and Ordinal XGBoost in classifying ordinal data using ten datasets obtained from the UCI Machine Learning Repository and Kaggle, which vary in the number of observations and features. Each dataset undergoes multicollinearity detection, an 80% training and 20% testing data split, and class balancing using SMOTE. Model performance is evaluated using metrics such as accuracy, F1-score, AUC, MSE, precision, and recall. The results show that ordinal XGBoost outperforms on datasets with complex structures and a higher number of features, achieving a maximum accuracy of 0.953. In contrast, Ordinal Logistic Regression demonstrates more stable performance on datasets with fewer features or balanced class distributions.

Downloads

Download data is not yet available.
Published
2025-12-20