Evaluating the Performance of Ordinal Logistic Regression and XGBoost on Ordinal Classification Datasets
Abstract
Choosing the appropriate classification model is crucial, especially when dealing with data featuring an ordinal dependent variable. This study explores and compares the performance of Ordinal Logistic Regression (OLR) and Ordinal XGBoost in classifying ordinal data using ten datasets obtained from the UCI Machine Learning Repository and Kaggle, which vary in the number of observations and features. Each dataset undergoes multicollinearity detection, an 80% training and 20% testing data split, and class balancing using SMOTE. Model performance is evaluated using metrics such as accuracy, F1-score, AUC, MSE, precision, and recall. The results show that ordinal XGBoost outperforms on datasets with complex structures and a higher number of features, achieving a maximum accuracy of 0.953. In contrast, Ordinal Logistic Regression demonstrates more stable performance on datasets with fewer features or balanced class distributions.
Downloads
Copyright (c) 2025 Jasmin Nur Hanifa, Rizka Annisa Mingka, Indahwati Indahwati, Pika Silvianti

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

















