Comparison of LASSO, Ridge, and Elastic Net Regularization with Balanced Bagging Classifier

Keywords: Balanced Bagging Classifier, Binary Classification, Drug-Induced Autoimmunity, Feature Selection

Abstract

Predicting Drug-Induced Autoimmunity (DIA) is crucial in pharmaceutical safety assessment, as early identification of compounds with autoimmune risk can prevent adverse drug reactions and improve patient outcomes. Classification analysis often faces challenges when the number of predictor variables exceeds the number of observations or when high correlations among predictors lead to multicollinearity and overfitting. Regularization methods, such as Ridge Regression, Least Absolute Shrinkage and Selection Operator (LASSO), and Elastic-Net, help stabilize parameter estimation and improve model interpretability. This study focuses on building a binary classification model to predict the risk of DIA using 196 molecular descriptors derived from chemical compound structures. To address class imbalance in the response variable, the Balanced Bagging Classifier (BBC) is combined with regularized logistic regression models. Elastic Net + BBC outperforms other models with the highest accuracy (0.825), followed closely by LASSO + BBC and Ridge + BBC (both 0.816). This integration not only improves classification accuracy but also enhances generalization and the reliable detection of minority class instances, supporting the early identification of autoimmune risks in drug discovery.

Downloads

Download data is not yet available.
Published
2025-09-16