Comparison of LASSO, Ridge, and Elastic Net Regularization with Balanced Bagging Classifier
Abstract
Predicting Drug-Induced Autoimmunity (DIA) is crucial in pharmaceutical safety assessment, as early identification of compounds with autoimmune risk can prevent adverse drug reactions and improve patient outcomes. Classification analysis often faces challenges when the number of predictor variables exceeds the number of observations or when high correlations among predictors lead to multicollinearity and overfitting. Regularization methods, such as Ridge Regression, Least Absolute Shrinkage and Selection Operator (LASSO), and Elastic-Net, help stabilize parameter estimation and improve model interpretability. This study focuses on building a binary classification model to predict the risk of DIA using 196 molecular descriptors derived from chemical compound structures. To address class imbalance in the response variable, the Balanced Bagging Classifier (BBC) is combined with regularized logistic regression models. Elastic Net + BBC outperforms other models with the highest accuracy (0.825), followed closely by LASSO + BBC and Ridge + BBC (both 0.816). This integration not only improves classification accuracy but also enhances generalization and the reliable detection of minority class instances, supporting the early identification of autoimmune risks in drug discovery.
Downloads
Copyright (c) 2025 Putri Nisrina Az-Zahra, Kusman Sadik, Cici Suhaeni, Agus Mohamad Soleh

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.