EVALUATION OF MULTIVARIATE ADAPTIVE REGRESSION SPLINES ON IMBALANCED DATASET FOR POVERTY CLASSIFICATION IN BENGKULU PROVINCE
Abstract
Classification is a statistical method that aims to predict the class of an object whose class label is unknown. The Multivariate Adaptive Regression Splines (MARS) classification method is a classification model that involves several basis functions with influential predictor variables. The MARS classification model is generally effective in classifying imbalanced data, including poverty data classification. The response variable used is the poverty status of households classified into poor and non-poor households, and the predictor variables consist of several poverty indicators. The problem that often arises in classification methods is a class imbalance in the response variable. Due to the poverty status included in the class imbalance data, the Bootstrap Aggregating (Bagging) and Synthetic Minority Over-sampling Technique (SMOTE) approaches will be used to improve classification accuracy on the MARS model. Bagging works by replicating data to strengthen the stability of classification accuracy, while SMOTE works by synthesizing data from minority data classes. The evaluation results showed that the classification model of poverty in Bengkulu Province using the SMOTE-MARS method provides the best classification accuracy compared to the MARS (25.81%) and Bagging-MARS (32.26%) methods based on the sensitivity value obtained, which is 85.36%.
Downloads
References
D. Barry and W. Hardle, APPLIED NONPARAMETRIC REGRESSION., 1st ed. Cambridge University Press., 1994. doi: 10.2307/2982873.
J. Han, J. Pei, and H. Tong, DATA MINING CONCEPTS AND TECHNIQUES, 4th ed. Morgan Kaufmann, 2023.
W. Zhang, A. T. C. Goh, and Y. Zhang, “MULTIVARIATE ADAPTIVE REGRESSION SPLINES APPLICATION FOR MULTIVARIATE GEOTECHNICAL PROBLEMS WITH BIG DATA,” Geotech. Geol. Eng., vol. 34, no. 1, pp. 193–204, 2016, doi: 10.1007/s10706-015-9938-9.
D. Çanga, “USE OF MARS DATA MINING ALGORITHM BASED ON TRAINING AND TEST SETS IN DETERMINING CARCASS WEIGHT OF CATTLE IN DIFFERENT BREEDS,” Tarim Bilim. Derg., vol. 28, no. 2, pp. 259–268, 2022, doi: 10.15832/ankutbd.818397.
J. M. Johnson and T. M. Khoshgoftaar, “SURVEY ON DEEP LEARNING WITH CLASS IMBALANCE,” J. Big Data, vol. 6, no. 1, pp. 1–54, 2019, doi: 10.1186/s40537-019-0192-5.
Tamonob, Onisimus, Sumertajaya, I. Made, Rahman, and L. O. Abdul, “ANALISIS MULTIVARIATE ADAPTIVE REGRESSION SPLINES (MARS) UNTUK MENGKLASIFIKASIKAN STATUS DESA DI PROVINSI NUSA TENGGARA TIMUR,” Institute Pertanian Bogor, 2020.
R. D. L. N. Karisma, J. Juhari, and R. A Rosa, “POVERTY IN CENTRAL JAVA USING MULTIVARIATE ADAPTIVE REGRESSION SPLINES AND BOOTSTRAP AGGREGATING MULTIVARIATE ADAPTIVE REGRESSION SPLINES,” CAUCHY J. Mat. Murni dan Apl., vol. 6, no. 4, pp. 238–245, 2021, doi: 10.18860/ca.v6i4.10871.
M. Hasyim et al., “BOOTSTRAP AGGREGATING MULTIVARIATE ADAPTIVE REGRESSION SPLINES (BAGGING MARS) TO ANALYSE THE LECTURER RESEARCH PERFORMANCE IN PRIVATE UNIVERSITY,” J. Phys. Conf. Ser., vol. 1114, no. 1, 2018, doi: 10.1088/1742-6596/1114/1/012117.
B. K. Kilinc, S. Malkoc, A. S. Koparal, and B. Yazici, “USING MULTIVARIATE ADAPTIVE REGRESSION SPLINES TO ESTIMATE POLLUTION IN SOIL,” Int. J. Adv. Appl. Sci., vol. 4, no. 2, pp. 10–16, 2017, doi: 10.21833/ijaas.2017.02.002.
Nidhomuddin and B. W. Otok, “RANDOM FOREST DAN MULTIVARIATE ADAPTIVE REGRESSION SPLINE (MARS) BINARY RESPONSE UNTUK KLASIFIKASI PENDERITA HIV/AIDS DI SURABAYA,” Stat. Fak. Mat. dan Ilmu Pengetah. Alam Inst. Teknol. Sepuluh Novemb., vol. 1, no. 3, pp. 50–57, 2015.
B. P. Statistik, “PROFIL KEMISKINAN DI INDONESIA MARET 2023,” 2023. [Online]. Available: https://www.bps.go.id/pressrelease/2018/07/16/1483/persentase-penduduk-miskin-maret-2018-turun-menjadi-9-82-persen.html
Badan Pusat Statistik, “PROFIL KEMISKINAN PROVINSI BENGKULU MARET 2023,” 2023.
M. A. Sahraei, H. Duman, M. Y. Çodur, and E. Eyduran, “PREDICTION OF TRANSPORTATION ENERGY DEMAND: MULTIVARIATE ADAPTIVE REGRESSION SPLINES,” Energy, vol. 224, pp. 1–9, 2021, doi: 10.1016/j.energy.2021.120090.
D. R. Cox and E. J. Snell, ANALYSIS OF BINARY DATA, 2nd ed. CRC PRess, 1989.
B. C. L. Huang, Y. Xiang, and Z. H. Huang, “USE LOGISTIC REGRESSION TO PREDICT USER’ BEHAVIORS,” Appl. Mech. Mater., vol. 651–653, pp. 1695–1698, 2014, doi: 10.4028/www.scientific.net/AMM.651-653.1695.
A. Agresti, AN INTRODUCTION TO CATEGORICAL DATA ANALYSIS, 2nd ed., vol. 28, no. 11. Florida: A John Wiley & Sons, Inc, 2009. doi: 10.1002/sim.3564.
W. Agwil, D. Agustina, H. Fransiska, and N. Hidayati, “KLASIFIKASI KARAKTERISTIK KEMISKINAN DI PROVINSI BENGKULU TAHUN 2020 MENGGUNAKAN METODE POHON KLASIFIKASI GABUNGAN,” J. Apl. Stat. Komputasi Stat., vol. 14, no. 2, pp. 23–32, 2022.
D. Elreedy and A. F. Atiya, “A COMPREHENSIVE ANALYSIS OF SYNTHETIC MINORITY OVERSAMPLING TECHNIQUE (SMOTE) FOR HANDLING CLASS IMBALANCE,” Inf. Sci. (Ny)., vol. 505, pp. 32–64, 2019, doi: 10.1016/j.ins.2019.07.070.
Copyright (c) 2025 Idhia Sriliana, Sigit Nugroho, Winalia Agwil, Esther Damayanti Sihombing

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this Journal agree to the following terms:
- Author retain copyright and grant the journal right of first publication with the work simultaneously licensed under a creative commons attribution license that allow others to share the work within an acknowledgement of the work’s authorship and initial publication of this journal.
- Authors are able to enter into separate, additional contractual arrangement for the non-exclusive distribution of the journal’s published version of the work (e.g. acknowledgement of its initial publication in this journal).
- Authors are permitted and encouraged to post their work online (e.g. in institutional repositories or on their websites) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published works.