CLASSIFICATION ANALYSIS USING BOOTSTRAP AGGREGATING MULTIVARIATE ADAPTIVE REGRESSION SPLINE (BAGGING MARS)
Abstract
Classification analysis is a method used to classify or analyze the relationship between several predictor variables and response variables that aim to predict the class of an object whose label is unknown. This classification problem arises when a number of measures consist of one or more categories that cannot be defined directly but use a measure. MARS is one of the classification methods focused on overcoming high-dimensionality and discontinuity problems in data. The accuracy or classification level of the MARS method can be improved using a resampling method, namely bagging. This study will apply the MARS model to obtain a model for classifying the status of people with diabetes based on people with diabetes. The data used in this study is secondary data obtained from the Kaggle website which can be accessed through https://www.kaggle.com/uciml/pima-indians-diabetes-database, namely the Pima Indians Diabetes Database and processed using R software. The results of MARS modeling concluded that the probability of someone having diabetes is 0. The probability of someone not having diabetes is 1, with a classification accuracy of 81.38%. In contrast, the accuracy of the best MARS bagging method among 200 replications is 75.23%, so in this study, a more appropriate method is used to classify the status of people with diabetes.
Downloads
References
B. W. Arleina. Oktiva D. and Otok. “Bootstrap Aggregating Multivariate Adaptive Regression Splines (Bagging MARS) untuk Mengklasifikasikan Rumah Tangga Miskin di Kabupaten Jombang.” J. Sains Dan Seni Pomits. vol. 3. no. 2. pp. 91–96. 2014.
N. R. Draper and H. Smith. “Applied Regression Analysis. 3rd Edition.” John Wiley & Sons. Inc. p. 736. 1998.
“(Statistics. textbooks and monographs 157) Randall L. Eubank - Nonparametric Regression and Spline Smoothing-Marcel Dekker (1999).pdf.”
J. H. Friedman. “Multivariate Adaptive Regression Splines.” Ann. Stat.. vol. 19. no. 1. pp. 1–67. 1991. doi: 10.1214/aos/1176347963.
L. Breiman. “Bagging predictors.” Mach. Learn.. vol. 24. pp. 123–140. 1996.
D. Çanga and M. Boğa. “Determination of the Effect of Some Properties on Egg Yield with Regression Analysis Met-hod Bagging Mars and R Application Yumurta Verimi Üzerine Bazı Özelliklerin Etkisinin Regresyon Analiz Yöntemlerinden Bagging Mars ile Belirlenmesi ve R Uygulaması.” Turkish J. Agric. - Food Sci. Technol.. vol. 8. no. 8. pp. 1705–1712. 2020.
S. Gocheva-Ilieva. A. Ivanov. and M. Stoimenova-minova. “Prediction of Daily Mean PM10 Concentrations Using Random Forest. CART Ensemble and Bagging Stacked by MARS.” Sustain.. vol. 14. no. 2. 2022. doi: 10.3390/su14020798.
L. Wei. W. Tian. E. A. Silva. R. Choudhary. Q. Meng. and S. Yang. “Comparative study on machine learning for urban building energy analysis.” Procedia Eng.. vol. 121. pp. 285–292. 2015.
W. Zhang and A. T. C. Goh. “Evaluating seismic liquefaction potential using multivariate adaptive regression splines and logistic regression.” Geomech. Eng.. vol. 10. no. 3. pp. 269–284. 2016. doi: 10.12989/gae.2016.10.3.269.
A. T. C. Goh. Y. Zhang. R. Zhang. W. Zhang. and Y. Xiao. “Evaluating stability of underground entry-type excavations using multivariate adaptive regression splines and logistic regression.” Tunn. Undergr. Sp. Technol.. vol. 70. no. March. pp. 148–154. 2017. doi: 10.1016/j.tust.2017.07.013.
M. Hasyim et al.. “Bootstrap Aggregating Multivariate Adaptive Regression Splines (Bagging MARS) to Analyze the Lecturer Research Performance in Private University.” J. Phys. Conf. Ser.. vol. 1114. no. 1. 2018. doi: 10.1088/1742-6596/1114/1/012117.
W. Härdle and L. Simar. Applied multivariate statistical analysis: Second edition. 2007. doi: 10.1007/978-3-540-72244-1.
L. NAHRIYAH. “Bootstrap Aggregating Multivariate Adaptive Regression Splines (Bagging MARS) untuk Klasifikasi Pasien HIV/AIDS di Kabupaten Pasuruan.” Surabaya: Program Studi Sarjana Departemen Statistika Fakultas Matematika dan Ilmu Pengetahuan Alam Institut Teknologi Sepuluh November. 2017.
B. W. Otok. R. Y. Putra. Sutikno. and S. D. P. Yasmirullah. “Bootstrap aggregating multivariate adaptive regression spline for observational studies in diabetes cases.” Syst. Rev. Pharm.. vol. 11. no. 8. pp. 406–413. 2020. doi: 10.31838/srp.2020.8.59.
M. N. Rahmaniah. Y. N. N. and M. N. Rahmaniah. “Bootstrap Aggregating Multivariate Adaptive Regression Splines (Studi Kasus : Identifikasi Komponen Penciri Akreditasi Sekolah/Madrasah Pada Tingkat SD/MI di Provinsi Kalimantan Timur Tahun 2015) Bootstrap.” J. EKSPONENSIAL. vol. 7. pp. 163–170. 2016.
M. A. Sahraei. H. Duman. M. Y. Çodur. and E. Eyduran. “Prediction of transportation energy demand: Multivariate Adaptive Regression Splines.” Energy. vol. 224. p. 120090. 2021. doi: 10.1016/j.energy.2021.120090.
Ö. Şengül. Ş. Çelik. and A. K. İbrahim. “Determination of the Effects of Silage Type. Silage Consumption. Birth Type and Birth Weight on Fattening Final Live Weight in Kıvırcık Lambs with MARS and Bagging MARS Algorithms.” Kafkas Univ. Vet. Fak. Derg.. vol. 28. no. 3. pp. 379–389. 2022. doi: 10.9775/kvfd.2022.27149.
A. H. Naser. A. H. Badr. S. N. Henedy. K. A. Ostrowski. and H. Imran. “Application of Multivariate Adaptive Regression Splines (MARS) approach in prediction of compressive strength of eco-friendly concrete.” Case Stud. Constr. Mater.. vol. 17. no. June. 2022. doi: 10.1016/j.cscm.2022.e01262.
V. Q. Lai. F. Lai. D. Yang. J. Shiau. W. Yodsomjai. and S. Keawsawasvong. “Determining Seismic Bearing Capacity of Footings Embedded in Cohesive Soil Slopes Using Multivariate Adaptive Regression Splines.” Int. J. Geosynth. Gr. Eng.. vol. 8. no. 4. pp. 1–18. 2022. doi: 10.1007/s40891-022-00390-2.
W. H. Chen et al.. “Forecast of glucose production from biomass wet torrefaction using statistical approach along with multivariate adaptive regression splines. neural network and decision tree.” Appl. Energy. vol. 324. no. July. p. 119775. 2022. doi: 10.1016/j.apenergy.2022.119775.
J. H. Friedman and B. W. Silverman. “Flexible parsimonious smoothing and additive modeling.” Technometrics. vol. 31. no. 1. pp. 3–21. 1989.
J. F. Hair JR. W. C. Black. B. J.Babin. and R. E. Anderson. “Joseph F. Hair. William C. Black. Barry J. Babin. Rolph E. Anderson - Multivariate Data Analysis (7th Edition)-Prentice Hall (2009).pdf.” p. 161. 2009.
Copyright (c) 2024 Rina Apriany Helen Wite Rupilu, Dedi Rosadi
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this Journal agree to the following terms:
- Author retain copyright and grant the journal right of first publication with the work simultaneously licensed under a creative commons attribution license that allow others to share the work within an acknowledgement of the work’s authorship and initial publication of this journal.
- Authors are able to enter into separate, additional contractual arrangement for the non-exclusive distribution of the journal’s published version of the work (e.g. acknowledgement of its initial publication in this journal).
- Authors are permitted and encouraged to post their work online (e.g. in institutional repositories or on their websites) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published works.