CLASSIFICATION ANALYSIS USING BOOTSTRAP AGGREGATING MULTIVARIATE ADAPTIVE REGRESSION SPLINE (BAGGING MARS)

  • Rina Apriany Helen Wite Rupilu Departement of Mathematics, Faculty of Mathematics and Natural Sciences, Universitas Gadjah Mada, Indonesia https://orcid.org/0009-0001-5199-9876
  • Dedi Rosadi Departement of Mathematics, Faculty of Mathematics and Natural Sciences, Universitas Gadjah Mada, Indonesia
Keywords: Bagging, Classification, Diabetes Status, MARS

Abstract

Classification analysis is a method used to classify or analyze the relationship between several predictor variables and response variables that aim to predict the class of an object whose label is unknown. This classification problem arises when a number of measures consist of one or more categories that cannot be defined directly but use a measure. MARS is one of the classification methods focused on overcoming high-dimensionality and discontinuity problems in data. The accuracy or classification level of the MARS method can be improved using a resampling method, namely bagging. This study will apply the MARS model to obtain a model for classifying the status of people with diabetes based on people with diabetes. The data used in this study is secondary data obtained from the Kaggle website which can be accessed through https://www.kaggle.com/uciml/pima-indians-diabetes-database, namely the Pima Indians Diabetes Database and processed using R software. The results of MARS modeling concluded that the probability of someone having diabetes is 0. The probability of someone not having diabetes is 1, with a classification accuracy of 81.38%. In contrast, the accuracy of the best MARS bagging method among 200 replications is 75.23%, so in this study, a more appropriate method is used to classify the status of people with diabetes.

Downloads

Download data is not yet available.

References

B. W. Arleina. Oktiva D. and Otok. “Bootstrap Aggregating Multivariate Adaptive Regression Splines (Bagging MARS) untuk Mengklasifikasikan Rumah Tangga Miskin di Kabupaten Jombang.” J. Sains Dan Seni Pomits. vol. 3. no. 2. pp. 91–96. 2014.

N. R. Draper and H. Smith. “Applied Regression Analysis. 3rd Edition.” John Wiley & Sons. Inc. p. 736. 1998.

“(Statistics. textbooks and monographs 157) Randall L. Eubank - Nonparametric Regression and Spline Smoothing-Marcel Dekker (1999).pdf.”

J. H. Friedman. “Multivariate Adaptive Regression Splines.” Ann. Stat.. vol. 19. no. 1. pp. 1–67. 1991. doi: 10.1214/aos/1176347963.

L. Breiman. “Bagging predictors.” Mach. Learn.. vol. 24. pp. 123–140. 1996.

D. Çanga and M. Boğa. “Determination of the Effect of Some Properties on Egg Yield with Regression Analysis Met-hod Bagging Mars and R Application Yumurta Verimi Üzerine Bazı Özelliklerin Etkisinin Regresyon Analiz Yöntemlerinden Bagging Mars ile Belirlenmesi ve R Uygulaması.” Turkish J. Agric. - Food Sci. Technol.. vol. 8. no. 8. pp. 1705–1712. 2020.

S. Gocheva-Ilieva. A. Ivanov. and M. Stoimenova-minova. “Prediction of Daily Mean PM10 Concentrations Using Random Forest. CART Ensemble and Bagging Stacked by MARS.” Sustain.. vol. 14. no. 2. 2022. doi: 10.3390/su14020798.

L. Wei. W. Tian. E. A. Silva. R. Choudhary. Q. Meng. and S. Yang. “Comparative study on machine learning for urban building energy analysis.” Procedia Eng.. vol. 121. pp. 285–292. 2015.

W. Zhang and A. T. C. Goh. “Evaluating seismic liquefaction potential using multivariate adaptive regression splines and logistic regression.” Geomech. Eng.. vol. 10. no. 3. pp. 269–284. 2016. doi: 10.12989/gae.2016.10.3.269.

A. T. C. Goh. Y. Zhang. R. Zhang. W. Zhang. and Y. Xiao. “Evaluating stability of underground entry-type excavations using multivariate adaptive regression splines and logistic regression.” Tunn. Undergr. Sp. Technol.. vol. 70. no. March. pp. 148–154. 2017. doi: 10.1016/j.tust.2017.07.013.

M. Hasyim et al.. “Bootstrap Aggregating Multivariate Adaptive Regression Splines (Bagging MARS) to Analyze the Lecturer Research Performance in Private University.” J. Phys. Conf. Ser.. vol. 1114. no. 1. 2018. doi: 10.1088/1742-6596/1114/1/012117.

W. Härdle and L. Simar. Applied multivariate statistical analysis: Second edition. 2007. doi: 10.1007/978-3-540-72244-1.

L. NAHRIYAH. “Bootstrap Aggregating Multivariate Adaptive Regression Splines (Bagging MARS) untuk Klasifikasi Pasien HIV/AIDS di Kabupaten Pasuruan.” Surabaya: Program Studi Sarjana Departemen Statistika Fakultas Matematika dan Ilmu Pengetahuan Alam Institut Teknologi Sepuluh November. 2017.

B. W. Otok. R. Y. Putra. Sutikno. and S. D. P. Yasmirullah. “Bootstrap aggregating multivariate adaptive regression spline for observational studies in diabetes cases.” Syst. Rev. Pharm.. vol. 11. no. 8. pp. 406–413. 2020. doi: 10.31838/srp.2020.8.59.

M. N. Rahmaniah. Y. N. N. and M. N. Rahmaniah. “Bootstrap Aggregating Multivariate Adaptive Regression Splines (Studi Kasus : Identifikasi Komponen Penciri Akreditasi Sekolah/Madrasah Pada Tingkat SD/MI di Provinsi Kalimantan Timur Tahun 2015) Bootstrap.” J. EKSPONENSIAL. vol. 7. pp. 163–170. 2016.

M. A. Sahraei. H. Duman. M. Y. Çodur. and E. Eyduran. “Prediction of transportation energy demand: Multivariate Adaptive Regression Splines.” Energy. vol. 224. p. 120090. 2021. doi: 10.1016/j.energy.2021.120090.

Ö. Şengül. Ş. Çelik. and A. K. İbrahim. “Determination of the Effects of Silage Type. Silage Consumption. Birth Type and Birth Weight on Fattening Final Live Weight in Kıvırcık Lambs with MARS and Bagging MARS Algorithms.” Kafkas Univ. Vet. Fak. Derg.. vol. 28. no. 3. pp. 379–389. 2022. doi: 10.9775/kvfd.2022.27149.

A. H. Naser. A. H. Badr. S. N. Henedy. K. A. Ostrowski. and H. Imran. “Application of Multivariate Adaptive Regression Splines (MARS) approach in prediction of compressive strength of eco-friendly concrete.” Case Stud. Constr. Mater.. vol. 17. no. June. 2022. doi: 10.1016/j.cscm.2022.e01262.

V. Q. Lai. F. Lai. D. Yang. J. Shiau. W. Yodsomjai. and S. Keawsawasvong. “Determining Seismic Bearing Capacity of Footings Embedded in Cohesive Soil Slopes Using Multivariate Adaptive Regression Splines.” Int. J. Geosynth. Gr. Eng.. vol. 8. no. 4. pp. 1–18. 2022. doi: 10.1007/s40891-022-00390-2.

W. H. Chen et al.. “Forecast of glucose production from biomass wet torrefaction using statistical approach along with multivariate adaptive regression splines. neural network and decision tree.” Appl. Energy. vol. 324. no. July. p. 119775. 2022. doi: 10.1016/j.apenergy.2022.119775.

J. H. Friedman and B. W. Silverman. “Flexible parsimonious smoothing and additive modeling.” Technometrics. vol. 31. no. 1. pp. 3–21. 1989.

J. F. Hair JR. W. C. Black. B. J.Babin. and R. E. Anderson. “Joseph F. Hair. William C. Black. Barry J. Babin. Rolph E. Anderson - Multivariate Data Analysis (7th Edition)-Prentice Hall (2009).pdf.” p. 161. 2009.

Published
2024-07-31
How to Cite
[1]
R. Rupilu and D. Rosadi, “CLASSIFICATION ANALYSIS USING BOOTSTRAP AGGREGATING MULTIVARIATE ADAPTIVE REGRESSION SPLINE (BAGGING MARS)”, BAREKENG: J. Math. & App., vol. 18, no. 3, pp. 1381-1390, Jul. 2024.