A COMPARISON OF RANDOM FOREST AND DOUBLE RANDOM FOREST: DROPOUT RATES OF MADRASAH STUDENTS IN INDONESIA
Abstract
Random forest algorithm allows for building better CART models. However, the disadvantage of this method is often underfitting, especially for small node sizes. Therefore, the double random forest method was developed to overcome this problem. The research was conducted by utilising Education Management Information System (EMIS) data, which is related to the incidence of school dropout. The data used consists of 2 data, namely MTs and MA dropout data. The initial testing procedure was carried out using the random forest algorithm for each data set, then the data was evaluated using the double random forest method. From this study, the underfitting case can be overcome well using the double random forest algorithm, while in the fit case, the difference in the goodness-of-fit value of the model is relatively the same. The results obtained show that MTs prioritise school quality more than MA, although family factors are more important at the MA level. Although the total number of factors used is basically the same, it should be noted that the two school levels have different relevance variables. It should be noted that no forecasting was done in this study given that the methodology used two different types of data.
Downloads
References
C. Indonesia, "Pembelajaran Daring Bisa Tekan Angka Putus Sekolah," CNBC Indonesia, Jakarta, 2022.
e. a. Nurmalitasari, "Factors Influencing Dropout Students in Higher Education," Education Research International, vol. 2023, no. education, p. 13, 2023.
A. Hakim, "Faktor Penyebab Anak Putus Sekolah," Jurnal Pendidikan, vol. 21, no. 2, pp. 122-132, 2020.
K. A. R. Indonesia, "Menjadi Muslim, Menjadi Indonesia (Kilas Balik Indonesia Menjadi Bangsa Muslim Terbesar)," Kementrian Agama Republik Indonesia, Jakarta, 2020.
A. Nurriqi, "Karakteristik Pendidikan Agama Islam di Madrasah Prespektif Kebijakan Pendidikan," Bintang Jurnal Pendidikan dan Sains, vol. 3, no. 1, pp. 124-141, 2021.
I. Adelia and O. Mitra, "Permasalahan Pendidikan Islam di Lembaga Pendidikan Madrasah," Islamika: Jurnal Ilmu-ilmu Keislaman, vol. 21, no. 01, pp. 32-45, 2021.
I. Turmidzi, "Pengelolaan Pendidikan Bermutu di Madrasah," Tarbawi, vol. 4, no. 2, pp. 165-181, 2021.
S. e. Badillo, "An Introduction to Machine Learning," Clinic Pharmacology and Therapeutics, vol. 107, no. 04, pp. 871-885, 2020.
L. Breiman, "Random Forests," Kluwer Academic Publishers. Manufactured in The Netherland, vol. 45, pp. 5-32, 2001.
Genuer, Robin and J.-M. Poggi , Random Forest with R, Cham: Springer Cham, 2020.
S. Han, H. Kim and Y.-S. Lee , "Double random forest," Springer, vol. 109, p. 1569–1586, 2020.
A. N. A. Aldania, A. M. Soleh and K. A. Notodiputro, "A Comparative Study of CatBoost and Double Random Forest for Multi-class Classification," Jurnal Resti : Rekayasa Sistem dan Teknologi Indformasi, vol. 7, no. 1, pp. 129 - 137, 2023.
T. Hastie, R. Tibshirani and J. Friedman, The Elements of Statistical Learning, New York: Springer, 2008.
,. M. e. a. Ganaie, "Heterogeneous Oblique Double Random Forest," arXiv, pp. 1-35, 2023.
W. E. Hipson and D. G. Séguin, "Goodness of Fit Model," Springer International Publishing, no. Encyclopedia of Personality and Individual Differences,, 2017.
Z. A. Sari and M. Andarwati, "PERAMALAN DOUBLE MOVING AVERAGEDAN DOUBLE EXPONENTIAL SMOOTHING JUMLAH PENUMPANGDI STASIUN KOTABARUMALANG," JournalofInformationSystemsManagementandDigitalBusiness, vol. 1, no. 2, pp. 263-272, 2024.
A. H. e. a. Hutasuhut, "PEMBUATAN APLIKASI PENDUKUNG KEPUTUSAN UNTUK PERAMALAN PERSEDIAAN BAHAN BAKU PRODUKSI PLASTIK BLOWING DAN INJECT MENGGUNAKAN METODE ARIMA (AUTOREGRESSIVE INTEGRATED MOVING AVERAGE) DI CV. ASIA," Jurnal Teknik ITS, vol. 3, no. 2, 2014.
P. Wei , Z. Lu and J. Song, "Variable importance analysis: A comprehensive review," Science Direct, vol. 142, pp. 399-432, 2015.
L. Breiman , "Random Forests," Machine Learning, vol. 45, pp. 5-32, 2001.
F. Hutter, H. Hoos and K. L. Brown, "An Efficient Approach for Assessing Hyperparameter Importance," in Proceedings of Machine Learning Research, 2014.
M. A. Salam and e. al., "The Effect of Different Dimensionality Reduction Techniques on Machine Learning Overfitting Problem," (IJACSA) International Journal of Advanced Computer Science and Applications,, vol. 4, p. 12, 2021.
Copyright (c) 2025 Arie Purwanto, Bagus Sartono, Khairil Anwar Notodiputro
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this Journal agree to the following terms:
- Author retain copyright and grant the journal right of first publication with the work simultaneously licensed under a creative commons attribution license that allow others to share the work within an acknowledgement of the work’s authorship and initial publication of this journal.
- Authors are able to enter into separate, additional contractual arrangement for the non-exclusive distribution of the journal’s published version of the work (e.g. acknowledgement of its initial publication in this journal).
- Authors are permitted and encouraged to post their work online (e.g. in institutional repositories or on their websites) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published works.