IMPACT OF FEATURE SELECTION ON DECISION TREE AND RANDOM FOREST FOR CLASSIFYING STUDENT STUDY SUCCESS
Abstract
The advancement of technology has a profound impact on the field of education. Education plays a crucial role in enhancing quality of life, particularly in higher education, where one of the key parameters is student success. This study investigates the influence of feature selection on the performance of machine learning models, particularly Decision Tree and Random Forest, in classifying student academic success. Utilizing a dataset of 19,061 students, the research aims to identify significant variables impacting classification outcomes. Feature selection was conducted using LASSO regression, resulting in a refined dataset of critical predictors. To address data imbalance, Synthetic Minority Over-sampling Technique (SMOTE) was applied, improving the representation of underrepresented classes. Both Decision Tree and Random Forest models were trained on balanced datasets, with performance evaluated using accuracy, precision, recall, and F1-score metrics. The Random Forest model demonstrated superior accuracy (96.41%) compared to the Decision Tree (67.15%), as well as higher AUC values. Model interpretability was enhanced using SHAP (SHapley Additive exPlanations). This study underscores the utility of advanced machine learning techniques in educational analytics, paving the way for data-driven decision-making to support student achievement.
Downloads
References
M. Arridho et al., “THE TECHNOLOGY DEVELOPMENT IN THE EDUCATION FIELD,” INTERNATIONAL OF EDUCATION AND SOSSIAL (AIOES) Journal, vol. 4, no. 2, pp. 25–29, 2023, doi: 10.55311/aioes.v3i2.199.
I. Oktadiani, H. Fitriawan, M. Nurwahidin, and Herpratiwi, “PENERAPAN MACHINE LEARNING UNTUK PREDIKSI MASA STUDI MAHASISWA DI PERGURUAN TINGGI X,” ELECTRICIAN, vol. 17, no. 3, Oct. 2023.doi: https://doi.org/10.23960/elc.v17n3.2529
R. Raja and P. C. Nagasubramani, “RECENT TREND OF TEACHING METHODS IN EDUCATION" ORGANISED BY SRI SAI BHARATH COLLEGE OF EDUCATION DINDIGUL,” India Journal of Applied and Advanced Research, vol. 2018, no. 3, pp. 33–35, 2018, doi: https://doi.org/10.21839/jaar.2018.v3iS1.165.
Dr. Lohans Kumar Kalyani, “THE ROLE OF TECHNOLOGY IN EDUCATION: ENHANCING LEARNING OUTCOMES AND 21ST CENTURY SKILLS,” International Journal of Scientific Research in Modern Science and Technology, vol. 3, no. 4, pp. 05–10, Apr. 2024, doi: https://doi.org/10.59828/ijsrmst.v3i4.199.
I. B. Suleiman, O. A. Okunade, E. G. Dada, and U. C. Ezeanya, “KEY FACTORS INFLUENCING STUDENTS’ ACADEMIC PERFORMANCE,” Journal of Electrical Systems and Information Technology, vol. 11, no. 1, p. 41, Sep. 2024, doi: https://doi.org/10.1186/s43067-024-00166-w.
Arwildayanto, A. Suking, Arifin, and Nellitawati, MANAJEMEN DAYA SAING PERGURUAN TINGGI, 1st ed., vol. 1. Bandung: Cendekia Press, 2020.
M. Syukri et al., “KUALITAS PENDIDIKAN DAN KEUNGGULAN KOMPETITIF,” Journal on Education, vol. 06, no. 02, pp. 11738–11747, Feb. 2024, Accessed: Dec. 06, 2024. [Online]. Available: http://jonedu.org/index.php/joe
R. Purnama, “METODE PENURUNAN MULTIPEL (MULTIPLE DECREMENT METHOD) PADA DATA LAMA STUDI MAHASISWA,” Serambi Saintia, vol. V, no. 2, Oct. 2017.
R. Setiawan, W. Wagiran, and Y. Alsamiri, “CONSTRUCTION OF AN INSTRUMENT FOR EVALUATING THE TEACHING PROCESS IN HIGHER EDUCATION: CONTENT AND CONSTRUCT VALIDITY,” REID (Research and Evaluation in Education), vol. 10, no. 1, pp. 50–63, Jun. 2024, doi: https://doi.org/10.21831/reid.v10i1.63483.
J. Han, M. Kamber, and J. Pei, “DATA MINING. CONCEPTS AND TECHNIQUES, 3RD EDITION (THE MORGAN KAUFMANN SERIES IN DATA MANAGEMENT SYSTEMS),” 2011.
I. H. Sarker, “MACHINE LEARNING: ALGORITHMS, REAL-WORLD APPLICATIONS AND RESEARCH DIRECTIONS,” May 01, 2021, Springer. doi: https://doi.org/10.20944/preprints202103.0216.v1.
A. Lindholm, N. Wahlström, F. Lindsten, and T. B. Schön, “SUPERVISED MACHINE LEARNING LECTURE NOTES FOR THE STATISTICAL MACHINE LEARNING COURSE,” Mar. 2019.
A. F. A. H. Alnuaimi and T. H. K. Albaldawi, “AN OVERVIEW OF MACHINE LEARNING CLASSIFICATION TECHNIQUES,” BIO Web Conf, vol. 97, Apr. 2024, doi: https://doi.org/10.1051/bioconf/20249700133.
M. Yağcı, “EDUCATIONAL DATA MINING: PREDICTION OF STUDENTS’ ACADEMIC PERFORMANCE USING MACHINE LEARNING ALGORITHMS,” Smart Learning Environments, vol. 9, no. 1, Dec. 2022, doi: https://doi.org/10.1186/s40561-022-00192-z.
P. Utami, N. Fajaryati, P. Sudira, M. E. Ismail, N. Maneetien, and F. Felestin, “THE ROLE OF TEACHER IN INDUSTRY 5.0, TECHNOLOGY, AND SOCIAL CAPITAL IN FOR VOCATIONAL HIGH SCHOOL GRADUATES IN SCHOOL-TO-WORK TRANSITIONS,” Elinvo (Electronics, Informatics, and Vocational Education), vol. 9, no. 1, pp. 113–133, May 2024, doi: https://doi.org/10.21831/elinvo.v9i1.72485.
E. Dewi Sri Mulyani, Y. Purnama Putra, E. Badar Sambani, S. Siti Sundari, T. Mufizar, and M. Satrio Nugraha, “STUDENT COMPETENCY ASSOCIATION ANALYSIS FOR LEARNING EVALUATION USING APRIORI ALGORITHM,” Elinvo (Electronics, Informatics, and Vocational Education), vol. 6, no. 2, pp. 120–130, Dec. 2021, doi: https://doi.org/10.21831/elinvo.v6i2.42264.
M. A. S. Pawitra, H.-C. Hung, and H. Jati, “A MACHINE LEARNING APPROACH TO PREDICTING ON-TIME GRADUATION IN INDONESIAN HIGHER EDUCATION,” Elinvo (Electronics, Informatics, and Vocational Education), vol. 9, no. 2, pp. 294–308, Dec. 2024, doi: https://doi.org/10.21831/elinvo.v9i2.77052.
B. Mahesh, “MACHINE LEARNING ALGORITHMS - A REVIEW,” International Journal of Science and Research (IJSR), vol. 9, no. 1, pp. 381–386, Jan. 2020, doi: https://doi.org/10.21275/ART20203995.
R. K. Dinata and N. Hasdyna, MACHINE LEARNING, 1st ed., vol. 1. Lhokseumawe: UNIMAL, 2020.
D. L. Olson and D. Delen, ADVANCED DATA MINING TECHNIQUES. Heidelberg: Springer, 2008. doi: https://doi.org/10.1007/978-3-540-76917-0_2.
L. Li and M. Iskander, “USE OF MACHINE LEARNING FOR CLASSIFICATION OF SAND PARTICLES,” Acta Geotech, vol. 17, no. 10, pp. 4739–4759, Oct. 2022, doi: https://doi.org/10.1007/s11440-021-01443-y.
Suci Amaliah, M. Nusrang, and A. Aswi, “PENERAPAN METODE RANDOM FOREST UNTUK KLASIFIKASI VARIAN MINUMAN KOPI DI KEDAI KOPI KONIJIWA BANTAENG,” VARIANSI: Journal of Statistics and Its application on Teaching and Research, vol. 4, no. 3, pp. 121–127, Dec. 2022, doi: https://doi.org/10.35580/variansiunm31.
M. L. Ruiz-Rodriguez, J. Andres Sandoval-Bringas, and M. A. Carreno-Leon, “CLASSIFICATION OF STUDENT SUCCESS USING RANDOM FOREST AND NEURAL NETWORKS,” in Proceedings - 2020 3rd International Conference of Inclusive Technology and Education, CONTIE 2020, Institute of Electrical and Electronics Engineers Inc., Oct. 2020, pp. 98–103. doi: https://doi.org/10.1109/CONTIE51334.2020.00027.
D. Alita and A. Rahman, “PENDETEKSIAN SARKASME PADA PROSES ANALISIS SENTIMEN MENGGUNAKAN RANDOM FOREST CLASSIFIER,” Jurnal Komputasi, vol. 8, no. 2, 2020.doi: https://doi.org/10.23960/komputasi.v8i2.2615
S. A. Cushman, K. Kilshaw, R. D. Campbell, Z. Kaszta, M. Gaywood, and D. W. Macdonald, “COMPARING THE PERFORMANCE OF GLOBAL, GEOGRAPHICALLY WEIGHTED AND ECOLOGICALLY WEIGHTED SPECIES DISTRIBUTION MODELS FOR SCOTTISH WILDCATS USING GLM AND RANDOM FOREST PREDICTIVE MODELING,” Ecol Modell, vol. 492, Jun. 2024, doi: https://doi.org/10.1016/j.ecolmodel.2024.110691.
H. Pramoedyo, D. Ariyanto, and N. N. Aini, “COMPARISON OF RANDOM FOREST AND NAÏVE BAYES METHODS FOR CLASSIFYING ANF FORECASTING SOIL TEXTURE IN THE AREA AROUND DAS KALIKONTO, EAST JAVA,” BAREKENG: Jurnal Ilmu Matematika dan Terapan, vol. 16, no. 4, pp. 1411–1422, Dec. 2022, doi: https://doi.org/10.30598/barekengvol16iss4pp1411-1422.
C. Sammut and G. I. Webb, ENCYCLOPEDIA OF MACHINE LEARNING. Victoria, 2011. doi: https://doi.org/10.1007/978-0-387-30164-8.
D. K. Dalimunthe and R. B. F. Hakim, “APPLICATION OF RANDOM FOREST ALGORITHM ON WATCH PRICE PREDICTION SYSTEM USING FRAMEWORK FLASK,” BAREKENG: Jurnal Ilmu Matematika dan Terapan, vol. 17, no. 1, pp. 0171–0184, Apr. 2023, doi: https://doi.org/10.30598/barekengvol17iss1pp0171-0184.
Guozheng. Li, Proceedings : 2013 IEEE International Conference on Bioinformatics and Biomedicine : 18-21 December 2013, Shanghai, China. IEEE, 2013.
I. A. M. S. Widiastuti, “ASSESSMENT AND FEEDBACK PRACTICES IN THE EFL CLASSROOM,” REID (Research and Evaluation in Education), vol. 7, no. 1, pp. 13–22, Jun. 2021, doi: https://doi.org/10.21831/reid.v7i1.37741.
Copyright (c) 2025 Firdaus Amruzain Satiranandi Wibowo, Heri Retnawati, Muhammad Lintang Damar Sakti, Asma Khoirunnisa, Angella Ananta Batubara, Miftah Okta Berlian, Zulfa Safina Ibrahim, Jailani Jailani, Sumaryanto Sumaryanto, Lantip Diat Prasojo

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this Journal agree to the following terms:
- Author retain copyright and grant the journal right of first publication with the work simultaneously licensed under a creative commons attribution license that allow others to share the work within an acknowledgement of the work’s authorship and initial publication of this journal.
- Authors are able to enter into separate, additional contractual arrangement for the non-exclusive distribution of the journal’s published version of the work (e.g. acknowledgement of its initial publication in this journal).
- Authors are permitted and encouraged to post their work online (e.g. in institutional repositories or on their websites) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published works.