Comparison of Adaboost Application to C4.5 and C5.0 Algorithms in Student Graduation Classification
Abstract
Students become a benchmark used to assess quality and evaluate college learning plans. Therefore, students who graduate not on time can have an effect on accreditation assessment. The characteristics of students who graduate on time or not on time in determining student graduation can be analyzed using classification techniques in data mining, namely the C4.5 and C5.0 algorithms. The purpose of this study is to compare the application of the Adaboost Algorithm to the C4.5 and C5.0 Algorithms in the classification of student graduation. The data used is the graduation data of students of the Statistics Study Program at Tanjungpura University Period I of the 2017/2018 Academic Year to Period II of the 2022/2023 Academic Year. The analysis begins by calculating the entropy, gain and gain ratio values. After that, each data was given the same initial weight and iterated 100 times. Based on the classification results using the C5.0 Algorithm, the attribute that has the highest gain ratio value is school accreditation, meaning that the school accreditation attribute has the most influence in the classification of student graduation. The application of the Adaboost Algorithm to the C5.0 Algorithm is better than the C4.5 Algorithm in classifying the graduation of students of the Untan Statistics Study Program. The Adaboost algorithm was able to increase the accuracy of the C5.0 Algorithm by 12.14%. While in the C4.5 Algorithm, the Adaboost Algorithm increases accuracy by 10.71%.Downloads
References
[2] A. F. A. Rahman and S. Wartulas, "Prediksi Kelulusan Mahasiswa Menggunakan Algoritma C4.5 (Studi Kasus Di Universitas Peradaban)," Indonesian Journal of Informatics and Research, vol.1, no. 2, p. 70-77, 2020.
[3] A. Bisri and R. S. Wahono, "Penerapan Adaboost untuk Penyelesaian Ketidakseimbangan Kelas pada Penentuan Kelulusan Mahasiswa dengan Metode Decision Tree," J. Intell. Syst, vol. 1, no. 1, pp. 27-32, 2015.
[4] Y. Mardi, "Data Mining : Klasifikasi Menggunakan Algoritma C4.5," Edik Inform, vol. 2, no. 2, pp. 213–219, Feb. 2017, doi: 10.22202/ei.2016.v2i2.1465.
[5] W. D. Ramadana, N. Satyahadewi, and H. Perdana, "Penerapan Market Basket Analysis Pada Pola Pembelian Barang Oleh Konsumen Menggunakan Metode Algoritma Apriori," Bimaster, vol.11, no. 3, p. 431-438, 2020.
[6] J. Han, M. Kamber, dan J. Pei, Data Mining Concepts And Taechniques, 4 Edition. Walthmann: Morgan Kaufmann Publishers, 2001.
[7] M. S. Mustafa, M. R. Ramadhan, and A. P. Thenata, "Implementasi Data Mining untuk Evaluasi Kinerja Akademik Mahasiswa Menggunakan Algoritma Naive Bayes Classifier," Creat. Inf. Technol. J, vol. 4, no. 2, p. 151-162, Jan. 2018, doi: 10.24076/citec.2017v4i2.106.
[8] P. B. N. Setio, D. R. S. Saputro, and B. Winarno, "Klasifikasi dengan Pohon Keputusan Berbasis Algoritme C4.5," in Prosiding Seminar Nasional Matematika., pp. 64-71, 2020.
[9] P. A. Jusia, "Analisis Komparasi Pemodelan Algoritma Decision Tree Menggunakan Metode Particle Swarm Optimization Dan Metode Adaboost Untuk Prediksi Awal Penyakit Jantung", in Prosiding Seminar Nasional Sistem Informasi., pp. 64-71, 2018.
[10] A. Rohman and A. Rufiyanto, "Implementasi Data Mining Dengan Algoritma Decision Tree C4.5 Untuk Prediksi Kelulusan Mahasiswa Di Universitas Pandanaran", in Proceeding SINTAK., pp. 134-139, 2019.
[11] M. Chair, Y. N. Nasution, and N. A. Rizki, "Aplikasi Klasifikasi Algoritma C4.5 (Studi Kasus Masa Studi Mahasiswa Fakultas Matematika Dan Ilmu Pengetahuan Alam Universitas Mulawarman Angkatan 2008)," Jurnal Informatika Mulawarman, vol. 12, no. 1, p. 50-55.
[12] T. Tanti, P. Sirait, and A. Andri, "Optimalisasi Kinerja Klasifikasi Melalui Seleksi Fitur dan AdaBoost dalam Penanganan Ketidakseimbangan Kelas," J. MEDIA Inform. BUDIDARMA, vol. 5, no. 4, p. 1377-1385, Oct 2021, doi: 10.30865/mib.v5i4.3280.
[13] A. I. Prianti, R. Santoso, dan A. R. Hakim, "Perbandingan Metode K-Nearest Neighbor Dan Adaptive Boosting Pada Kasus Klasifikasi Multi Kelas," J. Gaussian, vol. 9, no. 3, pp. 346–354, Aug. 2020, doi: 10.14710/j.gauss.v9i3.28924.
[14] B. W. Yap, K. A. Rani, H. A. A. Rahman, S. Fong, Z. Khairudin, and N. N. Abdullah, "An Application of Oversampling, Undersampling, Bagging and Boosting in Handling Imbalanced Datasets", in Proceedings of the First International Conference on Advanced Data and Information Engineering (DaEng-2013)., vol. 285, T. Herawan, M. M. Deris, and J. Abawajy, Eds. Singapore: Springer Singapore, 2014, pp. 13–22. doi: 10.1007/978-981-4585-18-7_2.
[15] J. M. Johnson and T. M. Khoshgoftaar, "Survey on deep learning with class imbalance," J. Big Data, vol. 6, no. 1, p. 27-81, Dec 2019, doi: 10.1186/s40537-019-0192-5.
Copyright (c) 2023 Yuveinsiana Crismayella, Neva Satyahadewi, Hendra Perdana
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
The author(s) hold the copyright of the published article without restriction. This policy means that the journal allows the author(s) to hold and retain publishing rights without restrictions.
The author(s) holds the copyright of published articles without limitation. This policy means that the journal allows the author to hold and retain publishing rights without restrictions. Journal editors are given the copyright to publish articles in according to agreement signed by the author and also include statement of originality of the article