A COMPARISON OF ARTIFICIAL NEURAL NETWORK AND NAIVE BAYES CLASSIFICATION USING UNBALANCED DATA HANDLING
Abstract
Classification is a supervised learning method that predicts the class of objects whose labels are unknown. Classification in machine learning will produce good performance if it has a balanced data class on the response variable. Therefore, unbalanced classification is a problem that must be taken seriously. This study will handle unbalanced data using the Synthetic Minority Over-Sampling Technique (SMOTE). The classification methods that are quite popular are the Naïve Bayes Classifier (NB) and the Resilient Backpropagation Artificial Neural Network (Rprop-ANN). The data used comes from the Health Nutrition Research and Development Agency (Balitbangkes) which consists of 2499 observations. This study examines the use of NB and ANN using the SMOTE method to classify the incidence of anemia in young women in Indonesia. Modeling is done on 80% of training data and predictions on 20% of test data. The analysis shows that SMOTE can perform better than not handling unbalanced data. Based on the results of the study, the best method for predicting the incidence of anemia is the Naïve Bayes method, with the sensitivity value of 82%.
Downloads
References
Bustami, “Penerapan Algoritma Naive Bayes Untuk Mengklasifikasi Data Nasabah Asuransi,” J. Inform. Ahmad Dahlan, vol. 8, no. 1, p. 102632, 2014, doi: 10.26555/jifo.v8i1.a2086.
G. James, D. Witten, T. Hastie, and R. Tibshirani, An Introduction to Statistical Learning - with Applications in R | Gareth James | Springer. 2013.
R. Nisbet, G. Miner, and K. Yale, Chapter 11 Model Evaluation and Enhancement. 2018.
F. Handayani and S. Pribadi, “Implementasi Algoritma Naive Bayes Classifier dalam Pengklasifikasian Teks Otomatis Pengaduan dan Pelaporan Masyarakat melalui Layanan Call Center 110,” J. Tek. Elektro, vol. 7, no. 1, pp. 19–24, 2015.
C. Patgiri and A. Ganguly, “Adaptive thresholding technique based classification of red blood cell and sickle cell using Naïve Bayes Classifier and K-nearest neighbor classifier,” Biomed. Signal Process. Control, vol. 68, no. July 2020, p. 102745, 2021, doi: 10.1016/j.bspc.2021.102745.
R. Mahadeva, M. Kumar, S. P. Patole, and G. Manik, “Employing Artificial Neural Network for Accurate Modeling, Simulation and Performance Analysis of An RO-Based Desalination Process,” Sustain. Comput. Informatics Syst., p. 100735, 2022, doi: 10.1016/j.suscom.2022.100735.
O. Erkaymaz, “Resilient back-propagation approach in small-world feed-forward neural network topology based on Newman–Watts algorithm,” Neural Comput. Appl., vol. 32, no. 20, pp. 16279–16289, 2020, doi: 10.1007/s00521-020-05161-6.
Y. Kuvvetli, M. Deveci, T. Paksoy, and H. Garg, “A predictive analytics model for COVID-19 pandemic using artificial neural networks,” Decis. Anal. J., vol. 1, no. August, p. 100007, 2021, doi: 10.1016/j.dajour.2021.100007.
M. Almiani, A. Abughazleh, Y. Jararweh, and A. Razaque, “Resilient Back Propagation Neural Network Security Model For Containerized Cloud Computing,” Simul. Model. Pract. Theory, vol. 118, no. April, p. 102544, 2022, doi: 10.1016/j.simpat.2022.102544.
S. R. Andani and R. Dewi, “Model Algoritma Resilient Backpropagation Dalam,” vol. 2, no. 2, pp. 67–75, 2019.
F. Sağlam and M. A. Cengiz, “A novel SMOTE-based resampling technique trough noise detection and the boosting procedure,” Expert Syst. Appl., vol. 200, no. April 2020, pp. 1–12, 2022, doi: 10.1016/j.eswa.2022.117023.
K. Meena, D. K. Tayal, V. Gupta, and A. Fatima, “Using classification techniques for statistical analysis of Anemia,” Artif. Intell. Med., vol. 94, no. August 2018, pp. 138–152, 2019, doi: 10.1016/j.artmed.2019.02.005.
Kemenkes RI, “Riset Kesehatan Dasar - Riskesdas 2013,” Jakarta, 2013. doi: 10.1517/13543784.7.5.803.
A.-L. M. Heath, C. M. Skeaff, S. Williams, and R. S. Gibson, “The role of blood loss and diet in the aetiology of mild iron deficiency in premenopausal adult New Zealand women,” Public Health Nutr., vol. 4, no. 2, pp. 197–206, 2001, doi: 10.1079/phn200054.
N. N. Abu-Baker, A. M. Eyadat, and A. M. Khamaiseh, “The Impact of Nutrition Education on Knowledge, Attitude, and Practice Regarding Iron Deficiency Anemia Among Female Adolescent Students in Jordan,” Heliyon, vol. 7, no. 2, 2021, doi: 10.1016/j.heliyon.2021.e06348.
P. T. Dalvi and N. Vernekar, “Anemia detection using ensemble learning techniques and statistical models,” 2016 IEEE Int. Conf. Recent Trends Electron. Inf. Commun. Technol. RTEICT 2016 - Proc., pp. 1747–1751, 2017, doi: 10.1109/RTEICT.2016.7808133.
T. K. Yıldız, N. Yurtay, and B. Öneç, “Classifying anemia types using artificial learning methods,” Eng. Sci. Technol. an Int. J., vol. 24, no. 1, pp. 50–70, 2021, doi: 10.1016/j.jestch.2020.12.003.
J. Lin and J. Yu, “Weighted Naive Bayes classification algorithm based on particle swarm optimization,” 2011 IEEE 3rd Int. Conf. Commun. Softw. Networks, ICCSN 2011, pp. 444–447, 2011, doi: 10.1109/ICCSN.2011.6014307.
A. Arafa, N. El-fishawy, M. Badawy, and M. Radad, “RN-SMOTE : Reduced Noise SMOTE based on DBSCAN for enhancing imbalanced data classification,” J. King Saud Univ. - Comput. Inf. Sci., vol. 34, no. 8, pp. 5059–5074, 2022, doi: 10.1016/j.jksuci.2022.06.005.
M. Artur, “Review the performance of the Bernoulli Naïve Bayes Classifier in Intrusion Detection Systems using Recursive Feature Elimination with Cross-validated selection of the best number of features,” Procedia Comput. Sci., vol. 190, no. 2019, pp. 564–570, 2021, doi: 10.1016/j.procs.2021.06.066.
Copyright (c) 2023 Nila Lestari, Indahwati Indahwati, Erfiani Erfiani, Elisa D Julianti
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this Journal agree to the following terms:
- Author retain copyright and grant the journal right of first publication with the work simultaneously licensed under a creative commons attribution license that allow others to share the work within an acknowledgement of the work’s authorship and initial publication of this journal.
- Authors are able to enter into separate, additional contractual arrangement for the non-exclusive distribution of the journal’s published version of the work (e.g. acknowledgement of its initial publication in this journal).
- Authors are permitted and encouraged to post their work online (e.g. in institutional repositories or on their websites) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published works.