OPTIMIZING HEART ATTACK DIAGNOSIS USING RANDOM FOREST WITH BAT ALGORITHM AND GREEDY CROSSOVER TECHNIQUE
Abstract
Cardiovascular disease stands as one of the primary contributors to global mortality, with the World Health Organization (WHO) reporting approximately 17.9 million deaths annually. Swift and accurate diagnosis of heart attacks is crucial to ensure timely and specialized intervention for patients afflicted by this ailment. A machine learning algorithm that can be employed for addressing such issues is the Random Forest algorithm. However, the efficacy of the model is significantly influenced by the features selected during the training phase. To mitigate this, the Binary Bat Algorithm (BBA) with greedy crossover has been utilized to enhance feature selection within the model. This approach is particularly adept at preventing convergence issues often associated with local minima. The optimal parameters for BBA with greedy crossover are determined to be , , , and . With these parameters, the proposed algorithm identifies the most relevant features, including age, gender, cp, chol, thalach, oldpeak, slope, and ca, achieving an accuracy of 94.19% on the training data and 91.8% on the test data. Furthermore, the precision and recall values for both classes range from 0.87 to 0.96, contributing to an approximate -score of 0.92. The proposed method has increased its -score by 0.05 if compared with the regular Random Forest model. These results underscore the effectiveness of the proposed algorithm in providing accurate and reliable predictions for heart disease diagnosis. As such, this model makes diagnosing heart attack more convenient and effective because it does not require too much medical features or patient data. Hopefully, the results of this research help medical practitioners make better and timely decisions in the diagnosis and treatment of heart attacks, as well as assist in planning more effective public health programs for heart attack prevention.
Downloads
References
P. Ghadge, V. Girme, K. Kokane, and P. Deshmukh, “Intelligent heart attack prediction system using big data,” International Journal of Recent Research in Mathematics Computer Science and Information Technology, vol. 2, pp. 73–77, 2015, [Online]. Available: https://api.semanticscholar.org/CorpusID:252692313
WHO, “Cardiovascular diseases.” Accessed: Jul. 18, 2023. [Online]. Available: https://www.who.int/health-topics/cardiovascular-diseases#tab=tab_1
Y. Kumar, A. Koul, R. Singla, and M. F. Ijaz, “Artificial intelligence in disease diagnosis: a systematic literature review, synthesizing framework and future research agenda.,” Journal of ambient intelligence and humanized computing, vol. 14, no. 7, pp. 8459–8486, 2023, doi: 10.1007/s12652-021-03612-z.
C. Nguyen, Y. Wang, and H.-N. Nguyen, “Random forest classifier combined with feature selection for breast cancer diagnosis and prognostic,” Journal of Biomedical Science and Engineering, vol. 06, pp. 551–560, 2013, doi: 10.4236/jbise.2013.65070.
S. M. S. Shah, F. A. Shah, S. A. Hussain, and S. Batool, “Support vector machines-based heart disease diagnosis using feature subset, wrapping selection and extraction methods,” Computers & Electrical Engineering, vol. 84, p. 106628, 2020, doi: https://doi.org/10.1016/j.compeleceng.2020.106628.
J. Zeniarja, A. Ukhifahdhina, and A. Salam, “Diagnosis of heart disease using k-nearest neighbor method based on forward selection,” Journal of Applied Intelligent System, vol. 4, pp. 39–47, 2020, doi: 10.33633/jais.v4i2.2749.
K. Vembandasamy, R. Sasipriya, and E. Deepa, “Heart diseases detection using naive bayes algorithm,” International Journal of Innovative Science, Engineering & Technology, vol. 2, no. 9, pp. 441–444, 2015, [Online]. Available: https://ijiset.com/vol2/v2s9/IJISET_V2_I9_54.pdf
M. Pal and S. Parija, “Prediction of heart diseases using random forest,” Journal of Physics: Conference Series, vol. 1817, no. 1, p. 12009, Mar. 2021, doi: 10.1088/1742-6596/1817/1/012009.
K. Kathirvelu, A. V. P. Yesudhas, and S. Ramanathan, “Spectral unmixing based random forest classifier for detecting surface water changes in multitemporal pansharpened Landsat image,” Expert Systems with Applications, vol. 224, p. 120072, 2023, doi: https://doi.org/10.1016/j.eswa.2023.120072.
P. S. Rao, P. Parida, G. Sahu, and S. Dash, “A multi-view human gait recognition using hybrid whale and gray wolf optimization algorithm with a random forest classifier,” Image and Vision Computing, vol. 136, p. 104721, 2023, doi: https://doi.org/10.1016/j.imavis.2023.104721.
M. R. Ali, S. M. A. Nipu, and S. A. Khan, “A decision support system for classifying supplier selection criteria using machine learning and random forest approach,” Decision Analytics Journal, vol. 7, p. 100238, Jun. 2023, doi: 10.1016/J.DAJOUR.2023.100238.
A. K, D. N, D. T, B. B B, B. D. N, and N. V, “Effect of multi filters in glucoma detection using random forest classifier,” Measurement: Sensors, vol. 25, p. 100566, Feb. 2023, doi: 10.1016/j.measen.2022.100566.
A. T. Azar, H. I. Elshazly, A. E. Hassanien, and A. M. Elkorany, “A random forest classifier for lymph diseases,” Computer Methods and Programs in Biomedicine, vol. 113, no. 2, pp. 465–473, 2014, doi: https://doi.org/10.1016/j.cmpb.2013.11.004.
M. Zhu, B. Su, and G. Ning, “Research of medical high-dimensional imbalanced data classification ensemble feature selection algorithm with random forest,” in 2017 International Conference on Smart Grid and Electrical Automation (ICSGEA), 2017, pp. 273–277. doi: 10.1109/ICSGEA.2017.158.
A. Chaudhary, S. Kolhe, and R. Kamal, “An improved random forest classifier for multi-class classification,” Information Processing in Agriculture, vol. 3, no. 4, pp. 215–222, 2016, doi: https://doi.org/10.1016/j.inpa.2016.08.002.
S. Han, H. Kim, and Y.-S. Lee, “Double random forest,” Machine Learning, vol. 109, no. 8, pp. 1569–1586, 2020, doi: 10.1007/s10994-020-05889-1.
Y. Ao, H. Li, L. Zhu, S. Ali, and Z. Yang, “The linear random forest algorithm and its advantages in machine learning assisted logging regression modeling,” Journal of Petroleum Science and Engineering, vol. 174, pp. 776–789, 2019, doi: https://doi.org/10.1016/j.petrol.2018.11.067.
T. M. Oshiro, P. S. Perez, and J. A. Baranauskas, “How many trees in a random forest?,” in Machine Learning and Data Mining in Pattern Recognition, P. Perner, Ed., Berlin, Heidelberg: Springer Berlin Heidelberg, 2012, pp. 154–168.
B. Lakshminarayanan, D. M. Roy, and Y. W. Teh, “Mondrian forests: Efficient online random forests.” 2015.
B. Wundervald, A. C. Parnell, and K. Domijan, “Generalizing gain penalization for feature selection in tree-based models,” IEEE Access, vol. 8, pp. 190231–190239, 2020, doi: 10.1109/ACCESS.2020.3032095.
I. M. B. Adnyana, “Penerapan feature selection untuk prediksi lama studi mahasiswa,” Jurnal Sistem dan Informatika, vol. 13, no. 2, pp. 72–76, 2019.
B. Pes, “Ensemble feature selection for high-dimensional data: a stability analysis across multiple domains,” Neural Computing and Applications, vol. 32, no. 10, pp. 5951–5973, 2020, doi: 10.1007/s00521-019-04082-3.
H. M. Farghaly and T. A. El-Hafeez, “A high-quality feature selection method based on frequent and correlated items for text classification,” Soft Computing, vol. 27, no. 16, pp. 11259–11274, 2023, doi: 10.1007/s00500-023-08587-x.
R. Ge et al., “McTwo: a two-step feature selection algorithm based on maximal information coefficient,” BMC Bioinformatics, vol. 17, no. 1, p. 142, 2016, doi: 10.1186/s12859-016-0990-0.
T. Verdonck, B. Baesens, M. Óskarsdóttir, and S. vanden Broucke, “Special issue on feature engineering editorial,” Machine Learning, 2021, doi: 10.1007/s10994-021-06042-2.
T. Islam, M. Islam, and M. R. Ruhin, “An analysis of foraging and echolocation behavior of swarm intelligence algorithms in optimization: ACO, BCO and BA,” International Journal of Intelligence Science, vol. 08, pp. 1–27, 2018, doi: 10.4236/ijis.2018.81001.
A. Chakri, R. Khelif, M. Benouaret, and X.-S. Yang, “New directional bat algorithm for continuous optimization problems,” Expert Systems with Applications, vol. 69, pp. 159–175, 2017, doi: https://doi.org/10.1016/j.eswa.2016.10.050.
A. Kaveh and P. Zakian, “Enhanced bat algorithm for optimal design of skeletal structures,” ASIAN JOURNAL OF CIVIL ENGINEERING (BHRC), vol. 15, pp. 179–212, Jan. 2014.
H. Zhu, Y. Wang, and Y. Zhang, “Improved bat algorithm with novel search mechanism and one-dimensional perturbation local search strategy,” 2018. [Online]. Available: https://api.semanticscholar.org/CorpusID:229347482
M. A. Salam, “Comparative study between FPA, BA, MCS, ABC, and PSO algorithms in training and optimizing of LS-SVM for stock market prediction,” International Journal of Advanced Computer Research, vol. 5, pp. 35–45, Mar. 2015.
A. Kaur and Y. Kumar, “Recent developments in bat algorithm: a mini review,” Journal of Physics: Conference Series, vol. 1950, p. 12055, Aug. 2021, doi: 10.1088/1742-6596/1950/1/012055.
S. Akila and S. Allin Christe, “A wrapper based binary bat algorithm with greedy crossover for attribute selection,” Expert Systems with Applications, vol. 187, p. 115828, 2022, doi: https://doi.org/10.1016/j.eswa.2021.115828.
M. A. Farsi, “Chapter 3 - Genetic algorithms: Principles and application in RAMS,” in Nature-Inspired Computing Paradigms in Systems, M. A. Mellal and M. G. Pecht, Eds., in Intelligent Data-Centric Systems. , Academic Press, 2021, pp. 25–46. doi: https://doi.org/10.1016/B978-0-12-823749-6.00001-5.
S. Anam, M. R. A. Putra, Z. Fitriah, I. Yanti, N. Hidayat, and D. M. Mahanani, “Health claim insurance prediction using support vector machine with particle swarm optimization,” BAREKENG: Jurnal Ilmu Matematika dan Terapan, vol. 17, no. 2, pp. 0797–0806, Jun. 2023, doi: 10.30598/barekengvol17iss2pp0797-0806.
X. Ma and J. Wang, “Optimized parameter settings of Binary Bat Algorithm for solving function optimization problems,” Journal of Electrical and Computer Engineering, 2018.
S. Mirjalili, S. M. Mirjalili, and X.-S. Yang, “Binary bat algorithm,” Neural Computing and Applications, vol. 25, no. 3, pp. 663–681, 2014, doi: 10.1007/s00521-013-1525-5.
A. Hassanat and E. Alkafaween, “On enhancing genetic algorithms using new crossovers,” International Journal of Computer Applications in Technology, vol. 55, Jun. 2017, doi: 10.1504/IJCAT.2017.10005868.
R. Marappan and G. Sethumadhavan, “Complexity analysis and stochastic convergence of some well-known evolutionary operators for solving graph coloring problem,” Mathematics, vol. 8, no. 3, p. 303, Feb. 2020, doi: 10.3390/math8030303.
A. M. Aladdin and T. A. Rashid, “A new Lagrangian problem crossover—a systematic review and meta-analysis of crossover standards,” Systems, vol. 11, no. 3, p. 144, Mar. 2023, doi: 10.3390/systems11030144.
M. Schonlau and R. Y. Zou, “The random forest algorithm for statistical learning,” The Stata Journal: Promoting communications on statistics and Stata, vol. 20, no. 1, pp. 3–29, Mar. 2020, doi: 10.1177/1536867X20909688.
Copyright (c) 2024 Safrizal Ardana Ardiyansa, Natasha Clarissa Maharani, Syaiful Anam, Eric Julianto

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this Journal agree to the following terms:
- Author retain copyright and grant the journal right of first publication with the work simultaneously licensed under a creative commons attribution license that allow others to share the work within an acknowledgement of the work’s authorship and initial publication of this journal.
- Authors are able to enter into separate, additional contractual arrangement for the non-exclusive distribution of the journal’s published version of the work (e.g. acknowledgement of its initial publication in this journal).
- Authors are permitted and encouraged to post their work online (e.g. in institutional repositories or on their websites) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published works.