OPTIMIZING HEART ATTACK DIAGNOSIS USING RANDOM FOREST WITH BAT ALGORITHM AND GREEDY CROSSOVER TECHNIQUE

  • Safrizal Ardana Ardiyansa Department of Mathematics, Faculty of Mathematics and Natural Science, Brawijaya University, Indonesia
  • Natasha Clarissa Maharani Department of Mathematics, Faculty of Mathematics and Natural Science, Brawijaya University, Indonesia
  • Syaiful Anam Department of Mathematics, Faculty of Mathematics and Natural Science, Brawijaya University, Indonesia https://orcid.org/0000-0002-6627-0084
  • Eric Julianto Braincore Indonesia, Indonesia
Keywords: Heart attack, Bat algorithm, Random Forest, Greedy crossover

Abstract

Cardiovascular disease stands as one of the primary contributors to global mortality, with the World Health Organization (WHO) reporting approximately 17.9 million deaths annually. Swift and accurate diagnosis of heart attacks is crucial to ensure timely and specialized intervention for patients afflicted by this ailment. A machine learning algorithm that can be employed for addressing such issues is the Random Forest algorithm. However, the efficacy of the model is significantly influenced by the features selected during the training phase. To mitigate this, the Binary Bat Algorithm (BBA) with greedy crossover has been utilized to enhance feature selection within the model. This approach is particularly adept at preventing convergence issues often associated with local minima. The optimal parameters for BBA with greedy crossover are determined to be , , , and . With these parameters, the proposed algorithm identifies the most relevant features, including age, gender, cp, chol, thalach, oldpeak, slope, and ca, achieving an accuracy of 94.19% on the training data and 91.8% on the test data. Furthermore, the precision and recall values for both classes range from 0.87 to 0.96, contributing to an approximate -score of 0.92. The proposed method has increased its -score by 0.05 if compared with the regular Random Forest model. These results underscore the effectiveness of the proposed algorithm in providing accurate and reliable predictions for heart disease diagnosis. As such, this model makes diagnosing heart attack more convenient and effective because it does not require too much medical features or patient data. Hopefully, the results of this research help medical practitioners make better and timely decisions in the diagnosis and treatment of heart attacks, as well as assist in planning more effective public health programs for heart attack prevention.

Downloads

Download data is not yet available.

Author Biographies

Safrizal Ardana Ardiyansa, Department of Mathematics, Faculty of Mathematics and Natural Science, Brawijaya University, Indonesia

Mathematics Department 

Natasha Clarissa Maharani, Department of Mathematics, Faculty of Mathematics and Natural Science, Brawijaya University, Indonesia

Mathematics Department

References

P. Ghadge, V. Girme, K. Kokane, and P. Deshmukh, “Intelligent heart attack prediction system using big data,” International Journal of Recent Research in Mathematics Computer Science and Information Technology, vol. 2, pp. 73–77, 2015, [Online]. Available: https://api.semanticscholar.org/CorpusID:252692313

WHO, “Cardiovascular diseases.” Accessed: Jul. 18, 2023. [Online]. Available: https://www.who.int/health-topics/cardiovascular-diseases#tab=tab_1

Y. Kumar, A. Koul, R. Singla, and M. F. Ijaz, “Artificial intelligence in disease diagnosis: a systematic literature review, synthesizing framework and future research agenda.,” Journal of ambient intelligence and humanized computing, vol. 14, no. 7, pp. 8459–8486, 2023, doi: 10.1007/s12652-021-03612-z.

C. Nguyen, Y. Wang, and H.-N. Nguyen, “Random forest classifier combined with feature selection for breast cancer diagnosis and prognostic,” Journal of Biomedical Science and Engineering, vol. 06, pp. 551–560, 2013, doi: 10.4236/jbise.2013.65070.

S. M. S. Shah, F. A. Shah, S. A. Hussain, and S. Batool, “Support vector machines-based heart disease diagnosis using feature subset, wrapping selection and extraction methods,” Computers & Electrical Engineering, vol. 84, p. 106628, 2020, doi: https://doi.org/10.1016/j.compeleceng.2020.106628.

J. Zeniarja, A. Ukhifahdhina, and A. Salam, “Diagnosis of heart disease using k-nearest neighbor method based on forward selection,” Journal of Applied Intelligent System, vol. 4, pp. 39–47, 2020, doi: 10.33633/jais.v4i2.2749.

K. Vembandasamy, R. Sasipriya, and E. Deepa, “Heart diseases detection using naive bayes algorithm,” International Journal of Innovative Science, Engineering & Technology, vol. 2, no. 9, pp. 441–444, 2015, [Online]. Available: https://ijiset.com/vol2/v2s9/IJISET_V2_I9_54.pdf

M. Pal and S. Parija, “Prediction of heart diseases using random forest,” Journal of Physics: Conference Series, vol. 1817, no. 1, p. 12009, Mar. 2021, doi: 10.1088/1742-6596/1817/1/012009.

K. Kathirvelu, A. V. P. Yesudhas, and S. Ramanathan, “Spectral unmixing based random forest classifier for detecting surface water changes in multitemporal pansharpened Landsat image,” Expert Systems with Applications, vol. 224, p. 120072, 2023, doi: https://doi.org/10.1016/j.eswa.2023.120072.

P. S. Rao, P. Parida, G. Sahu, and S. Dash, “A multi-view human gait recognition using hybrid whale and gray wolf optimization algorithm with a random forest classifier,” Image and Vision Computing, vol. 136, p. 104721, 2023, doi: https://doi.org/10.1016/j.imavis.2023.104721.

M. R. Ali, S. M. A. Nipu, and S. A. Khan, “A decision support system for classifying supplier selection criteria using machine learning and random forest approach,” Decision Analytics Journal, vol. 7, p. 100238, Jun. 2023, doi: 10.1016/J.DAJOUR.2023.100238.

A. K, D. N, D. T, B. B B, B. D. N, and N. V, “Effect of multi filters in glucoma detection using random forest classifier,” Measurement: Sensors, vol. 25, p. 100566, Feb. 2023, doi: 10.1016/j.measen.2022.100566.

A. T. Azar, H. I. Elshazly, A. E. Hassanien, and A. M. Elkorany, “A random forest classifier for lymph diseases,” Computer Methods and Programs in Biomedicine, vol. 113, no. 2, pp. 465–473, 2014, doi: https://doi.org/10.1016/j.cmpb.2013.11.004.

M. Zhu, B. Su, and G. Ning, “Research of medical high-dimensional imbalanced data classification ensemble feature selection algorithm with random forest,” in 2017 International Conference on Smart Grid and Electrical Automation (ICSGEA), 2017, pp. 273–277. doi: 10.1109/ICSGEA.2017.158.

A. Chaudhary, S. Kolhe, and R. Kamal, “An improved random forest classifier for multi-class classification,” Information Processing in Agriculture, vol. 3, no. 4, pp. 215–222, 2016, doi: https://doi.org/10.1016/j.inpa.2016.08.002.

S. Han, H. Kim, and Y.-S. Lee, “Double random forest,” Machine Learning, vol. 109, no. 8, pp. 1569–1586, 2020, doi: 10.1007/s10994-020-05889-1.

Y. Ao, H. Li, L. Zhu, S. Ali, and Z. Yang, “The linear random forest algorithm and its advantages in machine learning assisted logging regression modeling,” Journal of Petroleum Science and Engineering, vol. 174, pp. 776–789, 2019, doi: https://doi.org/10.1016/j.petrol.2018.11.067.

T. M. Oshiro, P. S. Perez, and J. A. Baranauskas, “How many trees in a random forest?,” in Machine Learning and Data Mining in Pattern Recognition, P. Perner, Ed., Berlin, Heidelberg: Springer Berlin Heidelberg, 2012, pp. 154–168.

B. Lakshminarayanan, D. M. Roy, and Y. W. Teh, “Mondrian forests: Efficient online random forests.” 2015.

B. Wundervald, A. C. Parnell, and K. Domijan, “Generalizing gain penalization for feature selection in tree-based models,” IEEE Access, vol. 8, pp. 190231–190239, 2020, doi: 10.1109/ACCESS.2020.3032095.

I. M. B. Adnyana, “Penerapan feature selection untuk prediksi lama studi mahasiswa,” Jurnal Sistem dan Informatika, vol. 13, no. 2, pp. 72–76, 2019.

B. Pes, “Ensemble feature selection for high-dimensional data: a stability analysis across multiple domains,” Neural Computing and Applications, vol. 32, no. 10, pp. 5951–5973, 2020, doi: 10.1007/s00521-019-04082-3.

H. M. Farghaly and T. A. El-Hafeez, “A high-quality feature selection method based on frequent and correlated items for text classification,” Soft Computing, vol. 27, no. 16, pp. 11259–11274, 2023, doi: 10.1007/s00500-023-08587-x.

R. Ge et al., “McTwo: a two-step feature selection algorithm based on maximal information coefficient,” BMC Bioinformatics, vol. 17, no. 1, p. 142, 2016, doi: 10.1186/s12859-016-0990-0.

T. Verdonck, B. Baesens, M. Óskarsdóttir, and S. vanden Broucke, “Special issue on feature engineering editorial,” Machine Learning, 2021, doi: 10.1007/s10994-021-06042-2.

T. Islam, M. Islam, and M. R. Ruhin, “An analysis of foraging and echolocation behavior of swarm intelligence algorithms in optimization: ACO, BCO and BA,” International Journal of Intelligence Science, vol. 08, pp. 1–27, 2018, doi: 10.4236/ijis.2018.81001.

A. Chakri, R. Khelif, M. Benouaret, and X.-S. Yang, “New directional bat algorithm for continuous optimization problems,” Expert Systems with Applications, vol. 69, pp. 159–175, 2017, doi: https://doi.org/10.1016/j.eswa.2016.10.050.

A. Kaveh and P. Zakian, “Enhanced bat algorithm for optimal design of skeletal structures,” ASIAN JOURNAL OF CIVIL ENGINEERING (BHRC), vol. 15, pp. 179–212, Jan. 2014.

H. Zhu, Y. Wang, and Y. Zhang, “Improved bat algorithm with novel search mechanism and one-dimensional perturbation local search strategy,” 2018. [Online]. Available: https://api.semanticscholar.org/CorpusID:229347482

M. A. Salam, “Comparative study between FPA, BA, MCS, ABC, and PSO algorithms in training and optimizing of LS-SVM for stock market prediction,” International Journal of Advanced Computer Research, vol. 5, pp. 35–45, Mar. 2015.

A. Kaur and Y. Kumar, “Recent developments in bat algorithm: a mini review,” Journal of Physics: Conference Series, vol. 1950, p. 12055, Aug. 2021, doi: 10.1088/1742-6596/1950/1/012055.

S. Akila and S. Allin Christe, “A wrapper based binary bat algorithm with greedy crossover for attribute selection,” Expert Systems with Applications, vol. 187, p. 115828, 2022, doi: https://doi.org/10.1016/j.eswa.2021.115828.

M. A. Farsi, “Chapter 3 - Genetic algorithms: Principles and application in RAMS,” in Nature-Inspired Computing Paradigms in Systems, M. A. Mellal and M. G. Pecht, Eds., in Intelligent Data-Centric Systems. , Academic Press, 2021, pp. 25–46. doi: https://doi.org/10.1016/B978-0-12-823749-6.00001-5.

S. Anam, M. R. A. Putra, Z. Fitriah, I. Yanti, N. Hidayat, and D. M. Mahanani, “Health claim insurance prediction using support vector machine with particle swarm optimization,” BAREKENG: Jurnal Ilmu Matematika dan Terapan, vol. 17, no. 2, pp. 0797–0806, Jun. 2023, doi: 10.30598/barekengvol17iss2pp0797-0806.

X. Ma and J. Wang, “Optimized parameter settings of Binary Bat Algorithm for solving function optimization problems,” Journal of Electrical and Computer Engineering, 2018.

S. Mirjalili, S. M. Mirjalili, and X.-S. Yang, “Binary bat algorithm,” Neural Computing and Applications, vol. 25, no. 3, pp. 663–681, 2014, doi: 10.1007/s00521-013-1525-5.

A. Hassanat and E. Alkafaween, “On enhancing genetic algorithms using new crossovers,” International Journal of Computer Applications in Technology, vol. 55, Jun. 2017, doi: 10.1504/IJCAT.2017.10005868.

R. Marappan and G. Sethumadhavan, “Complexity analysis and stochastic convergence of some well-known evolutionary operators for solving graph coloring problem,” Mathematics, vol. 8, no. 3, p. 303, Feb. 2020, doi: 10.3390/math8030303.

A. M. Aladdin and T. A. Rashid, “A new Lagrangian problem crossover—a systematic review and meta-analysis of crossover standards,” Systems, vol. 11, no. 3, p. 144, Mar. 2023, doi: 10.3390/systems11030144.

M. Schonlau and R. Y. Zou, “The random forest algorithm for statistical learning,” The Stata Journal: Promoting communications on statistics and Stata, vol. 20, no. 1, pp. 3–29, Mar. 2020, doi: 10.1177/1536867X20909688.

Published
2024-05-25
How to Cite
[1]
S. Ardiyansa, N. Maharani, S. Anam, and E. Julianto, “OPTIMIZING HEART ATTACK DIAGNOSIS USING RANDOM FOREST WITH BAT ALGORITHM AND GREEDY CROSSOVER TECHNIQUE”, BAREKENG: J. Math. & App., vol. 18, no. 2, pp. 1053-1066, May 2024.