IMPROVING SUPPORT VECTOR MACHINE PERFORMANCE WITH BINARY GAUSSIAN IMPROVED WHALE OPTIMIZATION ALGORITHM: A CASE STUDY ON DIABETES DATA

  • Haidar Ahmad Fajri Department of Mathematics, Faculty of Mathematics and Natural Sciences, Universitas Brawijaya, Indonesia https://orcid.org/0009-0004-9787-662X
  • Safrizal Ardana Ardiyansa Department of Mathematics, Faculty of Mathematics and Natural Sciences, Universitas Brawijaya, Indonesia https://orcid.org/0009-0007-8683-5568
  • Syaiful Anam Department of Mathematics, Faculty of Mathematics and Natural Sciences, Universitas Brawijaya, Indonesia https://orcid.org/0000-0002-6627-0084
  • Natasha Clarrisa Maharani Department of Mathematics, Faculty of Mathematics and Natural Sciences, Universitas Brawijaya, Indonesia https://orcid.org/0009-0002-0726-2807
  • Eric Julianto Department of Mathematics, Faculty of Mathematics and Natural Sciences, Universitas Brawijaya, Indonesia https://orcid.org/0009-0000-2461-7521
Keywords: Diabetes Mellitus, Feature Selection, Gaussian Improved WOA, Support Vector Machine

Abstract

Diabetes mellitus is a chronic condition with high blood sugar that can cause severe organ damage, affecting all ages globally. Early diagnosis is crucial for improving patients' quality of life, and machine learning offers a promising approach. The Support Vector Machine (SVM) is effective for classification, but feature selection is essential to enhance the relevance of features. The Whale Optimization Algorithm (WOA) is an optimal method for global feature selection, but it has a drawback-premature convergence, which can lead to suboptimal results. This issue should be addressed by modifying mutation operations, convergence factors, and population initialization, resulting in Binary Gaussian IWOA (BGIWOA). This research focuses on feature selection using BGIWOA, comparing it with Variance Inflation Factor (VIF) using SVM. The result show that BGIWOA is better than VIF and the best configuration BGIWOA’s parameter is  with linear kernel. This configuration produces the best accuracy of 95.00%. BGIWOA-SVM demonstrates better accuracy with stable consistency compared to VIF-SVM. The best SVM model achieves average accuracy of 95.62% for training data and 95.58% for validation data, with an accuracy of 93.85% for the test data. This model also yields an average precision of 94.00%, a recall of 91.00%, and an -score of 92.00%. The model was also better than SVM without optimization, which only achieved a training accuracy of 84.25% and a testing accuracy of 81.30%. This model can assist in diagnosing diabetes with accurate and consistent predictions for new data. The results are specific to the diabetes dataset used in this research, so further testing on other binary datasets is necessary to confirm the model's effectiveness and generalizability across different domains and types of data.

Downloads

Download data is not yet available.

References

American Diabetes Association, “Diagnosis And Classification Of Diabetes Mellitus,” Diabetes Care, vol. 34, no. Supplement_1, pp. S62–S69, 2011, doi: https://doi.org/10.2337/dc11-S062.

M. Wątroba, A. D. Grabowska, and D. Szukiewicz, “EFFECTS OF DIABETES MELLITUS-RELATED DYSGLYCEMIA ON THE FUNCTIONS OF FLOOD-BRAIN BARRIER AND THE RISK OF DEMENTIA,” International Journal of Molecular Sciences, vol. 24, no. 12, 2023.doi: https://doi.org/10.3390/ijms241210069

F. Lotti and M. Maggi, “EFFECTS OF DIABETES MELLITUS ON SPERM QUALITY AND FERTILITY OUTCOMES: CLINICAL EVIDENCE,” Andrology, vol. 11, no. 2, pp. 399–416, 2023, doi: https://doi.org/10.1111/andr.13342.

A. Katsarou, S. Gudbjörnsdottir, A. Rawshani, D. Dabelea, E. Bonifacio, B. J. Anderson, L.M. Jacobsen, D. A. Schatz, and Å. Lernmark, “TYPE 1 DIABETES MELLITUS,” Nature Reviews Disease Primers, vol. 3, no. 1, pp. 17016, 2017, doi: https://doi.org/10.1038/nrdp.2017.16.

K. Yang, X. Yang, C. Jin, S. Ding, T. Liu, B. Ma, H. Sun, J. Zhang, and Y. Li, “GLOBAL BURDEN OF TYPE 1 DIABETES IN ADULTS AGED 65 YEARS AND OLDER, 1990-2019: POPULATION BASED STUDY,” BMJ, pp. e078432, 2024, doi: https://doi.org/10.1136/bmj-2023-078432.

H. Yaribeygi, T. Sathyapalan, S. L. Atkin, and A. Sahebkar, “MOLECULAR MECHANISMS LINKING OXIDATIVE STRESS AND DIABETES MELLITUS,” Oxidative Medicine and Cellular Longevity, vol. 2020, pp. 1–13, 2020, doi: https://doi.org/10.1155/2020/8609213.

WHO, “Diabetes.” Accessed: Aug. 25, 2024. [Online]. Available: https://www.who.int/news-room/fact-sheets/detail/diabetes

L. Pujiati, “PENGARUH KONTROL KADAR GULA DARAH DAN PEMILIHAN BALUTAN TERHADAP LAMA PENYEMBUHAN LUKA PADA PASIEN ULKUS DIABETIKUM,” vol. 13, no. 2, 2020.

N. H. Cho, J. E. Shaw, S. Karuranga, J. D. D. R. Fernandes, A. W. Ohlrogge, and B. Malanda, “IDF DIABETES ATLAS: GLOBAL ESTIMATES OF DIABETES PREVALENCE FOR 2017 AND PROJECTIONS FOR 2045,” Diabetes Research and Clinical Practice, vol. 138, pp. 271–281, 2018, doi: https://doi.org/10.1016/j.diabres.2018.02.023.

S. Syamsurizal, “TYPE-2 DIABETES MELLITUS OF DEGENERATIVE DISEASE,” Bioscience, vol. 2, no. 1, pp. 34, 2018, doi: https://doi.org/10.24036/02018219980-0-00.

R. Ambady and S. Chamukuttan, “EARLY DIAGNOSIS AND PREVENTION OF DIABETES IN DEVELOPING COUNTRIES,” Reviews in Endocrine and Metabolic Disorders, vol. 9, no. 3, pp. 193, 2008, doi: https://doi.org/10.1007/s11154-008-9079-z.

J. Hippisley-Cox and C. Coupland, “DIABETES TREATMENTS AND RISK OF AMPUTATION, BLINDNESS, SEVERE KIDNEY FAILURE, HYPERGLYCAEMIA, AND HYPOGLYCAEMIA: OPEN COHORT STUDY IN PRIMARY CARE,” BMJ, pp. i1450, 2016, doi: https://doi.org/10.1136/bmj.i1450.

A. Viloria, J.-P. Lis-Gutiérrez, M. Gaitán-Angulo, A. R. M. Godoy, G. C. Moreno, and S. J. Kamatkar, “METHODOLOGY FOR THE DESIGN OF A STUDENT PATTERN RECOGNITION ROOL TO FACILITATE THE TEACHING - LEARNING PROCESS THROUGH KNOWLEDGE DATA DISCOVERY (BIG DATA),” in Data Mining and Big Data, vol. 10943, Y. Tan, Y. Shi, and Q. Tang, Eds., in Lecture Notes in Computer Science, vol. 10943. , Cham: Springer International Publishing, 2018, pp. 670–679. doi: https://doi.org/10.1007/978-3-319-93803-5_63.

A. Viloria, Y. Herazo-Beltran, D. Cabrera, and O. B. Pineda, “DIABETES DIAGNOSTIC PREDICTION USING VECTOR SUPPORT MACHINES,” Procedia Computer Science, vol. 170, pp. 376–381, 2020, doi: https://doi.org/10.1016/j.procs.2020.03.065.

S. Li, H. Zhao, Z. Ru, and Q. Sun, “PROBABILISTIC BACK ANALYSIS BASED ON BAYESIAN AND MULTI-OUTPUT SUPPORT VECTOR MACHINE FOR A HIGH CUT ROCK SLOPE,” Engineering Geology, vol. 203, pp. 178–190, 2016, doi: https://doi.org/10.1016/j.enggeo.2015.11.004.

Md. R. Islam, S. Banik, K. N. Rahman, and M. M. Rahman, “A Comparative Approach To Alleviating The Prevalence Of Diabetes Mellitus Using Machine Learning,” Computer Methods and Programs in Biomedicine Update, vol. 4, pp. 100113, 2023, doi: https://doi.org/10.1016/j.cmpbup.2023.100113.

H. Kaur and V. Kumari, “PREDICTIVE MODELLING AND ANALYTICS FOR DIABETES USING A MACHINE LEARNING APPROACH,” Applied Computing and Informatics, vol. 18, no. 1/2, pp. 90–100, 2022, doi: https://doi.org/10.1016/j.aci.2018.12.004.

V. Chaoji, R. Rastogi, and G. Roy, “Machine Learning In The Real World” in Proceedings of the VLDB Endowment, vol. 9, no. 13, 2016.doi: https://doi.org/10.14778/3007263.3007318

S. A. Ardiyansa, N. C. Maharani, S. Anam, and E. Julianto, “OPTIMIZING HEART ATTACK DIAGNOSIS USING RANDOM FOREST WITH BAT ALGORITHM AND GREEDY CROSSOVER TECHNIQUE,” BAREKENG: Jurnal Ilmu Matematika dan Terapan,., vol. 18, no. 2, pp. 1053–1066, 2024, doi: https://doi.org/10.30598/barekengvol18iss2pp1053-1066.

C. Yaiprasert and A. N. Hidayanto, “AI-Powered Ensemble Machine Learning To Optimize Cost Strategies In Logistics Business,” International Journal of Information Management Data Insights, vol. 4, no. 1, pp. 100209, 2024, doi: https://doi.org/10.1016/j.jjimei.2023.100209.

R. F. Khoiroh, E. Julianto, S. A. Adiyansa, H. A. Fajri, A. A. R. Yasa, and B. Sangapta, “IMPLEMENTASI SPEECH RECOGNITION WHISPER PADA DEBAT CALON WAKIL PRESIDEN REPUBLIK INDONESIA,” EXPLORE: Jurnal Informatika dan Komputer, vol. 14, no. 2, 2024, doi: https://doi.org/10.35200/ex.v14i2.115.

M. Bansal, A. Goyal, and A. Choudhary, “A COMPARATIVE ANALYSIS OF K-NEAREST NEIGHBOR, GENETIC, SUPPORT VECTOR MACHINE, DECISION TREE, AND LONG SHORT TERM MEMORY ALGORITHMS IN MACHINE LEARNING,” Decision Analytics Journal, vol. 3, pp. 100071, 2022, doi: https://doi.org/10.1016/j.dajour.2022.100071.

T. Liao, Z. Lei, T. Zhu, S. Zeng, Y. Li, and C. Yuan, “DEEP METRIC LEARNING FOR K-NEAREST NEIGHBOR CLASSICATION,” IEEE Transactions on Knowledge and Data Engineering, pp. 1–1, 2021, doi: https://doi.org/10.1109/TKDE.2021.3090275.

J. Cervantes, F. Garcia-Lamont, L. Rodríguez-Mazahua, and A. Lopez, “A COMPREHENSIVE SURVEY ON SUPPORT VECTOR MACHINE CLASSIFICATION: APPLICATIONS, CHALLENGES AND TRENDS,” Neurocomputing, vol. 408, pp. 189–215, 2020, doi: https://doi.org/10.1016/j.neucom.2019.10.118.

S. Srivastava, L. Sharma, V. Sharma, A. Kumar, and H. Darbari, “PREDICTION OF DIABETES USING ARTIFICIAL NEURAL NETWORK APPROACH,” in Engineering Vibration, Communication and Information Processing, vol. 478, K. Ray, S. N. Sharan, S. Rawat, S. K. Jain, S. Srivastava, and A. Bandyopadhyay, Eds., in Lecture Notes in Electrical Engineering, vol. 478 , Singapore: Springer Singapore, 2019, pp. 679–687. doi: https://doi.org/10.1007/978-981-13-1642-5_59.

A. Gunawan and I. Fenriana, “DESIGN OF DIABETES PREDICTION APPLICATION USING K-NEAREST NEIGHBOR ALGORITHM,” Bit-Tech, vol. 6, no. 2, pp. 110–117, 2023, doi: https://doi.org/10.32877/bt.v6i2.939.

Y. Zhang, G. Wang, F. Chung, and S. Wang, “SUPPORT VECTOR MACHINES WITH THE KNOWN FEATURE-EVOLUTION PRIORS,” Knowledge-Based Systems, vol. 223, pp. 107048, 2021, doi: https://doi.org/10.1016/j.knosys.2021.107048.

M. Afshar and H. Usefi, “OPTIMIZING FEATURE SELECTION METHODS BY REMOVING IRRELEVANT FEATURES USING SPARSE LEAST SQUARES,” Expert Systems with Applications, vol. 200, pp. 116928, 2022, doi: https://doi.org/10.1016/j.eswa.2022.116928.

S. Mirjalili and A. Lewis, “THE WHALE OPTIMIZATION ALGORITHM,” Advances in Engineering Software, vol. 95, pp. 51–67, 2016, doi: https://doi.org/10.1016/j.advengsoft.2016.01.008.

W. Xu and J. Hu, “A COMPARISON OF PARAMETERS OPTIMIZED-TYPE VMD METHODS USED IN BEARING FAULT DIAGNOSIS,” Journal of Physics: Conference Series, vol. 2029, no. 1, pp. 012131, 2021, doi: https://doi.org/10.1088/1742-6596/2029/1/012131.

L. Liu, J. Liang, K. Guo, C. Ke, D. He, and J. Chen, “DYNAMIC PATH PLANNING OF MOBILE ROBOT BASED ON IMPROVED SPARROW SEARCH ALGORITHM,” Biomimetics, vol. 8, no. 2, pp. 182, 2023, doi: https://doi.org/10.3390/biomimetics8020182.

H. Nematzadeh, R. Enayatifar, M. Mahmud, and E. Akbari, “FREQUENCY BASED FEATURE SELECTION METHOD USING WHALE ALGORITHM,” Genomics, vol. 111, no. 6, pp. 1946–1955, 2019, doi: https://doi.org/10.1016/j.ygeno.2019.01.006.

Mila Desi Anasanti, Khairunisa Hilyati, and Annisa Novtariany, “THE EXPLORING FEATURE SELECTION TECHNIQUES ON CLASSIFICATION ALGORITHMS FOR PREDICTING TYPE 2 DIABETES AT EARLY STAGE,” Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), vol. 6, no. 5, pp. 832–839, 2022, doi: https://doi.org/10.29207/resti.v6i5.4419.

G.-Y. Ning and D.-Q. Cao, “IMPROVED WHALE OPTIMIZATION ALGORITHM FOR SOLVING CONSTRAINED OPTIMIZATION PROBLEMS,” Discrete Dynamics in Nature and Society, vol. 2021, pp. 1–13, 2021, doi: https://doi.org/10.1155/2021/8832251.

D. Mohajan and H. K. Mohajan, “BODY MASS INDEX (BMI) IS A POPULAR ANTHROPOMETRIC TOOL TO MEASURE OBESITY AMONG ADULTS,” Journal of Innovations in Medical Research, vol. 2, no. 4, pp. 25–33, 2023, doi: https://doi.org/10.56397/JIMR/2023.04.06.

D. P. Wilson et al., “USE OF LIPOPROTEIN(A) IN CLINICAL PRACTICE: A BIOMARKER WHOSE TIME HAS COME. A SCIENTIFIC STATEMENT FROM THE NATIONAL LIPID ASSOCIATION,” Journal of Clinical Lipidology, vol. 16, no. 5, pp. e77–e95, Sep. 2022, doi: https://doi.org/10.1016/j.jacl.2022.08.007.

J. A. Goldbogen, A. S. Friedlaender, J. Calambokidis, M. F. McKenna, M. Simon, and D. P. Nowacek, “INTEGRATIVE APPROACHES TO THE STUDY OF BALEEN WHALE DIVING BEHAVIOR, FEEDING PERFORMANCE, AND FORAGING ECOLOGY,” BioScience, vol. 63, no. 2, pp. 90–100, 2013, doi: https://doi.org/10.1525/bio.2013.63.2.5.

M. Barhoush, B. H. Abed-alguni, and N. E. A. Al-qudah, “IMPROVED DISCRETE SALP SWARM ALGORITHM USING EXPLORATION AND EXPLOITATION TECHNIQUES FOR FEATURE SELECTION IN INTRUSION DETECTION SYSTEMS,” The Journal of Supercomputing, vol. 79, no. 18, pp. 21265–21309, 2023, doi: https://doi.org/10.1007/s11227-023-05444-4

J. I. Daoud, “MULTICOLLINEARITY AND REGRESSION ANALYSIS,” Journal of Physics: Conference Series, vol. 949, no. 1, pp. 012009, 2017, doi: https://doi.org/10.1088/1742-6596/949/1/012009.

D. Chicco, M. J. Warrens, and G. Jurman, “THE COEFFICIENT OF DETERMINATION R-SQUARED IS MORE INFORMATIVE THAN SMAPE, MAE, MAPE, MSE AND RMSE IN REGRESSION ANALYSIS EVALUATION,” PeerJ Computer Science, vol. 7, pp. e623, 2021, doi: https://doi.org/10.7717/peerj-cs.623.

S. D. Wahyuni and R. H. Kusumodestoni, “OPTIMALISASI ALGORITMA SUPPORT VECTOR MACHINE (SVM) DALAM KLASIFIKASI KEJADIAN DATA STUNTING,” Bulletin of Information Technology (BIT), vol. 5, no. 2, pp. 56–64, Jun. 2024, doi: 10.47065/bit.v5i2.1247.

Published
2025-09-01
How to Cite
[1]
H. A. Fajri, S. A. Ardiyansa, S. Anam, N. C. Maharani, and E. Julianto, “IMPROVING SUPPORT VECTOR MACHINE PERFORMANCE WITH BINARY GAUSSIAN IMPROVED WHALE OPTIMIZATION ALGORITHM: A CASE STUDY ON DIABETES DATA”, BAREKENG: J. Math. & App., vol. 19, no. 4, pp. 2531-2542, Sep. 2025.