IMPLEMENTATION OF K-MEDOIDS AND K-PROTOTYPES CLUSTERING FOR EARLY DETECTION OF HYPERTENSION DISEASE

  • Hardianti Hafid Department of Statistics, Faculty of Mathematics and Natural Science, Universitas Negeri Makassar, Indonesia https://orcid.org/0009-0006-2306-7453
  • Selvi Annisa Department of Statistics, Faculty of Mathematics and Natural Science, Universitas Lambung Mangkurat, Indonesia https://orcid.org/0009-0005-7807-9200
Keywords: K-Medoids, K-Prototypes, Hypertension, Clustering

Abstract

Hypertension is a serious concern because of its significant impact on public health, especially in the context of lifestyle changes and specific health conditions. One method for grouping patients based on complex clinical data is the Clustering method. This research type is quantitative, namely taking or collecting the necessary data and then analyzing it using the K-Medoids and K-Prototypes methods. The K-Medoids method is more resistant to outliers and noise than the K-Means method, which is more suitable for this research. The K-Prototypes method can handle mixed numerical and categorical data, effectively grouping hypertensive patients based on different variable categories. This research used the K-Medoids and K-Prototypes grouping methods to categorize patients into risk categories based on gender, age, family history of hypertension, smoking status, pulse rate, and increased systolic and diastolic blood pressure. The Elbow and Silhouette Coefficient methods were applied to evaluate the data and determine the optimal number of clusters for dividing patients into low-risk and high-risk hypertension groups. The analysis revealed that two clusters are the optimal solution. The clustering results show K-Medoids' superiority in grouping data with higher Silhouette Coefficient values ​​compared to K-Prototypes. Overall, the K-Medoids and K-Prototypes algorithms can detect early hypertension risk by dividing patients into different risk groups. Although the clustering results are still weak, these two methods show potential in helping health institutions identify and treat hypertension risk in Indonesia.

Downloads

Download data is not yet available.

References

T. Unger, et al., “International Society of Hypertension Global Hypertension Practice Guidelines,” Hypertension, vol. 75, no. 6, pp. 1334–1357, 2020, doi: https://doi.org/10.1161/hypertensionaha.120.15026.

B. L. Yudha, L. Muflikhah, and R. C. Wihandika, “Klasifikasi Risiko Hipertensi Menggunakan Metode Neighbor Weighted K- Nearest Neighbor ( NWKNN ),” J. Pengemb. Teknol. Inf. dan Ilmu Komput. Univ. Brawijaya, vol. 2, no. 2, pp. 897–904, 2018.

K. T. Mills, A. Stefanescu, and J. Hea, “The Global Epidemiology of Hypertension,” Nat. Rev. Nephrol., vol. 16, no. 4, pp. 223–237, 2020, doi: 10.1038/s41581-019-0244-2.

D. O. Ondimu, G. M. Kikuvi, and W. N. Otieno, “Risk Factors for Hypertension among Young Adults (18-35) Years Attending in Tenwek Mission Hospital, Bomet County, Kenya in 2018,” Pan African Med. Journal, vol. 33, no. 210, pp. 1–8, 2019, doi: 10.11604/pamj.2019.33.210.18407.

B. Setiaji and P. A. K. Pramudho, “Pemanfaatan Teknologi Informasi Berbasis Data Dan Jurnal Untuk Rekomendasi Kebijakan Bidang Kesehatan,” Heal. J. Inov. Ris. Ilmu Kesehat., vol. 1, no. 3, pp. 166–175, 2022, doi: 10.51878/healthy.v1i3.1649.

R. A. Johnson and D. W. Wicheren, Applied Multivariate Statistical Analysis. Prentice Hall, 2002.

L. Kaufman and P. Rousseeuw, Finding Groups in Data: An Introduction to Cluster Analysis. New York: John Wiley, 1990.

H. Nabila, D. Retno, and S. Saputro, “Clustering Data Campuran Numerik dan Kategorik Menggunakan Algoritme Ensemble Quick RObust Clustering using linKs ( QROCK ),” Prism. Pros. Semin. Nas. Mat., vol. 5, no. 1, pp. 716–720, 2022, [Online]. Available: https://journal.unnes.ac.id/sju/index.php/prisma/article/view/54590

Z. R. Fadilah and A. W. Wijayanto, “Perbandingan Metode Klasterisasi Data Bertipe Campuran: One-Hot-Encoding, Gower Distance, dan K-Prototype Berdasarkan Akurasi (Studi Kasus: Chronic Kidney Disease Dataset),” J. Appl. Informatics Comput., vol. 7, no. 1, pp. 57–67, 2023, doi: 10.30871/jaic.v7i1.5857.

S. Sundari, I. S. Damanik, A. P. Windarto, H. S. Tambunan, J. Jalaluddin, and A. Wanto, “Analisis K-Medoids Clustering Dalam Pengelompokkan Data Imunisasi Campak Balita di Indonesia,” Pros. Semin. Nas. Ris. Inf. Sci., vol. 1, no. September, p. 687, 2019, doi: 10.30645/senaris.v1i0.75.

S. Nurlaela, A. Primajaya, and T. N. Padilah, “Algoritma K-Medoids Untuk Clustering Penyakit Maag Di Kabupaten Karawang,” I N F O R M a T I K a, vol. 12, no. 2, p. 56, 2020, doi: 10.36723/juri.v12i2.234.

S. Sindi, W. R. O. Ningse, I. A. Sihombing, F. I. R.H.Zer, and D. Hartama, “Analisis Algoritma K-Medoids Clustering Dalam Pengelompokan Penyebaran Covid-19 Di Indonesia,” J. Teknol. Inf., vol. 4, no. 1, pp. 166–173, 2020, doi: 10.36294/jurti.v4i1.1296.

E. H. S. Atmaja, “Implementation of k-Medoids Clustering Algorithm to Cluster Crime Patterns in Yogyakarta,” Int. J. Appl. Sci. Smart Technol., vol. 1, no. 1, pp. 33–44, 2019, doi: 10.24071/ijasst.v1i1.1859.

R. Novidianto and K. Fithriasari, “Algoritma ClusterMix K-Prototypes Untuk Menangkap Karakteristik Pasien Berdasarkan Variabel Penciri Mortalitas Pasien Dengan Gagal Jantung,” Inferensi, vol. 4, no. 1, p. 37, 2021, doi: 10.12962/j27213862.v4i1.8479.

R. Madhuri, M. R. Murty, J. V. R. Murthy, P. V. G. D. P. Reddy, and S. C. Satapathy, “Cluster Analysis on Different Data Sets Using K-Modes and K-Prototype Algorithms,” Adv. Intell. Syst. Comput., vol. 249 VOLUME, pp. 137–144, 2014, doi: 10.1007/978-3-319-03095-1_15.

B. Prihasto, D. Darmansyah, D. P. Yuda, F. M. Alwafi, H. N. Ekawati, and Y. P. Sari, “Comparative Analysis of K-Means and K-Medoids Clustering Methods on Weather Data of Denpasar City,” J. Pendidik. Multimed., vol. 5, no. 2, pp. 91–114, 2023, doi: 10.17509/edsence.v5i2.65925.

Z. Huang, “Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values. Data Mining and Knowledge Discovery 2, 283-304,” Data Min. Knowl. Discov., vol. 2, no. 3, pp. 283–304, 1998, [Online]. Available: https://www.researchgate.net/publication/220451944_Huang_Z_Extensions_to_the_k-Means_Algorithm_for_Clustering_Large_Data_Sets_with_Categorical_Values_Data_Mining_and_Knowledge_Discovery_2_283-304

P. M. Hasugian, B. Sinaga, J. Manurung, and S. A. Al Hashim, “Best Cluster Optimization with Combination of K-Means Algorithm And Elbow Method Towards Rice Production Status Determination,” Int. J. Artif. Intell. Res., vol. 5, no. 1, pp. 102–110, 2021, doi: 10.29099/ijair.v6i1.232.

P. J. Rousseeuw, “Silhouettes: a graphical aid to the interpretation and validation of cluster analysis,” J. Comput. Appl. Math., vol. 20, pp. 53–65, 1987.

H. Hafid and A. Arisandi, “Klasifikasi Penggunaan Teknologi Pada Petani Milenial di Sulawesi Selatan Menggunakan Density Based Spatial Clustering Algorithm With Noise,” J. Math. Theory Appl., vol. 6, no. 1, pp. 104–113, 2024, doi: 10.31605/jomta.v6i1.3623.

B. N. Sari and A. Primajaya, “Penerapan Clustering Dbscan Untuk Pertanian Padi Di Kabupaten Karawang,” J. Inform. dan Komput., vol. 4, no. 1, pp. 28–34, 2019, [Online]. Available: https://ejournal.akakom.ac.id/index.php/jiko/article/view/178%0Awww.mapcoordinates.net/en.

G. Brock, V. Pihur, S. Datta, and S. Datta, “ClValid: An R package for cluster validation,” J. Stat. Softw., vol. 25, no. 4, pp. 1–22, 2008, doi: 10.18637/jss.v025.i04.

N. Rizqia and P. Ratnasari, “Comparative Study of k-Mean , k-Medoid , and Hierarchical Clustering,” vol. 5, no. 2, pp. 9–20, 2023.

H. Santoso, “Case Base Reasoning Untuk Mendiagnosis Penyakit Hipertensi Menggunakan Metode Indexing Density Based Spatial Clustering Application With Noise (DBSCAN),” ETHOS (Jurnal Penelit. dan Pengabdian), vol. 7, no. 1, pp. 88–100, 2019, doi: 10.29313/ethos.v7i1.4206.

Published
2025-01-13
How to Cite
[1]
H. Hafid and S. Annisa, “IMPLEMENTATION OF K-MEDOIDS AND K-PROTOTYPES CLUSTERING FOR EARLY DETECTION OF HYPERTENSION DISEASE”, BAREKENG: J. Math. & App., vol. 19, no. 1, pp. 465-476, Jan. 2025.