IMPLEMENTATION OF K-MEDOIDS AND K-PROTOTYPES CLUSTERING FOR EARLY DETECTION OF HYPERTENSION DISEASE
Abstract
Hypertension is a serious concern because of its significant impact on public health, especially in the context of lifestyle changes and specific health conditions. One method for grouping patients based on complex clinical data is the Clustering method. This research type is quantitative, namely taking or collecting the necessary data and then analyzing it using the K-Medoids and K-Prototypes methods. The K-Medoids method is more resistant to outliers and noise than the K-Means method, which is more suitable for this research. The K-Prototypes method can handle mixed numerical and categorical data, effectively grouping hypertensive patients based on different variable categories. This research used the K-Medoids and K-Prototypes grouping methods to categorize patients into risk categories based on gender, age, family history of hypertension, smoking status, pulse rate, and increased systolic and diastolic blood pressure. The Elbow and Silhouette Coefficient methods were applied to evaluate the data and determine the optimal number of clusters for dividing patients into low-risk and high-risk hypertension groups. The analysis revealed that two clusters are the optimal solution. The clustering results show K-Medoids' superiority in grouping data with higher Silhouette Coefficient values compared to K-Prototypes. Overall, the K-Medoids and K-Prototypes algorithms can detect early hypertension risk by dividing patients into different risk groups. Although the clustering results are still weak, these two methods show potential in helping health institutions identify and treat hypertension risk in Indonesia.
Downloads
References
T. Unger, et al., “International Society of Hypertension Global Hypertension Practice Guidelines,” Hypertension, vol. 75, no. 6, pp. 1334–1357, 2020, doi: https://doi.org/10.1161/hypertensionaha.120.15026.
B. L. Yudha, L. Muflikhah, and R. C. Wihandika, “Klasifikasi Risiko Hipertensi Menggunakan Metode Neighbor Weighted K- Nearest Neighbor ( NWKNN ),” J. Pengemb. Teknol. Inf. dan Ilmu Komput. Univ. Brawijaya, vol. 2, no. 2, pp. 897–904, 2018.
K. T. Mills, A. Stefanescu, and J. Hea, “The Global Epidemiology of Hypertension,” Nat. Rev. Nephrol., vol. 16, no. 4, pp. 223–237, 2020, doi: 10.1038/s41581-019-0244-2.
D. O. Ondimu, G. M. Kikuvi, and W. N. Otieno, “Risk Factors for Hypertension among Young Adults (18-35) Years Attending in Tenwek Mission Hospital, Bomet County, Kenya in 2018,” Pan African Med. Journal, vol. 33, no. 210, pp. 1–8, 2019, doi: 10.11604/pamj.2019.33.210.18407.
B. Setiaji and P. A. K. Pramudho, “Pemanfaatan Teknologi Informasi Berbasis Data Dan Jurnal Untuk Rekomendasi Kebijakan Bidang Kesehatan,” Heal. J. Inov. Ris. Ilmu Kesehat., vol. 1, no. 3, pp. 166–175, 2022, doi: 10.51878/healthy.v1i3.1649.
R. A. Johnson and D. W. Wicheren, Applied Multivariate Statistical Analysis. Prentice Hall, 2002.
L. Kaufman and P. Rousseeuw, Finding Groups in Data: An Introduction to Cluster Analysis. New York: John Wiley, 1990.
H. Nabila, D. Retno, and S. Saputro, “Clustering Data Campuran Numerik dan Kategorik Menggunakan Algoritme Ensemble Quick RObust Clustering using linKs ( QROCK ),” Prism. Pros. Semin. Nas. Mat., vol. 5, no. 1, pp. 716–720, 2022, [Online]. Available: https://journal.unnes.ac.id/sju/index.php/prisma/article/view/54590
Z. R. Fadilah and A. W. Wijayanto, “Perbandingan Metode Klasterisasi Data Bertipe Campuran: One-Hot-Encoding, Gower Distance, dan K-Prototype Berdasarkan Akurasi (Studi Kasus: Chronic Kidney Disease Dataset),” J. Appl. Informatics Comput., vol. 7, no. 1, pp. 57–67, 2023, doi: 10.30871/jaic.v7i1.5857.
S. Sundari, I. S. Damanik, A. P. Windarto, H. S. Tambunan, J. Jalaluddin, and A. Wanto, “Analisis K-Medoids Clustering Dalam Pengelompokkan Data Imunisasi Campak Balita di Indonesia,” Pros. Semin. Nas. Ris. Inf. Sci., vol. 1, no. September, p. 687, 2019, doi: 10.30645/senaris.v1i0.75.
S. Nurlaela, A. Primajaya, and T. N. Padilah, “Algoritma K-Medoids Untuk Clustering Penyakit Maag Di Kabupaten Karawang,” I N F O R M a T I K a, vol. 12, no. 2, p. 56, 2020, doi: 10.36723/juri.v12i2.234.
S. Sindi, W. R. O. Ningse, I. A. Sihombing, F. I. R.H.Zer, and D. Hartama, “Analisis Algoritma K-Medoids Clustering Dalam Pengelompokan Penyebaran Covid-19 Di Indonesia,” J. Teknol. Inf., vol. 4, no. 1, pp. 166–173, 2020, doi: 10.36294/jurti.v4i1.1296.
E. H. S. Atmaja, “Implementation of k-Medoids Clustering Algorithm to Cluster Crime Patterns in Yogyakarta,” Int. J. Appl. Sci. Smart Technol., vol. 1, no. 1, pp. 33–44, 2019, doi: 10.24071/ijasst.v1i1.1859.
R. Novidianto and K. Fithriasari, “Algoritma ClusterMix K-Prototypes Untuk Menangkap Karakteristik Pasien Berdasarkan Variabel Penciri Mortalitas Pasien Dengan Gagal Jantung,” Inferensi, vol. 4, no. 1, p. 37, 2021, doi: 10.12962/j27213862.v4i1.8479.
R. Madhuri, M. R. Murty, J. V. R. Murthy, P. V. G. D. P. Reddy, and S. C. Satapathy, “Cluster Analysis on Different Data Sets Using K-Modes and K-Prototype Algorithms,” Adv. Intell. Syst. Comput., vol. 249 VOLUME, pp. 137–144, 2014, doi: 10.1007/978-3-319-03095-1_15.
B. Prihasto, D. Darmansyah, D. P. Yuda, F. M. Alwafi, H. N. Ekawati, and Y. P. Sari, “Comparative Analysis of K-Means and K-Medoids Clustering Methods on Weather Data of Denpasar City,” J. Pendidik. Multimed., vol. 5, no. 2, pp. 91–114, 2023, doi: 10.17509/edsence.v5i2.65925.
Z. Huang, “Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values. Data Mining and Knowledge Discovery 2, 283-304,” Data Min. Knowl. Discov., vol. 2, no. 3, pp. 283–304, 1998, [Online]. Available: https://www.researchgate.net/publication/220451944_Huang_Z_Extensions_to_the_k-Means_Algorithm_for_Clustering_Large_Data_Sets_with_Categorical_Values_Data_Mining_and_Knowledge_Discovery_2_283-304
P. M. Hasugian, B. Sinaga, J. Manurung, and S. A. Al Hashim, “Best Cluster Optimization with Combination of K-Means Algorithm And Elbow Method Towards Rice Production Status Determination,” Int. J. Artif. Intell. Res., vol. 5, no. 1, pp. 102–110, 2021, doi: 10.29099/ijair.v6i1.232.
P. J. Rousseeuw, “Silhouettes: a graphical aid to the interpretation and validation of cluster analysis,” J. Comput. Appl. Math., vol. 20, pp. 53–65, 1987.
H. Hafid and A. Arisandi, “Klasifikasi Penggunaan Teknologi Pada Petani Milenial di Sulawesi Selatan Menggunakan Density Based Spatial Clustering Algorithm With Noise,” J. Math. Theory Appl., vol. 6, no. 1, pp. 104–113, 2024, doi: 10.31605/jomta.v6i1.3623.
B. N. Sari and A. Primajaya, “Penerapan Clustering Dbscan Untuk Pertanian Padi Di Kabupaten Karawang,” J. Inform. dan Komput., vol. 4, no. 1, pp. 28–34, 2019, [Online]. Available: https://ejournal.akakom.ac.id/index.php/jiko/article/view/178%0Awww.mapcoordinates.net/en.
G. Brock, V. Pihur, S. Datta, and S. Datta, “ClValid: An R package for cluster validation,” J. Stat. Softw., vol. 25, no. 4, pp. 1–22, 2008, doi: 10.18637/jss.v025.i04.
N. Rizqia and P. Ratnasari, “Comparative Study of k-Mean , k-Medoid , and Hierarchical Clustering,” vol. 5, no. 2, pp. 9–20, 2023.
H. Santoso, “Case Base Reasoning Untuk Mendiagnosis Penyakit Hipertensi Menggunakan Metode Indexing Density Based Spatial Clustering Application With Noise (DBSCAN),” ETHOS (Jurnal Penelit. dan Pengabdian), vol. 7, no. 1, pp. 88–100, 2019, doi: 10.29313/ethos.v7i1.4206.
Copyright (c) 2025 Hardianti Hafid, Selvi Annisa
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this Journal agree to the following terms:
- Author retain copyright and grant the journal right of first publication with the work simultaneously licensed under a creative commons attribution license that allow others to share the work within an acknowledgement of the work’s authorship and initial publication of this journal.
- Authors are able to enter into separate, additional contractual arrangement for the non-exclusive distribution of the journal’s published version of the work (e.g. acknowledgement of its initial publication in this journal).
- Authors are permitted and encouraged to post their work online (e.g. in institutional repositories or on their websites) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published works.