K-MEANS CLUSTER COUNT OPTIMIZATION WITH SILHOUETTE INDEX VALIDATION AND DAVIES BOULDIN INDEX (CASE STUDY: COVERAGE OF PREGNANT WOMEN, CHILDBIRTH, AND POSTPARTUM HEALTH SERVICES IN INDONESIA IN 2020)
Abstract
One of the causes of the increasing maternal mortality rate in Indonesia is the declining performance of maternal health services in each Indonesian province. To overcome the decline in performance, namely by determining in advance the provinces that need to be prioritized for services by grouping 34 provinces in Indonesia. This study aims to obtain the best provincial grouping results so that it can prioritize the right provinces. One of the methods that are suitable for grouping provinces is K-Means because it is simple and easy to implement. The disadvantage of K-Means is that it is sensitive to determining the right number of initial clusters, so Silhouette Index and Davies Bouldin Index validation is used to obtain the optimal number of clusters with stable and consistent results. This study used healthcare data for pregnant women, childbirth, and postpartum with K=2, 3, and 4 as the initial cluster number. K-Means objects are grouped in similarities using Euclidean and Manhattan distances. The result obtained was the optimal number of clusters with K=2 using Manhattan, where the highest Silhouette Index value was 0,658685 and the lowest Davies Bouldin Index was 0,3561214 which met the criteria for determining the optimal cluster.
Downloads
References
A. Bates, and J. Kalita, Counting Clusters in Twitter Posts. Proceedings of the 2nd International Conference on Information Technology for Competitive Strategies, 2016.
A. Widarjono, Analisis Statistika Multivariat Terapan Edisi Pertama. Yogyakarta: UPP STIM YKPN, 2010.
Badan Perencanaan Pembangunan Nasional, “Tujuan Pembangunan Berkelanjutan Sustainable Development Goals Kehidupan Sehat dan Sejahtera”, 2022, https://sdgs.bappenas.go.id/tujuan-3/ [Accessed: 15 January 2022]
D. Gujarati, Dasar-dasar Ekonometrika Jilid 2. Jakarta: Erlangga, 2009.
D. Rachmatin, and K. Sawitri, “Perbandingan antara Metode Agglomeratif, Metode Divisif dan Metode K-Means dalam Analisis Klaster”, 2019, http://eprints.itenas.ac.id/157/ [Accessed: 20 February 2022].
D. T. Larose, and C. D. Larose, Discovering Knowledge in Data An Introduction to Data Mining Second Edition Wiley Series on Methods and Applications in Data Mining. New Jersey: John Wiley and Sons, Inc, 2014.
E. Hair, T. Halle, E. Terry-Humen, B. Lavelle, and J. Calkins, “Children's School Readiness in the ECLS-K: Predictions to Academic, Health, and Social Outcomes in First Grade”. Early Childhood Research Quarterly, vol. 21, no. 4, pp. 431-454, 2006.
E. U. Wahyuningtyas, R. R. Putri, and Sutrisno, “Optimasi K-Means untuk Clustering Dosen Berdasarkan Kinerja Akademik Menggunakan Algoritma Genetika Paralel”, Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer, vol. 2, no. 8, pp. 2628 – 2635, 2018.
J. F. Hair, R. E. Anderson, R. L. Thatham, and W. C. Black, Multivariate Data Analysis Seventh Edition. New Jersey: Pearson Education, Inc, 2010.
J. Supranto, Analisis Multivariat : Arti dan Interpretasi. Jakarta: PT. Rineka Cipta, 2004.
Kementerian Kesehatan, Profil Kesehatan Republik Indonesia 2020. Jakarta: Kementerian Kesehatan, 2020.
L. Fadhurullah, “Gambaran Kualitas Pelayanan Kesehatan Ibu dan Anak”, Psikoborneo, vol. 6, no. 1, pp. 81-91, 2018.
L. Vendramin, R. Campello, and E. R. Hruschka, “On the Comparison of Relative Clustering Validity Criteria”, Proceedings of the SIAM International Conference on Data Mining, vol. 3, no. 4, pp. 733-744, 2009.
M. D. Kartikasari, “Self-Organizing Map Menggunakan Davies Bouldin Index dalam Pengelompokan Wilayah Indonesia Berdasarkan Konsumsi Pangan”, Jambura J.Math, vol. 3, no. 2, pp. 187-196, 2021.
N. Pratiwi, Implementasi K-Means dan K-Medoids Clustering dalam Pengelompokan Unit Usaha Koperasi (Studi Kasus: Unit Usaha Koperasi Terdaftar di Kabupaten Sleman per Tahun Buku 2014). Yogyakarta: Universitas Islam Indonesia, 2016.
R. A. Johnson, and D. W. Wichern, “Applied Multivariate Statistical Analysis”, 2002, http://faculty.smu.edu/tfomby/eco5385/lecture/Scoring%20Measures%20for%20Prediction%20Problems.pdf [Accessed: 10 January 2022].
R. Awasthi, A. K. Tiwari, and S. Pathak, “Empirical Evaluation on K-Means Clustering with Effect of Distance Functions for Bank Dataset”, International Journal of Innovative Technology and Research, vol. 1, no. 3, pp. 233-235, 2013.
Y. Agusta, “K-Means-Penerapan, Permasalahan, dan Metode Terkait”, 2007, http://www.yudiagusta.file.wordpress.com/2008/03/K-Means.pdf [Accessed: 10 January 2022].
Copyright (c) 2023 Fahlevi Suryaningrum, Dwi Ispriyanti, Iut Tri Utami
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this Journal agree to the following terms:
- Author retain copyright and grant the journal right of first publication with the work simultaneously licensed under a creative commons attribution license that allow others to share the work within an acknowledgement of the work’s authorship and initial publication of this journal.
- Authors are able to enter into separate, additional contractual arrangement for the non-exclusive distribution of the journal’s published version of the work (e.g. acknowledgement of its initial publication in this journal).
- Authors are permitted and encouraged to post their work online (e.g. in institutional repositories or on their websites) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published works.