K-MEANS CLUSTER COUNT OPTIMIZATION WITH SILHOUETTE INDEX VALIDATION AND DAVIES BOULDIN INDEX (CASE STUDY: COVERAGE OF PREGNANT WOMEN, CHILDBIRTH, AND POSTPARTUM HEALTH SERVICES IN INDONESIA IN 2020)

  • Iut Tri Utami Department of Statistics, Faculty of Science and Mathematics, Diponegoro University, Indonesia
  • Fahlevi Suryaningrum Department of Statistics, Faculty of Science and Mathematics, Diponegoro University, Indonesia
  • Dwi Ispriyanti Department of Statistics, Faculty of Science and Mathematics, Diponegoro University, Indonesia
Keywords: Maternal Health Care, K-Means, Silhouette Index, Davies Bouldin Index

Abstract

One of the causes of the increasing maternal mortality rate in Indonesia is the declining performance of maternal health services in each Indonesian province. To overcome the decline in performance, namely by determining in advance the provinces that need to be prioritized for services by grouping 34 provinces in Indonesia. This study aims to obtain the best provincial grouping results so that it can prioritize the right provinces. One of the methods that are suitable for grouping provinces is K-Means because it is simple and easy to implement. The disadvantage of K-Means is that it is sensitive to determining the right number of initial clusters, so Silhouette Index and Davies Bouldin Index validation is used to obtain the optimal number of clusters with stable and consistent results. This study used healthcare data for pregnant women, childbirth, and postpartum with K=2, 3, and 4 as the initial cluster number. K-Means objects are grouped in similarities using Euclidean and Manhattan distances. The result obtained was the optimal number of clusters with K=2 using Manhattan, where the highest Silhouette Index value was 0,658685 and the lowest Davies Bouldin Index was 0,3561214 which met the criteria for determining the optimal cluster.

Downloads

Download data is not yet available.

References

A. Bates, and J. Kalita, Counting Clusters in Twitter Posts. Proceedings of the 2nd International Conference on Information Technology for Competitive Strategies, 2016.

A. Widarjono, Analisis Statistika Multivariat Terapan Edisi Pertama. Yogyakarta: UPP STIM YKPN, 2010.

Badan Perencanaan Pembangunan Nasional, “Tujuan Pembangunan Berkelanjutan Sustainable Development Goals Kehidupan Sehat dan Sejahtera”, 2022, https://sdgs.bappenas.go.id/tujuan-3/ [Accessed: 15 January 2022]

D. Gujarati, Dasar-dasar Ekonometrika Jilid 2. Jakarta: Erlangga, 2009.

D. Rachmatin, and K. Sawitri, “Perbandingan antara Metode Agglomeratif, Metode Divisif dan Metode K-Means dalam Analisis Klaster”, 2019, http://eprints.itenas.ac.id/157/ [Accessed: 20 February 2022].

D. T. Larose, and C. D. Larose, Discovering Knowledge in Data An Introduction to Data Mining Second Edition Wiley Series on Methods and Applications in Data Mining. New Jersey: John Wiley and Sons, Inc, 2014.

E. Hair, T. Halle, E. Terry-Humen, B. Lavelle, and J. Calkins, “Children's School Readiness in the ECLS-K: Predictions to Academic, Health, and Social Outcomes in First Grade”. Early Childhood Research Quarterly, vol. 21, no. 4, pp. 431-454, 2006.

E. U. Wahyuningtyas, R. R. Putri, and Sutrisno, “Optimasi K-Means untuk Clustering Dosen Berdasarkan Kinerja Akademik Menggunakan Algoritma Genetika Paralel”, Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer, vol. 2, no. 8, pp. 2628 – 2635, 2018.

J. F. Hair, R. E. Anderson, R. L. Thatham, and W. C. Black, Multivariate Data Analysis Seventh Edition. New Jersey: Pearson Education, Inc, 2010.

J. Supranto, Analisis Multivariat : Arti dan Interpretasi. Jakarta: PT. Rineka Cipta, 2004.

Kementerian Kesehatan, Profil Kesehatan Republik Indonesia 2020. Jakarta: Kementerian Kesehatan, 2020.

L. Fadhurullah, “Gambaran Kualitas Pelayanan Kesehatan Ibu dan Anak”, Psikoborneo, vol. 6, no. 1, pp. 81-91, 2018.

L. Vendramin, R. Campello, and E. R. Hruschka, “On the Comparison of Relative Clustering Validity Criteria”, Proceedings of the SIAM International Conference on Data Mining, vol. 3, no. 4, pp. 733-744, 2009.

M. D. Kartikasari, “Self-Organizing Map Menggunakan Davies Bouldin Index dalam Pengelompokan Wilayah Indonesia Berdasarkan Konsumsi Pangan”, Jambura J.Math, vol. 3, no. 2, pp. 187-196, 2021.

N. Pratiwi, Implementasi K-Means dan K-Medoids Clustering dalam Pengelompokan Unit Usaha Koperasi (Studi Kasus: Unit Usaha Koperasi Terdaftar di Kabupaten Sleman per Tahun Buku 2014). Yogyakarta: Universitas Islam Indonesia, 2016.

R. A. Johnson, and D. W. Wichern, “Applied Multivariate Statistical Analysis”, 2002, http://faculty.smu.edu/tfomby/eco5385/lecture/Scoring%20Measures%20for%20Prediction%20Problems.pdf [Accessed: 10 January 2022].

R. Awasthi, A. K. Tiwari, and S. Pathak, “Empirical Evaluation on K-Means Clustering with Effect of Distance Functions for Bank Dataset”, International Journal of Innovative Technology and Research, vol. 1, no. 3, pp. 233-235, 2013.

Y. Agusta, “K-Means-Penerapan, Permasalahan, dan Metode Terkait”, 2007, http://www.yudiagusta.file.wordpress.com/2008/03/K-Means.pdf [Accessed: 10 January 2022].

Published
2023-06-11
How to Cite
[1]
I. Utami, F. Suryaningrum, and D. Ispriyanti, “K-MEANS CLUSTER COUNT OPTIMIZATION WITH SILHOUETTE INDEX VALIDATION AND DAVIES BOULDIN INDEX (CASE STUDY: COVERAGE OF PREGNANT WOMEN, CHILDBIRTH, AND POSTPARTUM HEALTH SERVICES IN INDONESIA IN 2020)”, BAREKENG: J. Math. & App., vol. 17, no. 2, pp. 0707-0716, Jun. 2023.