COMPARISON OF K-MEANS AND GAUSSIAN MIXTURE MODEL IN PROFILING AREAS BY POVERTY INDICATORS

  • Zumrotul Wahidah Department of Statistics, Faculty of Mathematics and Natural Sciences, Universitas Islam Indonesia, Indonesia
  • Dina Tri Utari Department of Statistics, Faculty of Mathematics and Natural Sciences, Universitas Islam Indonesia, Indonesia
Keywords: Poverty, Clustering index, K-Means, Gaussian Mixture Model

Abstract

The Covid-19 pandemic has led to income degradation of the Indonesia population which potentially triggers poverty. According to the Indonesian Central Statistics Agency, the Province of Central Java is one of the areas that is most affected by Covid-19 especially on the economic aspect. In 2020, the percentage of poor people has increased by 0.6% from 2019. If this condition is ignored for the long term, it will have a negative impact on hampering national development. As a first step in designing a strategy for mitigating the impact of poverty, it is necessary to carry out an appropriate profiling of the areas affected on the economic aspect based on poverty indicators. This study compares the K-Means Clustering and Gaussian Mixture Model (GMM) in providing the best data grouping based on clustering indexes, including: connectivity, Dunn, and silhouette. GMM is a generalization of K-Means clustering to include information about the covariance structure of the data as well as latent Gaussian centers. We used poverty indicators data from Central Statistics Agency of Central Java, such as poverty line, percentage of poor population, poverty depth index, and poverty severity index.  The results obtained from this study indicate that the GMM gives the best results with the 3 clusters, with the number of members for the first, second, third is 10, 19, and 6 respectively.

Downloads

Download data is not yet available.

References

T. Tambunan, Perekonomian Indonesia (Teori dan Temuan Empiris). Jakarta: Ghalia Indonesia, 2001.

F. A. Hafiez, “Ini 5 Provinsi Penyumbang Kasus Covid-19 Terbanyak,” medcom.id, Mar. 02, 2022.

N. I. Febianto and N. D. Palasara, “Analisis Clustering K-Means Pada Data Informasi Kemiskinan Ddi Jawa Barat Tahun 2018,” Jurnal Sisfokom (Sistem Informasi dan Komputer), vol. 8, no. 2, pp. 130–140, 2019.

K. Aprilia and F. Sembiring, “Analisis Garis Kemiskinan Makanan Menggunakan Metode Algoritma K-Means Clustering,” in Seminar Nasional Sistem Informasi dan Manajemen Informatika, 2021, pp. 1–10.

D. Widyadhan, R. B. Hastuti, I. Kharisudin, and F. Fauzi, “Perbandingan Analisis Klaster K-Means dan Average Linkage untuk Pengklasteran Kemiskinan di Provinsi Jawa Tengah,” in PRISMA: Prosiding Seminar Nasional Matematika, 2021, pp. 584–594.

S. A. Prabawa, “Perbandingan Algoritma K-Means dan Gaussian Mixture Model untuk Pengelompoka Berita pada Kompas.com,” Universitas Multimedia Nusantara, Tangerang, 2021.

J. F. Hair, W. C. Black, B. J. Babin, and R. E. Anderson, Multivariate Data Analysis, 7th Edition. New York City: Pearson Education Limited, 2013.

S. Yamin, L. A. Rachmach, and H. Kurniawan, Regresi Dan Korelasi dalam genggaman Anda: Aplikasi dengan Software SPSS, Eviews, MINITAB, dan STATGRAPHICS. Jakarta: Salemba Empat, 2011.

J. I. Daoud, “Multicollinearity and Regression Analysis,” Journal of Physics: Conference Series 949, pp. 1–6, 2017.

R. A. Johnson and D. W. Wichern, Applied Multivariate Statistical Analysis 6th edition. United States of America: Pearson Education Inc., 2007.

J. F. J. Hair, W. C. Black, B. J. Babin, R. E. Anderson, and R. L. Tatham, Multivariate Data Analysis 6th edition. New Jersey: Pearson Education, 2006.

E. Irwansyah and M. Faisal, Advanced Clustering: Teori dan Aplikasi. Yogyakarta: Deepublish, 2015.

J. B. MacQueen, “Some methods for classification and analysis of multivariate observations,” in Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, 1967, pp. 281–297.

J. A. Hartigan and M. A. Wong, “Algorithm AS 136: A K-Means Clustering Algorithm,” J R Stat Soc Ser C Appl Stat, vol. 28, no. 1, pp. 100–108, 1979.

M. Wahyudi, Masitha, R. Saragih, and Solikhun, Data Mining: Penerapan Algoritma K-Means Clustering dan K-Medoids Clustering. Medan: Yayasan Kita Menulis, 2020.

G. L. McLachlan, K. E. Basford, and M. Dekker, “Mixture Models: Inference and Applications to Clustering,” J Am Stat Assoc, vol. 84, no. 405, pp. 337–338, 1989.

L. Scrucca, “Identifying connected components in Gaussian finite mixture models for clustering,” Comput Stat Data Anal, vol. 93, pp. 5–17, 2016.

E. Genge, “Analysis of Massive Emigration from Poland: The Model-Based Clustering Approach,” Argumenta Oeconomica Cracoviensia, vol. 16, pp. 37–49, 2017.

N. Shen and B. Gonz´alez, “Bayesian Information Criterion for Linear Mixed-effects Models,” 2021.

BPS Provinsi Jawa Tengah, “Kemiskinan dan Ketimpangan,” BPS Provinsi Jawa Tengah.

Published
2023-06-11
How to Cite
[1]
Z. Wahidah and D. Utari, “COMPARISON OF K-MEANS AND GAUSSIAN MIXTURE MODEL IN PROFILING AREAS BY POVERTY INDICATORS”, BAREKENG: J. Math. & App., vol. 17, no. 2, pp. 0717-0726, Jun. 2023.