A COMPARATIVE ANALYSIS OF DBSCAN AND GAUSSIAN MIXTURE MODEL FOR CLUSTERING INDONESIAN PROVINCES BASED ON SOCIOECONOMIC WELFARE INDICATORS

  • Sri Andayani Department of Mathematics Education, Faculty of Mathematics and Natural Sciences, Universitas Negeri Yogyakarta, Indonesia https://orcid.org/0000-0002-2121-9242
  • Namita Retnani Department of Mathematics Education, Faculty of Mathematics and Natural Sciences, Universitas Negeri Yogyakarta, Indonesia https://orcid.org/0000-0002-4844-2620
  • Thesa Adi Saputra Yusri Department of Mathematics Education, Faculty of Mathematics and Natural Sciences, Universitas Negeri Yogyakarta, Indonesia https://orcid.org/0000-0002-4844-2620
  • Bambang Sumarno Hadi Marwoto Department of Mathematics Education, Faculty of Mathematics and Natural Sciences, Universitas Negeri Yogyakarta, Indonesia https://orcid.org/0009-0002-2676-1804
Keywords: Clustering Analysis, DBSCAN, Gaussian Mixture Model (GMM), Socioeconomic Welfare

Abstract

Public welfare refers to a condition in which people experience happiness, comfort, prosperity, and can adequately fulfill their basic needs. Indonesia consists of several provinces, each with varying levels of welfare. One crucial aspect in promoting equitable development is ensuring that all regions in Indonesia achieve similar welfare standards. This study aims to classify Indonesian provinces based on socioeconomic welfare indicators, with the results serving as a basis for policy-making that considers regional potential and challenges. The data used in this study are secondary data obtained from the official website of BPS-Statistics Indonesia on provincial welfare indicators from 2020 to 2023. The research methodology includes data collection, descriptive statistical analysis, determining the optimal number of clusters, and comparing the clustering performance of Density-Based Spatial Clustering of Applications with Noise (DBSCAN) and the Gaussian Mixture Model (GMM) using Silhouette Index, Davies-Bouldin Index, and Calinski-Harabasz Index as evaluation metrics. The DBSCAN-based clustering resulted in two clusters: high-welfare and low-welfare regions. Meanwhile, GMM clustering produced five clusters: moderate, fairly low, low, high, and fairly high welfare regions. Based on cluster validity measures, GMM outperformed DBSCAN, achieving a Silhouette score of 0.28, a Davies-Bouldin Index of 1.12, and a Calinski-Harabasz Index of 10.9.

Downloads

Download data is not yet available.

References

F. Basri and H. Munandar, Lanskap Ekonomi Indonesia: KAJIAN DAN RENUNGAN TERHADAP MASALAH-MASALAH STRUKTURAL, TRANSFORMASI BARU, DAN PROSPEK PEREKONOMIAN INDONESIA. Jakarta: Kencana, 2009.

Badan Pusat Statistik (BPS), INDIKATOR KESEJAHTERAAN RAKYAT 2020. Jakarta: Badan Pusat Statistik Indonesia, 2020.

Badan Pusat Statistik (BPS), Indikator Kesejahteraan Rakyat 2023: HUBUNGAN FAKTOR SOSIAL DAN DEMOGRAFI DENGAN PEKERJA LANSIA DI INDONESIA. Jakarta: Badan Pusat Statistik Indonesia, 2023.

E. Setiawan, M. A. Suprayogi, and A. Kurnia, “A COMPARISON OF LOGISTIC REGRESSION, MIXED LOGISTIC REGRESSION, AND GEOGRAPHICALLY WEIGHTED LOGISTIC REGRESSION ON PUBLIC HEALTH DEVELOPMENT IN JAVA,” BAREKENG J. Ilmu Mat. Dan Terap., vol. 19, no. 1, pp. 129–140, Jan. 2025, doi: https://doi.org/10.30598/barekengvol19iss1pp129-140.

O. Rahmawati and A. Fauzan, “PROVINCIAL CLUSTERING BASED ON EDUCATION INDICATORS: K-MEDOIDS APPLICATION AND K-MEDOIDS OUTLIER HANDLING,” BAREKENG J. Ilmu Mat. Dan Terap., vol. 18, no. 2, pp. 1167–1178, May 2024, doi: https://doi.org/10.30598/barekengvol18iss2pp1167-1178.

P. A. Puspitasari, D. Y. Faidah, and T. Hendrawati, “GROUPING REGENCIES/CITIES IN WEST JAVA PROVINCE BASED ON PEOPLE’S WELFARE INDICATORS USING BIPLOT AND CLUSTERING,” BAREKENG J. Ilmu Mat. Dan Terap., vol. 18, no. 3, pp. 1839–1852, Jul. 2024, doi: https://doi.org/10.30598/barekengvol18iss3pp1839-1852.

M. Musa and S. I. Fallo, “HIERARCHICAL CLUSTER ANALYSIS ON PEOPLE’S WELFARE IN SOUTHEAST SULAWESI PROVINCE,” BAREKENG J. Ilmu Mat. Dan Terap., vol. 17, no. 2, pp. 1163–1172, Jun. 2023, doi: https://doi.org/10.30598/barekengvol17iss2pp1163-1172.

R. M. Prakash, K. Bhuvaneshwari, M. Divya, K. J. Sri, and A. S. Begum, “SEGMENTATION OF THERMAL INFRARED BREAST IMAGES USING K-MEANS, FCM AND EM ALGORITHMS FOR BREAST CANCER DETECTION,” in 2017 International Conference on Innovations in Information, Embedded and Communication Systems (ICIIECS), Coimbatore: IEEE, Mar. 2017, pp. 1–4. doi: https://doi.org/10.1109/ICIIECS.2017.8276142.

M. B. Johra, “SOFT CLUSTERING DENGAN ALGORITMA FUZZY K-MEANS (STUDI KASUS : PENGELOMPOKAN DESA DI KOTA TIDORE KEPULAUAN),” BAREKENG J. Ilmu Mat. Dan Terap., vol. 15, no. 2, pp. 385–392, Jun. 2021, doi: https://doi.org/10.30598/barekengvol15iss2pp385-392.

Z. Dong, W. Jiang, M. Sun, and Y. Zhang, “SOFT SENSING OF NOX EMISSIONS FROM THERMAL POWER UNITS BASED ON ADAPTIVE GMM TWO-STEP CLUSTERING ALGORITHM AND ENSEMBLE LEARNING,” IEEE Trans. Instrum. Meas., vol. 72, pp. 1–19, 2023, doi: https://doi.org/10.1109/TIM.2023.3279913.

N. S. Belinda, I. R. Hg, and H. Yozza, “PENERAPAN ANALISIS CLUSTER ENSEMBLE DENGAN METODE ROCK UNTUK MENGELOMPOKKAN PROVINSI DI INDONESIA BERDASARKAN INDIKATOR KESEJAHTERAAN RAKYAT,” J. Mat. UNAND, vol. 8, no. 2, p. 108, Jul. 2019, doi: https://doi.org/10.25077/jmu.8.2.108-119.2019.

E. W. Ambarsari, N. Dwitiyanti, N. Selvia, W. N. Cholifah, and P. D. Mardika, “COMPARISON APPROACHES OF THE FUZZY C-MEANS AND GAUSSIAN MIXTURE MODEL IN CLUSTERING THE WELFARE OF THE INDONESIAN PEOPLE,” KnE Soc. Sci., May 2023, doi: 10.18502/kss.v8i9.13315.

F. W. Saputri and D. B. Arianto, “PERBANDINGAN PERFORMA ALGORITMA K-MEANS, K-MEDOIDS, DAN DBSCAN DALAM PENGGEROMBOLAN PROVINSI DI INDONESIA BERDASARKAN INDIKATOR KESEJAHTERAAN MASYARAKAT,” J. Teknol. Inf. J. Keilmuan Dan Apl. Bid. Tek. Inform., vol. 7, no. 2, pp. 138–151, Aug. 2023, doi: https://doi.org/10.47111/jti.v7i2.9558.

N. Dwitiyanti, N. Selvia, and F. R. Andrari, “PENERAPAN FUZZY C-MEANS CLUSTER DALAM PENGELOMPOKKAN PROVINSI INDONESIA MENURUT INDIKATOR KESEJAHTERAAN RAKYAT,” Fakt. Exacta, vol. 12, no. 3, p. 201, Nov. 2019, doi: https://doi.org/10.30998/faktorexacta.v12i3.4526.

N. Dwitiyanti, S. Wulandari, and N. Selvia, “IMPLEMENTASI GRAPH CLUSTERING ALGORITHM MODIFICATION MAXIMUM STANDARD DEVIATION REDUCTION (MMSDR) DALAM CLUSTERING PROVINSI DI INDONESIA MENURUT INDIKATOR KESEJAHTERAAN RAKYAT,” Fakt. Exacta, vol. 13, no. 2, p. 73, Aug. 2020, doi: https://doi.org/10.30998/faktorexacta.v13i2.5863.

M. Ester, H.-P. Kriegel, J. Sander, and X. Xu, “A DENSITY-BASED ALGORITHM FOR DISCOVERING CLUSTERS IN LARGE SPATIAL DATABASES WITH NOISE,” in Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD), Aug. 1996, pp. 226–231.

A. Kassambara, PRACTICAL GUIDE TO CLUSTER ANALYSIS IN R: UNSUPERVISED MACHINE LEARNING, 1st ed. United States: STHDA, 2017.

E. Schubert, J. Sander, M. Ester, H. P. Kriegel, and X. Xu, “DBSCAN REVISITED, REVISITED: WHY AND HOW YOU SHOULD (STILL) USE DBSCAN,” ACM Trans. Database Syst., vol. 42, no. 3, pp. 1–21, Sep. 2017, doi: https://doi.org/10.1145/3068335.

C. M. Bishop, PATTERN RECOGNITION AND MACHINE LEARNING. New York: Springer, 2006.

K. E. Setiawan and A. Kurniawan, “PENGELOMPOKAN RUMAH SAKIT DI JAKARTA MENGGUNAKAN MODEL DBSCAN, GAUSSIAN MIXTURE, DAN HIERARCHICAL CLUSTERING,” J. Inform. Terpadu, vol. 9, no. 2, pp. 149–156, Sep. 2023, doi: https://doi.org/10.54914/jit.v9i2.995.

A. P. Dempster, N. M. Laird, and D. B. Rubin, “MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA THE EM ALGORITHM,” J. R. Stat. Soc. Ser. B Stat. Methodol., vol. 39, no. 1, pp. 1–22, Sep. 1977, doi: https://doi.org/10.1111/j.2517-6161.1977.tb01600.x.

G. Schwarz, “ESTIMATING THE DIMENSION OF A MODEL,” Ann. Stat., vol. 6, no. 2, Mar. 1978, doi: https://doi.org/10.1214/aos/1176344136.

P. Guenther, M. Guenther, C. M. Ringle, G. Zaefarian, and S. Cartwright, “IMPROVING PLS-SEM USE FOR BUSINESS MARKETING RESEARCH,” Ind. Mark. Manag., vol. 111, pp. 127–142, May 2023, doi: https://doi.org/10.1016/j.indmarman.2023.03.010.

C. F. Dormann et al., “COLLINEARITY: A REVIEW OF METHODS TO DEAL WITH IT AND A SIMULATION STUDY EVALUATING THEIR PERFORMANCE,” Ecography, vol. 36, no. 1, pp. 27–46, Jan. 2013, doi: https://doi.org/10.1111/j.1600-0587.2012.07348.x.

G. James, D. Witten, T. Hastie, and R. Tibshirani, AN INTRODUCTION TO STATISTICAL LEARNING. New York: Springer, 2013.

M. F. F. Mardianto et al., “GROUPING OF PROVINCES IN INDONESIA BASED ON COMMUNITY WELFARE LEVEL INDICATORS USING HIERARCHICAL CLUSTER ANALYSIS,” presented at the 4TH INTERNATIONAL SCIENTIFIC CONFERENCE OF ALKAFEEL UNIVERSITY (ISCKU 2022), Najaf, Iraq, 2023, p. 080015. doi: https://doi.org/10.1063/5.0181024.

P. I. Kontoro, J. Junaidi, N. F. Gamayanti, and A. S. N. Apusing, “CLUSTERING INDONESIAN PROVINCES BASED ON POVERTY LEVELS UTILIZING THE AVERAGE LINKAGE METHOD WITH PRINCIPAL COMPONENT ANALYSIS,” in Proceedings of the 5th International Seminar on Science and Technology (ISST 2023), vol. 10, Y. Yuyun, M. Rasyiid, M. S. Zubair, and E. Sesa, Eds., in Advances in Physics Research, vol. 10. , Dordrecht: Atlantis Press International BV, 2024, pp. 78–86. doi: https://doi.org/10.2991/978-94-6463-520-1_13.

K. Vanessa, I. A. Iswanto, K. Wijaya, and M. F. Hidayat, “COMPARING K-MEANS AND DBSCAN ALGORITHMS FOR CLUSTERING POVERTY LEVELS IN PAPUA ISLANDS,” in 2024 9th International Conference on Information Technology and Digital Applications (ICITDA), Nilai, Negeri Sembilan, Malaysia: IEEE, Nov. 2024, pp. 1–6. doi: 1 https://doi.org/10.1109/ICITDA64560.2024.10810077.

A. M. Ikotun, F. Habyarimana, and A. E. Ezugwu, “CLUSTER VALIDITY INDICES FOR AUTOMATIC CLUSTERING: A COMPREHENSIVE REVIEW,” Heliyon, vol. 11, no. 2, p. e41953, Jan. 2025, doi: https://doi.org/10.1016/j.heliyon.2025.e41953.

I. Ioannou, C. Christophorou, P. Nagaradjane, and V. Vassiliou, “PERFORMANCE EVALUATION OF MACHINE LEARNING CLUSTER METRICS FOR MOBILE NETWORK AUGMENTATION,” in 2024 International Conference on Wireless Communications Signal Processing and Networking (WiSPNET), Chennai, India: IEEE, Mar. 2024, pp. 1–7. doi: https://doi.org/10.1109/WiSPNET61464.2024.10532825.

D. Y. Faidah, D. Destin, F. A. Anggina, and M. I. Caesar, “ASSESSING THE PERFORMANCE OF K-MEANS AND DBSCAN CLUSTERING METHODS IN TUBERCULOSIS MAPPING,” Commun. Math. Biol. Neurosci., 2025, doi: 10.28919/cmbn/9039.

Published
2025-07-01
How to Cite
[1]
S. Andayani, N. Retnani, T. A. S. Yusri, and B. S. H. Marwoto, “A COMPARATIVE ANALYSIS OF DBSCAN AND GAUSSIAN MIXTURE MODEL FOR CLUSTERING INDONESIAN PROVINCES BASED ON SOCIOECONOMIC WELFARE INDICATORS”, BAREKENG: J. Math. & App., vol. 19, no. 3, pp. 2039-2056, Jul. 2025.