CLUSTERING ANALYSIS FOR GROUPING SUB-DISTRICTS IN BOJONEGORO DISTRICT WITH THE K-MEANS METHOD WITH A VARIETY OF APPROACHES

Denny Nurdiansyah; Mochamad Nizar Palefi Ma'ady; Yuana Sukmawaty; Muchammad Chandra Cahyo Utomo; Tia Mutiani

doi:10.30598/barekengvol18iss2pp1095-1104

Denny Nurdiansyah Statistics, Universitas Nadlatul Ulama Sunan Giri, Indonesia https://orcid.org/0000-0002-9126-9616
Mochamad Nizar Palefi Ma'ady Information System, Institut Teknologi Telkom Surabaya, Indonesia https://orcid.org/0000-0002-7061-4660
Yuana Sukmawaty Statistics, Universitas Lambung Mangkurat, Indonesia https://orcid.org/0000-0003-3912-7925
Muchammad Chandra Cahyo Utomo Informatics, Institut Teknologi Kalimantan, Indonesia https://orcid.org/0000-0003-0024-8316
Tia Mutiani Statistics, Universitas Nadlatul Ulama Sunan Giri, Indonesia

DOI: https://doi.org/10.30598/barekengvol18iss2pp1095-1104

Keywords: Fast K-Means, Kernel K-Means, Population Document

Abstract

Population data is an important piece of information that is useful for regional planning and development. Insight into the state of an area is more straightforward to observe if there are grouped sub-districts. In this case, data mining techniques can identify patterns and relationships in population data. The K-Means algorithm is a clustering technique that divides data into groups or clusters based on similar characteristics. This research aims to apply the K-Means method with various approaches to clustering sub-districts in the Bojonegoro district according to population data. The research method used is a quantitative method with an exploratory study in the application of the K-Means method with a variety of approaches, namely the use of the Kernel K-Means method by utilizing the mapping function to map data to a higher dimension before the clustering process. In addition, the Fast K-Means method is used, which reduces the model training time to improve the cluster-centered recalibration problem as the amount of data increases. The data source used in this research is secondary population data in the form of birth, death, migrant, and moving variables obtained from the Satu Data Bojonegoro website developed by the Bojonegoro Regency Government. It is found that the best K-Means approach is the Kernel K-Means method with a number of clusters of 5. The performance of the cluster method is evaluated by measuring the average distance within the cluster. The data coordinate pattern in the Kernel K-means method clustering shows a smooth initial trend when the value of the number of clusters is 5 so that the clusters formed are obtained clearly. The conclusion from this study's results is that the K-Means method's best approach in grouping sub-districts in Bojonegoro district is the Kernel K-Means approach.

Downloads

Download data is not yet available.

References

U. U. Muhimah, “Peran Pemerintah dalam Bidang Administrasi Kependudukan dalam Kerangka Perlindungan Hukum Warga Negara Ditinjau dari Undang-Undang No. 23 Tahun 2006 tentang Administrasi Kependudukan,” Sultan Jurisprud. J. Ris. Ilmu Huk., vol. 2, no. 1, pp. 53–63, 2022, doi: 10.51825/sjp.v2i1.15879.

N. A. Wardrop et al., “Spatially disaggregated population estimates in the absence of national population and housing census data,” Proc. Natl. Acad. Sci., vol. 115, no. 14, pp. 3529–3537, Apr. 2018, doi: 10.1073/pnas.1715305115.

K. D. Negeri, Peraturan Menteri Dalam Negeri Republik Indonesia Nomor 40 Tahun 2012 Tentang Pedoman Penyusunan Proyeksi Penduduk Di Daerah. 2012, pp. 1–9.

O. Y. Fujiyati and Sukadi, “Sistem Informasi Pengolahan Data Kependudukan Desa Purwoasri,” J. Speed – Sentra Penelit. Eng. dan Edukasi, vol. 7, no. 1, pp. 1–15, 2015.

W. P. Bojonegoro, “Demografi Kabupaten Bojonegoro,” Dinas Kependud. dan Pencatatan Sipil, 2019.

N. A. Sholikhah, “Studi Perbandingan Clustering Kecamatan di Kabupaten Bojonegoro Berdasarkan Keaktifan Penduduk Dalam Kepemilikan Dokumen Kependudukan,” J. Stat. dan Komputasi, vol. 1, no. 1, pp. 42–53, Jun. 2022, doi: 10.32665/statkom.v1i1.443.

I. G. I. Sudipa and E. A. P. Lestari, “Rancang Bangun Sistem Informasi Penduduk Dusun (Studi Kasus : Dusun Tegal Kori Kaja Ubung),” J. Teknol. Inf. dan Komput., vol. 5, no. 2, 2019, doi: 10.36002/jutik.v5i2.782.

H. A. Madni, Z. Anwar, and M. A. Shah, “Data mining techniques and applications — A decade review,” in 2017 23rd International Conference on Automation and Computing (ICAC), Sep. 2017, pp. 1–7. doi: 10.23919/IConAC.2017.8082090.

G. Ahalya and H. M. Pandey, “Data clustering approaches survey and analysis,” in 2015 International Conference on Futuristic Trends on Computational Analysis and Knowledge Management (ABLAZE), Feb. 2015, pp. 532–537. doi: 10.1109/ABLAZE.2015.7154919.

C. Yuan and H. Yang, “Research on K-Value Selection Method of K-Means Clustering Algorithm,” J, vol. 2, no. 2, pp. 226–235, Jun. 2019, doi: 10.3390/j2020016.

K. F. Irnanda, D. Hartama, and A. P. Windarto, “Analisa Klasifikasi C4.5 Terhadap Faktor Penyebab Menurunnya Prestasi Belajar Mahasiswa Pada Masa Pandemi,” J. Media Inform. Budidarma, vol. 5, no. 1, p. 327, 2021, doi: 10.30865/mib.v5i1.2763.

D. Nurdiansyah, S. Saidah, and N. Cahyani, “DATA MINING STUDY FOR GROUPING ELEMENTARY SCHOOLS IN BOJONEGORO DISTRICT BASED ON CAPACITY AND EDUCATIONAL FACILITIES,” BAREKENG J. Ilmu Mat. dan Terap., vol. 17, no. 2, pp. 1081–1092, Jun. 2023, doi: 10.30598/barekengvol17iss2pp1081-1092.

H. Fitriyah, E. M. Safitri, N. Muna, M. Khasanah, D. A. Aprilia, and D. Nurdiansyah, “IMPLEMENTASI ALGORITMA CLUSTERING DENGAN MODIFIKASI METODE ELBOW UNTUK MENDUKUNG STRATEGI PEMERATAAN BANTUAN SOSIAL DI KABUPATEN BOJONEGORO,” J. Lebesgue J. Ilm. Pendidik. Mat. Mat. Dan Stat., vol. 4, no. 3, pp. 1598–1607, 2023, doi: https://doi.org/10.46306/lb.v4i3.453.

M. Ahmed, R. Seraj, and S. M. S. Islam, “The k-means Algorithm: A Comprehensive Survey and Performance Evaluation,” Electronics, vol. 9, no. 8, p. 1295, Aug. 2020, doi: 10.3390/electronics9081295.

K. Aprianto, “Optimasi Kernel K-Means dalam Pengelompokan Kabupaten/Kota Berdasarkan Indeks Pembangunan Manusia di Indonesia,” Limits J. Math. Its Appl., vol. 15, no. 1, p. 1, 2018, doi: 10.12962/limits.v15i1.3408.

Y. Yao, Y. Li, B. Jiang, and H. Chen, “Multiple Kernel k -Means Clustering by Selecting Representative Kernels,” IEEE Trans. Neural Networks Learn. Syst., vol. 32, no. 11, pp. 4983–4996, Nov. 2021, doi: 10.1109/TNNLS.2020.3026532.

A. N. Azizah, T. Widiharih, and A. R. Hakim, “Kernel K-Means Clustering untuk Pengelompokan Sungai di Kota Semarang Berdasarkan Faktor Pencemaran Air,” J. Gaussian, vol. 11, no. 2, pp. 228–236, 2022, doi: 10.14710/j.gauss.v11i2.35470.

C.-H. Lin, C.-C. Chen, H.-L. Lee, and J.-R. Liao, “Fast K-means algorithm based on a level histogram for image retrieval,” Expert Syst. Appl., vol. 41, no. 7, pp. 3276–3283, Jun. 2014, doi: 10.1016/j.eswa.2013.11.017.

Q. Hu, J. Wu, L. Bai, Y. Zhang, and J. Cheng, “Fast K-means for Large Scale Clustering,” in Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, Nov. 2017, pp. 2099–2102. doi: 10.1145/3132847.3133091.

D. Peng, Z. Chen, J. Fu, S. Xia, and Q. Wen, “Fast k-means Clustering Based on the Neighbor Information,” in 2021 International Symposium on Electrical, Electronics and Information Engineering, Feb. 2021, pp. 551–555. doi: 10.1145/3459104.3459194.

CLUSTERING ANALYSIS FOR GROUPING SUB-DISTRICTS IN BOJONEGORO DISTRICT WITH THE K-MEANS METHOD WITH A VARIETY OF APPROACHES

Abstract

Downloads

References

Editorial Office

Contact Info