INTEGRATION OF HIERARCHICAL CLUSTER, SELF-ORGANIZING MAPS, AND ENSEMBLE CLUSTER WITH NAÏVE BAYES CLASSIFIER FOR GROUPING CABBAGE PRODUCTION IN INDONESIA

  • Maulidya Maghfiro Department of Statistics, Faculty of Mathematics and Natural Sciences, Universitas Brawijaya, Indonesia https://orcid.org/0009-0004-6977-9611
  • Ni Wayan Surya Wardhani Department of Statistics, Faculty of Mathematics and Natural Sciences, Universitas Brawijaya, Indonesia https://orcid.org/0000-0002-9118-3302
  • Atiek Iriany Department of Statistics, Faculty of Mathematics and Natural Sciences, Universitas Brawijaya, Indonesia https://orcid.org/0000-0002-3464-0925
Keywords: Cabbage, Cluster Ensembles, Hierarchical Analysis, Naïve Bayes, SOM

Abstract

The purpose of this study is to evaluate and compare different clustering techniques, including hierarchical cluster analysis (using complete linkage, average linkage, and single linkage methods), Self-Organizing Maps (SOM) clustering, and ensemble clustering, within the framework of integrated cluster analysis combined with Naïve Bayes analysis, specifically applied to cabbage production in Indonesia. The data utilized in this study are on cabbage production from various districts and cities in Indonesia, obtained from the 2023 publications of the Central Statistics Agency (BPS). The variables used in this study are cabbage harvest, cabbage production, area height, and rainfall. The data size used is 157 districts/cities in Indonesia. This research is a quantitative analysis employing integrated cluster analysis combined with Naïve Bayes. Cluster analysis is used to obtain classes in each district/city. Different clustering methods, including hierarchical clustering, Self-Organizing Map (SOM), and ensemble clustering, are compared to determine the best approach for grouping districts based on cabbage production. Naïve Bayes analysis is then used to classify cabbage production in Indonesia and identify the optimal clusters. This comparison aims to find the most effective clustering method for improving grouping accuracy and understanding cabbage production patterns. The best method for classifying cabbage production in Indonesia is the ensemble clustering approach integrated with Naïve Bayes, resulting in three distinct clusters: high, medium, and low production clusters.

Downloads

Download data is not yet available.

References

Vikram Gude, Saravanan V, Ishwarya RJ, and Sathya M, “IDENTIFICATION OF SIGNIFICANT FEATURES AND DATA MINING TECHNIQUES IN PREDICTING HEART STROKE,” Int. J. Adv. Res. Sci. Commun. Technol., vol. 2, no. 1, pp. 676–682, 2022, doi: 10.48175/ijarsct-7738.

J. E. van Engelen and H. H. Hoos, “A SURVEY ON SEMI-SUPERVISED LEARNING,” Mach. Learn., vol. 109, no. 2, pp. 373–440, 2020, doi: 10.1007/s10994-019-05855-6.

Z. Muda, W. Yassin, M. N. Sulaiman, and N. I. Udzir, “K-MEANS CLUSTERING AND NAIVE BAYES CLASSIFICATION FOR INTRUSION DETECTION,” J. IT Asia, vol. 4, no. 1, pp. 13–25, 2016, doi: 10.33736/jita.45.2014.

N. M. A. A. Badung, A. A. R. Fernandes, and W. H. Nugroho, “COMPARISON OF DISTANCE AND LINKAGE IN INTEGRATED CLUSTER ANALYSIS WITH MULTIPLE DISCRIMINANT ANALYSIS ON HOME OWNERSHIP CREDIT BANK IN INDONESIA,” Math. Stat., vol. 9, no. 6, pp. 958–975, 2021, doi: 10.13189/ms.2021.090612.

I. Wahyuni and S. P. Wulandari, “PEMETAAN KABUPATEN/KOTA DI JAWA TIMUR BERDASARKAN INDIKATOR KESEJAHTERAAN RAKYAT MENGGUNAKAN ANALISIS CLUSTER HIERARKI,” J. Sains dan Seni, vol. 11, no. 1, pp. D70–D75, 2022.

V. Kotu and B. Deshpande, DATA SCIENCE CONCEPT AND PRACTICE, 2nd ed., vol. 19, no. 5. Chenna: Morgan Kaufmann, 2018.

M. H. Ghaseminezhad and A. Karami, “A NOVEL SELF-ORGANIZING MAP (SOM) NEURAL NETWORK FOR DISCRETE GROUPS OF DATA CLUSTERING,” Appl. Soft Comput. J., vol. 11, no. 4, pp. 3771–3778, 2011, doi: 10.1016/j.asoc.2011.02.009.

D. Aktaş, B. Lokman, T. İnkaya, and G. Dejaegere, “CLUSTER ENSEMBLE SELECTION AND CONSENSUS CLUSTERING: A MULTI-OBJECTIVE OPTIMIZATION APPROACH,” Eur. J. Oper. Res., vol. 314, no. 3, pp. 1065–1077, 2024, doi: 10.1016/j.ejor.2023.10.029.

A. Z. Machfud, A. Pandu Kusuma, and W. Dwi Puspitasari, “ANALISIS ALGORITMA NAIVE BAYES CLASSIFIER (NBC) PADA KLASIFIKASI TINGKAT MINAT BARANG DI TOKO VIOLET CELL,” JATI (Jurnal Mhs. Tek. Inform., vol. 7, no. 1, pp. 87–94, 2023, doi: 10.36040/jati.v7i1.5692.

M. Arifat, Wardiana Adinda Putri, and A. S. Mufida, “PENERAPAN METODE NAIVE BAYES CLASSIFIER UNTUK KLASIFIKASI INDEKS PEMBANGUNAN MANUSIA DI PROVINSI JAWA TIMUR,” J. Stat. dan Komputasi, vol. 2, no. 1, pp. 31–43, 2023, doi: 10.32665/statkom.v2i1.1661.

R. R. R. Arisandi, B. Warsito, and A. R. Hakim, “APLIKASI NAÏVE BAYES CLASSIFIER (NBC) PADA KLASIFIKASI STATUS GIZI BALITA STUNTING DENGAN PENGUJIAN K-FOLD CROSS VALIDATION,” J. Gaussian, vol. 11, no. 1, pp. 130–139, 2022, doi: 10.14710/j.gauss.v11i1.33991.

L. Best, E. Foo, and H. Tian, “UTILISING K-MEANS CLUSTERING AND NAIVE BAYES FOR IOT ANOMALY DETECTION: A HYBRID APPROACH,” Smart Sensors, Meas. Instrum., vol. 43, pp. 177–214, 2022, doi: 10.1007/978-3-031-08270-2_7.

A. N. Haloho, “RESPON PERTUMBUHAN DAN PRODUKSI KUBIS (BRASSICA OLEARACEAE. L) DENGAN PEMBERIAN BERBAGAI JENIS DAN DOSIS PUPUK KANDANG,” Agroprimatech, vol. 4, no. 1, pp. 10–17, 2020, doi: 10.34012/agroprimatech.v4i1.1325.

U. Braga-Neto, FUNDAMENTALS OF PATTERN RECOGNITION AND MACHINE LEARNING, 2nd ed. Switzerland: Springer, 2020.

Haviluddin et al., “A PERFORMANCE COMPARISON OF EUCLIDEAN, MANHATTAN AND MINKOWSKI DISTANCES IN K-MEANS CLUSTERING,” 2020 6th Int. Conf. Sci. Inf. Technol. Embrac. Ind. 4.0 Towar. Innov. Disaster Manag. ICSITech 2020, pp. 184–188, 2020, doi: 10.1109/ICSITech49800.2020.9392053.

T. Li, A. Rezaeipanah, and E. S. M. Tag El Din, “AN ENSEMBLE AGGLOMERATIVE HIERARCHICAL CLUSTERING ALGORITHM BASED ON CLUSTERS CLUSTERING TECHNIQUE AND THE NOVEL SIMILARITY MEASUREMENT,” J. King Saud Univ. - Comput. Inf. Sci., vol. 34, no. 6, pp. 3828–3842, 2022, doi: 10.1016/j.jksuci.2022.04.010.

V. Vijaya, S. Sharma, and N. Batra, “COMPARATIVE STUDY OF SINGLE LINKAGE, COMPLETE LINKAGE, AND WARD METHOD OF AGGLOMERATIVE CLUSTERING,” Proc. Int. Conf. Mach. Learn. Big Data, Cloud Parallel Comput. Trends, Prespectives Prospect. Com. 2019, pp. 568–573, 2019, doi: 10.1109/COMITCon.2019.8862232.

N. Satyahadewi, S. J. Sinaga, and H. Perdana, “HIERARCHICAL CLUSTER ANALYSIS OF DISTRICTS/CITIES IN NORTH SUMATRA PROVINCE BASED ON HUMAN DEVELOPMENT INDEX INDICATORS USING PSEUDO-F,” BAREKENG J. Ilmu Mat. dan Terap., vol. 17, no. 3, pp. 1429–1438, 2023, doi: 10.30598/barekengvol17iss3pp1429-1438.

S. Patel, S. Sihmar, and A. Jatain, “A STUDY OF HIERARCHICAL CLUSTERING ALGORITHMS,” 2015 Int. Conf. Comput. Sustain. Glob. Dev. INDIACom 2015, pp. 537–541, 2015.

H. Hartatik and A. S. D. Cahya, “CLUSTERISASI KERUSAKAN GEMPA BUMI DI PULAU JAWA MENGGUNAKAN SOM,” J. Ilm. Intech Inf. Technol. J. UMUS, vol. 2, no. 02, pp. 25–34, 2020, doi: 10.46772/intech.v2i02.286.

R. D. Kusumah, B. Warsito, and M. A. Mukid, “PERBANDINGAN METODE K-MEANS DAN SELF ORGANIZING MAP (STUDI KASUS: PENGELOMPOKAN KABUPATEN/KOTA DI JAWA TENGAH BERDASARKAN INDIKATOR INDEKS PEMBANGUNAN MANUSIA 2015),” J. Gaussian, vol. 6, no. 3, pp. 429–437, 2017, [Online]. Available: http://ejournal-s1.undip.ac.id/index.php/gaussian.

I. Hidayatin, S. Adinugroho, and C. Dewi, “PENGELOMPOKAN WILAYAH BERDASARKAN PENYANDANG MASALAH KESEJAHTERAAN SOSIAL (PMKS) DENGAN OPTIMASI ALGORITME K-MEANS MENGGUNAKAN SELF ORGANIZING MAP (SOM),” J. Pengemb. Teknol. Inf. dan Ilmu Kompter, vol. 3, no. 8, pp. 2548–964, 2019, [Online]. Available: http://j-ptiik.ub.ac.id.

E. Fauziyari and D. U. Wustqa, “PEMETAAN KABUPATEN/KOTA DI PROVINSI PAPUA BERDASARKAN INDIKATOR DAERAH TERTINGGAL DENGAN METODE ENSEMBLE CLUSTERING,” J. Stat. Dan Sains Data, vol. 1, pp. 40–55, 2023, [Online]. Available: https://journal.student.uny.ac.id/index.php/jssd.

H. Nashir, A. Kurnia, and A. Fitrianto, “SUBDISTRICT CLUSTERING IN WEST JAVA PROVINCE BASED ON DISEASE INCIDENCE OF JKN PARTICIPANTS PRIMARY SERVICES,” BAREKENG J. Ilmu Mat. dan Terap., vol. 17, no. 1, pp. 0295–0304, 2023, doi: 10.30598/barekengvol17iss1pp0295-0304.

S. Y. Hadist and A. P. Utomo, “PENGELOMPOKAN KABUPATEN/KOTA DI PULAU JAWA BERDASARKAN KONDISI SOSIAL EKONOMI SEBELUM DAN SETELAH MEMASUKI PANDEMI COVID-19 PENERAPAN METODE CLUSTER ENSEMBLE,” Semin. Nas. Off. Stat., vol. 19, no. 2020, pp. 322–332, 2021.

A. A. Yusfar, M. A. Tiro, and S. Sudarmin, “ANALISIS CLUSTER ENSEMBLE DALAM PENGELOMPOKAN KABUPATEN/KOTA DI PROVINSI SULAWESI SELATAN BERDASARKAN INDIKATOR KINERJA PEMBANGUNAN EKONOMI DAERAH,” VARIANSI J. Stat. Its Appl. Teach. Res., vol. 3, no. 1, p. 31, 2020, doi: 10.35580/variansiunm14626.

R. Agustin, V. M. Santi, and B. Sumargo, “METODE NAIVE BAYES DALAM MENDETEKSI SEL KANKER PAYUDARA,” J. Stat. dan Apl., vol. 3, no. 1, pp. 30–38, 2019, doi: 10.21009/jsa.03104.

I. W. Saputro and B. W. Sari, “UJI PERFORMA ALGORITMA NAÏVE BAYES UNTUK PREDIKSI MASA STUDI MAHASISWA,” Creat. Inf. Technol. J., vol. 6, no. 1, p. 1, 2020, doi: 10.24076/citec.2019v6i1.178.

A. Diallo, L. Affognon, C. Diallo, and E. C. Ezin, “DEEP LEARNING BASED BINARY AND MULTI-CLASS CLASSIFICATION COMPARISON FOR ANOMALY DETECTION,” 8th Int. Conf. Eng. Emerg. Technol. ICEET 2022, no. October, pp. 1–6, 2022, doi: 10.1109/ICEET56468.2022.10007171.

T. Wang and Y. Zhao, “CREDIT CARD FRAUD DETECTION USING LOGISTIC REGRESSION,” Proc. - 2022 Int. Conf. Big Data, Inf. Comput. Network, BDICN 2022, pp. 301–305, 2022, doi: 10.1109/BDICN55575.2022.00064.

Published
2025-04-01
How to Cite
[1]
M. Maghfiro, N. W. S. Wardhani, and A. Iriany, “INTEGRATION OF HIERARCHICAL CLUSTER, SELF-ORGANIZING MAPS, AND ENSEMBLE CLUSTER WITH NAÏVE BAYES CLASSIFIER FOR GROUPING CABBAGE PRODUCTION IN INDONESIA”, BAREKENG: J. Math. & App., vol. 19, no. 2, pp. 1057-1070, Apr. 2025.