Time Series Clustering of Rice Productivity in West Java Using Trimming Gaussian Mixture Models

Keywords: Adjusted Rand Index, rice productivity, Silhouette Score, time series clustering, Trimming Gaussian Mixture Model

Abstract

This study investigates the application of the Trimming Gaussian Mixture Model (TGMM) for clustering monthly rice productivity time series data in West Java from 2018 to 2023. TGMM is a robust clustering approach that reduces the influence of outliers by trimming a specified portion of the data prior to parameter estimation. The dataset, sourced from Open Data Jabar, was analyzed to identify the most representative number of clusters using the Silhouette Score. The optimal clustering solution was achieved with two main clusters (k = 2) and a trimming proportion of 15%. The results revealed three distinct regional groups: two dominant clusters characterized by moderate-stable and high-consistent productivity patterns, and a separate group of outliers marked by low and highly fluctuating productivity. Cluster stability was assessed using the Adjusted Rand Index (ARI), yielding values of 0.41 (bootstrap) and 0.545 (subsampling), which indicate a reasonably consistent clustering structure. These findings demonstrate the effectiveness of TGMM in capturing underlying productivity patterns while accounting for noise and outliers, suggesting its potential as a robust decision-support tool for data-driven agricultural planning and policy formulation.

Downloads

Download data is not yet available.

References

S. Aghabozorgi, A. S. Shirkhorshidi and T. Y. Wah, "Time-series clustering – A decade review," Information Systems, vol. 13, pp. 16-38, 2015.
R. Umatani, T. Imai, K. Kawamoto and S. Kunimasa, "Time series clustering with an EM algorithm for mixtures of linear Gaussian state space models," Pattern Recognition, vol. 138, no. 15, p. 109375, 2023.
M. Ulinnuha, F. M. Afendi and I. M. Sumertajaya, "Study of Clustering Time Series Forecasting Model for Provincial Grouping in Indonesia Based on Rice Price," Indonesian Journal of Statistics and Its Applications, vol. 6, no. 1, p. 50–62, 2022.
A. D. Munthe, "Penerapan Klastering Deret Waktu untuk Pengelompokan Provinsi di Indonesia Berdasarkan Nilai Produksi Padi," JURNAL LITBANG SUKOWATI, vol. 2, no. 2, pp. 1-11, 2019.
A. M. Yolanda and H. Savira, "Segmentation of Provinces in Indonesia Using Time Series Data of Rice," Jurnal Pangan, vol. 33, no. 3, pp. 169-177, 2024.
D. A. Reynolds, "Gaussian Mixture Models," in Encyclopedia of Biometrics, Boston, Springer, 2009, pp. 659-663.
P. Coretto and C. Hennig, "Robust Improper Maximum Likelihood: Tuning, Computation, and a Comparison With Other Methods for Robust Gaussian Clustering," Journal of the American Statistical Association, vol. 111, no. 516, pp. 1648-1659, 2016.
L. A. García-Escudero, A. Gordaliza, C. Matrán and A. Mayo-Iscar, "A general trimming approach to robust cluster Analysis," The Annals of Statistics, vol. 36, no. 3, pp. 1324-1345, 2008.
J. A. Cuesta-Albertos, A. Gordaliza and C. Matrán, "Trimmed k-Means: An Attempt to Robustify Quantizers," The Annals of Statistics, vol. 25, no. 2, pp. 553-576, 1997.
I. A. Mahmudiati and R. Fajriyah, "Grouping Indonesian Province Farmers’ Term of Trade Using Dynamic Time Warping," Indonesian Journal of Applied Statistics, vol. 7, no. 2, pp. 112-120, 2024.
Pemerintah Provinsi Jawa Barat, "Produktivitas padi Kerangka Sampel Area (KSA) berdasarkan bulan di Jawa Barat (2018–2023)," Open Data Jabar, 2024. [Online]. Available: https://opendata.jabarprov.go.id/id/dataset/produktivitas-padi-kerangka-sampel-area-ksa-berdasarkan-bulan-di-jawa-barat. [Accessed 15 April 2025].
J. W. Tukey, Exploratory Data Analysis, Massachusetts: Addison-Wesley, 1977.
R. Mcgill, J. W. Tukey and W. A. Larsen, "Variations of Box Plots," The American Statistician, vol. 32, no. 1, pp. 12-16, 1977.
P. J. Rousseeuw, "Silhouettes: A Graphical Aid to the Interpretation and Validation of Cluster Analysis," Journal of Computational and Applied Mathematics, vol. 20, pp. 53-65, 1987.
I. T. Jolliffe, Principal Component Analysis (2nd edition), New York: Springer, 2002.
C. Hennig, "Cluster-wise assessment of cluster stability," Computational Statistics & Data Analysis, vol. 52, no. 1, pp. 258-271, 2007.
Published
2025-12-18