Improving Land Use Classification Accuracy Using Zonal  Statistics And Supervised Machine Learning

Gede Awantara; Kusman Sadik; Agus Mohamad Soleh; Cici Suhaeni

doi:10.30598/parameterv5i1pp181-194

Gede Awantara IPB University, Indonesia https://orcid.org/0009-0008-1646-3356
Kusman Sadik IPB University, Indonesia https://orcid.org/0000-0001-8361-8057
Agus Mohamad Soleh IPB University, Indonesia https://orcid.org/0000-0002-2732-1985
Cici Suhaeni IPB University, Indonesia https://orcid.org/0009-0001-0347-3810

DOI: https://doi.org/10.30598/parameterv5i1pp181-194

Keywords: Geospatial Analysis, land use classification, supervised machine learning, zonal statistics.

Abstract

This study aims to improve land use classification accuracy by integrating zonal statistics with supervised machine learning using Sentinel-2 imagery. Two classification models were developed: Model A based on single-pixel values and Model B using aggregated zonal statistics derived from polygon shapefile data. Two algorithms, Random Forest and Classification and Regression Trees (CART), were implemented and evaluated through 5-fold cross validation. The results show that Model B consistently outperformed Model A, with the best performance achieved by Random Forest Model B, reaching an overall accuracy of 73.74% and a kappa coefficient of 0.5999. Class-wise evaluation based on F1-score revealed strong performance in dominant classes such as settlement, water bodies, and rice fields, while underrepresented classes like cropland and shrubland were more difficult to classify due to class imbalance. These findings highlight the effectiveness of zonal statistics in producing more representative training features and improving model stability and accuracy in land use classification tasks.

Downloads

Download data is not yet available.