Prediksi Produktivitas Padi Menggunakan Algoritma Random Forest di Provinsi Sumatera Tahun 1993 – 2020

Prediction of Rice Productivity Using the Random Forest Algorithm in Sumatra Province from 1993 to 2020

  • Putri Aprilia de Fretes Program Studi Ilmu Komputer, Fakultas Sains dan Teknologi, Universitas Pattimura
  • Shinta Rante Mangaluk
Keywords: ice paddy production, data science, Machine learning, Random Forest, harvest area, environmental factors, prediction

Abstract

This study aims to analyze the relationship between rice production and environmental factors in the Sumatra
region using data science approaches and Machine learning algorithms. The dataset used includes information
on rice production, harvest area, rainfall, humidity, and average temperature from various provinces in Sumatra
between 1993 and 2020. The analysis was conducted through data exploration, Pearson correlation test, Feature
engineering such as environmental index and annual temperature fluctuation, and predictive model building using
Linear regression, Decision Tree, and Random Forest algorithms. The results showed that harvest area had the
highest correlation to rice production, while environmental factors also showed significant influence. The Random
Forest model was selected as the best model based on the evaluation of R², MAE, and RMSE metrics. In addition,
parameter tuning and Cross-Validation were conducted to improve model performance. This study emphasizes
the importance of utilizing data-driven quantitative approaches in supporting more precise agricultural planning
and policies.

Downloads

Download data is not yet available.

Author Biography

Shinta Rante Mangaluk

This study aims to analyze the relationship between rice production and environmental factors in the Sumatra
region using data science approaches and Machine learning algorithms. The dataset used includes information
on rice production, harvest area, rainfall, humidity, and average temperature from various provinces in Sumatra
between 1993 and 2020. The analysis was conducted through data exploration, Pearson correlation test, Feature
engineering such as environmental index and annual temperature fluctuation, and predictive model building using
Linear regression, Decision Tree, and Random Forest algorithms. The results showed that harvest area had the
highest correlation to rice production, while environmental factors also showed significant influence. The Random
Forest model was selected as the best model based on the evaluation of R², MAE, and RMSE metrics. In addition,
parameter tuning and Cross-Validation were conducted to improve model performance. This study emphasizes
the importance of utilizing data-driven quantitative approaches in supporting more precise agricultural planning
and policies.

Published
2025-07-29
How to Cite
de Fretes, P., & Mangaluk, S. (2025). Prediksi Produktivitas Padi Menggunakan Algoritma Random Forest di Provinsi Sumatera Tahun 1993 – 2020. ALGORITHM: Journal of Computer Science and Computational Intelligence, 1(1), 9-19. Retrieved from https://ojs3.unpatti.ac.id/index.php/algorithm/article/view/21018
Section
Articles