Prediksi Produktivitas Padi Menggunakan Algoritma Random Forest di Provinsi Sumatera Tahun 1993 – 2020
Prediction of Rice Productivity Using the Random Forest Algorithm in Sumatra Province from 1993 to 2020
Abstract
This study aims to analyze the relationship between rice production and environmental factors in the Sumatra
region using data science approaches and Machine learning algorithms. The dataset used includes information
on rice production, harvest area, rainfall, humidity, and average temperature from various provinces in Sumatra
between 1993 and 2020. The analysis was conducted through data exploration, Pearson correlation test, Feature
engineering such as environmental index and annual temperature fluctuation, and predictive model building using
Linear regression, Decision Tree, and Random Forest algorithms. The results showed that harvest area had the
highest correlation to rice production, while environmental factors also showed significant influence. The Random
Forest model was selected as the best model based on the evaluation of R², MAE, and RMSE metrics. In addition,
parameter tuning and Cross-Validation were conducted to improve model performance. This study emphasizes
the importance of utilizing data-driven quantitative approaches in supporting more precise agricultural planning
and policies.