A PERFORMANCE COMPARISON OF DECISION TREE MODELS FOR PM10 PREDICTION IN JAKARTA English

Main Article Content

Khairummin Alfi Syahrin
Agung Hari Saputra

Abstract

PM10 are airborne particulates that have a diameter of ≤10 μm. The potential hazards of PM10 particulates are an issue that is being intensified by many researchers. This research utilizes PyCaret, a library to accelerate the process of modeling and experimentation in the field of machine learning (ML) and data science. This research compares the performance of three decision tree-based models Extra Trees, Random Forest, and XGBoost in predicting PM10 particulate levels, presenting data and visualizations for each models predictions. The data used is ISPU data at five air quality monitoring stations in Jakarta, with the main dataset of PM10 in 2021. The forecast results show an increasing graph pattern, with higher fluctuations in XGBoost. The Extra Trees model produces the best performance, with MASE 0.8808, RMSSE 0.8113, MAE 12.6173, RMSE 14.7436, MAPE 0.2433, SMAPE 0.207, and R² -1.2013.

Downloads

Download data is not yet available.

Article Details

Section
Articles

References

[1] E. Kristanti, R. E. Handriyono, M. N. Apsari, and N. R. Abadi, “Evaluasi Monitoring Kualitas Udara Di Pt X (Desa Sedayulawas, Kecamatan Brondong, Kabupaten Lamongan),” in Prosiding Seminar Teknologi Perencanaan, Perancangan, Lingkungan dan Infrastruktur, 2021, pp. 406–412. [Online]. Available: http://ejurnal.itats.ac.id/stepplan/article/view/1601.
[2] I. Q. A’yun and R. Umaroh, “Polusi Udara dalam Ruangan dan Kondisi Kesehatan: Analisis Rumah Tangga Indonesia,” Jurnal Ekonomi Dan Pembangunan Indonesia, vol. 22, no. 1, p. 2, 2022, doi: https://doi.org/10.21002/jepi.2022.02.
[3] A. Mosavi, P. Ozturk, and K. Chau, “Flood prediction using machine learning models: Literature review,” Water (Basel), vol. 10, no. 11, p. 1536, 2018, doi: https://doi.org/10.3390/w10111536.
[4] A. B. K. Didavi, R. G. Agbokpanzo, and M. Agbomahena, “Comparative study of Decision Tree, Random Forest and XGBoost performance in forecasting the power output of a photovoltaic system,” in 2021 4th International Conference on Bio-Engineering for Smart Technologies (BioSMART), IEEE, 2021, pp. 1–5. doi: https://doi.org/10.1109/BioSMART54244.2021.9677566.
[5] P. Pothumani and E. S. Reddy, “Original Research Article Network intrusion detection using ensemble weighted voting classifier based honeypot framework,” Journal of Autonomous Intelligence, vol. 7, no. 3, 2024, doi: https://doi.org/10.32629/jai.v7i3.1081.
[6] Q. C. Doan, C. Chen, S. He, and X. Zhang, “How urban air quality affects land values: Exploring non-linear relationships and its threshold identification using explainable artificial intelligence,” J Clean Prod, p. 140340, 2023, doi: https://doi.org/10.1016/j.jclepro.2023.140340.
[7] S. Chakraborty and S. Bhattacharya, “Application of XGBoost algorithm as a predictive tool in a CNC turning process,” Reports in Mechanical Engineering, vol. 2, no. 1, pp. 190–201, 2021, doi: https://doi.org/10.31181/rme2001021901b.
[8] O. Sagi and L. Rokach, “Approximating XGBoost with an interpretable decision tree,” Inf Sci (N Y), vol. 572, pp. 522–542, 2021, doi: https://doi.org/10.1016/j.ins.2021.05.055.
[9] Y. Kristanto, T. Agustin, and F. R. Muhammad, “Pendugaan Karakteristik Awan berdasarkan Data Spektral Citra Satelit Resolusi Spasial Menengah Landsat 8 Oli/Tirs (Studi Kasus: Provinsi Dki Jakarta),” Jurnal Meteorologi Klimatologi Dan Geofisika, vol. 4, no. 2, pp. 42–50, 2017, doi: https://doi.org/10.36754/jmkg.v4i2.46.
[10] H. Rachmi, “Klasifikasi Pencemaran Udara Di DKI Jakarta Menggunakan Metode Naïve Bayes,” Jurnal Publikasi Ilmu Komputer dan Multimedia, vol. 2, no. 2, pp. 86–92, 2023, doi: https://doi.org/10.55606/jupikom.v2i2.2384.
[11] J. W. Simatupang, S. Hamidah, B. Raditya, and F. Hadinegara, “Sistem Monitoring Online Jaringan Sensor Nirkabel: Survei Kualitas Air dan Udara di Daerah Karawang,” 2022, doi: https://doi.org/10.32672/jse.v7i2.4210.
[12] A. Lestari, A. Fitrisia, and O. Ofianto, “Metodologi Ilmu Pengetahuan: Kuantitatif Dan Kualitatif Dalam Bentuk Implementasi,” Jurnal Pendidikan Dan Konseling (JPDK), vol. 4, no. 6, pp. 8558–8563, 2022, doi: https://doi.org/10.31004/jpdk.v4i6.9710.
[13] F. N. Fajri, A. Tholib, and W. Yuliana, “Application of Machine Learning Algorithm for Determining Elective Courses in Informatics Study Program,” Jurnal Teknik Informatika dan Sistem Informasi, vol. 8, no. 3, pp. 485–496, 2022, doi: https://doi.org/10.28932/jutisi.v8i3.3990.
[14] M. Kalimuthu, P. Vaishnavi, and M. Kishore, “Crop prediction using machine learning,” in 2020 third international conference on smart systems and inventive technology (ICSSIT), IEEE, 2020, pp. 926–932. doi: https://doi.org/10.1109/ICSSIT48917.2020.9214190.
[15] D. C. de Oliveira, U. C. Barbosa, A. C. R. O. Bergland, O. Resende, and D. E. C. de Oliveira, “G-SOJA-WEBSITE WITH PREDICTION ON SOYBEAN CLASSIFICATION USING MACHINE LEARNING,” Engenharia Agrícola, vol. 42, p. e20210140, 2022, doi: https://doi.org/10.1590/1809-4430-Eng.Agric.v42nepe20210140/2022.
[16] M. Vasegh, A. Dehghanbanadaki, and S. Motamedi, “Enhanced soil liquefaction potential estimation using machine learning and web-based platform,” 2023, doi: https://doi.org/10.21203/rs.3.rs-2701088/v1.
[17] A. R. Abidin and I. K. D. Nuryana, “Perbandingan Metode Klasifikasi Data Mining Untuk Mengukur Tingkat Kepuasan Mahasiswa Terhadap Sistem Informasi Penilaian Nonakademik UNESA (SIPENA),” Journal of Emerging Information System and Business Intelligence (JEISBI), vol. 4, no. 4, pp. 129–138, 2023, [Online]. Available: https://ejournal.unesa.ac.id/index.php/JEISBI/article/view/56966
[18] J. Garcia-Arismendiz, S. Huertas-Zúñiga, C. A. Lizárraga-Portugal, J. C. Quiroz-Flores, and Y. J. Garcia-Lopez, “Improving Demand Forecasting by Implementing Machine Learning in Poultry Production Company,” learning, vol. 8, p. 9, 2023, doi: https://doi.org/10.14445/22315381/IJETT-V71I2P205.
[19] S. Gutmann, C. Maget, M. Spangler, and K. Bogenberger, “Truck parking occupancy prediction: Xgboost-LSTM model fusion,” Frontiers in Future Transportation, vol. 2, p. 693708, 2021, doi: https://doi.org/10.3389/ffutr.2021.693708.
[20] R. K. Patel, A. Kumari, S. Tanwar, W.-C. Hong, and R. Sharma, “AI-empowered recommender system for renewable energy harvesting in smart grid system,” IEEE Access, vol. 10, pp. 24316–24326, 2022, doi: https://doi.org/10.1109/ACCESS.2022.3152528.
[21] Y. Niu, “Walmart Sales Forecasting using XGBoost algorithm and Feature engineering,” in 2020 International Conference on Big Data & Artificial Intelligence & Software Engineering (ICBASE), 2020, pp. 458–461. doi: https://doi.org/10.1109/ICBASE51474.2020.00103.
[22] J. B. Habyarimana, “Forecasting Crop Production: A Seasonal Regression Model Decomposition of MAPE and SMAPE,” J Stat Sci Appl, vol. 2, pp. 203–212, 2014, doi: https://doi.org/10.17265/2328-224x/2014.05.004.
[23] C. Chen, J. Twycross, and J. M. Garibaldi, “A new accuracy measure based on bounded relative error for time series forecasting,” PLoS One, vol. 12, no. 3, p. e0174202, 2017, doi: https://doi.org/10.1371/journal.pone.0174202.
[24] J.-W. Baek and K. Chung, “Multi-context mining-based graph neural network for predicting emerging health risks,” IEEE Access, vol. 11, pp. 15153–15163, 2023, doi: https://doi.org/10.1109/ACCESS.2023.3243722.
[25] A. M. Iqbal, I. T. Setiadi, A. D. Pratama, and I. Imelda, “Stock Price Prediction of PT. Kimia Farma, Tbk Using Bayesian Ridge Algorithm,” Al Qalam: Jurnal Ilmiah Keagamaan dan Kemasyarakatan, vol. 17, no. 3, pp. 2218–2229, 2023, doi: http://dx.doi.org/10.35931/aq.v17i3.2222.
[26] S. R. P. Ariyanto and W. Yustanti, “Prediksi Kenaikan Jabatan Pranata Komputer pada Kementerian X dengan Menggunakan Model Algoritma Klasifikasi Linear Discriminant Analysis (LDA),” Journal of Emerging Information System and Business Intelligence (JEISBI), vol. 4, no. 3, pp. 40–49, 2023, [Online]. Available: https://ejournal.unesa.ac.id/index.php/JEISBI/article/view/54229