Klasifikasi Topik Riset Ilmu Komputer di Kawasan ASEAN Menggunakan Algoritma Random Forest

  • Victoria Daniela Melatawun Program Studi Ilmu Komputer, Fakultas Sains dan Teknologi, Universitas Pattimura
  • Citra Fathia Palembang Program Studi Ilmu Komputer, Fakultas Sains dan Teknologi, Universitas Pattimura
  • Noval Febrian Pattiasina Program Studi Ilmu Komputer, Fakultas Sains dan Teknologi, Universitas Pattimura
Keywords: Random Forest, TF-IDF, Text Classification, Computer Science, ASEAN

Abstract

The growth of scientific publications in computer science across Southeast Asia (ASEAN) over the past decade reflects the increasing research capacity of its member states. However, few studies have systematically mapped the distribution of computer science research topics in this region using a machine learning approach, particularly in identifying under-researched sub-fields (research gaps). This study aims to classify computer science research topics from scientific publications of ASEAN countries for the period 2015–2025, while simultaneously identifying open research gaps. Data were obtained from OpenAlex, an open-access bibliographic database covering more than 245 million global scientific publications, with a total of 43,263 papers collected from 10 ASEAN countries. Each paper was represented using the Term Frequency-Inverse Document Frequency (TF-IDF) method with 10,000 features and classified into 22 computer science sub-fields using a Random Forest algorithm with 200 estimators and a 5-fold cross-validation scheme. Model evaluation yielded an accuracy of 93.87%, a weighted F1-Score of 0.9320, and a Cross-Validation Accuracy of 93.73% ± 0.63%. Artificial Intelligence dominated computer science research across ASEAN, peaking at 1,168 papers in 2020. Computer Vision (F1=0.00), Bioinformatics (F1=0.12), and Robotics (F1=0.18) were identified as the sub-fields with the largest research gaps. It should be noted that sub-field labeling was performed automatically, and therefore manual validation by domain experts remains necessary.

Downloads

Download data is not yet available.
Published
2026-05-05
How to Cite
Melatawun, V. D., Palembang, C. F., & Pattiasina, N. F. (2026). Klasifikasi Topik Riset Ilmu Komputer di Kawasan ASEAN Menggunakan Algoritma Random Forest. ALGORITHM: Journal of Computer Science and Computational Intelligence, 2(1), 47-56. https://doi.org/10.30598/algorithm.v2i1.47-56
Section
Articles