APPLICATION OF SUPPORT VECTOR MACHINE FOR CLASS IMBALANCE LEARNING TO PREDICT ANTICANCER COMPOUNDS OF MEDICINAL PLANTS IN WEST SULAWESI
Abstract
Indonesian medicinal plants, such as turmeric and soursop, have shown promising anticancer properties through their bioactive compounds, like curcumin and extracts from soursop. Despite many extensive studies on medicinal plants in Indonesia, research revealing the activity of natural products in West Sulawesi is still limited, and the studies focus mainly on ethnobotanical research. In this work, we propose a machine-learning approach to predict the anticancer activity of compounds in medicinal plants in West Sulawesi by leveraging high throughput-screening data, especially molecular information from a public database. We applied Support Vector Machine (SVM) with five sampling techniques to address data imbalance. We also evaluated the performance in selecting the best combination in handling class imbalance learning in our dataset. The result shows that undersampling and ADSYN methods can improve the prediction of anticancer activity. Based on the two methods of balancing data, we have ten potential anticancer compounds from three medicinal plants in West Sulawesi.
Downloads
References
B. Andinata, A. Bachtiar, P. Oktamianti, J. R. Partahi, and M. S. A. Dini, “A Comparison of Cancer Incidences Between Dharmais Cancer Hospital and GLOBOCAN 2020: A Descriptive Study of Top 10 Cancer Incidences,” Indonesian Journal of Cancer, vol. 17, no. 2, pp. 119–122, 2023.
A. Zia, T. Farkhondeh, A. M. Pourbagher-Shahri, and S. Samarghandian, “The role of curcumin in aging and senescence: Molecular mechanisms,” Biomedicine & Pharmacotherapy, vol. 134, p. 111119, 2021.
S. Ilango et al., “A review on annona muricata and its anticancer activity,” Cancers (Basel), vol. 14, no. 18, p. 4539, 2022.
G. M. Nurdin, A. P. Sari, and H. Herni, “Identifikasi Tumbuhan Obat Masyarakat Desa Pao-Pao Kabupaten Polewali Mandar Provinsi Sulawesi Barat,” Biosfer: Jurnal Biologi dan Pendidikan Biologi, vol. 7, no. 1, pp. 20–29, 2022.
H. Hastuti, I. Lestari, M. Yunus, and A. Hasyim, “Inventarisasi Tumbuhan Berkhasiat Obat di Desa Pokkang, Kec. Kalukku, Kabupaten Mamuju, Provinsi Sulawesi Barat,” Jurnal Biosense, vol. 5, no. 01, pp. 41–54, 2022.
H. Alang, S. Rosalia, and A. D. R. Ainulia, “Inventarisasi tumbuhan obat sebagai upaya swamedikasi oleh masyarakat suku mamasa di Sulawesi Barat,” Quagga: Jurnal Pendidikan Dan Biologi, vol. 14, no. 1, pp. 77–87, 2022.
R. Zhang, X. Li, X. Zhang, H. Qin, and W. Xiao, “Machine learning approaches for elucidating the biological effects of natural products,” Nat Prod Rep, vol. 38, no. 2, pp. 346–361, 2021.
S. Syamsiah, H. Karim, A. F. Arsal, and S. Sondok, “Kajian Etnobotani dalam Pemanfaatan Tumbuhan Obat Tradisional di Kecamatan Pana Kabupaten Mamasa, Sulawesi Barat,” Jurnal Bionature, vol. 22, no. 2, pp. 1–12, 2021.
S. Kim et al., “PubChem 2019 update: improved access to chemical data,” Nucleic Acids Res, vol. 47, no. D1, pp. D1102–D1109, 2019.
A. Rácz, D. Bajusz, and K. Héberger, “Life beyond the Tanimoto coefficient: Similarity measures for interaction fingerprints,” J Cheminform, vol. 10, no. 1, pp. 1–12, 2018, doi: 10.1186/s13321-018-0302-y.
R. Mohammed, J. Rawashdeh, and M. Abdullah, “Machine learning with oversampling and undersampling techniques: overview study and experimental results,” in 2020 11th international conference on information and communication systems (ICICS), IEEE, 2020, pp. 243–248.
S. Liu and K. Zhang, “Under-sampling and feature selection algorithms for S2SMLP,” IEEE Access, vol. 8, pp. 191803–191814, 2020.
J. Mathew, C. K. Pang, M. Luo, and W. H. Leong, “Classification of imbalanced data by oversampling in kernel space of support vector machines,” IEEE Trans Neural Netw Learn Syst, vol. 29, no. 9, pp. 4065–4076, 2017.
A. Ishaq et al., “Improving the prediction of heart failure patients’ survival using SMOTE and effective data mining techniques,” IEEE access, vol. 9, pp. 39707–39716, 2021.
D. Elreedy, A. F. Atiya, and F. Kamalov, “A theoretical distribution analysis of synthetic minority oversampling technique (SMOTE) for imbalanced learning,” Mach Learn, pp. 1–21, 2023.
C. Bunkhumpornpat, K. Sinapiromsaran, and C. Lursinsap, “DBSMOTE: density-based synthetic minority over-sampling technique,” Applied Intelligence, vol. 36, pp. 664–684, 2012.
C.-K. Ma and Y.-J. Park, “A new instance density-based synthetic minority oversampling method for imbalanced classification problems,” Engineering Optimization, vol. 54, no. 10, pp. 1743–1757, 2022.
J. Brandt and E. Lanzén, “A comparative review of SMOTE and ADASYN in imbalanced data classification,” 2021.
D. A. Pisner and D. M. Schnyer, “Support vector machine,” in Machine learning, Elsevier, 2020, pp. 101–121.
A. Patle and D. S. Chouhan, “SVM kernel functions for classification,” in 2013 International conference on advances in technology and engineering (ICATE), IEEE, 2013, pp. 1–9.
N. W. S. Wardhani, M. Y. Rochayani, A. Iriany, A. D. Sulistyono, and P. Lestantyo, “Cross-validation metrics for evaluating classification performance on imbalanced data,” in 2019 international conference on computer, control, informatics and its applications (IC3INA), IEEE, 2019, pp. 14–18.
M. J. Nime et al., “Studies on Antioxidant and Antineoplastic Potentials of Oldenlandia corymbosa Linn. Leaves,” Journal of Fundamental and Applied Pharmaceutical Science, vol. 3, no. 2, p. 84, 2023.
M. Zebeaman, M. G. Tadesse, R. K. Bachheti, A. Bachheti, R. Gebeyhu, and K. K. Chaubey, “Plants and Plant-Derived Molecules as Natural Immunomodulators,” Biomed Res Int, vol. 2023, 2023.
Copyright (c) 2024 Hikmah Hikmah, Nur Hilal A Syahrir, Putri Indi Rahayu
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this Journal agree to the following terms:
- Author retain copyright and grant the journal right of first publication with the work simultaneously licensed under a creative commons attribution license that allow others to share the work within an acknowledgement of the work’s authorship and initial publication of this journal.
- Authors are able to enter into separate, additional contractual arrangement for the non-exclusive distribution of the journal’s published version of the work (e.g. acknowledgement of its initial publication in this journal).
- Authors are permitted and encouraged to post their work online (e.g. in institutional repositories or on their websites) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published works.