ENHANCING E-COMMERCE REVIEW SENTIMENT ANALYSIS WITH LINEAR SVM: FEATURE-EXTRACTION AND HYPERPARAMETER COMPARISONS

Keywords: E-commerce, Feature Extraction, Hyperparameter tuning, Sentiment Analysis, SVM Linear

Abstract

Sentiment analysis of e-commerce reviews is essential for understanding customer perceptions and supporting service and marketing decisions. However, previous SVM-based studies often report results using only one feature representation or one tuning approach, which provides limited guidance on the most effective practical configuration. This study addresses this gap by benchmarking a linear Support Vector Machine across TF IDF and Word2Vec representations and comparing three hyperparameter tuning strategies, Grid Search, Random Search, and Optuna, on an Indonesian language dataset of customer product reviews. The held-out test set contains 871 reviews, while class imbalance in the training data is handled by applying SMOTE only on the training set, resulting in a balanced training set of 2902 samples. Using stratified validation with Accuracy, Precision, Recall, F1 score, and ROC AUC, the best configuration is TF IDF with Optuna-tuned linear SVM, achieving 86.68 percent accuracy, an F1 score of 0.87, and ROC AUC of about 0.93 to 0.94. For Word2Vec, the best result is obtained with Random Search, reaching 84.38 percent accuracy, an F1 score of 0.84, and an ROC AUC of about 0.92. These findings indicate that TF-IDF is a stronger match for linear SVM in this setting, and that Optuna provides the most consistent gains for TF-IDF. Limitations include the use of binary sentiment labels and an evaluation focused on linear SVM with simple document-level Word2Vec aggregation, so performance may differ across other domains, platforms, and languages. Future research will examine richer document embeddings, nonlinear and contextual models, multi-class or aspect-level sentiment, and broader cross-platform validation to improve generalizability.

Downloads

Download data is not yet available.

References

S. Fernandes, R. Panda, V. G. Venkatesh, B. N. Swar, and Y. Shi, “MEASURING THE IMPACT OF ONLINE REVIEWS ON CONSUMER PURCHASE DECISIONS – A SCALE DEVELOPMENT STUDY,” Journal of Retailing and Consumer Services, vol. 68, pp. 1–11, Sep. 2022, doi: https://doi.org/10.1016/j.jretconser.2022.103066

D. M. Brown, S. Pattinson, C. Sutherland, and M. A. P. Davies, “INTERNAL MARKETING AND ORGANIZATIONAL PERFORMANCE: A SYSTEMATIC REVIEW AND FUTURE RESEARCH AGENDA,” J Bus Res, vol. 194, pp. 1–14, May 2025, doi: https://doi.org/10.1016/j.jbusres.2025.115384

M. K. Anam, T. P. Lestari, H. Yenni, T. Nasution, and M. B. Firdaus, “ENHANCEMENT OF MACHINE LEARNING ALGORITHM IN FINE-GRAINED SENTIMENT ANALYSIS USING THE ENSEMBLE,” ECTI Transactions on Computer and Information Technology (ECTI-CIT), vol. 19, no. 2, pp. 159–167, Mar. 2025, doi: https://doi.org/10.37936/ecti-cit.2025192.257815

M. K. Anam et al., “SARA DETECTION ON SOCIAL MEDIA USING DEEP LEARNING ALGORITHM DEVELOPMENT,” Journal of Applied Engineering and Technological Science, vol. 6, no. 1, pp. 225–237, Dec. 2024, doi: https://doi.org/10.37385/jaets.v6i1.5390

N. W. Susanto and H. Suparwito, “SVM-PSO ALGORITHM FOR TWEET SENTIMENT ANALYSIS #BESOKSENIN,” Indonesian Journal of Information Systems (IJIS), vol. 6, no. 1, pp. 36–47, 2023, doi: https://doi.org/10.24002/ijis.v6i1.7551

M. A. K. Raiaan et al., “A SYSTEMATIC REVIEW OF HYPERPARAMETER OPTIMIZATION TECHNIQUES IN CONVOLUTIONAL NEURAL NETWORKS,” Decision Analytics Journal, vol. 11, pp. 1–32, Jun. 2024, doi: https://doi.org/10.1016/j.dajour.2024.100470

M. Zainottah, R. Saputra, Y. Servanda, and I. Rosita, “CRITICAL SENTIMENT ANALYSIS OF TOKOPEDIA ELECTRONIC PRODUCTS USING SVM-LOGISTIC & TF-IDF ENSEMBLE METHODS,” vol. 4, no. 3, pp. 2808–4519, 2025, doi: https://doi.org/10.59934/jaiea.v4i3.1194

N. Z. B. Jannah and K. Kusnawi, “COMPARISON OF NAÏVE BAYES AND SVM IN SENTIMENT ANALYSIS OF PRODUCT REVIEWS ON MARKETPLACES,” Sinkron, vol. 8, no. 2, pp. 727–733, Mar. 2024, doi: https://doi.org/10.33395/sinkron.v8i2.13559

M. Hidayat and A. Wibowo, “SVM OPTIMIZATION WITH INFORMATION GAIN FEATURE SELECTION TO INCREASE THE ACCURACY OF SENTIMENT ANALYSIS OF INCREASING THE COST OF THE HAJJ,” Jurnal Teknik Informatika (Jutif), vol. 5, no. 4, pp. 579–591, Aug. 2024. , doi: https://doi.org/10.52436/1.jutif.2024.5.4.2217

F. Farasalsabila et al., “SENTIMENT ANALYSIS FOR IMDB MOVIE REVIEW USING SUPPORT VECTOR MACHINE (SVM) METHOD,” Inform: Jurnal Ilmiah Bidang Teknologi Informasi dan Komunikasi, vol. 8, no. 2, pp. 90–95, Mar. 2023, doi: https://doi.org/10.25139/inform.v8i2.5700

F. Panjaitan, W. Ce, H. Oktafiandi, G. Kanugrahan, Y. Ramdhani, and V. H. C. Putra, “EVALUATION OF MACHINE LEARNING MODELS FOR SENTIMENT ANALYSIS IN THE SOUTH SUMATRA GOVERNOR ELECTION USING DATA BALANCING TECHNIQUES,” Journal of Information Systems and Informatics, vol. 7, no. 1, pp. 461–478, Mar. 2025, doi: https://doi.org/10.51519/journalisi.v7i1.1019

E. Elhosary and O. Moselhi, “EVALUATING NATURAL LANGUAGE PROCESSING ALGORITHMS FOR IMPROVED HAZARD AND OPERABILITY ANALYSIS,” Geodata and AI, vol. 4, pp. 1–15, Sep. 2025, doi: https://doi.org/10.1016/j.geoai.2025.100026

D. S. Asudani, N. K. Nagwani, and P. Singh, “IMPACT OF WORD EMBEDDING MODELS ON TEXT ANALYTICS IN DEEP LEARNING ENVIRONMENT: A REVIEW,” Artif Intell Rev, vol. 56, no. 9, pp. 10345–10425, Sep. 2023, doi: https://doi.org/10.1007/s10462-023-10419-1

A. S. Aribowo, N. H. Cahyana, and Y. Fauziah, “ENHANCING SEMI-SUPERVISED SENTIMENT ANALYSIS THROUGH HYPERPARAMETER TUNING WITHIN ITERATIONS: A COMPARATIVE STUDY USING GRID SEARCH AND RANDOM SEARCH,” in International Conference on Advanced Informatics and Intelligent Information Systems, pp. 248–260, 2024, doi: https://doi.org/10.2991/978-94-6463-366-5_23

A. A. R. Reza and M. S. Rohman, “PREDICTION STUNTING ANALYSIS USING RANDOM FOREST ALGORITHM AND RANDOM SEARCH OPTIMIZATION,” JOURNAL OF INFORMATICS AND TELECOMMUNICATION ENGINEERING, vol. 7, no. 2, pp. 534–544, Jan. 2024, doi: https://doi.org/10.31289/jite.v7i2.10628

Herianto, B. Kurniawan, Z. H. Hartomi, Y. Irawan, and M. K. Anam, “MACHINE LEARNING ALGORITHM OPTIMIZATION USING STACKING TECHNIQUE FOR GRADUATION PREDICTION,” Journal of Applied Data Sciences, vol. 5, no. 3, pp. 1272–1285, Sep. 2024, doi: https://doi.org/10.47738/jads.v5i3.316

M. K. Anam et al., “IMPROVED PERFORMANCE OF HYBRID GRU-BILSTM FOR DETECTION EMOTION ON TWITTER DATASET,” Journal of Applied Data Sciences, vol. 6, no. 1, pp. 354–365, Jan. 2025, doi: https://doi.org/10.47738/jads.v6i1.459

Y. D. Nugroho H, F. Zakiyabarsi, and A. J. Paramita, “IMPLEMENTASI SMOTE-ENN DAN BORDERLINE SMOTE TERHADAP PERFORMA LIGHTGBM PADA IMBALANCED CLASS,” Rabit: Jurnal Teknologi dan Sistem Informasi Univrab, vol. 10, no. 1, pp. 51–59, Jan. 2025, doi: https://doi.org/10.36341/rabit.v10i1.5436

M. A. Latief, L. R. Nabila, W. Miftakhurrahman, S. Ma’rufatullah, and H. Tantyoko, “HANDLING IMBALANCE DATA USING HYBRID SAMPLING SMOTE-ENN IN LUNG CANCER CLASSIFICATION,” International Journal of Engineering and Computer Science Applications (IJECSA), vol. 3, no. 1, pp. 11–18, Feb. 2024, doi: https://doi.org/10.30812/ijecsa.v3i1.3758

P. P. Putra, M. K. Anam, A. S. Chan, A. Hadi, N. Hendri, and A. Masnur, “OPTIMIZING SENTIMENT ANALYSIS ON IMBALANCED HOTEL REVIEW DATA USING SMOTE AND ENSEMBLE MACHINE LEARNING TECHNIQUES,” Journal of Applied Data Sciences, vol. 6, no. 2, pp. 936–951, May 2025, doi: https://doi.org/10.47738/jads.v6i2.618

F. Suandi et al., “ENHANCING SENTIMENT ANALYSIS PERFORMANCE USING SMOTE AND MAJORITY VOTING IN MACHINE LEARNING ALGORITHMS,” in International Conference on Applied Engineering, Atlantis Press, 2024, pp. 126–138, doi: https://doi.org/10.2991/978-94-6463-620-8_10

M. K. Anam, T. P. Lestari, L. Efrizoni, N. S. Handayani, and I. Andhika, “SENTIMENT ANALYSIS OPTIMIZATION USING ENSEMBLE OF MULTIPLE SVM KERNEL FUNCTIONS,” Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), vol. 9, no. 4, pp. 905–914, Aug. 2025, doi: https://doi.org/10.29207/resti.v9i4.6708

V. Talasila, M. V Mohan, and N. M. R, “ENHANCING TEXT-TO-IMAGE SYNTHESIS WITH AN IMPROVED SEMI-SUPERVISED IMAGE GENERATION MODEL INCORPORATING N-GRAM, ENHANCED TF-IDF, AND BOW TECHNIQUES,” International Journal of Intelligent Systems and Applications in Engineering , vol. 11, no. 7s, pp. 381–397, 2023, [Online]. Available: www.ijisae.org

B. Kabra and C. Nagar, “CONVOLUTIONAL NEURAL NETWORK BASED SENTIMENT ANALYSIS WITH TF-IDF BASED VECTORIZATION,” Pg 1 J. Integr. Sci. Technol, vol. 11, no. 3, p. 503, 2023, [Online]. Available: http://pubs.thesciencein.org/jist

H. Zhou, “RESEARCH OF TEXT CLASSIFICATION BASED ON TF-IDF AND CNN-LSTM,” in Journal of Physics: Conference Series, IOP Publishing Ltd, pp. 1–8, Jan. 2022, doi: https://doi.org/10.1088/1742-6596/2171/1/012021

S. A. Savittri, A. Amalia, and M. A. Budiman, “A RELEVANT DOCUMENT SEARCH SYSTEM MODEL USING WORD2VEC APPROACHES,” in Journal of Physics: Conference Series, IOP Publishing Ltd, pp. 1–9, Jun. 2021, , doi: https://doi.org/10.1088/1742-6596/1898/1/012008

Sugiyanti and A. Fricco, “SHOE REVIEW SENTIMENT ANALYSIS USING MACHINE LEARNING AND DEEP LEARNING WITH WORD2VEC,” Jurnal Elektronik Ilmu Komputer Udayana, vol. 12, no. 1, pp. 215–222, 2023, doi: https://doi.org/10.24843/JLK.2023.v12.i01.p25

M. Lestandy and Abdurrahim, “EFFECT OF WORD2VEC WEIGHTING WITH CNN-BILSTM MODEL ON EMOTION CLASSIFICATION,” Jurnal Nasional Pendidikan Teknik Informatika (JANAPATI), vol. 12, no. 1, pp. 99–107, Mar. 2023, doi: https://doi.org/10.23887/janapati.v12i1.58571

E. M. S. Rochman et al., “CLASSIFICATION OF THESIS TOPICS BASED ON INFORMATICS SCIENCE USING SVM,” in IOP Conference Series: Materials Science and Engineering, IOP Publishing, pp. 1–7, May 2021, doi: https://doi.org/10.1088/1757-899X/1125/1/012033

Sugiyanti and A. Fricco, “SHOE REVIEW SENTIMENT ANALYSIS USING MACHINE LEARNING AND DEEP LEARNING WITH WORD2VEC,” Jurnal Elektronik Ilmu Komputer Udayana, vol. 12, no. 1, pp. 215–222, 2023, doi: https://doi.org/10.24843/JLK.2023.v12.i01.p25

M. Lestandy and Abdurrahim, “EFFECT OF WORD2VEC WEIGHTING WITH CNN-BILSTM MODEL ON EMOTION CLASSIFICATION,” Jurnal Nasional Pendidikan Teknik Informatika (JANAPATI), vol. 12, no. 1, pp. 99–107, Mar. 2023, doi: https://doi.org/10.23887/janapati.v12i1.58571

Published
2026-04-08
How to Cite
[1]
F. Hanum, R. Andrianto, A. S. R. Hutagaol, N. Harahap, and ibnu R. munthe, “ENHANCING E-COMMERCE REVIEW SENTIMENT ANALYSIS WITH LINEAR SVM: FEATURE-EXTRACTION AND HYPERPARAMETER COMPARISONS”, BAREKENG: J. Math. & App., vol. 20, no. 3, pp. 2575-2586, Apr. 2026.