ENHANCING E-COMMERCE REVIEW SENTIMENT ANALYSIS WITH LINEAR SVM: FEATURE-EXTRACTION AND HYPERPARAMETER COMPARISONS
Abstract
Sentiment analysis of e-commerce reviews is essential for understanding customer perceptions and supporting service and marketing decisions. However, previous SVM-based studies often report results using only one feature representation or one tuning approach, which provides limited guidance on the most effective practical configuration. This study addresses this gap by benchmarking a linear Support Vector Machine across TF IDF and Word2Vec representations and comparing three hyperparameter tuning strategies, Grid Search, Random Search, and Optuna, on an Indonesian language dataset of customer product reviews. The held-out test set contains 871 reviews, while class imbalance in the training data is handled by applying SMOTE only on the training set, resulting in a balanced training set of 2902 samples. Using stratified validation with Accuracy, Precision, Recall, F1 score, and ROC AUC, the best configuration is TF IDF with Optuna-tuned linear SVM, achieving 86.68 percent accuracy, an F1 score of 0.87, and ROC AUC of about 0.93 to 0.94. For Word2Vec, the best result is obtained with Random Search, reaching 84.38 percent accuracy, an F1 score of 0.84, and an ROC AUC of about 0.92. These findings indicate that TF-IDF is a stronger match for linear SVM in this setting, and that Optuna provides the most consistent gains for TF-IDF. Limitations include the use of binary sentiment labels and an evaluation focused on linear SVM with simple document-level Word2Vec aggregation, so performance may differ across other domains, platforms, and languages. Future research will examine richer document embeddings, nonlinear and contextual models, multi-class or aspect-level sentiment, and broader cross-platform validation to improve generalizability.
Downloads
References
S. Fernandes, R. Panda, V. G. Venkatesh, B. N. Swar, and Y. Shi, “MEASURING THE IMPACT OF ONLINE REVIEWS ON CONSUMER PURCHASE DECISIONS – A SCALE DEVELOPMENT STUDY,” Journal of Retailing and Consumer Services, vol. 68, pp. 1–11, Sep. 2022, doi: https://doi.org/10.1016/j.jretconser.2022.103066
D. M. Brown, S. Pattinson, C. Sutherland, and M. A. P. Davies, “INTERNAL MARKETING AND ORGANIZATIONAL PERFORMANCE: A SYSTEMATIC REVIEW AND FUTURE RESEARCH AGENDA,” J Bus Res, vol. 194, pp. 1–14, May 2025, doi: https://doi.org/10.1016/j.jbusres.2025.115384
M. K. Anam, T. P. Lestari, H. Yenni, T. Nasution, and M. B. Firdaus, “ENHANCEMENT OF MACHINE LEARNING ALGORITHM IN FINE-GRAINED SENTIMENT ANALYSIS USING THE ENSEMBLE,” ECTI Transactions on Computer and Information Technology (ECTI-CIT), vol. 19, no. 2, pp. 159–167, Mar. 2025, doi: https://doi.org/10.37936/ecti-cit.2025192.257815
M. K. Anam et al., “SARA DETECTION ON SOCIAL MEDIA USING DEEP LEARNING ALGORITHM DEVELOPMENT,” Journal of Applied Engineering and Technological Science, vol. 6, no. 1, pp. 225–237, Dec. 2024, doi: https://doi.org/10.37385/jaets.v6i1.5390
N. W. Susanto and H. Suparwito, “SVM-PSO ALGORITHM FOR TWEET SENTIMENT ANALYSIS #BESOKSENIN,” Indonesian Journal of Information Systems (IJIS), vol. 6, no. 1, pp. 36–47, 2023, doi: https://doi.org/10.24002/ijis.v6i1.7551
M. A. K. Raiaan et al., “A SYSTEMATIC REVIEW OF HYPERPARAMETER OPTIMIZATION TECHNIQUES IN CONVOLUTIONAL NEURAL NETWORKS,” Decision Analytics Journal, vol. 11, pp. 1–32, Jun. 2024, doi: https://doi.org/10.1016/j.dajour.2024.100470
M. Zainottah, R. Saputra, Y. Servanda, and I. Rosita, “CRITICAL SENTIMENT ANALYSIS OF TOKOPEDIA ELECTRONIC PRODUCTS USING SVM-LOGISTIC & TF-IDF ENSEMBLE METHODS,” vol. 4, no. 3, pp. 2808–4519, 2025, doi: https://doi.org/10.59934/jaiea.v4i3.1194
N. Z. B. Jannah and K. Kusnawi, “COMPARISON OF NAÏVE BAYES AND SVM IN SENTIMENT ANALYSIS OF PRODUCT REVIEWS ON MARKETPLACES,” Sinkron, vol. 8, no. 2, pp. 727–733, Mar. 2024, doi: https://doi.org/10.33395/sinkron.v8i2.13559
M. Hidayat and A. Wibowo, “SVM OPTIMIZATION WITH INFORMATION GAIN FEATURE SELECTION TO INCREASE THE ACCURACY OF SENTIMENT ANALYSIS OF INCREASING THE COST OF THE HAJJ,” Jurnal Teknik Informatika (Jutif), vol. 5, no. 4, pp. 579–591, Aug. 2024. , doi: https://doi.org/10.52436/1.jutif.2024.5.4.2217
F. Farasalsabila et al., “SENTIMENT ANALYSIS FOR IMDB MOVIE REVIEW USING SUPPORT VECTOR MACHINE (SVM) METHOD,” Inform: Jurnal Ilmiah Bidang Teknologi Informasi dan Komunikasi, vol. 8, no. 2, pp. 90–95, Mar. 2023, doi: https://doi.org/10.25139/inform.v8i2.5700
F. Panjaitan, W. Ce, H. Oktafiandi, G. Kanugrahan, Y. Ramdhani, and V. H. C. Putra, “EVALUATION OF MACHINE LEARNING MODELS FOR SENTIMENT ANALYSIS IN THE SOUTH SUMATRA GOVERNOR ELECTION USING DATA BALANCING TECHNIQUES,” Journal of Information Systems and Informatics, vol. 7, no. 1, pp. 461–478, Mar. 2025, doi: https://doi.org/10.51519/journalisi.v7i1.1019
E. Elhosary and O. Moselhi, “EVALUATING NATURAL LANGUAGE PROCESSING ALGORITHMS FOR IMPROVED HAZARD AND OPERABILITY ANALYSIS,” Geodata and AI, vol. 4, pp. 1–15, Sep. 2025, doi: https://doi.org/10.1016/j.geoai.2025.100026
D. S. Asudani, N. K. Nagwani, and P. Singh, “IMPACT OF WORD EMBEDDING MODELS ON TEXT ANALYTICS IN DEEP LEARNING ENVIRONMENT: A REVIEW,” Artif Intell Rev, vol. 56, no. 9, pp. 10345–10425, Sep. 2023, doi: https://doi.org/10.1007/s10462-023-10419-1
A. S. Aribowo, N. H. Cahyana, and Y. Fauziah, “ENHANCING SEMI-SUPERVISED SENTIMENT ANALYSIS THROUGH HYPERPARAMETER TUNING WITHIN ITERATIONS: A COMPARATIVE STUDY USING GRID SEARCH AND RANDOM SEARCH,” in International Conference on Advanced Informatics and Intelligent Information Systems, pp. 248–260, 2024, doi: https://doi.org/10.2991/978-94-6463-366-5_23
A. A. R. Reza and M. S. Rohman, “PREDICTION STUNTING ANALYSIS USING RANDOM FOREST ALGORITHM AND RANDOM SEARCH OPTIMIZATION,” JOURNAL OF INFORMATICS AND TELECOMMUNICATION ENGINEERING, vol. 7, no. 2, pp. 534–544, Jan. 2024, doi: https://doi.org/10.31289/jite.v7i2.10628
Herianto, B. Kurniawan, Z. H. Hartomi, Y. Irawan, and M. K. Anam, “MACHINE LEARNING ALGORITHM OPTIMIZATION USING STACKING TECHNIQUE FOR GRADUATION PREDICTION,” Journal of Applied Data Sciences, vol. 5, no. 3, pp. 1272–1285, Sep. 2024, doi: https://doi.org/10.47738/jads.v5i3.316
M. K. Anam et al., “IMPROVED PERFORMANCE OF HYBRID GRU-BILSTM FOR DETECTION EMOTION ON TWITTER DATASET,” Journal of Applied Data Sciences, vol. 6, no. 1, pp. 354–365, Jan. 2025, doi: https://doi.org/10.47738/jads.v6i1.459
Y. D. Nugroho H, F. Zakiyabarsi, and A. J. Paramita, “IMPLEMENTASI SMOTE-ENN DAN BORDERLINE SMOTE TERHADAP PERFORMA LIGHTGBM PADA IMBALANCED CLASS,” Rabit: Jurnal Teknologi dan Sistem Informasi Univrab, vol. 10, no. 1, pp. 51–59, Jan. 2025, doi: https://doi.org/10.36341/rabit.v10i1.5436
M. A. Latief, L. R. Nabila, W. Miftakhurrahman, S. Ma’rufatullah, and H. Tantyoko, “HANDLING IMBALANCE DATA USING HYBRID SAMPLING SMOTE-ENN IN LUNG CANCER CLASSIFICATION,” International Journal of Engineering and Computer Science Applications (IJECSA), vol. 3, no. 1, pp. 11–18, Feb. 2024, doi: https://doi.org/10.30812/ijecsa.v3i1.3758
P. P. Putra, M. K. Anam, A. S. Chan, A. Hadi, N. Hendri, and A. Masnur, “OPTIMIZING SENTIMENT ANALYSIS ON IMBALANCED HOTEL REVIEW DATA USING SMOTE AND ENSEMBLE MACHINE LEARNING TECHNIQUES,” Journal of Applied Data Sciences, vol. 6, no. 2, pp. 936–951, May 2025, doi: https://doi.org/10.47738/jads.v6i2.618
F. Suandi et al., “ENHANCING SENTIMENT ANALYSIS PERFORMANCE USING SMOTE AND MAJORITY VOTING IN MACHINE LEARNING ALGORITHMS,” in International Conference on Applied Engineering, Atlantis Press, 2024, pp. 126–138, doi: https://doi.org/10.2991/978-94-6463-620-8_10
M. K. Anam, T. P. Lestari, L. Efrizoni, N. S. Handayani, and I. Andhika, “SENTIMENT ANALYSIS OPTIMIZATION USING ENSEMBLE OF MULTIPLE SVM KERNEL FUNCTIONS,” Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), vol. 9, no. 4, pp. 905–914, Aug. 2025, doi: https://doi.org/10.29207/resti.v9i4.6708
V. Talasila, M. V Mohan, and N. M. R, “ENHANCING TEXT-TO-IMAGE SYNTHESIS WITH AN IMPROVED SEMI-SUPERVISED IMAGE GENERATION MODEL INCORPORATING N-GRAM, ENHANCED TF-IDF, AND BOW TECHNIQUES,” International Journal of Intelligent Systems and Applications in Engineering , vol. 11, no. 7s, pp. 381–397, 2023, [Online]. Available: www.ijisae.org
B. Kabra and C. Nagar, “CONVOLUTIONAL NEURAL NETWORK BASED SENTIMENT ANALYSIS WITH TF-IDF BASED VECTORIZATION,” Pg 1 J. Integr. Sci. Technol, vol. 11, no. 3, p. 503, 2023, [Online]. Available: http://pubs.thesciencein.org/jist
H. Zhou, “RESEARCH OF TEXT CLASSIFICATION BASED ON TF-IDF AND CNN-LSTM,” in Journal of Physics: Conference Series, IOP Publishing Ltd, pp. 1–8, Jan. 2022, doi: https://doi.org/10.1088/1742-6596/2171/1/012021
S. A. Savittri, A. Amalia, and M. A. Budiman, “A RELEVANT DOCUMENT SEARCH SYSTEM MODEL USING WORD2VEC APPROACHES,” in Journal of Physics: Conference Series, IOP Publishing Ltd, pp. 1–9, Jun. 2021, , doi: https://doi.org/10.1088/1742-6596/1898/1/012008
Sugiyanti and A. Fricco, “SHOE REVIEW SENTIMENT ANALYSIS USING MACHINE LEARNING AND DEEP LEARNING WITH WORD2VEC,” Jurnal Elektronik Ilmu Komputer Udayana, vol. 12, no. 1, pp. 215–222, 2023, doi: https://doi.org/10.24843/JLK.2023.v12.i01.p25
M. Lestandy and Abdurrahim, “EFFECT OF WORD2VEC WEIGHTING WITH CNN-BILSTM MODEL ON EMOTION CLASSIFICATION,” Jurnal Nasional Pendidikan Teknik Informatika (JANAPATI), vol. 12, no. 1, pp. 99–107, Mar. 2023, doi: https://doi.org/10.23887/janapati.v12i1.58571
E. M. S. Rochman et al., “CLASSIFICATION OF THESIS TOPICS BASED ON INFORMATICS SCIENCE USING SVM,” in IOP Conference Series: Materials Science and Engineering, IOP Publishing, pp. 1–7, May 2021, doi: https://doi.org/10.1088/1757-899X/1125/1/012033
Sugiyanti and A. Fricco, “SHOE REVIEW SENTIMENT ANALYSIS USING MACHINE LEARNING AND DEEP LEARNING WITH WORD2VEC,” Jurnal Elektronik Ilmu Komputer Udayana, vol. 12, no. 1, pp. 215–222, 2023, doi: https://doi.org/10.24843/JLK.2023.v12.i01.p25
M. Lestandy and Abdurrahim, “EFFECT OF WORD2VEC WEIGHTING WITH CNN-BILSTM MODEL ON EMOTION CLASSIFICATION,” Jurnal Nasional Pendidikan Teknik Informatika (JANAPATI), vol. 12, no. 1, pp. 99–107, Mar. 2023, doi: https://doi.org/10.23887/janapati.v12i1.58571
Copyright (c) 2026 Fauziah Hanum, Richi Andrianto, Anita Sri Rejeki Hutagaol, Nurhanna Harahap, ibnu Rasyid munthe

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this Journal agree to the following terms:
- Author retain copyright and grant the journal right of first publication with the work simultaneously licensed under a creative commons attribution license that allow others to share the work within an acknowledgement of the work’s authorship and initial publication of this journal.
- Authors are able to enter into separate, additional contractual arrangement for the non-exclusive distribution of the journal’s published version of the work (e.g. acknowledgement of its initial publication in this journal).
- Authors are permitted and encouraged to post their work online (e.g. in institutional repositories or on their websites) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published works.




1.gif)


