ANALYSIS OF MULTILINGUAL OPINION POLARIZATION WITH CROSS-LINGUAL LANGUAGE MODEL-ROBUSTLY OPTIMIZED BIDIRECTIONAL ENCODER REPRESENTATIONS FROM TRANSFORMERS APPROACH (XLM-ROBERTA)

  • Ghaitsa Shafa Cinta Kananta Department of Mathematics, Faculty of Mathematics and Natural Sciences, Universitas Sebelas Maret, Indonesia https://orcid.org/0009-0001-1076-3103
  • Dewi Retno Sari Saputro Department of Mathematics, Faculty of Mathematics and Natural Sciences, Universitas Sebelas Maret, Indonesia https://orcid.org/0000-0002-6569-394X
  • Sutanto Sutanto Department of Mathematics, Faculty of Mathematics and Natural Sciences, Universitas Sebelas Maret, Indonesia https://orcid.org/0000-0002-8072-1216
Keywords: Multilingual, Opinion Polarization, Sentiment analysis, XLM-RoBERTa

Abstract

The rapid growth of digital communication has intensified opinion exchanges across languages and cultures on social media, enriching public discourse while also increasing the risk of polarization that deepens social divisions. Conventional sentiment analysis methods that rely on translation often distort meaning, overlook emotional nuances, and fail to capture rhetorical devices such as irony and sarcasm, thereby limiting their reliability in multilingual contexts. This study examines the capability of XLM-RoBERTa, a multilingual transformer model pretrained on more than 100 languages, to address these challenges by generating consistent semantic representations and accommodating linguistic and cultural diversity without translation. The research employs bibliometric analysis using VOSviewer on 357 Scopus-indexed publications from 2020 to 2025 to map research trends, combined with a literature review that evaluates XLM-RoBERTa in sentiment and opinion analysis. The findings reveal that although XLM-RoBERTa has been widely employed for sentiment classification, text categorization, and offensive language detection, research explicitly focused on multilingual opinion polarization remains limited. Benchmark evaluations further indicate that XLM-RoBERTa surpasses earlier multilingual models, achieving 79.6% accuracy on XNLI and an 81.2% F1-score on MLQA, confirming its robustness in capturing semantic nuances, cultural variations, and rhetorical complexity without translation. The novelty of this research lies in integrating trend-mapping with methodological evaluation, thereby establishing XLM-RoBERTa as a reliable framework for real-time monitoring of global public opinion, supporting evidence-based policymaking, and advancing scholarly understanding of multilingual communication dynamics in the digital era.

Downloads

Download data is not yet available.

Author Biography

Dewi Retno Sari Saputro, Department of Mathematics, Faculty of Mathematics and Natural Sciences, Universitas Sebelas Maret, Indonesia

Lecturer in the mathematics undergraduate program

References

N. A. P. Masaling and D. Suhartono, “UTILIZING ROBERTA AND XLM-ROBERTA PRE-TRAINED MODEL FOR STRUCTURED SENTIMENT ANALYSIS,” International Journal of Informatics and Communication Technology, vol. 13, no. 3, pp. 410–421, Dec. 2024, doi: https://doi.org/10.11591/ijict.v13i3.pp410-421.

K. Garimella, G. De Francisci Morales, A. Gionis, and M. Mathioudakis, “POLITICAL DISCOURSE ON SOCIAL MEDIA: ECHO CHAMBERS, GATEKEEPERS, AND THE PRICE OF BIPARTISANSHIP,” in The Web Conference 2018 - Proceedings of the World Wide Web Conference, WWW 2018, Association for Computing Machinery, Inc, Apr. 2018, pp. 913–922. doi: https://doi.org/10.1145/3178876.3186139.

E. Pariser, THE FILTER BUBBLE: WHAT THE INTERNET IS HIDING FROM YOU. Penguin Group, The, 2011.

E. Bakshy, S. Messing, and L. A. Adamic, “EXPOSURE TO IDEOLOGICALLY DIVERSE NEWS AND OPINION ON FACEBOOK”, doi: https://doi.org/10.7910/DVN/LDJ7MS.

S. Iyengar, G. Sood, and Y. Lelkes, “AFFECT, NOT IDEOLOGY: A SOCIAL IDENTITY PERSPECTIVE ON POLARIZATION,” Sep. 2012. doi: https://doi.org/10.1093/poq/nfs038.

C. A. Bail et al., “EXPOSURE TO OPPOSING VIEWS ON SOCIAL MEDIA CAN INCREASE POLITICAL POLARIZATION,” Proc Natl Acad Sci U S A, vol. 115, no. 37, pp. 9216–9221, Sep. 2018, doi: https://doi.org/10.1073/pnas.1804840115.

S. Iyengar and S. J. Westwood, “FEAR AND LOATHING ACROSS PARTY LINES: NEW EVIDENCE ON GROUP POLARIZATION,” Am J Pol Sci, vol. 59, no. 3, pp. 690–707, Jul. 2015, doi: https://doi.org/10.1111/ajps.12152.

D. Yuna, L. Xiaokun, L. Jianing, and H. Lu, “CROSS-CULTURAL COMMUNICATION ON SOCIAL MEDIA: REVIEW FROM THE PERSPECTIVE OF CULTURAL PSYCHOLOGY AND NEUROSCIENCE,” Mar. 08, 2022, Frontiers Media S.A. doi: https://doi.org/10.3389/fpsyg.2022.858900.

M. Falkenberg, F. Zollo, W. Quattrociocchi, J. Pfeffer, and A. Baronchelli, “PATTERNS OF PARTISAN TOXICITY AND ENGAGEMENT REVEAL THE COMMON STRUCTURE OF ONLINE POLITICAL COMMUNICATION ACROSS COUNTRIES,” Nature Communications , vol. 15, no. 1, Dec. 2024, doi: https://doi.org/10.1038/s41467-024-53868-0.

K. Dashtipour et al., “MULTILINGUAL SENTIMENT ANALYSIS: STATE OF THE ART AND INDEPENDENT COMPARISON OF TECHNIQUES,” Cognit Comput, vol. 8, no. 4, pp. 757–771, Aug. 2016, doi: https://doi.org/10.1007/s12559-016-9415-7.

B. Ghanem, J. Karoui, F. Benamara, P. Rosso, and V. Moriceau, “IRONY DETECTION IN A MULTILINGUAL CONTEXT,” Feb. 2020.

D. Jain, A. Kumar, and G. Garg, “SARCASM DETECTION IN MASH-UP LANGUAGE USING SOFT-ATTENTION BASED BI-DIRECTIONAL LSTM AND FEATURE-RICH CNN,” Applied Soft Computing Journal, vol. 91, Jun. 2020, doi: https://doi.org/10.1016/j.asoc.2020.106198.

A. Balahur and M. Turchi, “COMPARATIVE EXPERIMENTS USING SUPERVISED LEARNING AND MACHINE TRANSLATION FOR MULTILINGUAL SENTIMENT ANALYSIS,” Comput Speech Lang, vol. 28, no. 1, pp. 56–75, 2014, doi: https://doi.org/10.1016/j.csl.2013.03.004.

J. Serrano-Guerrero, J. A. Olivas, F. P. Romero, and E. Herrera-Viedma, “SENTIMENT ANALYSIS: A REVIEW AND COMPARATIVE ANALYSIS OF WEB SERVICES,” Inf Sci (N Y), vol. 311, pp. 18–38, Aug. 2015, doi: https://doi.org/10.1016/j.ins.2015.03.040.

S. Poria, D. Hazarika, N. Majumder, and R. Mihalcea, “BENEATH THE TIP OF THE ICEBERG: CURRENT CHALLENGES AND NEW DIRECTIONS IN SENTIMENT ANALYSIS RESEARCH,” IEEE Trans Affect Comput, vol. 14, no. 1, pp. 108–132, Jan. 2023, doi: https://doi.org/10.1109/TAFFC.2020.3038167.

A. J. Morales, J. Borondo, J. C. Losada, and R. M. Benito, “MEASURING POLITICAL POLARIZATION: TWITTER SHOWS THE TWO SIDES OF VENEZUELA,” Chaos: An Interdisciplinary Journal of Nonlinear Science, vol. 25, no. 3, Mar. 2015, doi: https://doi.org/10.1063/1.4913758.

M. Abdalla and G. Hirst, “CROSS-LINGUAL SENTIMENT ANALYSIS WITHOUT (GOOD) TRANSLATION,” Oct. 2017.

T. Young, D. Hazarika, S. Poria, and E. Cambria, “RECENT TRENDS IN DEEP LEARNING BASED NATURAL LANGUAGE PROCESSING,” Nov. 2018.

E. Cambria, Y. Li, F. Z. Xing, S. Poria, and K. Kwok, “SENTICNET 6: ENSEMBLE APPLICATION OF SYMBOLIC AND SUBSYMBOLIC AI FOR SENTIMENT ANALYSIS,” in Proceedings of the 29th ACM International Conference on Information & Knowledge Management, New York, NY, USA: ACM, Oct. 2020, pp. 105–114. doi: https://doi.org/10.1145/3340531.3412003.

A. Conneau et al., “UNSUPERVISED CROSS-LINGUAL REPRESENTATION LEARNING AT SCALE,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Stroudsburg, PA, USA: Association for Computational Linguistics, 2020, pp. 8440–8451. doi: https://doi.org/10.18653/v1/2020.acl-main.747.

T. Ranasinghe and M. Zampieri, “MULTILINGUAL OFFENSIVE LANGUAGE IDENTIFICATION WITH CROSS-LINGUAL EMBEDDINGS.”

D. R. S. Saputro, H. Prasetyo, A. Wibowo, F. Khairina, K. Sidiq, and G. N. A. Wibowo, “BIBLIOMETRIC ANALYSIS OF NEURAL BASIS EXPANSION ANALYSIS FOR INTERPRETABLE TIME SERIES (N-BEATS) FOR RESEARCH TREND MAPPING,” Barekeng, vol. 17, no. 2, pp. 1103–1112, Jun. 2023, doi: https://doi.org/10.30598/barekengvol17iss2pp1103-1112.

N. Donthu, S. Kumar, D. Mukherjee, N. Pandey, and W. M. Lim, “HOW TO CONDUCT A BIBLIOMETRIC ANALYSIS: AN OVERVIEW AND GUIDELINES,” J Bus Res, vol. 133, pp. 285–296, Sep. 2021, doi: https://doi.org/10.1016/j.jbusres.2021.04.070.

S. P. Indriati, D. R. S. Saputro, and P. Widyaningsih, “THE BIBLIOMETRIC NETWORK TO IDENTIFY RESEARCH TRENDS IN MULTI-INPUT TRANSFER FUNCTION,” Barekeng, vol. 18, no. 3, pp. 1931–1938, Aug. 2024, doi: https://doi.org/10.30598/barekengvol18iss3pp1931-1938.

N. J. van Eck and L. Waltman, “SOFTWARE SURVEY: VOSVIEWER, A COMPUTER PROGRAM FOR BIBLIOMETRIC MAPPING,” Scientometrics, vol. 84, no. 2, pp. 523–538, 2010, doi: https://doi.org/10.1007/s11192-009-0146-3.

G. Lample and A. Conneau, “CROSS-LINGUAL LANGUAGE MODEL PRETRAINING,” Jan. 2019.

J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: PRE-TRAINING OF DEEP BIDIRECTIONAL TRANSFORMERS FOR LANGUAGE UNDERSTANDING,” May 2019.

Y. Liu et al., “ROBERTA: A ROBUSTLY OPTIMIZED BERT PRETRAINING APPROACH,” Jul. 2019, [Online]. Available: https://arxiv.org/abs/1907.11692.

A. Vaswani et al., “ATTENTION IS ALL YOU NEED,” Aug. 2023.

S. Ruder, I. Vulić, and A. Søgaard, “A SURVEY OF CROSS-LINGUAL WORD EMBEDDING MODELS,” Oct. 2019, doi: https://doi.org/10.1613/jair.1.11640.

J. Barnes, R. Klinger, and S. Schulte Im Walde, “PROJECTING EMBEDDINGS FOR DOMAIN ADAPTATION: JOINT MODELING OF SENTIMENT ANALYSIS IN DIVERSE DOMAINS TITLE AND ABSTRACT IN BASQUE.”

T. Kudo and J. Richardson, “SENTENCEPIECE: A SIMPLE AND LANGUAGE INDEPENDENT SUBWORD TOKENIZER AND DETOKENIZER FOR NEURAL TEXT PROCESSING,” Aug. 2018, [Online]. Available: https://arxiv.org/abs/1808.06226.

M. Mosbach, M. Andriushchenko, and D. Klakow, “ON THE STABILITY OF FINE-TUNING BERT: MISCONCEPTIONS, EXPLANATIONS, AND STRONG BASELINES,” Jun. 2020, [Online]. Available: https://arxiv.org/abs/2006.04884.

J. Hu, S. Ruder, A. Siddhant, G. Neubig, O. Firat, and M. Johnson, “XTREME: A MASSIVELY MULTILINGUAL MULTI-TASK BENCHMARK FOR EVALUATING CROSS-LINGUAL GENERALIZATION,” Sep. 2020.

Y. K. Wiciaputra, J. C. Young, and A. Rusli, “BILINGUAL TEXT CLASSIFICATION IN ENGLISH AND INDONESIAN VIA TRANSFER LEARNING USING XLM-ROBERTA,” International Journal of Advances in Soft Computing and its Applications, vol. 13, no. 3, pp. 72–87, 2021, doi: https://doi.org/10.15849/ijasca.211128.06.

A. Kumar and V. H. C. Albuquerque, “SENTIMENT ANALYSIS USING XLM-R TRANSFORMER AND ZERO-SHOT TRANSFER LEARNING ON RESOURCE-POOR INDIAN LANGUAGE,” ACM Transactions on Asian and Low-Resource Language Information Processing, vol. 20, no. 5, Sep. 2021, doi: https://doi.org/10.1145/3461764.

F. Barbieri, L. E. Anke, and J. Camacho-Collados, “XLM-T: MULTILINGUAL LANGUAGE MODELS IN TWITTER FOR SENTIMENT ANALYSIS AND BEYOND,” May 2022.

A. Gaurav, B. B. Gupta, S. Sharma, R. Bansal, and K. T. Chui, “XLM-ROBERTA BASED SENTIMENT ANALYSIS OF TWEETS ON METAVERSE AND 6G,” in Procedia Computer Science, Elsevier B.V., 2024, pp. 902–907. doi: https://doi.org/10.1016/j.procs.2024.06.110.

E. Briakou and M. Carpuat, “DETECTING FINE-GRAINED CROSS-LINGUAL SEMANTIC DIVERGENCES WITHOUT SUPERVISION BY LEARNING TO RANK,” Oct. 2020, [Online]. Available: https://arxiv.org/abs/2010.03662.

K. Bali, J. Sharma, M. Choudhury, and Y. Vyas, “‘I AM BORROWING YA MIXING ?’ AN ANALYSIS OF ENGLISH-HINDI CODE MIXING IN FACEBOOK,” in Proceedings of the First Workshop on Computational Approaches to Code Switching, Stroudsburg, PA, USA: Association for Computational Linguistics, 2014, pp. 116–126. doi: https://doi.org/10.3115/v1/W14-3914.

A. Joshi, P. Bhattacharyya, and M. J. Carman, “AUTOMATIC SARCASM DETECTION,” ACM Comput Surv, vol. 50, no. 5, pp. 1–22, Sep. 2018, doi: https://doi.org/10.1145/3124420.

Published
2026-01-26
How to Cite
[1]
G. S. C. Kananta, D. R. S. Saputro, and S. Sutanto, “ANALYSIS OF MULTILINGUAL OPINION POLARIZATION WITH CROSS-LINGUAL LANGUAGE MODEL-ROBUSTLY OPTIMIZED BIDIRECTIONAL ENCODER REPRESENTATIONS FROM TRANSFORMERS APPROACH (XLM-ROBERTA)”, BAREKENG: J. Math. & App., vol. 20, no. 2, pp. 1709–1718, Jan. 2026.

Most read articles by the same author(s)