Pemanfaatan Fitur Tambahan Emosi Untuk Deteksi Hate Speech Media Sosial Bahasa Indonesia

Michael Joy Clement(1*),Hafiz Irsyad(2)
(1) Universitas Multi Data Palembang
(2) Universitas Multi Data Palembang
(*) Corresponding Author
DOI : 10.35889/progresif.v22i1.3338

Abstract

This study examines the importance of incorporatring emotion features and enhancing the temporal robustness of hate-speech detection models to improve classification accuracy. The research aims to analyze the impact of emotion features on an IndoBERT based model and to evaluate the model’s adaptability using an unsupervised self-learning approach. The dataset consists of two corpora, a public dataset from 2019 and twitter data from 2025, each divided into training, validation, and test sets with an 80%, 10%, 10% split. Model performance is evaluated using accuracy, precision, recall and F1-score calculated from confusion matrix. Experimental results show that adding emotion features increases accuracy by 1-2% across all scenarios. In cross-temporal testing, the supervised model performance declines duet o linguistic shifts whereas the self-learning method improves accuracy up to 77.67%. These findings indicate that emotion features and self-learning effectively enhance the model’s ability to adapt to evolving language and social context.

Keyword: Emotion; Hate speech detection; IndoBERT

 

Abstrak

Penelitian ini membahas pentingnya penambahan fitur emosi dan peningkatan ketahanan model deteksi ujaran kebencian terhadap perubahan bahasa lintas waktu guna memperkuat akurasi klasifikasi. Tujuan penelitian adalah menganalisis pengaruh fitur emosi pada model berbasis IndoBERT dan mengevaluasi kemampuan adaptasi model menggunakan pendekatan unsupervised self-learning. Data menggunakan dua korpus yaitu dataset publik tahun 2019 dan data Twitter tahun 2025, yang masing-masing dibagi menjadi data latih dan data latih, validasi, dan uji dengan proporsi 80%, 10%, dan 10%. Model dievaluasi menggunakan accuracy, precision, recall, dan F1-score yang dihitung melalui confusion matrix. Hasil pengujian menunjukkan bahwa penambahan fitur emosi meningkatkan akurasi sebesar 1-2% di seluruh skenario. Pada pengujian lintas waktu, performa model supervised menurun akibat perubahan konteks linguistik, namun metode self-learning meningkatkan akurasi hingga 77.67%. temuan ini menunjukkan bahwa fitur emosi dan self-learning efektif meningkatkan adaptasi model terhadap dinamika bahasa serta konteks sosial.

Kata kunci: Seteksi ujaran kebencian; Emosi; IndoBERT

References


A. W. Syakhrani, R.K. Amuntai and E. K. Widijatmoko, “Perkembangan Komunikasi Digital: Dampak Media Sosial Pada Interaksi Sosial di Era Modern,” Jurnal Komunikasi, vol. 2, no. 12, pp. 919–925, Dec. 2024.

I. W. Zega, I. P.S.B. Purba, M. Iqbal, K. I. Ainurridho, Y. Madarusman, and R. S. Gueci, “Penggunaan Media Sosial yang Bijak Dalam Kebebasan Berekspresi dan Berpendapat,” Abdi Laksana: Jurnal Pengabdian Kepada Masyarakat, vol. 5, pp. 498–504, May 2024.

S. Khodijah, N. Syifa, Y. Sembiring, N. R. Fauzan, and S. dan Teknologi, “Tinjauan Dampak Negatif Fenomena Kebencian di Media Sosial di Indonesia,” Senashtek 2024, vol. 2, no. 1, pp. 77–80, Jul. 2024.

T. A. Azis et al., “Hate Speech against Javanese in Social Media: A Case Study of Instagram Platform,” Jurnal Ilmiah Multidisiplin, vol. 4, no. 3, p. 46, May 2025, doi: 10.56127/jukim.v4i03.

P. Madriaza et al., “Exposure to hate in online and traditional media: A systematic review and meta-analysis of the impact of this exposure on individuals and communities,” Campbell Systematic Reviews, vol. 21, no. 1, p. e70018, Mar. 2025, doi: 10.1002/cl2.70018.

F. Andy Kusuma and E. W. Pamungkas, “Pendeteksian Hate Speech Pada Sosial Media Indonesia dengan Algoritma Support Vector Machine (SVM) dan Decision Tree,” Tesis, Program Studi Teknik Informatika, Universitas Muhammadiyah, Surakarta, 2023.

M. O. Ibrohim and I. Budi, “Multi-label Hate Speech and Abusive Language Detection in Indonesian Twitter,” in Preceedings of the Third Workshop on Abusive Language Online, Florence, Italy, Aug. 2019, pp. 46–57.

A. Ghenai, Z. Noorian, H. Moradisani, P. Abadeh, C. Erentzen, and F. Zarrinkalam, “Exploring hate speech dynamics: The emotional, linguistic, and thematic impact on social media users,” Inf Process Manag, vol. 62, no. 3, p. 104079, May 2025, doi: 10.1016/j.ipm.2025.104079.

K. Mnassri, P. Rajapaksha, R. Farahbakhsh, and N. Crespi, “Hate Speech and Offensive Language Detection using an Emotion-aware Shared Encoder,” in ICC 2023 - IEEE International Conference on Communications, Rome, Italy, Feb. 2023, pp. 2852–2857. doi: 10.1109/ICC45041.2023.10279690.

F. M. Plaza-del-Arco, S. Halat, S. Padó, and R. Klinger, “Multi-Task Learning with Sentiment, Emotion, and Target Detection to Recognize Hate Speech and Offensive Language,” in CEUR Workshop Proceedings, Gandhinagar, India, Dec. 2021, pp. 297–318.

R. Bagestra, A. Misbullah, Z. Zulfan, R. Rasudin, L. Farsiah, and S. A. Nazhifah, “Performance Assessment of Machine Learning and Transformer Models for Indonesian Multi-Label Hate Speech Detection,” Infolitika Journal of Data Science, vol. 2, no. 2, pp. 62–71, Nov. 2024, doi: 10.60084/ijds.v2i2.235.

F. Koto, A. Rahimi, J. H. Lau, and T. Baldwin, “IndoLEM and IndoBERT: A Benchmark Dataset and Pre-trained Language Model for Indonesian NLP,” in Proceedings of the 28th International Conference on Computational Linguistics, Barcelona, Spain, Dec. 2020, pp. 757–770.

Dhendra and V. G. Utomo, “Benchmarking IndoBERT and Transformer Models for Sentiment Classification on Indonesian E-Government Service Reviews,” Jurnal Transformatika, vol. 23, no. 1, pp. 86–95, Jul. 2025, doi: 10.26623/transformatika.v23i1.12095.

J. Devlin, M.-W. Chang, K. Lee, K. T. Google, and A. I. Language, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” in Preceedings of NAACL-HLT 2019, Mineapolis, Minnesota, Jun. 2019, pp. 4171–4186.

A. Glenn, P. LaCasse, and B. Cox, “Emotion classification of Indonesian Tweets using Bidirectional LSTM,” Neural Comput Appl, vol. 35, no. 13, pp. 9567–9578, May 2023, doi: 10.1007/s00521-022-08186-1.

M. H. Algifari and D. Nugroho, “Emotion Classification of Indonesian Tweets using BERT Embedding,” Journal of Applied Informatics and Computing (JAIC), vol. 7, no. 2, pp. 2548–6861, Dec. 2023.

A. S. S. Ansyah, A. P. Kurniawan, A. N. Kholifah, and D. Purwitasari, “A Hybrid Method on Emotion Detection for Indonesian Tweets of COVID-19,” Jurnal RESTI, vol. 7, no. 2, pp. 254–262, Apr. 2023, doi: 10.29207/resti.v7i2.4816.

A. C. Saputra et al., “Prediksi Emosi Dalam Teks Bahasa Indonesia Menggunakan Model Indobert,” Jurnal Teknologi Informasi : Jurnal Keilmuan Dan Aplikasi Bidang Teknik Informatika, vol. 19, no. 1, pp. 1–15, Jan. 2025, doi: 10.47111/JTI.

J. F. Kusuma and A. Chowanda, “International Journal On Informatics Visualization journal homepage : www.joiv.org/index.php/joiv International Journal On Informatics Visualization Indonesian Hate Speech Detection Using IndoBERTweet and BiLSTM on Twitter,” International Journal on Informatics Visualization, vol. 7, no. 3, pp. 773–780, Sep. 2023, [Online]. Available: www.joiv.org/index.php/joiv

M. Usman, M. Ahmad, G. Sidorov, I. Gelbukh, and R. Q. Tellez, “A Large Language Model-Based Approach for Multilingual Hate Speech Detection on Social Media,” Multidisciplinary Digital Publishing Institute (MDPI), Jul. 2025. doi: 10.3390/computers14070279.

IndoNLP Team, “IndoNLU — Emotion Twitter Dataset (emot_emotion-twitter).” Accessed: Aug. 04, 2025. [Online]. Available: https://github.com/IndoNLP/indonlu/tree/master/ dataset/emot_emotion-twitter

O. Ibirohim, “ID-MultiLabel Hate Speech and Abusive Language Detection Dataset.” Accessed: Aug. 04, 2025. [Online]. Available: https://github.com/okkyibrohim/id-multi-label-hate-speech-and-abusive-language-detection/commits/master/

M. R. Awal, R. Cao, R. K.-W. Lee, and S. Mitrovic, “AngryBERT: Joint Learning Target and Emotion for Hate Speech Detection,” Mar. 2021, [Online]. Available: http://arxiv.org/abs/2103.11800


How To Cite This :

Refbacks

  • There are currently no refbacks.