Pengembangan Model Deep Learning dengan Slang-Aware Embeddings untuk Deteksi Promosi Judi Online
Abstract
Online gambling promotions on social media platforms such as YouTube often employ slang or non-standard language to evade traditional moderation systems, increasing the spread of illegal content in Indonesia. This study aims to develop a hybrid deep learning model combining BERT and LSTM to accurately detect online gambling promotions. Data were collected from YouTube comments through a web scraping process, followed by text cleaning, labeling, and normalization using a semi-automatic slang dictionary. The model was trained with slang-aware embeddings to capture informal language context. Evaluation was conducted using precision, recall, F1-score, and confusion matrix metrics. The results show an accuracy of 96% with an F1-score of 0.96, indicating a strong balance between precision and recall. These findings demonstrate the effectiveness of the proposed hybrid approach in automatically detecting online gambling promotional content.
Kata kunci: Online Gambling Detection; Deep Learning; NLP; Slang-Aware Embeddings; BERT-LTSM
Abstrak
Promosi judi daring di media sosial seperti YouTube sering menggunakan bahasa tidak baku atau slang untuk menghindari deteksi sistem moderasi. Kondisi ini berpotensi meningkatkan penyebaran konten ilegal di Indonesia. Penelitian ini bertujuan mengembangkan model deep learning hibrida yang mengombinasikan BERT dan LSTM guna mendeteksi promosi judi daring secara lebih akurat. Data dikumpulkan dari komentar YouTube melalui proses web scraping, kemudian diproses melalui tahap pembersihan teks, pelabelan, dan normalisasi menggunakan kamus slang semi-otomatis. Model dilatih dengan slang-aware embeddings untuk menangkap konteks bahasa tidak resmi. Pengujian dilakukan menggunakan metrik precision, recall, F1-score, dan confusion matrix. Hasil menunjukkan akurasi sebesar 96% dengan nilai F1-score 0,96, menandakan keseimbangan tinggi antara presisi dan sensitivitas model. Temuan ini membuktikan efektivitas pendekatan hibrida dalam mendeteksi konten promosi judi daring secara otomatis.
Keywords
References
A. Perwira and J. Dwitama, “Deteksi Ujaran Kebencian pada Teks Bahasa Indonesia Menggunakan Bidirectional Long Short Term Memory (Bi-LSTM),” Universitas Islam Indonesia, Yogyakarta, 2023.
M. Khadapi and V. Maruli Pakpahan, “Analisis Sentimen Berbasis Jaringan LSTM dan BERT terhadap Diskusi Twitter tentang Pemilu 2024,” JUKI : Jurnal Komputer dan Informatika, vol. 6, no. 2, pp. 130–137, Nov. 2024.
Y. P. Sumihar, “Sentiment Analysis of Public Opinions Regarding ‘Ideas of Presidential Candidates’ in YouTube Video Comments with Robustly Optimized BERT Pretraining Approach,” Jurnal Sistem Informasi dan Ilmu Komputer Prima, vol. 8, no. 1, pp. 12–28, Aug. 2024.
J. C. Setiawan, K. M. Lhaksmana, and B. Bunyamin, “Sentiment Analysis of Indonesian TikTok Review Using LSTM and IndoBERTweet Algorithm,” JIPI (Jurnal Ilmiah Penelitian dan Pembelajaran Informatika), vol. 8, no. 3, pp. 774–780, Aug. 2023, doi: 10.29100/jipi.v8i3.3911.
M. I. K. Sinapoy, Y. Sibaroni, and S. S. Prasetyowati, “Comparison of LSTM and IndoBERT Method in Identifying Hoax on Twitter,” Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), vol. 7, no. 3, pp. 657–662, Jun. 2023, doi: 10.29207/resti.v7i3.4830.
B. Wibowo, A. Fathl Jannah, and L. Hafiz, “Optimalisasi Bot Telegram untuk Deteksi Situs Perjudian Online di Dunia Pendidikan dan Sektor Pemerintah,” Jurnal Pengabdian Masyarakat Sultan Indonesia, vol. 2, no. 1, pp. 17–24, Dec. 2024, doi: 10.58291/abdisultan.v2i1.316.
K. Setyo Nugroho, I. Akbar, and A. Nizar Suksmawati, “Deteksi Depresi Dan Kecemasan Pengguna Twitter Menggunakan Bidirectional LSTM,” in CIASTECH 2021, Malang: Universitas Widyagama Malang, Dec. 2021, pp. 287–296.
S. Choi, “Understanding Involuntary Illegal Online Gamblers in the U.S.: Framing in Misleading Information by Online Casino Reviews,” UNLV Gaming Research & Review Journal, vol. 27, no. 1, pp. 23–47, Apr. 2023, doi: 10.9741/2327-8455.1474.
A. Hernández-Ruiz and Y. Gutiérrez, “Analysing the Twitter accounts of licensed Sports gambling operators in Spain: a space for responsible gambling?,” Communication & Society, vol. 34, no. 4, pp. 65–79, Oct. 2021, doi: 10.15581/003.34.4.65-79.
P. Angellina and B. Prasetyo, “Pertanggungjawaban Pidana Terhadap Pelaku yang Mempromosikan Judi Online,” Ranah Research : Journal of Multidisciplinary Research and Development, vol. 7, no. 2, pp. 946–952, Dec. 2024, doi: 10.38035/rrj.v7i2.1395.
K. Kolandai-Matchett and M. Wenden Abbott, “Gaming-Gambling Convergence: Trends, Emerging Risks, and Legislative Responses,” Int J Ment Health Addict, vol. 20, no. 4, pp. 2024–2056, Aug. 2022, doi: 10.1007/s11469-021-00498-y.
A. Bradley and R. J. E. James, “How are major gambling brands using Twitter?,” Int Gambl Stud, vol. 19, no. 3, pp. 451–470, Sep. 2019, doi: 10.1080/14459795.2019.1606927.
T. Teichert, A. Graf, T. B. Swanton, and S. M. Gainsbury, “The joint influence of regulatory and social cues on consumer choice of gambling websites: preliminary evidence from a discrete choice experiment,” Int Gambl Stud, vol. 21, no. 3, pp. 480–497, Sep. 2021, doi: 10.1080/14459795.2021.1921011.
S. Tang, X. Mi, Y. Li, X. Wang, and K. Chen, “Clues in Tweets: Twitter-Guided Discovery and Analysis of SMS Spam,” in Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security, New York, NY, USA: ACM, Nov. 2022, pp. 2751–2764. doi: 10.1145/3548606.3559351.
E. Zhu, J. Wu, H. Liu, and K. Li, “A Sentiment Index of the Housing Market in China: Text Mining of Narratives on Social Media,” The Journal of Real Estate Finance and Economics, vol. 66, no. 1, pp. 77–118, Jan. 2023, doi: 10.1007/s11146-022-09900-5.
P. Chiawchansilp and P. Kantavat, “Spam Article Detection on Social Media Platform Using Deep Learning: Enhancing Content Integrity and User Experience,” in Proceedings of the 13th International Conference on Advances in Information Technology, in IAIT ’23. New York, NY, USA: Association for Computing Machinery, 2023. doi: 10.1145/3628454.3628459.
S. Kaddoura, S. A. Alex, M. Itani, S. Henno, A. AlNashash, and D. J. Hemanth, “Arabic spam tweets classification using deep learning,” Neural Comput Appl, vol. 35, no. 23, pp. 17233–17246, 2023, doi: 10.1007/s00521-023-08614-w.
M. Liu, Y. Zhang, B. Liu, Z. Li, H. Duan, and D. Sun, “Detecting and Characterizing SMS Spearphishing Attacks,” in Proceedings of the 37th Annual Computer Security Applications Conference, in ACSAC ’21. New York, NY, USA: Association for Computing Machinery, 2021, pp. 930–943. doi: 10.1145/3485832.3488012.
N. Nasir, F. Iqbal, M. Zaheer, M. Shahjahan, and M. Javed, “Lures for Money: A First Look into YouTube Videos Promoting Money-Making Apps,” in Proceedings of the 2022 ACM on Asia Conference on Computer and Communications Security, in ASIA CCS ’22. New York, NY, USA: Association for Computing Machinery, Mar. 2022, pp. 1195–1206. doi: 10.1145/3488932.3517404.
How To Cite This :
Refbacks
- There are currently no refbacks.










