Klasifikasi Genre Musik Menggunakan CNN Dengan Arsitektur Resnet-50 Dan Gradient Boost LightGBM

Roberto Alessandro(1*),Tinaliah Tinaliah(2)
(1) Universitas Multi Data Palembang
(2) Universitas Multi Data Palembang
(*) Corresponding Author
DOI : 10.35889/jutisi.v15i1.3458

Abstract

The rapid growth of digital music has driven the need for accurate and efficient automated music genre classification systems. This study evaluates a hybrid approach that integrates the ResNet-50 architecture as a feature extractor through transfer learning and LightGBM as a classifier. Using the GTZAN dataset represented as Mel-spectrograms, the research compares the effectiveness of hyperparameter optimization using Random Search and Grid Search methods. Based on performance evaluation, the hybrid scenario optimized with Grid Search yielded the best performance with an accuracy of 81.20%, outperforming the Random Search method. Nevertheless, the overall experimental results reveal that the end-to-end ResNet-50 model still provides superior performance compared to the hybrid approach. This indicates that the deep features from ResNet-50 are highly representative for separating genre classes, such that the addition of an external ensemble classifier does not yield significant improvements, although the hybrid approach still offers valuable empirical insights as a stable alternative model.

Keywords: Convolutional Neural Network; ResNet-50; LightGBM; Mel-Spectogram; Klasifikasi Genre Musik;

Abstrak

Pertumbuhan pesat musik digital mendorong kebutuhan akan sistem klasifikasi genre musik otomatis yang akurat dan efisien. Penelitian ini mengevaluasi pendekatan hibrida yang mengintegrasikan arsitektur ResNet-50 sebagai pengekstraksi fitur melalui teknik transfer learning dan LightGBM sebagai classifier. Menggunakan dataset GTZAN yang direpresentasikan dalam bentuk Mel-spectrogram, penelitian ini membandingkan efektivitas optimasi hyperparameter menggunakan metode Random Search dan Grid Search. Berdasarkan evaluasi kinerja, skenario hibrida dengan optimasi Grid Search terbukti menghasilkan kinerja terbaik dengan akurasi 81,20%, mengungguli metode Random Search. Kendati demikian, hasil eksperimen secara keseluruhan mengungkapkan bahwa model ResNet-50 end-to-end masih memberikan performa yang lebih unggul dibandingkan pendekatan hibrida. Hal ini mengindikasikan bahwa fitur mendalam dari ResNet-50 sudah sangat representatif untuk memisahkan kelas genre, sehingga penambahan classifier eksternal tidak memberikan peningkatan signifikan, meskipun pendekatan hibrida tetap menawarkan wawasan empiris penting sebagai model alternatif yang stabil.

 

Keywords


Convolutional Neural Network; ResNet-50; LightGBM; Mel-spectrogram; Klasifikasi Genre Musik

References


S. Surahman, U. Bhayangkara, and J. Raya, Musik Di Indonesia Tiga Masa : Klasik , Modern , dan Postmodern, no. August. 2024.

P. Hendra, S. S. Sekolah, T. Teologi, and P. Kebenaran, “Musik dalam Dinamika Pujian Penyembahan,” Pneum. J. Teol. Kependetaan, vol. 10, no. 2, pp. 176–199, 2020, doi: 10.56438/pneuma.v10i2.26.

Kim, A.J. Differential Effects of Musical Expression of Emotions and Psychological Distress on Subjective Appraisals and Emotional Responses to Music. Behav. Sci. 2023, 13, 491. https://doi.org/ 10.3390/bs13060491

Y. Hu and G. Mogos, “Music genres classification by deep learning,” Indones. J. Electr. Eng. Comput. Sci., vol. 25, no. 2, pp. 1186–1198, 2022, doi: 10.11591/ijeecs.v25.i2.pp1186-1198.

T. Tinaliah et al., “Penerapan Convolutional Neural Network Untuk Klasifikasi Citra Ekspresi Wajah Manusia Pada MMA Facial Expression Dataset,” vol. 8, no. 4, pp. 2051–2059, 2021, doi: https://doi.org/10.35957/jatisi.v8i4.1437.

R. F. Fadhillah and R. Sumiharto, “Klasifikasi Suara Untuk Memonitorid Hutan Berbasis Convolutional Neural Network,” IJEIS (Indonesian J. Electron. Instrum. Syst., vol. 13, no. 1, pp. 13–22, 2023, doi: 10.22146/ijeis.79536.

M. H. Tanveer, H. Zhu, W. Ahmed, A. Thomas, B. M. Imran, and M. Salman, “Mel-spectrogram and Deep CNN Based Representation Learning from Bio-Sonar Implementation on UAVs,” 2021 Int. Conf. Comput. Control Robot. ICCCR 2021, pp. 220–224, 2021, doi: 10.1109/ICCCR49711.2021.9349416.

K. Indra and M. Agung, “Analisis Komparatif Arsitektur CNN dan VGG16 pada Klasifikasi Genre Musik,” J. Nas. Teknol. Inf. dan Apl., vol. 3, pp. 889–898, 2025, doi: https://doi.org/10.24843/JNATIA.2025.v03.i04.p19.

Y. V. Via, I. Y. Purbasari, and A. P. Pratama, “Analisa Algoritma Convolution Neural Network (Cnn) Pada Klasifikasi Genre Musik Berdasar Durasi Waktu,” Scan J. Teknol. Inf. dan Komun., vol. 17, no. 1, pp. 35–41, 2022, doi: 10.33005/scan.v17i1.3251.

Y. A. Auliya, D. I. Swasono, and P. P. Harwanto, “Classification of Indonesian Music Genres Using Transfer Learning with ResNet-50 and Mel-Frequency Cepstral Coefficient Feature Extraction,” Math. Model. Eng. Probl., vol. 12, no. 3, pp. 840–850, 2025, doi: 10.18280/MMEP.120310.

Z. R. LI Xingjian, TANG Xinyi, “Deep LightGBM Sound classification using Deep LightGBM algorithm,” Tech. Acoust., vol. 41, no. 6, pp. 871–877, 2022, doi: 10.16300/j.cnki.1000-3630.2022.06.012.

I. R. Huriah, A. Ismania, and S. Widianingrum, “Optimization of Data Augmentation Based on Synonym Replacement in News Text Classification Using Neural Network,” Komputa J. Ilm. Komput. dan Inform., vol. 14, no. 1, pp. 100–107, 2025, doi: 10.34010/komputa.v14i1.

R. F. Junaidi et al., “Baby Cry Sound Detection: A Comparison of Mel Spectrogram Image on Convolutional Neural Network Models,” J. Electron. Electromed. Eng. Med. Informatics, vol. 6, no. 4, pp. 355–369, 2024, doi: 10.35882/jeeemi.v6i4.465.

R. F. Junaidi et al., “Baby Cry Sound Detection: A Comparison of Mel Spectrogram Image on Convolutional Neural Network Models,” J. Electron. Electromed. Eng. Med. Informatics, vol. 6, no. 4, pp. 355–369, 2024, doi: 10.35882/jeeemi.v6i4.465.

M. Shafiq and Z. Gu, “Deep Residual Learning for Image Recognition: A Survey,” Appl. Sci., vol. 12, no. 18, pp. 1–43, 2022, doi: 10.3390/app12188972.

T. Hascoet, Q. Febvre, W. Zhuang, Y. Ariki, and T. Takiguchi, “Reversible designs for extreme memory cost reduction of CNN training,” EURASIP J. Image Video Process., vol. 2023, no. 1, p. 1, 2023, doi: 10.1186/s13640-022-00601-w.


The PDF file you selected should load here if your Web browser has a PDF reader plug-in installed (for example, a recent version of Adobe Acrobat Reader).

If you would like more information about how to print, save, and work with PDFs, Highwire Press provides a helpful Frequently Asked Questions about PDFs.

Alternatively, you can download the PDF file directly to your computer, from where it can be opened using a PDF reader. To download the PDF, click the Download link above.

Fullscreen Fullscreen Off

Full Text: File PDF

How To Cite This :

Refbacks

  • There are currently no refbacks.