Prediksi Kelayakan Kredit Nasabah Dengan Penerapan Cost-Sensitive Random Forest

Jolyn Lucretia(1*),Dedy Hermanto(2)
(1) Universitas Multi Data Palembang
(2) Universitas Multi Data Palembang
(*) Corresponding Author
DOI : 10.35889/progresif.v22i1.3401

Abstract

The high risk of credit default and class imbalance in customer data pose major challenges in developing accurate credit scoring systems. This condition causes predictive models to be biased toward the majority class, thereby reducing the ability to detect high-risk borrowers. This study develops a credit scoring model for imbalanced data using the Synthetic Minority Oversampling Technique (SMOTE) and cost-sensitive Random Forest with hyperparameter optimization via GridSearchCV. The dataset consists of 32,581 customer records. Experimental results show that the best configuration with n_estimators = 200 achieves a cross-validation F1-score of 0.813750. On the test data, the model attains an accuracy of 0.927267, precision of 0.911458, recall of 0.738397, and an F1-score of 0.815851, indicating improved and more balanced detection of high-risk borrowers.

Keywords: Random Forest; SMOTE; Cost-sensitive learning; GridSearchCV

 

Abstrak

Tingginya risiko gagal bayar kredit dan ketidakseimbangan kelas pada data nasabah menjadi tantangan utama dalam pengembangan sistem credit scoring yang akurat. Kondisi ini menyebabkan model prediksi cenderung bias terhadap kelas mayoritas sehingga kemampuan deteksi debitur berisiko menjadi kurang optimal. Penelitian ini mengembangkan model credit scoring pada data tidak seimbang menggunakan Synthetic Minority Oversampling Technique (SMOTE) dan Cost-Sensitive Random Forest dengan optimasi hyperparameter GridSearchCV. Dataset yang digunakan berjumlah 32.581 data nasabah. Hasil pengujian menunjukkan konfigurasi terbaik dengan n_estimators = 200 menghasilkan F1-score validasi silang sebesar 0,813750. Pada data uji, model mencapai akurasi 0,927267, precision 0,911458, recall 0,738397, dan F1-score 0,815851, yang menunjukkan peningkatan kemampuan deteksi debitur berisiko secara lebih seimbang.

Kata kunci: Random Forest; SMOTE; Cost-sensitive learning; GridSearchCV.

References


A. Purwatiningsih and A. Suprayitno, “Efektivitas Pemberian Kredit Guna Meminimalkan Kredit Bermasalah Bank Mandiri Cabang Malang,” Journal of Public and Business Accounting, vol. 3, no. 2, pp. 108–118, Dec. 2022, doi: 10.31328/jopba.v3i2.281.

N. Dwiastuti, “Pengaruh Kredit Perbankan Terhadap Pertumbuhan Ekonomi dan Hubungannya Dengan Kesejahteraan Masyarakat Kabupaten/Kota di Provinsi Kalimantan Barat,” in Prosiding Seminar Akademik Tahunan Ilmu Ekonomi dan Studi Pembangunan 2020, Pontianak, Oct. 2020, pp. 73–91.

H. Wulandari and I. Lubis, “Pengaruh Pertumbuhan Kredit Modal Kerja dan Pertumbuhan Kredit Investasi pada UMKM terhadap Pertumbuhan Ekonomi Indonesia Periode 2013-2023,” Progressus Humanitatis, vol. 1, no. 1, pp. 191–204, 2025, doi: 10.70285/s168kc17.

OJK, “Laporan Surveillance Perbankan Indonesia TW III 2024,” 2024.

T. Yulianti, A. H. Cahyana, M. Komarudin, Y. Mulyani, and H. D. Septama, “Penilaian Pembayaran Kredit dengan Logistic Regression dan Random Forest pada Home Credit,” Pseudocode, vol. 11, no. 2, pp. 79–88, Sep. 2024, doi: 10.33369/pseudocode.11.2.79-88.

A. M. Jannah, A. Habibi, and B. M. Basuki, “Implementasi Metode Naïve Bayes Classifier Pada Machine Learning Untuk Sistem Alternatif Credit Scoring,” SCIENCE ELECTRO, vol. 19, no. 1, pp. 17–4, 2025.

H. Putra and Rumini, “Comparative Study of Logistic Regression, Random Forest, and XGBoost for Bank Loan Approval Classification,” Journal of Applied Informatics and Computing (JAIC), vol. 9, no. 5, p. 2822, 2025, [Online]. Available: http://jurnal.polibatam.ac.id/index.php/JAIC

A. Adrian and I. Verawati, “Analisis Performa Logistic Regression dan Random Forest dalam Klasifikasi Kelayakan Penerimaan Kredit,” IJCSR: The Indonesian Journal of Computer Science Research, vol. 4, no. 2, pp. 148–158, Jul. 2025, [Online]. Available: https://subset.id/index.php/IJCSR

I. Syah, S. Sutono, and M. Mulyanto, “Penerapan Algoritma Random Forest Dalam Pencarian Pola Klaim Bpjs Rumah Sakit Umum Kumala Siwi Mijen Kudus,” JURNAL LENTERA BISNIS, vol. 14, no. 3, pp. 3986–3995, Sep. 2025, doi: 10.34127/jrlab.v14i3.1770.

N. N. A. Nanda, Y. Farida, and W. D. Utami, “Implementation of SMOTE to Improve the Performance of Random Forest Classification in Credit Risk Assessment in Banking,” INTENSIF: Jurnal Ilmiah Penelitian dan Penerapan Teknologi Sistem Informasi, vol. 9, no. 2, pp. 158–177, Jul. 2025, doi: 10.29407/intensif.v9i2.23930.

I. D. Mienye and Y. Sun, “Performance analysis of cost-sensitive learning methods with application to imbalanced medical data,” Inform Med Unlocked, vol. 25, pp. 1–10, Jan. 2021, doi: 10.1016/j.imu.2021.100690.

I. Prieto-Egido, A. Guerrero-Curieses, A. Martínez-Fernández, and J. L. Rojo-Álvarez, “Identifying high-risk pregnancies in rural areas with machine-manifold learning,” Engineer Applications of Artificial Intelligence, vol. 163, pp. 1–15, Oct. 2025, doi: 10.21950/WDZX9.

K. D. Tzimourta, M. G. Tsipouras, P. Angelidis, D. G. Tsalikakis, and E. Orovou, “Maternal Health Risk Detection: Advancing Midwifery with Artificial Intelligence,” Healthcare (Switzerland), vol. 13, no. 7, pp. 1–21, Apr. 2025, doi: 10.3390/healthcare13070833.

L. Breiman, “Random Forests,” Mach Learn, vol. 45, no. 1, pp. 1–32, Jan. 2001.

M. Z. Sarwani, M. Khoiron, and M. Udin, “Optimization of the Naïve Bayes Classifier Algorithm Using Cost-Sensitive Learning to Detect Lung Diseases with an Imbalanced Dataset,” Journal of Artificial Intelligence and Software Engineering (J-AISE), vol. 5, no. 1, p. 332, Mar. 2025, doi: 10.30811/jaise.v5i1.6474.

I. Araf, A. Idri, and I. Chairi, “Cost-sensitive learning for imbalanced medical data: a review,” Artif Intell Rev, vol. 57, no. 4, p. 80, Apr. 2024, doi: 10.1007/s10462-023-10652-8.

A. W. Anggraeni, A. R. B. Jamroni, G. Samudra, A. Sarif, and Wiyanto, “Efektivitas Teknik SMOTE Dalam Meningkatkan Performa Naïve Bayes Deteksi Gangguan Kecemasan Mahasiswa,” Jurnal Teknik Informatika dan Sistem Informasi, vol. 12, no. 3, pp. 105–15, 2025, [Online]. Available: http://jurnal.mdp.ac.id

B. N. Nuzululnisa and H. Hairani, “Analisis Kinerja Model Random Forest dengan Teknik Manhattan-SMOTE pada Deteksi Fraud Transaksi Kartu Kredit Imbalance,” in SEMINAR NASIONAL CORISINDO, Mataram, Sep. 2025, pp. 65–71.

N. A. Suhartono and V. J. L. Engel, “Penerapan Extreme Gradient Boosting (XGBoost) dengan SMOTE untuk Deteksi Penipuan Kartu Kredit,” Institut Teknologi Harapan Bangsa, Bandung, 2022.


How To Cite This :

Refbacks

  • There are currently no refbacks.