Navigating Data Imbalances in Credit Risk Management: A One-Sided Selection Approach

No Thumbnail Available

Date

2024

Journal Title

Journal ISSN

Volume Title

Publisher

Institute of Electrical and Electronics Engineers Inc.

Abstract

Credit scoring plays a vital role in mitigating the information asymmetry that is pervasive on platforms for peer-to-peer (P2P) lending. A considerable challenge stems from the disparity in loan repayment outcomes: a significant minority of loan applicants defaulting on their loans, while the majority fulfilling their repayment obligations. The presence of imbalance in the dataset has the potential to incorporate bias into predictive model, which could lower its performance. In order to address this issue, data balancing techniques are often employed to enhance the performance of credit scoring models through the generation of datasets that are more balanced. This work constructs a robust credit scoring model capable of precisely assessing the creditworthiness of individuals seeking P2P lending. Four distinct classifiers - Logistic Regression, Random Forest, LightGBM, and Support Vector Machine (SVM) are employed. In doing so, it effectively mitigates the distortions that can result from unbalanced data distributions. This work achieves data balance with One-Sided Selection methodology along with Information gain and Pearson correlation which mainly determine the features to include. The proposed model thus works on both balanced and unbalanced datasets. Experimental results show that the standard metrics like accuracy, precision, recall, and F1-Score achieves upto 90.41%, 89.51%, 90.40%, and 89.96%, respectively. © 2024 IEEE.

Description

Keywords

Credit Scoring, Data Imbalance, Information Gain, One-Sided Selection, Pearson Correlation

Citation

2024 Control Instrumentation System Conference: Guiding Tomorrow: Emerging Trends in Control, Instrumentation, and Systems Engineering, CISCON 2024, 2024, Vol., , p. -

Endorsement

Review

Supplemented By

Referenced By