Faculty Publications

Permanent URI for this communityhttps://idr.nitk.ac.in/handle/123456789/18736

Publications by NITK Faculty

Browse

Search Results

Now showing 1 - 3 of 3
  • Item
    Navigating Data Imbalances in Credit Risk Management: A One-Sided Selection Approach
    (Institute of Electrical and Electronics Engineers Inc., 2024) Bennehalli, S.J.; Vakkund, S.; Anusha Hegde, H.; Bhowmik, B.
    Credit scoring plays a vital role in mitigating the information asymmetry that is pervasive on platforms for peer-to-peer (P2P) lending. A considerable challenge stems from the disparity in loan repayment outcomes: a significant minority of loan applicants defaulting on their loans, while the majority fulfilling their repayment obligations. The presence of imbalance in the dataset has the potential to incorporate bias into predictive model, which could lower its performance. In order to address this issue, data balancing techniques are often employed to enhance the performance of credit scoring models through the generation of datasets that are more balanced. This work constructs a robust credit scoring model capable of precisely assessing the creditworthiness of individuals seeking P2P lending. Four distinct classifiers - Logistic Regression, Random Forest, LightGBM, and Support Vector Machine (SVM) are employed. In doing so, it effectively mitigates the distortions that can result from unbalanced data distributions. This work achieves data balance with One-Sided Selection methodology along with Information gain and Pearson correlation which mainly determine the features to include. The proposed model thus works on both balanced and unbalanced datasets. Experimental results show that the standard metrics like accuracy, precision, recall, and F1-Score achieves upto 90.41%, 89.51%, 90.40%, and 89.96%, respectively. © 2024 IEEE.
  • Item
    Enhancing Big Data Security Through Anomaly Detection
    (Institute of Electrical and Electronics Engineers Inc., 2024) Vakkund, S.; Kumar, S.; Rao, S.; Anusha Hegde, H.; Bhowmik, B.
    Securing the massive and fast-moving data streams typical in Big Data environments presents unique challenges that traditional static security measures simply can't handle. To effectively protect these data flows, we need methods that can analyze traffic in real-time and respond swiftly to potential threats. Anomaly detection is one such method, offering an automated way to identify unusual or suspicious activities within Big Data systems. In this study, we explore several widely-used anomaly detection algorithms, evaluating their effectiveness in identifying anomalies within large datasets. Specifically, we will assess these algorithms using the UNSW-NB15 Dataset, aiming to pinpoint which algorithm, or combination of algorithms, is best suited for the demands of Big Data security. © 2024 IEEE.
  • Item
    Louvain community-based label assignment for reject inference in peer-to-peer lending
    (Springer Science and Business Media Deutschland GmbH, 2025) Hegde, A.; Bhowmik, B.; Bennehalli, S.; Vakkund, S.
    The digital transformation in the Financial Technology (FinTech) sector has significantly altered traditional banking and lending practices, giving rise to innovative models like peer-to-peer (P2P) lending. P2P lending platforms directly connect lenders and borrowers online, bypassing conventional financial intermediaries and democratizing access to finance. However, this innovation introduces new complexities in the risk assessment process, necessitating advanced analytical methods. This research presents Accept-Reject-Net framework, a three-step modeling approach designed to capture and evaluate the complex relationships of loans within the accept and reject dataset, a crucial aspect of P2P lending. Initially, the datasets are separated using two outlier detection methods that efficiently manage extensive datasets by distinguishing inliers (data points adhering to a specific pattern) from outliers (data points deviating from the anticipated pattern). We then generate four distinct merged datasets by applying two different ratios of accept and reject data. In the second stage, borrowers are systematically represented as nodes, with their Euclidean distances as edges, allowing us to extract graph features that effectively capture the structural attributes and similarities of the loans. These graph features are used to classify entries in the Reject dataset as either default or non-default. Two distinct approaches are introduced Louvain mode and Louvain threshold to facilitate label assignment within detected communities. The threshold is validated across multiple levels to assess its effectiveness in refining label assignment. In the third phase, these features are inputs for training five machine learning models, further enhanced with additional labeled data. To ensure the reliability and robustness of our findings, confidence intervals and permutation tests are used to assess the performance differences between different partitions. The 7:1 ratio of accept:reject with the threshold method of Louvain community detection for assigning labels to the rejected dataset improves the metrics, making the model much more effective for reject inference. This comprehensive approach addresses the biases inherent in traditional credit scoring models and enhances the predictive accuracy and fairness of loan evaluations. © The Author(s), under exclusive licence to Springer Nature Switzerland AG 2025.