A Hybrid Weighted Loss Function for Enhanced Protein Interaction Site Prediction
No Thumbnail Available
Date
2025
Journal Title
Journal ISSN
Volume Title
Publisher
Springer Science and Business Media Deutschland GmbH
Abstract
Accurately predicting protein interaction sites is crucial for applications such as protein design, drug discovery, and functional protein analysis. However, a significant challenge in this task arises from the inherent class imbalance between interacting and non-interacting sites in protein datasets. While data augmentation techniques are commonly used to mitigate this imbalance, they often introduce noise, potentially reducing prediction accuracy. In this study, we present a novel approach to improve protein interaction site prediction by developing a customized loss function that combines focal loss and cost-sensitive loss, specifically designed to address class imbalance without relying on data augmentation. Our model, which integrates graph convolutional networks (GCNs) to process evolutionary and structural features of proteins, is evaluated using robust performance metrics suited for imbalanced data: Matthews Correlation Coefficient (MCC) and Area Under Precision-Recall Curve (AUPRC). We evaluate the proposed method on the Test_60 dataset, achieving an MCC of 0.342 and an AUPRC of 0.425, providing a modest improvement over the standard cross-entropy loss function. These findings highlight the effectiveness of our tailored loss function in handling class imbalance and improving prediction performance in protein interaction site prediction. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025.
Description
Keywords
AUPRC, Class-imbalance, Cost-sensitive loss, Focal loss, MCC, Protein-interaction site
Citation
Lecture Notes in Networks and Systems, 2025, Vol.1371 LNNS, , p. 111-123
