Tighter Clusters, Safer Code? Improving Vulnerability Detection with Enhanced Contrastive Loss

No Thumbnail Available

Date

2023

Journal Title

Journal ISSN

Volume Title

Publisher

Association for Computational Linguistics (ACL)

Abstract

Distinguishing vulnerable code from non-vulnerable code is challenging due to high inter-class similarity. Supervised contrastive learning (SCL) improves embedding separation but struggles with intra-class clustering, especially when variations within the same class are subtle. We propose CLUSTER-ENHANCED SUPERVISED CONTRASTIVE LOSS (CESCL), an extension of SCL with a distance-based regularization term that tightens intra-class clustering while maintaining inter-class separation. Evaluating on CodeBERT and GraphCodeBERT with Binary Cross Entropy (BCE), BCE + SCL, and BCE + CESCL, our method improves F1 score by 1.76% on CodeBERT and 4.1% on GraphCodeBERT, demonstrating its effectiveness in code vulnerability detection and broader applicability to high-similarity classification tasks. © 2025 Association for Computational Linguistics.

Description

Keywords

Citation

Proceedings of the 2025 Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies: Long Papers, NAACL-HLT 2025, 2023, Vol.4, , p. 247-252

Endorsement

Review

Supplemented By

Referenced By