Conference Papers

Permanent URI for this collectionhttps://idr.nitk.ac.in/handle/123456789/28506

Browse

Search Results

Now showing 1 - 3 of 3
  • Item
    Tighter Clusters, Safer Code? Improving Vulnerability Detection with Enhanced Contrastive Loss
    (Association for Computational Linguistics (ACL), 2023) Kapparad, P.; Mohan, B.R.
    Distinguishing vulnerable code from non-vulnerable code is challenging due to high inter-class similarity. Supervised contrastive learning (SCL) improves embedding separation but struggles with intra-class clustering, especially when variations within the same class are subtle. We propose CLUSTER-ENHANCED SUPERVISED CONTRASTIVE LOSS (CESCL), an extension of SCL with a distance-based regularization term that tightens intra-class clustering while maintaining inter-class separation. Evaluating on CodeBERT and GraphCodeBERT with Binary Cross Entropy (BCE), BCE + SCL, and BCE + CESCL, our method improves F1 score by 1.76% on CodeBERT and 4.1% on GraphCodeBERT, demonstrating its effectiveness in code vulnerability detection and broader applicability to high-similarity classification tasks. © 2025 Association for Computational Linguistics.
  • Item
    Subjective Answer Evaluation Using Keyword Similarity and Regression Techniques
    (Institute of Electrical and Electronics Engineers Inc., 2024) Kapparad, P.
    This paper introduces a novel approach of automated grading of subjective answers using Natural Language Processing (NLP) techniques. The motivation for the project arises from the need to simplify the process of subjective answer evaluation, which is a repetitive and time-consuming task when done manually. Since no dataset is available for topic presented, we created our own dataset consisting of evaluated student answers for 1 and 3 mark questions on the topics of Social Science. For 1 mark questions, we employed a keyword similarity based grading system. On the other hand, for the 3 mark questions many techniques were explored, including using BERT, DistilBERT, and RoBERTa, which achieved no noteworthy results. Another alternative approach involving both keyword similarity and sentence-sentence similarity was created for the 3 mark questions, which slightly outperformed the previously mentioned techniques. The results for evaluation of 1 mark questions was promising, achieving 90% accuracy. However, there remains significant room for improvement for evaluation of longer answer questions. A key insight from our study is that the scope of improvement is directly related to increasing the quantity and quality of the dataset. This research adds to the ongoing conversation about automation of subjective answer evaluation, aiming to make grading methods more efficient and hassle free in the future. © 2024 IEEE.
  • Item
    Comparative Analysis of Religious Texts: NLP Approaches to the Bible, Quran, and Bhagavad Gita
    (Association for Computational Linguistics (ACL), 2025) Mahit Nandan, A.D.; Godbole, I.; Kapparad, P.; Bhattacharjee, S.
    Religious texts have long influenced cultural, moral, and ethical systems, and have shaped societies for generations. Scriptures like the Bible, the Quran, and the Bhagavad Gita offer insights into fundamental human values and societal norms. Analyzing these texts with advanced methods can help improve our understanding of their significance and the similarities or differences between them. This study uses Natural Language Processing (NLP) techniques to examine these religious texts. Latent Dirichlet allocation (LDA) is used for topic modeling to explore key themes, while GloVe embeddings and Sentence transformers are used to comapre topics between the texts. Sentiment analysis using Valence Aware Dictionary and sEntiment Reasoner (VADER) assesses the emotional tone of the verses, and corpus distance measurement is done to analyze semantic similarities and differences. The findings reveal unique and shared themes and sentiment patterns across the Bible, the Quran, and the Bhagavad Gita, offering new perspectives in computational religious studies. © 2025 Association for Computational Linguistics.