Conference Papers

Permanent URI for this collectionhttps://idr.nitk.ac.in/handle/123456789/28506

Browse

Search Results

Now showing 1 - 5 of 5
  • Item
    NITK_NLP at CheckThat! 2021: Ensemble transformer model for fake news classification
    (CEUR-WS, 2021) LekshmiAmmal, R.L.; Anand Kumar, M.
    Social media has become an inevitable part of our life as we are primarily dependent on them to get most of the news around us. However, the amount of false information propagated through it is much higher than the genuine ones, thus becoming a peril to society. In this paper, we have proposed a model for Fake News Classification as a part of CLEF2021 Checkthat! Lab1 shared task, which had Multi-class Fake News Detection and Topical Domain Classification of News Articles. We have used an ensemble model consisting of pre-trained transformer-based models that helped us achieve 4tℎ and 1st positions on the leaderboard of the two tasks. We achieved an F1-score of 0.4483 against a top score of 0.8376 in one task and a score of 0.8813 in another. © 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
  • Item
    NITK-IT NLP at CheckThat! 2022: Window based approach for Fake News Detection using transformers
    (CEUR-WS, 2022) LekshmiAmmal, H.R.; Anand Kumar, A.M.
    Misinformation is a severe threat to society which mainly spreads through online social media. The amount of misinformation generated and propagated is much more than authentic news. In this paper, we have proposed a model for the shared task on Fake News Classification by CLEF2022 CheckThat! Lab1, which had mono-lingual Multi-class Fake News Detection in English and cross-lingual task for English and German. We employed a transformer-based model with overlapping window strides, which helped us to achieve 7th and 2nd positions out of 25 and 8 participants on the final leaderboard of the two tasks respectively. We got an F1 score of 0.2980 and 0.2245 against the top score of 0.3391 and 0.2898 for the two tasks. © 2022 Copyright for this paper by its authors.
  • Item
    Subjective Answer Evaluation Using Keyword Similarity and Regression Techniques
    (Institute of Electrical and Electronics Engineers Inc., 2024) Kapparad, P.
    This paper introduces a novel approach of automated grading of subjective answers using Natural Language Processing (NLP) techniques. The motivation for the project arises from the need to simplify the process of subjective answer evaluation, which is a repetitive and time-consuming task when done manually. Since no dataset is available for topic presented, we created our own dataset consisting of evaluated student answers for 1 and 3 mark questions on the topics of Social Science. For 1 mark questions, we employed a keyword similarity based grading system. On the other hand, for the 3 mark questions many techniques were explored, including using BERT, DistilBERT, and RoBERTa, which achieved no noteworthy results. Another alternative approach involving both keyword similarity and sentence-sentence similarity was created for the 3 mark questions, which slightly outperformed the previously mentioned techniques. The results for evaluation of 1 mark questions was promising, achieving 90% accuracy. However, there remains significant room for improvement for evaluation of longer answer questions. A key insight from our study is that the scope of improvement is directly related to increasing the quantity and quality of the dataset. This research adds to the ongoing conversation about automation of subjective answer evaluation, aiming to make grading methods more efficient and hassle free in the future. © 2024 IEEE.
  • Item
    SCaLAR NITK at Touché: Comparative Analysis of Machine Learning Models for Human Value Identification
    (CEUR-WS, 2024) Praveen, K.; Darshan, R.K.; Reddy, C.T.; Anand Kumar, M.
    This study delves into task of detecting human values in textual data by making use of Natural Language Processing (NLP) techniques. With the increasing use of social media and other platforms, there is an abundance in data that is generated. Finding human values in these text data will help us to understand and analyze human behavior in a better way, because these values are the core principle that influence human behavior. Analyzing these human values will help not only in research but also for practical applications such as sentiment evaluation, market analysis and personalized recommendation systems. The study tries to evaluate the performance of different existing models along with proposing novel techniques. Models used in this study range from simple machine learning model like SVM, KNN and Random Forest algorithms for classification using embeddings obtained from BERT till transformer models like BERT and RoBERTa for text classification and Large Language Models like Mistral-7b. The task that has be performed is a multilabel, multitask classification. QLoRA quantization method is used for reducing the size of weights of the model which makes it computationally less expensive for training and Supervised Fine Tuning (SFT) trainer is used for fine tuning LLMs for this specific task. It was found that LLMs performed better compared to all other models. © 2024 Copyright for this paper by its authors.
  • Item
    Multimodal Propaganda Detection in Memes with Tolerance-Based Soft Computing Method
    (Springer Science and Business Media Deutschland GmbH, 2024) Kelkar, S.; Ravi, S.; Ramanna, S.; Anand Kumar, M.
    This paper presents a tolerance-based near sets-based classifier applied to multimodal propaganda detection task using text and image data originating from Memes. Memes on the internet consist of an image superimposed with text and are very popular in social media. They are often used as a part of disinformation campaign whereby social media users are influenced via a number of rhetorical and psychological techniques known as persuasion techniques. The focus of this paper is on a subtask of the SemEval-2024 Multilingual Detection of Persuasion Techniques Competition in Memes to detect the presence or absence of a persuasion technique. We introduce a multimodal Tolerance Near Sets Classifier (MTNSC) trained on a combination of word embeddings (RoBERTa) and pre-trained image features (ResNet and ResNet-Memes) using the competition data. This work extends our earlier work in the Natural Language Processing domain where a tolerance-based near sets-based sentiment classifier was introduced. The proposed MTNSC achieves a macro F1 score of 70.15% and micro-F1 score of 75.33% on the test dataset demonstrating satisfactory performance of TNS-based classifiers in a multimodal setting. Our findings point to the model’s effectiveness when compared to a few leading submissions based on deep learning techniques. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.