Conference Papers
Permanent URI for this collectionhttps://idr.nitk.ac.in/handle/123456789/28506
Browse
2 results
Search Results
Item HALE Lab NITK at Touché 2024: A Hybrid Approach for Identifying Political Ideology and Power in Multilingual Parliamentary Speeches(CEUR-WS, 2024) Simhadri, S.; Patel, M.M.; Sowmya Kamath, S.In this article, an approach to determine the political views and stances of speakers for identifying whether they support or oppose the government in parliamentary discussions is presented. The work was carried out as part of the Touché 2024 Task 2, “Ideology and Power Identification in Parliamentary Debates†. Towards this, two systems were developed, the first employs traditional machine learning methods with TF-IDF embeddings, while the second utilizes advanced NLP techniques with the LASER encoder for multilingual embeddings. Both systems incorporate standard preprocessing techniques and also integrates a variety of models, after which a voting classifier is used to combine the predictions from both approaches. Experiments revealed that this comprehensive framework effectively addresses the complexities and nuances of political discourse, providing valuable insights into speakers' ideologies and governing statuses within parliamentary debates. © 2024 Copyright for this paper by its authors.Item A Comprehensive Analysis of Classification Techniques for Effective Multi-class Research Article Categorization on an Imbalanced Dataset(Springer Science and Business Media Deutschland GmbH, 2025) Gowhar, S.; Kempaiah, P.; Sowmya Kamath, S.; Sugumaran, V.Categorizing scientific articles into specific research fields is a challenging problem, affected by the volume and variety of literature published. However, existing classification systems often suffer from limitations regarding taxonomy or the models used for classification. This article explores a comprehensive analysis of approaches built on Sentence Transformer embeddings combined with Machine Learning algorithms, Neural Networks, and Transformers to classify articles into 123 predefined classes, with the dataset being heavily imbalanced. The effectiveness of Large Language Models (LLMs) for generating synthetic data is also experimented with, along with synonym augmentation SMOTE and employing 1D CNNs for text classification. The best-performing model is a hierarchical classification model trained on MP-Net sentence embeddings that achieved an accuracy of 78%, outperforming all other models. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.
