A Comprehensive Analysis ofÂ Classification Techniques forÂ Effective Multi-class Research Article Categorization onÂ anÂ Imbalanced Dataset

Gowhar, S.; Kempaiah, P.; Sowmya Kamath, S.; Sugumaran, V.

A Comprehensive Analysis ofÂ Classification Techniques forÂ Effective Multi-class Research Article Categorization onÂ anÂ Imbalanced Dataset

dc.contributor.author	Gowhar, S.
dc.contributor.author	Kempaiah, P.
dc.contributor.author	Sowmya Kamath, S.
dc.contributor.author	Sugumaran, V.
dc.date.accessioned	2026-02-06T06:33:26Z
dc.date.issued	2025
dc.description.abstract	Categorizing scientific articles into specific research fields is a challenging problem, affected by the volume and variety of literature published. However, existing classification systems often suffer from limitations regarding taxonomy or the models used for classification. This article explores a comprehensive analysis of approaches built on Sentence Transformer embeddings combined with Machine Learning algorithms, Neural Networks, and Transformers to classify articles into 123 predefined classes, with the dataset being heavily imbalanced. The effectiveness of Large Language Models (LLMs) for generating synthetic data is also experimented with, along with synonym augmentation SMOTE and employing 1D CNNs for text classification. The best-performing model is a hierarchical classification model trained on MP-Net sentence embeddings that achieved an accuracy of 78%, outperforming all other models. Â© The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.
dc.identifier.citation	Communications in Computer and Information Science, 2025, Vol.2461 CCIS, , p. 106-118
dc.identifier.issn	18650929
dc.identifier.uri	https://doi.org/10.1007/978-3-031-96473-2_8
dc.identifier.uri	https://idr.nitk.ac.in/handle/123456789/28640
dc.publisher	Springer Science and Business Media Deutschland GmbH
dc.subject	Document Classification
dc.subject	Explainable A
dc.subject	I Large Language Models
dc.subject	Natural Language Processing
dc.subject	Transformers
dc.title	A Comprehensive Analysis ofÂ Classification Techniques forÂ Effective Multi-class Research Article Categorization onÂ anÂ Imbalanced Dataset

Collections

Conference Papers

A Comprehensive Analysis ofÂ Classification Techniques forÂ Effective Multi-class Research Article Categorization onÂ anÂ Imbalanced Dataset

Files

Collections