Text Augmentation for Enhancing the Text Classification for Low Resource Language

Kumar, K.; Rudra, B.

Text Augmentation for Enhancing the Text Classification for Low Resource Language

Date

2025

Authors

Kumar, K.

Rudra, B.

Publisher

Springer

Abstract

The technique of producing more data from a small corpus to improve the predic- tion models’ performance is text augmentation. This Work Focuses on the pivotal role of text augmentation in Natural Language Processing (NLP). It tackles two significant challenges within the field: first, the adaptation of augmentation techniques for low-resource languages, where labeled data is scarce, and second, the enhancement of text classification across diverse domains, including senti- ment analysis, topic classification, and spam detection. This research leverages state-of-the-art transformer-based models like BERT and GPT-2 to ensure the adaptability and effectiveness of these augmentation techniques. The goal is to make NLP more accessible and impactful for low-resource languages, overcoming the challenges of data scarcity. Accuracy and applicability of text classification models, catering to a wide range of applications. Using two Swedish datasets as a paradigm for low-resource languages, we demonstrate the effectiveness of our techniques through thorough empirical testing, as measured by F1 scores. Our findings highlight how enhanced data improves classification performance in sit- uations with limited resources. By exploring various augmentation methods and their applications, this research contributes to advancing NLP solutions for both language-specific and classification-related challenges, pushing the boundaries of text augmentation’s capabilities in the field of NLP. © The Author(s), under exclusive licence to Springer Nature Singapore Pte Ltd. 2025.

Keywords

Back translation, BERT, GPT, NLP, T5, Text augmentation, Text classification

Citation

SN Computer Science, 2025, 6, 6, pp. -

URI

https://doi.org/10.1007/s42979-025-04120-z
https://idr.nitk.ac.in/handle/123456789/20155

Collections

Journal Articles

Full item page

Text Augmentation for Enhancing the Text Classification for Low Resource Language

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By