Text Augmentation for Enhancing the Text Classification for Low Resource Language
| dc.contributor.author | Kumar, K. | |
| dc.contributor.author | Rudra, B. | |
| dc.date.accessioned | 2026-02-03T13:19:35Z | |
| dc.date.issued | 2025 | |
| dc.description.abstract | The technique of producing more data from a small corpus to improve the predic- tion models’ performance is text augmentation. This Work Focuses on the pivotal role of text augmentation in Natural Language Processing (NLP). It tackles two significant challenges within the field: first, the adaptation of augmentation techniques for low-resource languages, where labeled data is scarce, and second, the enhancement of text classification across diverse domains, including senti- ment analysis, topic classification, and spam detection. This research leverages state-of-the-art transformer-based models like BERT and GPT-2 to ensure the adaptability and effectiveness of these augmentation techniques. The goal is to make NLP more accessible and impactful for low-resource languages, overcoming the challenges of data scarcity. Accuracy and applicability of text classification models, catering to a wide range of applications. Using two Swedish datasets as a paradigm for low-resource languages, we demonstrate the effectiveness of our techniques through thorough empirical testing, as measured by F1 scores. Our findings highlight how enhanced data improves classification performance in sit- uations with limited resources. By exploring various augmentation methods and their applications, this research contributes to advancing NLP solutions for both language-specific and classification-related challenges, pushing the boundaries of text augmentation’s capabilities in the field of NLP. © The Author(s), under exclusive licence to Springer Nature Singapore Pte Ltd. 2025. | |
| dc.identifier.citation | SN Computer Science, 2025, 6, 6, pp. - | |
| dc.identifier.issn | 2662995X | |
| dc.identifier.uri | https://doi.org/10.1007/s42979-025-04120-z | |
| dc.identifier.uri | https://idr.nitk.ac.in/handle/123456789/20155 | |
| dc.publisher | Springer | |
| dc.subject | Back translation | |
| dc.subject | BERT | |
| dc.subject | GPT | |
| dc.subject | NLP | |
| dc.subject | T5 | |
| dc.subject | Text augmentation | |
| dc.subject | Text classification | |
| dc.title | Text Augmentation for Enhancing the Text Classification for Low Resource Language |
