Legal Text Analysis Using Pre-trained Transformers

dc.contributor.authorPrajwal, M.P.
dc.contributor.authorAnand Kumar, A.M.
dc.date.accessioned2026-02-06T06:35:36Z
dc.date.issued2022
dc.description.abstractIn this paper, we investigate the application of pre-trained transformers for text classification and similarity identification in the legal domain. We do several experiments applying various pre-trained transformer models to predict the descriptor of law or case based on text and identify similar cases. We consider an Indian Supreme Court judicial cases dataset containing cases and statutes and the EURLEX dataset containing approximately 57,000 documents and 4000 labels. EURLEX is a collection of treaties and laws related to the European Union. We preprocess the texts in the dataset and obtain embeddings from pre-trained transformers. Then, we use these embeddings as input to LSTM/BiLSTM layer to classify or predict similarity. Our results show that pre-trained transformers are sufficiently good when the length of the text to be classified or similarity predicted is small rather than large texts. © 2022, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
dc.identifier.citationLecture Notes in Electrical Engineering, 2022, Vol.858, , p. 493-504
dc.identifier.issn18761100
dc.identifier.urihttps://doi.org/10.1007/978-981-19-0840-8_37
dc.identifier.urihttps://idr.nitk.ac.in/handle/123456789/29945
dc.publisherSpringer Science and Business Media Deutschland GmbH
dc.subjectBERT
dc.subjectDeep learning
dc.subjectLegal analytics
dc.titleLegal Text Analysis Using Pre-trained Transformers

Files