Legal Text Analysis Using Pre-trained Transformers
No Thumbnail Available
Date
2022
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Springer Science and Business Media Deutschland GmbH
Abstract
In this paper, we investigate the application of pre-trained transformers for text classification and similarity identification in the legal domain. We do several experiments applying various pre-trained transformer models to predict the descriptor of law or case based on text and identify similar cases. We consider an Indian Supreme Court judicial cases dataset containing cases and statutes and the EURLEX dataset containing approximately 57,000 documents and 4000 labels. EURLEX is a collection of treaties and laws related to the European Union. We preprocess the texts in the dataset and obtain embeddings from pre-trained transformers. Then, we use these embeddings as input to LSTM/BiLSTM layer to classify or predict similarity. Our results show that pre-trained transformers are sufficiently good when the length of the text to be classified or similarity predicted is small rather than large texts. © 2022, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
Description
Keywords
BERT, Deep learning, Legal analytics
Citation
Lecture Notes in Electrical Engineering, 2022, Vol.858, , p. 493-504
