Legal Text Analysis Using Pre-trained Transformers

No Thumbnail Available

Date

2022

Journal Title

Journal ISSN

Volume Title

Publisher

Springer Science and Business Media Deutschland GmbH

Abstract

In this paper, we investigate the application of pre-trained transformers for text classification and similarity identification in the legal domain. We do several experiments applying various pre-trained transformer models to predict the descriptor of law or case based on text and identify similar cases. We consider an Indian Supreme Court judicial cases dataset containing cases and statutes and the EURLEX dataset containing approximately 57,000 documents and 4000 labels. EURLEX is a collection of treaties and laws related to the European Union. We preprocess the texts in the dataset and obtain embeddings from pre-trained transformers. Then, we use these embeddings as input to LSTM/BiLSTM layer to classify or predict similarity. Our results show that pre-trained transformers are sufficiently good when the length of the text to be classified or similarity predicted is small rather than large texts. © 2022, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

Description

Keywords

BERT, Deep learning, Legal analytics

Citation

Lecture Notes in Electrical Engineering, 2022, Vol.858, , p. 493-504

Endorsement

Review

Supplemented By

Referenced By