Large Language Models for Indian Legal Text Summarisation
No Thumbnail Available
Date
2024
Journal Title
Journal ISSN
Volume Title
Publisher
Institute of Electrical and Electronics Engineers Inc.
Abstract
Summarizing legal case judgments is a complex task in Legal Natural Language Processing (NLP), with a gap in understanding how various summarization models, including extractive and abstractive approaches and analysing the perform within the domain of legal documents. Since there are around 4 crore pending cases in the Indian court system, this study addresses the challenge of laborious task of manually summarizing legal documents. It introduces both supervised and unsupervised models for both extractive and abstractive summarization, showcasing their effective performance through evaluations using ROUGE metrics and BERT score. BART, T5, PEGASUS, ROBERTA, Legal-PEGASUS, Legal-BERT models are used for abstractive summarisation. TextRank, LexRank, LSA, Summarizer BERT, KL-Summ are used in case of extractive summarisation. Longformer, Bert - Legal Pegasus are also considered for the task of Summarisation. In the domain of legal document summarization, we used GPT-4 and LLAMA-2, employing prompt engineering with both Zero-shot and Oneshot prompts to extract summaries. As far of our knowledge, this is the first paper that used Large Language Models like GPT-4 and LLama-2 for the task of Legal Text summarisation. Along with that a user-friendly chatbot has been developed utilizing the Llama model and specifically designed to respond for queries related to legal texts. Additionally, a web application has been created, allowing users to upload legal documents for summarization. An option is given to users to select from various languages including Telugu, Tamil, Kannada, Malayalam, and Hindi. As a result the summarised text is converted into respective language. © 2024 IEEE.
Description
Keywords
GPT-4, Large language models, Legal, LLAMA-2, NLP, Summarisation
Citation
Proceedings of CONECCT 2024 - 10th IEEE International Conference on Electronics, Computing and Communication Technologies, 2024, Vol., , p. -
