Large Language Models for Indian Legal Text Summarisation

Hemanth Kumar, M.; Jayanth, P.; Anand Kumar, M.

Large Language Models for Indian Legal Text Summarisation

dc.contributor.author	Hemanth Kumar, M.
dc.contributor.author	Jayanth, P.
dc.contributor.author	Anand Kumar, M.
dc.date.accessioned	2026-02-06T06:33:49Z
dc.date.issued	2024
dc.description.abstract	Summarizing legal case judgments is a complex task in Legal Natural Language Processing (NLP), with a gap in understanding how various summarization models, including extractive and abstractive approaches and analysing the perform within the domain of legal documents. Since there are around 4 crore pending cases in the Indian court system, this study addresses the challenge of laborious task of manually summarizing legal documents. It introduces both supervised and unsupervised models for both extractive and abstractive summarization, showcasing their effective performance through evaluations using ROUGE metrics and BERT score. BART, T5, PEGASUS, ROBERTA, Legal-PEGASUS, Legal-BERT models are used for abstractive summarisation. TextRank, LexRank, LSA, Summarizer BERT, KL-Summ are used in case of extractive summarisation. Longformer, Bert - Legal Pegasus are also considered for the task of Summarisation. In the domain of legal document summarization, we used GPT-4 and LLAMA-2, employing prompt engineering with both Zero-shot and Oneshot prompts to extract summaries. As far of our knowledge, this is the first paper that used Large Language Models like GPT-4 and LLama-2 for the task of Legal Text summarisation. Along with that a user-friendly chatbot has been developed utilizing the Llama model and specifically designed to respond for queries related to legal texts. Additionally, a web application has been created, allowing users to upload legal documents for summarization. An option is given to users to select from various languages including Telugu, Tamil, Kannada, Malayalam, and Hindi. As a result the summarised text is converted into respective language. Â© 2024 IEEE.
dc.identifier.citation	Proceedings of CONECCT 2024 - 10th IEEE International Conference on Electronics, Computing and Communication Technologies, 2024, Vol., , p. -
dc.identifier.uri	https://doi.org/10.1109/CONECCT62155.2024.10677065
dc.identifier.uri	https://idr.nitk.ac.in/handle/123456789/28890
dc.publisher	Institute of Electrical and Electronics Engineers Inc.
dc.subject	GPT-4
dc.subject	Large language models
dc.subject	Legal
dc.subject	LLAMA-2
dc.subject	NLP
dc.subject	Summarisation
dc.title	Large Language Models for Indian Legal Text Summarisation

Collections

Conference Papers

Large Language Models for Indian Legal Text Summarisation

Files

Collections