Large Language Models for Indian Legal Text Summarisation

dc.contributor.authorHemanth Kumar, M.
dc.contributor.authorJayanth, P.
dc.contributor.authorAnand Kumar, M.
dc.date.accessioned2026-02-06T06:33:49Z
dc.date.issued2024
dc.description.abstractSummarizing legal case judgments is a complex task in Legal Natural Language Processing (NLP), with a gap in understanding how various summarization models, including extractive and abstractive approaches and analysing the perform within the domain of legal documents. Since there are around 4 crore pending cases in the Indian court system, this study addresses the challenge of laborious task of manually summarizing legal documents. It introduces both supervised and unsupervised models for both extractive and abstractive summarization, showcasing their effective performance through evaluations using ROUGE metrics and BERT score. BART, T5, PEGASUS, ROBERTA, Legal-PEGASUS, Legal-BERT models are used for abstractive summarisation. TextRank, LexRank, LSA, Summarizer BERT, KL-Summ are used in case of extractive summarisation. Longformer, Bert - Legal Pegasus are also considered for the task of Summarisation. In the domain of legal document summarization, we used GPT-4 and LLAMA-2, employing prompt engineering with both Zero-shot and Oneshot prompts to extract summaries. As far of our knowledge, this is the first paper that used Large Language Models like GPT-4 and LLama-2 for the task of Legal Text summarisation. Along with that a user-friendly chatbot has been developed utilizing the Llama model and specifically designed to respond for queries related to legal texts. Additionally, a web application has been created, allowing users to upload legal documents for summarization. An option is given to users to select from various languages including Telugu, Tamil, Kannada, Malayalam, and Hindi. As a result the summarised text is converted into respective language. © 2024 IEEE.
dc.identifier.citationProceedings of CONECCT 2024 - 10th IEEE International Conference on Electronics, Computing and Communication Technologies, 2024, Vol., , p. -
dc.identifier.urihttps://doi.org/10.1109/CONECCT62155.2024.10677065
dc.identifier.urihttps://idr.nitk.ac.in/handle/123456789/28890
dc.publisherInstitute of Electrical and Electronics Engineers Inc.
dc.subjectGPT-4
dc.subjectLarge language models
dc.subjectLegal
dc.subjectLLAMA-2
dc.subjectNLP
dc.subjectSummarisation
dc.titleLarge Language Models for Indian Legal Text Summarisation

Files