Refining LLMs with Reinforcement Learning for Human-Like Text Generation

dc.contributor.authorHarish, A.
dc.contributor.authorPrakash, G.
dc.contributor.authorNair, R.R.
dc.contributor.authorIyer, V.B.
dc.contributor.authorAnand Kumar, M.
dc.date.accessioned2026-02-06T06:33:50Z
dc.date.issued2024
dc.description.abstractLarge Language Models (LLMs) are used widely for tasks involving text generation such as dialogue summarization and creative writing. The generated text often appears unnatural, and this text can easily be distinguished from natural language. In this paper, we leverage the capabilities of Reinforcement Learning to fine-tune LLMs so as to produce text that resembles human language. We have applied the Proximal Policy Optimization algorithm to fine tune a FLAN-T5 LLM for a dialogue summarization task. © 2024 IEEE.
dc.identifier.citationProceedings of CONECCT 2024 - 10th IEEE International Conference on Electronics, Computing and Communication Technologies, 2024, Vol., , p. -
dc.identifier.urihttps://doi.org/10.1109/CONECCT62155.2024.10677038
dc.identifier.urihttps://idr.nitk.ac.in/handle/123456789/28902
dc.publisherInstitute of Electrical and Electronics Engineers Inc.
dc.subjectAI detection
dc.subjectLarge Language Models (LLMs)
dc.subjectLow Rank Adaptation (LoRA)
dc.subjectProximal Policy Optimization (PPO)
dc.titleRefining LLMs with Reinforcement Learning for Human-Like Text Generation

Files