Refining LLMs with Reinforcement Learning for Human-Like Text Generation

Harish, A.; Prakash, G.; Nair, R.R.; Iyer, V.B.; Anand Kumar, M.

Refining LLMs with Reinforcement Learning for Human-Like Text Generation

Date

2024

Authors

Publisher

Institute of Electrical and Electronics Engineers Inc.

Abstract

Large Language Models (LLMs) are used widely for tasks involving text generation such as dialogue summarization and creative writing. The generated text often appears unnatural, and this text can easily be distinguished from natural language. In this paper, we leverage the capabilities of Reinforcement Learning to fine-tune LLMs so as to produce text that resembles human language. We have applied the Proximal Policy Optimization algorithm to fine tune a FLAN-T5 LLM for a dialogue summarization task. Â© 2024 IEEE.

Keywords

AI detection, Large Language Models (LLMs), Low Rank Adaptation (LoRA), Proximal Policy Optimization (PPO)

Citation

Proceedings of CONECCT 2024 - 10th IEEE International Conference on Electronics, Computing and Communication Technologies, 2024, Vol., , p. -

URI

https://doi.org/10.1109/CONECCT62155.2024.10677038
https://idr.nitk.ac.in/handle/123456789/28902

Collections

Conference Papers

Full item page

Refining LLMs with Reinforcement Learning for Human-Like Text Generation

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By