NeuralDoc-Automating Code Translation Using Machine Learning

No Thumbnail Available

Date

2022

Journal Title

Journal ISSN

Volume Title

Publisher

Springer Science and Business Media Deutschland GmbH

Abstract

Source code documentation is the process of writing concise, natural language descriptions of how the source code behaves during run time. In this work, we propose a novel approach called NeuralDoc, for automating source code documentation using machine learning techniques. We model automatic code documentation as a language translation task, where the source code serves as the input sequence, which is translated by the machine learning model to natural language sentences depicting the functionality of the program. The machine learning model that we use is the Transformer, which leverages the self-attention and multi-headed attention features to effectively capture long-range dependencies and has been shown to perform well on a range of natural language processing tasks. We integrate the copy attention mechanism and incorporate the use of BERT, which is a pre-training technique into the basic Transformer architecture to create a novel approach for automating code documentation. We build an intuitive interface for users to interact with our models and deploy our system as a web application. We carry out experiments on two datasets consisting of Java and Python source programs and their documentation, to demonstrate the effectiveness of our proposed method. © 2022, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

Description

Keywords

Automatic documentation, BERT, Neural machine translation, Program comprehension, Software engineering, Transformer

Citation

Lecture Notes in Electrical Engineering, 2022, Vol.811, , p. 125-138

Endorsement

Review

Supplemented By

Referenced By