Semantic Similarity and Paraphrase Identification for Malayalam Using Deep Autoencoders

Praveena, R.; Anand Kumar, M.; Padannayil, K.P.

Semantic Similarity and Paraphrase Identification for Malayalam Using Deep Autoencoders

dc.contributor.author	Praveena, R.
dc.contributor.author	Anand Kumar, M.
dc.contributor.author	Padannayil, K.P.
dc.date.accessioned	2026-02-08T16:50:18Z
dc.date.issued	2021
dc.description.abstract	In this chapter, we deal with the sentence-level paraphrase identification for the Malayalam language. We use recursive autoencoder architecture for the unsupervised learning of phrase representations to extract features for paraphrase identification. Sentence’s features of varying lengths are converted to fixed-size representation using the convolution method of dynamic pooling. Initially, the Malayalam paraphrase identification system was designed to identify paraphrases and non-paraphrases alone and later extended to identify semi-equivalent paraphrases. Along with semantic features, conventional statistical features are further taken into account, resulting in improved system performance. The proposed system was implemented using word2vec embedding and obtained 77.67% accuracy for the two-class system and 66.07% for the three-class system. This chapter also discusses different experiments done for choosing the best parameters and embedding models. © 2021, The Author(s), under exclusive license to Springer Nature Switzerland AG.
dc.identifier.citation	Signals and Communication Technology, 2021, Vol., , p. 81-96
dc.identifier.issn	18604862
dc.identifier.uri	https://doi.org/10.1007/s00202-025-03284-4
dc.identifier.uri	https://idr.nitk.ac.in/handle/123456789/33750
dc.publisher	Springer Science and Business Media Deutschland GmbH
dc.subject	Deep learning
dc.subject	Glove
dc.subject	Malayalam paraphrase identification
dc.subject	Recursive autoencoders
dc.subject	Word2vec
dc.title	Semantic Similarity and Paraphrase Identification for Malayalam Using Deep Autoencoders

Collections

Book Chapters

Semantic Similarity and Paraphrase Identification for Malayalam Using Deep Autoencoders

Files

Collections