Conference Papers

Permanent URI for this collectionhttps://idr.nitk.ac.in/handle/123456789/28506

Browse

Search Results

Now showing 1 - 2 of 2
  • Item
    Molecular-InChI: Automated Recognition of Optical Chemical Structure
    (Institute of Electrical and Electronics Engineers Inc., 2022) Kumar, N.; Rashmi, M.; Ramu, S.; Reddy Guddeti, R.M.
    With the advent of a new era dominated by digital media and publications in recent years, the importance of striking a balance between traditional and new modes of operation has become increasingly apparent. It has been standard practice in the field of chemistry for decades to express chemical compounds using their structural forms, referred to as the Skeletal formula. In this research, we tried to interpret these old chemical structure images, extracted from old literature, to transform pictures back to the underlying chemical structure labeled as InChI text using a huge set of synthetic image data produced by Bristol-Myers Squibb. In this paper, we propose an improved synthetic data and an Encoder-Decoder-based deep learning-based model to automatically represent these molecular images into their underlying InChI representation. © 2022 IEEE.
  • Item
    fastText-Based Siamese Network for Hindi Semantic Textual Similarity
    (Springer Science and Business Media Deutschland GmbH, 2025) Chandrashekar, A.; Rushad, M.; Nambiar, A.; Rashmi, V.; Koolagudi, S.G.
    Semantic textual similarity is a measurement of the degree of similarity or equivalence between two sentences semantically. Semantic sentence pairs have the ability to substitute text from each other and retain their meaning. Various rule-based and machine learning models have gained quick prominence in the field, especially in a language like English, where there is an abundance of lexical tools and resources. However, other languages like Hindi have not seen much improvement in state-of-the-art methods and models to evaluate semantic similarity of text data. This paper proposes a fastText-based Siamese neural network architecture to evaluate the semantic equivalency between a Hindi sentence pair. The pair is scored on a scale of 0–5, where 0 indicates least similar and 5 indicates most similar. The corpus contains a combination of two datasets containing manually scored sentence pairs. The performance parameters used to evaluate this approach are model accuracy and model loss over a training period of multiple epochs. The proposed architecture incorporates a fastText-based embedding layer and a bi-directional Long Short Term Memory layer to achieve a similarity score. The proposed architecture can extract semantic and various global features of the text to determine a similarity score. This model achieves an accuracy of 85.5% on a compiled Hindi-Hindi sentence pair dataset, which is a considerable improvement over existing rule and supervise-based systems. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025.