Faculty Publications

Permanent URI for this communityhttps://idr.nitk.ac.in/handle/123456789/18736

Publications by NITK Faculty

Browse

Search Results

Now showing 1 - 2 of 2
  • Item
    Dialect Identification using Chroma-Spectral Shape Features with Ensemble Technique
    (Academic Press, 2021) Chittaragi, N.B.; Koolagudi, S.G.
    The present work proposes a text-independent dialect identification system. Generally, dialects of a language exhibit varying pronunciation styles followed in a specific geographical region. In this paper, chroma features familiar with music-related systems are employed for identification of dialects. In addition, eight significant spectral shape related features from short term spectra are computed and combined along with chroma features and named as chroma-spectral shape features. Chroma features try to aggregate spectral information and attempt to encapsulate the evidential variations, concerning timbre, correlated melody, rhythmic, and intonation patterns found prominently among dialects of few languages. The effectiveness of the proposed features and approach is evaluated on five prominent Kannada dialects spoken in Karnataka, India and internationally known standard Intonation Variation in English (IViE) dataset with nine British English dialects. Discriminative models such as, single classifier based Support Vector Machine (SVM) and ensemble based support vector machines (ESVM) are employed for classification. The proposed features have shown better performance over state-of-the-art i-vector features on both datasets. The highest recognition performance of 95.6% and 97.52% are achieved in the cases of Kannada and IViE dialect datasets respectively using ESVM. Proposed features have also demonstrated robust performance with small sized (limited data) audio clips even in noisy conditions. © 2021 Elsevier Ltd
  • Item
    End-to-end latent fingerprint enhancement using multi-scale Generative Adversarial Network
    (Elsevier B.V., 2024) Pramukha, R.N.; Akhila, P.; Koolagudi, S.G.
    Latent fingerprint enhancement is paramount as it dramatically influences matching accuracy. This process is often challenging due to varying structured noise and background patterns. The prints may be of arbitrary sizes and scales with a high degree of occlusion. There is a need for creating an end-to-end system that handles different conditions reliably to streamline this often lengthy and tricky process. In this work, we propose a Generative Adversarial Network (GAN) based architecture that effectively captures multi-scale context using Atrous Spatial Pyramid Pooling (ASPP). We have trained the network on a synthetically generated dataset, carefully designed to represent real-world latent prints. By avoiding the reconstruction of spurious ridges and only enhancing valid ridges, we avoid the generation of false minutiae, leading to better matching performance. We obtained state-of-the-art results in Sensor to Latent matching using the IIITD MOLF and Latent to Latent Matching using IIITD Latent datasets. © 2024 Elsevier B.V.