Conference Papers
Permanent URI for this collectionhttps://idr.nitk.ac.in/handle/123456789/28506
Browse
4 results
Search Results
Item A Transfer Learning Approach for Diabetic Retinopathy Classification Using Deep Convolutional Neural Networks(Institute of Electrical and Electronics Engineers Inc., 2018) Krishnan, A.S.; Clive, D.R.; Bhat, V.; Ramteke, P.B.; Koolagudi, S.G.Diabetic Retinopathy is a disease in which the retina is damaged due to diabetes mellitus. It is a leading cause for blindness today. Detection and quantification of such mellitus from retinal images is tedious and requires expertise. In this paper, an automatic identification of severity of Diabetic Retinopathy using Convolutional Neural Networks (CNNs) with a transfer learning approach has been proposed to aid the diagnostic process. A comparison of different CNN architectures such as ResNet, Inception-ResNet-v2 etc. is done using the quadratic weighted kappa metric. The qualitative and quantitative evaluation of the proposed approach is carried out on the Diabetic Retinopathy detection dataset from Kaggle. From the results, we observe that the proposed model achieves a kappa score of 0.76. © 2018 IEEE.Item Retinal-Layer Segmentation Using Dilated Convolutions(Springer Science and Business Media Deutschland GmbH, 2020) Guru Pradeep Reddy, T.; Ashritha, K.S.; Prajwala, T.M.; Girish, G.N.; Kothari, A.R.; Koolagudi, S.G.; Rajan, J.Visualization and analysis of Spectral Domain Optical Coherence Tomography (SD-OCT) cross-sectional scans has gained a lot of importance in the diagnosis of several retinal abnormalities. Quantitative analytic techniques like retinal thickness and volumetric analysis are performed on cross-sectional images of the retina for early diagnosis and prognosis of retinal diseases. However, segmentation of retinal layers from OCT images is a complicated task on account of certain factors like speckle noise, low image contrast and low signal-to-noise ratio amongst many others. Owing to the importance of retinal layer segmentation in diagnosing ophthalmic diseases, manual segmentation techniques have been proposed and adopted in clinical practice. Nonetheless, manual segmentations suffer from erroneous boundary detection issues. This paper thus proposes a fully automated semantic segmentation technique that uses an encoder–decoder architecture to accurately segment the prominent retinal layers. © 2020, Springer Nature Singapore Pte Ltd.Item A Transpose-SELDNet for Polyphonic Sound Event Localization and Detection(Institute of Electrical and Electronics Engineers Inc., 2023) Spoorthy, V.; Koolagudi, S.G.Human beings have the ability to identify a particular event occurring in a surrounding based on sound cues even when no visual scenes are presented. Sound events are the auditory cues that are present in a surrounding. Sound event detection (SED) is the process of determining the beginning and end of sound events as well as a textual label for the event. The term sound source localization (SSL) refers to the process of identifying the spatial location of a sound occurrence in addition to the SED. The integrated task of SED and SSL is known as Sound Event Localization and Detection (SELD). In this proposed work, three different deep learning architectures are explored to perform SELD. The three deep learning architectures are SELDNet, D-SELDNet (Depthwise Convolution), and T-SELDNet (Transpose Convolution). Two sets of features are used to perform SED and Direction-of-Arrival (DOA) estimation tasks in this work. D-SELDNet uses a Depthwise convolution layer which helps reduce the model's complexity in terms of computation time. T-SELDNet uses Transpose Convolution, which helps in learning better discriminative features by retaining the input size and not losing necessary information from the input. The proposed method is evaluated on the First-order Ambisonic (FOA) array format of the TAU-NIGENS Spatial Sound Events 2020 dataset. An improvement has been observed as compared to the existing SELD systems with the proposed T-SELDNet. © 2023 IEEE.Item Polyphonic Sound Event Detection Using Modified Recurrent Temporal Pyramid Neural Network(Springer Science and Business Media Deutschland GmbH, 2024) Venkatesh, S.; Koolagudi, S.G.In this paper, a novel approach to performing polyphonic Sound Event Detection (SED) is presented. A new deep learning architecture named “Modified Recurrent Temporal Pyramid Neural Network (MR-TPNN)†is introduced. The input features fed to the network are spectrograms generated from Constant Q-Transform (CQT). CQT spectrograms provided better sound event information in the audio recording than the Short Time Fourier Transform (STFT) and Fast Fourier Transform (FFT) methods. The temporal information is an essential factor for detecting the onset and offset of events in an audio recording. Capturing the temporal information is ensured by fusing Temporal pyramids and Bi-directional long short-term memory (LSTM) recurrent layers in deep learning architecture. Extensive experiments are carried out on three benchmark datasets, and the results of the proposed method are superior to those of the existing polyphonic SED systems. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.
