2. Thesis and Dissertations

Permanent URI for this communityhttps://idr.nitk.ac.in/handle/1/10

Browse

Search Results

Now showing 1 - 3 of 3
  • Thumbnail Image
    Item
    Development Of Limited Supervised Deep Learning Methods For Biomedical Image Analysis
    (National Institute Of Technology Karnataka Surathkal, 2023) S J, Pawan; Rajan, Jeny
    Over the past few years, the computer vision domain has evolved and made a revo- lutionary transition from human-engineered features to automated features to address challenging tasks. Computer vision is an ever-evolving domain, having its roots deeply correlated with neuroscience. Any new findings that trigger a more intuitive under- standing and working of the human brain generally impact the design of computer vision algorithms. The convolutional neural network is one such algorithm that has become the de facto standard for most computer vision tasks, such as image classifi- cation, object detection, image segmentation, etc. However, the performance of CNNs is highly dependent on labeled data, making their practicability difficult in scenarios lacking sufficient labeled data, especially in medical applications. Therefore, it is im- perative to develop deep learning methods with limited supervision. In light of this, we explore the dimensions of deep learning with limited supervision through capsule networks and semi-supervised learning for biomedical image analysis, with a primary focus on segmentation. In this thesis, we have systematically reviewed various techniques for handling deep learning with limited labeled data, focusing on capsule networks and consis- tency regularization-driven semi-supervised learning. Capsule networks have shown immense potential for image classification tasks. However, extending it to pixel-level classification or segmentation is difficult. It poses numerous challenges, including the exponential growth of trainable parameters, expensive computation, and extensive memory overhead. In this regard, we propose DRIP-Caps, a Dilated Residual Incep- tion and Capsule Pooling framework that makes the capsule network lightweight by re- ducing the computation complexity without compromising performance on the CSCR (central Serous Chorioretinopathy) dataset. viiiSemi-supervised learning is a major discipline that alleviates the requirement for labeled data by incorporating labeled and unlabeled data to formulate pertinent infor- mation. We present a semi-supervised framework based on a mixup operation-driven consistency constraint for medical image segmentation by incorporating geometric con- straints regressing over the signed distance map (SDM) of the object of interest, achiev- ing superior performance on the publicly available ACDC and LA datasets. We also propose a novel semi-supervised framework for enforcing dual consistency (data level and network level) with the two-stage pre-training approach through networks of differ- ent learning paradigms enforcing both local and global semantic affinities, improving the overall performance. We envision these methods serving a major role in alleviating the tedious labeling process as far as the segmentation task is concerned.
  • Thumbnail Image
    Item
    Content-based Music Information Retrieval (CB-MIR) and its Applications Towards Music Recommender System
    (National Institute of Technology Karnataka, Surathkal, 2019) Murthy, Y V Srinivasa.; Koolagudi, Shashidhar G.
    Music is a pervasive element of human’s day-to-day activities. Most of the people love to listen to music all the time for handling their stress and tensions. Some are capable of creating the music. The importance of music for human beings has exploited the advancements in technology resulting in an enormous number of digital tracks. However, a majority of tracks are available with an inadequate meta-information. The meta-information is limited to the song title, album name, singer name and composer. Now, the question is how to organize them effectively in order to retrieve the relevant clips quickly, without proper meta-information like genre, lyrics, raga, mood, instrument names, etc. The process of labelling the meta-information manually for millions of tracks of the digital cloud is practically not possible. Hence, an area of research known as music information retrieval (MIR) has been introduced in the early years of 21st century. However, it acquired much attention of researchers since 2005 with the support of Music Information Retrieval Evaluation eXchange (MIREX)1 competition. There are several works that have been proposed for various tasks of MIR such as singing voice detection, singer identification, genre classification, instrument identification, music mood estimation, lyrics generation, music annotation and so on. However, the main focus is on Western music, and only a few works are reported on Indian songs in the literature. Since Indian popular songs are contributing to a major portion of the global digital cloud, in this thesis, an attempt has been made to develop a few useful MIR tasks such as vocal and non-vocal segmentation, singer identification, music mood estimation and development of music recommender system in Indian scenario. Efforts have been put to construct relevant databases with a possible coherence for all the tasks mentioned above. Results include comparative analysis with standard datasets such as MIR-1K and artist20 are given. For each of the four tasks, some novel approach has been presented in this thesis. First, the task of vocal and non-vocal segmentation has been choosen to locate the onset and offset points of singing voice regions. A set of novel features such as formant attack slope (FAS), formant heights from base-to-peak (FH1), formant angle values at peak (FA1),formant angle values of valley (FA2), and singer formant (F5) have been computed and used for discriminating vocal and non-vocal segments. Also, an attempt has been made to develop a feature selection algorithm based on the concepts of genetics, known as genetic algorithm based feature selection (GAFS). The list of observations made out of this experimentation using selected features on the Indian and Western databases has been reported. Second, the task of singer identification (SID) has been considered. A database with the songs of 10 male and 10 female singers has been constructed. The songs are taken from two popular cine industries of Indian subcontinent named Tollywood (Telugu) and Bollywood (Hindi). Various timbral and temporal features have been computed to analyze their effect on singer identification with different classifiers. However, the feature based systems are found to be less effective, and hence the trending convolutional neural networks (CNNs) have been used with spectrograms of song clips as inputs. Identifying mood of the song has been considered as a third objective for this thesis. Six different moods are identified based on the analysis done on the combination of Russell’s and Thayer’s models (Saari and Eerola, 2014). We have developed, a two-level classification model for music mood detection. In the first stage, songs have been categorized into energetic or non-energetic songs. The actual class label has been predicted in the second stage. The performance of the system is found to be better in this case compared to development of single phase classification recommender system has been taken up using the labels like the title of a track, singer name(s), mood of a song, and duration. The graph structure based recommendation system has been proposed in this work to estimate the similarity in the listening patterns of same listeners. A graph has been constructed for every user by considering songs as nodes. Further, the similarities are estimated using the adjacency matrices obtained on listening patterns. This approach could be more appropriate for improving the performance of song recommender systems.
  • Thumbnail Image
    Item
    Automatic Segmentation of Intra-Retinal Cysts from Optical Coherence Tomography Scans
    (National Institute of Technology Karnataka, Surathkal, 2018) G. N., Girish; Rajan, Jeny
    Retinal cysts are formed by accumulation of fluid in the retina caused by leakage due to blood retinal barrier breakdown from inflammation or vascular disorders. Analysis of retinal cystic spaces holds significance in detection, treatment and prognostication of several ocular diseases like age-related macular degeneration, diabetic macular edema, etc. Segmentation of intra-retinal cysts (IRCs) and their quantification is important for retinal pathology and severity characterization. In recent years, automated segmentation of intra-retinal cysts from optical coherence tomography (OCT) B-scans has gained significance in the field of retinal image analysis. In this thesis, a benchmark study is conducted to compare existing methods to identify the factors affecting IRC segmentation from OCT scans. A modular approach is employed to standardize the different IRC segmentation algorithms followed by analysis of variations in automated cyst segmentation performances and method scalability across image acquisition systems are done by using publicly available cyst segmentation challenge dataset (OPTIMA cyst segmentation challenge). Such exhaustive analysis on the scalability of OCT cyst segmentation methods in terms of methodological and input data variations has not been done before. An efficient cyst segmentation technique must be capable of performing cyst identification and delineation with minimum errors. Several methods proposed in the literature fail to delineate cysts up to their true boundary. To address this problem, an unsupervised vendor dependent method using marker controlled watershed transformation is proposed in this thesis. The method is based on two stages- k-means clustering technique is used to identify cysts in the form of marker, followed by topographical based watershed transform for final segmentation. Qualitative and quantitative evaluation of vthe proposed method is carried out against ground truth obtained from two graders on OPTIMA cyst segmentation challenge dataset (Spectralis Vendor OCT scans). Obtained results show that the proposed method outperformed other considered unsupervised methods. Several segmentation methods have been proposed in the literature for IRC segmentation on vendor-specific OCT images, but these lack generalizability across imaging systems. To address this issue, a fully convolutional network (FCN) model for vendor-independent IRC segmentation is proposed in this thesis. The proposed FCN was trained using the OPTIMA cyst segmentation challenge dataset (with four different vendor-specific images, namely, Cirrus, Nidek, Spectralis and Topcon). This method counteracts image noise variabilities and model over-fitting by data augmentation and hyper-parameter optimization. Additionally, sensitivity analysis of the model hyperparameters (depth and receptive field size) is performed to optimize the proposed FCN. The Dice Correlation Coefficient of the proposed method outperforms the algorithms published in the OPTIMA cyst segmentation challenge. Deeper FCNs exhibit better feature learning capabilities than shallower networks but those are computationally intensive due to large number of computation parameters and may be prone to vanishing gradient problem. To address this issue, a depthwise separable convolutional filter based end-end convolutional neural network architecture with swish activation functions is proposed in this thesis. OPTIMA cyst segmentation challenge dataset with four different vendor scans were used to evaluate the proposed architecture for vendor independent IRC segmentation task. Obtained experimental results show that the proposed method significantly reduced the number of computation parameters compared to regular convolution based FCN.