2. Thesis and Dissertations

Permanent URI for this communityhttps://idr.nitk.ac.in/handle/1/10

Browse

Search Results

Now showing 1 - 6 of 6
  • Thumbnail Image
    Item
    Advanced Spectral Spatial Approaches for Dimensionality Reduction of Hyperspectral Data
    (National Institute of Technology Karnataka, Surathkal, 2024) C, DEEPA; SHETTY, AMBA; NARASIMHADHAN, A.V.
    Recent advances in sensor technology have enabled the collection of large data in hyperspectral remote sensing. Although rich spectral information is captured in hundreds of narrow contiguous bands, the hyperspectral data possess several limitations such as mixed pixels, high intraclass variability, interclass similarity, and the curse of dimensionality which restricts the potential of conventional machine learning classifiers. Dimensionality reduction (DR) and incorporation of spatial information can be taken into account to increase the interpretability of hyperspectral data. The thesis mainly focuses on the implementation of different approaches for DR of hyperspectral data to address the curse of dimensionality, limited samples and labelled data issues inherent in hyperspectral data. First, a quality measure based on the co-ranking matrix has been proposed for the performance evaluation of 15 DR techniques for mineral exploration. The selection of appropriate techniques for a particular task is challenging due to the diversity and ever-increasing number of DR techniques. A few important aspects in this regard have been explored in detail. Clustering is performed using the K-means algorithm and the relationship between the quality index and clustering accuracy has been examined concurrently for the first time in hyperspectral remote sensing. Furthermore, the loss of quality in the process of DR has also been analyzed which provides sufficient input for the end-user to select an appropriate DR technique. Second, the ability of the Convolutional Neural Network (CNN) for supervised learning of hyperspectral data is explored. A fast and compact hybrid CNN which combines the strengths of 3D and 2D convolutions to extract joint spectral-spatial information has been proposed to analyze the impact of different feature extraction techniques on classification performance. The effect of input patch size on final results has been well demonstrated. A detailed investigation of classification accuracy, execution time, and comparison with nine state-of-the-art approaches has been demonstrated. ii Next, a novel deep feature selection strategy using autoencoders inspired by knowledge distillation has been implemented for the model compression and selection of informative bands. The potential of convolutional autoencoders has been well explored in selecting discriminative bands. Sensitivity analysis tests and different applications have been considered to verify the generalization capability of the proposed model. The potential of unsupervised learning schemes has been discussed in detail. Finally, a generator model based on Generative Adversarial Networks (GAN) has been proposed for virtual sample generation and compact representation of hyperspectral data. The training instability issue in Vanilla GAN has been addressed by the effective implementation of deep convolutional GANs. By comparing the spectra of the generated hyperspectral images to the corresponding real ones, the quality of the images is assessed. The potential of augmented data for improvement in classification accuracy has also been investigated.
  • Thumbnail Image
    Item
    Soil Fertility Classification Using Machine Learning-Based Approach
    (National Institute of Technology Karnataka, Surathkal, 2024) M. Sujatha; D.Jaidhar C.
    Agriculture is the main source of economy and survival in many countries. To ensure sustainable agricultural development, it is crucial to promptly acquire soil fertility and apply accurate fertilizers. However, traditional laboratory methods for analyzing soil samples make it challenging to estimate soil fertility. Therefore, this research aims to develop a reliable Machine Learning (ML)-based classifier that can classify soil fertility as LOW, or MEDIUM, or HIGH. Additionally, prescribes fertilizers based on the classification results. Soil fertility classification approach based on laboratory chemical parameters such as Electrical Conductivity (EC), Organic Carbon (OC), potential of hydrogen (pH), boron (B), copper (Cu), iron (Fe), manganese (Mn), phosphorus (P), potassium (K), sulphur (S), and zinc (Zn) have been proposed using ML approaches. The classifiers used in this study included Random Forest (RF), bagging, Boosted Regression Tree (BRT), J48 Decision Tree (J48), Logistic Regression (LR), Naive Bayes (NB), and Support Vector Machine (SVM). The experiments were conducted with a split dataset (75% of data for training and 25% for testing) and 10-fold cross-validation. The tree-based classifier RF, outperformed the other classifiers by producing an accuracy of 99.99% with 10-fold cross-validation test and a split dataset. To avoid the need for laboratory analysis and obtain soil parameters specific to the site, this research relied on Sentinel-2 spectral data to determine EC, pH, OC, and N. The generated dataset was labeled using various clustering methods such as canopy, density-based, expectation-maximization, farthest-first, fuzzy C-means, and k-means and then compared with manual labeling. Among these, the canopy clustering approach achieved the highest accuracy of 75.99% on labeling dataset. Therefore, the proposed method for labeling the dataset uses the canopy-centered fuzzy C-means clustering. It was found that the proposed canopy-centered fuzzy-C-means clustering method achieved the highest accuracy of 78.42% in labeling the dataset. Furthermore, the performance of several ML-based classifiers, such as NB, SVM, J48, and RF were compared using datasets labeled with different clustering approaches. The RF classifier achieved the highest classification accuracy of 99.69% using the proposed approach and on 10-fold cross-validation. To determine the best fertilizer for a given soil, a new fertilizer prescription approach was proposed. It uses an ensemble filter-based feature selection to classify soil fertility and prescribe the appropriate fertilizer. It was tested on two datasets from regions with varying climate conditions. Various tree-based classifiers, such as classification and regression tree, extra tree, reduced error pruning tree, RF, NB, and SVM, were compared using the first dataset with relevant soil parameters. The results showed that the RF classifier with relevant soil parameters was the most accurate, achieving a 99.96% i accuracy with dataset-1 and a 99.90% accuracy with dataset-2. A soil fertility classifier and fertilizer prescription approach was proposed by utilizing 2D Convolutional Neural Networks (CNNs). The experiments were conducted on a split dataset with varying kernel sizes of 3×3 to 7×7 and input grid sizes from 11×11 to 13×13. The classifier showed an impressive accuracy of 97.24% and kappa statistics of 0.0938 with an input grid size of 11×11 and a kernel size of 3×3. To further improve the accuracy, the training data was oversampled using the Synthetic Minority Oversampling Technique (SMOTE). The proposed approach using oversampling achieved an accuracy of 97.52% and kappa statistics of 0.1397, with an input grid size of 12×12 and a kernel size of 3×3. A 1D-CNN-based soil fertility classification approach was developed to simplify the 2D CNN-based classifier used for soil fertility classification. To improve the performance of the model, the dataset was normalized using Min-Max normalization, and training data was oversampled using SMOTE. The proposed approach was compared with the soil fertility classifiers based on Extreme Learning Machine (ELM) and Multi- Layer Perceptron (MLP). The proposed approach, with normalization and SMOTE, achieved an accuracy of 97.90% and kappa statistics of 0.2358. A new method to classify soil fertility and prescribe fertilizers using symbolic deterministic finite automata, to overcome the limitations of traditional ML-based classifiers, which require large, unbiased datasets and are prone to errors. The proposed method was compared using ML-based classifiers using data from Sentinel-2 satellite imagery and laboratory-measured soil health data of Belgaum district. The data consisted of two sets: one with four soil parameters (Soil-health-1 dataset) and the other with twelve soil parameters (Soil-health-2 dataset). The results showed that the new approach was able to classify soil fertility with 100% accuracy using the Sentinel-2 and Soil-health-1 datasets, and with 98.37% accuracy using the Soil-health-2 dataset. Satellite revisits to a specific site location are infrequent, hence, soil sensors are used to collect real-time values of EC, pH, N, P, and K in this study. The collected real-time data is tested using trained and saved ML-based classifiers, such as Classification and Regression Tree (CART), J48, RF, Reduced Error Pruning (REP), NB and SVM which were trained using the Soil-health dataset of Belgaum district. For the real-time test data RF and REP classifiers achieved highest test accuracy of 100%.
  • Thumbnail Image
    Item
    3d Convolutional Neural Network Architectures for Volumetric Medical Image Segmentation
    (National Institute of Technology Karnataka, Surathkal., 2024) S. Niyas
    Computer-aided medical image analysis plays a critical role in supporting medical practitioners with expert clinical diagnoses and determining optimal treatment plans. Currently, convolu tional neural networks (CNNs) are widely regarded as the preferred method for automated medical image analysis due to their ability to autonomously learn relevant features from train ing data. However, most cutting-edge semantic image segmentation techniques rely on two dimensional (2D) CNN models, which do not fully exploit the inter-slice information available in cross-sectional imaging modalities, such as MRI volumes. This limitation underscores the need for more advanced approaches to better utilize the three-dimensional (3D) data inherent in these imaging techniques. In this thesis, we present a comprehensive evaluation of various techniques employed in 3D deep learning for medical image segmentation. With the rapid advancements in 3D imaging systems and excellent hardware and software support to process large volumes of data, 3D deep learning methods are gaining popularity in medical image segmentation. However, traditional 3D CNN-based segmentation models require substantial computational resources, extensive memory, and typically larger datasets than 2D CNN approaches. To address these challenges, we propose a 3D CNN segmentation model that e!ciently extracts information across slices and mitigates several limitations associated with traditional 3D CNN techniques. The method aims to retain the advantages of both 2D CNN and 3D CNN methods by e”ectively designing input data slices and the CNN architecture. In this study, we proposed a shallow sliced stacking approach to reduce the depth of input 3D data to maintain a good segmentation accuracy with minimum computation overhead and model complexity. Incorporating residual connections in the encoder path also facilitates the extraction of multi-scale features without significantly increasing the model complexity. Accurate diagnosis of various medical conditions often requires the simultaneous analysis of multiple image characteristics. For instance, Focal Cortical Dysplasia (FCD) lesion detec viii tion can be significantly enhanced by incorporating data on cortical thickness maps along with f luid-Attenuated Inversion Recovery (FLAIR) Magnetic Resonance Imaging (MRI) scans. Ad ditionally, employing multi-axis analysis of 3D cross-sectional imaging can substantially improve diagnostic performance. Inspired by these concepts, we propose a 3D deep learning model em ploying a multi-view, dual encoder-decoder architecture. The model also incorporates various architecture-wise enhancements, including an end-to-end cascaded approach for transitioning from coarse to fine segmentation, 3D Attention modules for maintaining consistency between encoder and decoder pairs, and dual-task learning. In our study, we apply this model to pro cess FLAIR MRI volumes alongside corresponding cortical thickness maps, aiming to e”ectively detect FCD lesions. Generative Adversarial Networks (GANs) have significantly impacted the field of image anal ysis, and they have been successfully employed for tasks such as image segmentation. Hence, this study also proposes a 3D attention-driven Vox2Vox CNN network that leverages the power of a 3D GAN to accurately segment acute stroke lesion cores in Computed Tomography Per fusion (CTP) scans. This methodology also incorporates valuable insights derived from our prior models relevant to this research. The segmentation framework incorporates two super vised GAN components: a generator and a discriminator. The generator module is designed to process 3D slices from CTP maps and learn to generate 3D binary prediction masks that closely match the ground truth for stroke lesions. Concurrently, the discriminator module is trained to distinguish between the outputs generated by the generator and the actual ground truth. Overall, this thesis demonstrates the e!cacy of 3D deep learning in identifying malig nancies from cross-sectional imaging modalities, including CT and MRI, thereby enhancing the capabilities of automated Computer-Aided Detection (CAD) systems.
  • Thumbnail Image
    Item
    Development Of Limited Supervised Deep Learning Methods For Biomedical Image Analysis
    (National Institute Of Technology Karnataka Surathkal, 2023) S J, Pawan; Rajan, Jeny
    Over the past few years, the computer vision domain has evolved and made a revo- lutionary transition from human-engineered features to automated features to address challenging tasks. Computer vision is an ever-evolving domain, having its roots deeply correlated with neuroscience. Any new findings that trigger a more intuitive under- standing and working of the human brain generally impact the design of computer vision algorithms. The convolutional neural network is one such algorithm that has become the de facto standard for most computer vision tasks, such as image classifi- cation, object detection, image segmentation, etc. However, the performance of CNNs is highly dependent on labeled data, making their practicability difficult in scenarios lacking sufficient labeled data, especially in medical applications. Therefore, it is im- perative to develop deep learning methods with limited supervision. In light of this, we explore the dimensions of deep learning with limited supervision through capsule networks and semi-supervised learning for biomedical image analysis, with a primary focus on segmentation. In this thesis, we have systematically reviewed various techniques for handling deep learning with limited labeled data, focusing on capsule networks and consis- tency regularization-driven semi-supervised learning. Capsule networks have shown immense potential for image classification tasks. However, extending it to pixel-level classification or segmentation is difficult. It poses numerous challenges, including the exponential growth of trainable parameters, expensive computation, and extensive memory overhead. In this regard, we propose DRIP-Caps, a Dilated Residual Incep- tion and Capsule Pooling framework that makes the capsule network lightweight by re- ducing the computation complexity without compromising performance on the CSCR (central Serous Chorioretinopathy) dataset. viiiSemi-supervised learning is a major discipline that alleviates the requirement for labeled data by incorporating labeled and unlabeled data to formulate pertinent infor- mation. We present a semi-supervised framework based on a mixup operation-driven consistency constraint for medical image segmentation by incorporating geometric con- straints regressing over the signed distance map (SDM) of the object of interest, achiev- ing superior performance on the publicly available ACDC and LA datasets. We also propose a novel semi-supervised framework for enforcing dual consistency (data level and network level) with the two-stage pre-training approach through networks of differ- ent learning paradigms enforcing both local and global semantic affinities, improving the overall performance. We envision these methods serving a major role in alleviating the tedious labeling process as far as the segmentation task is concerned.
  • Thumbnail Image
    Item
    Content-based Music Information Retrieval (CB-MIR) and its Applications Towards Music Recommender System
    (National Institute of Technology Karnataka, Surathkal, 2019) Murthy, Y V Srinivasa.; Koolagudi, Shashidhar G.
    Music is a pervasive element of human’s day-to-day activities. Most of the people love to listen to music all the time for handling their stress and tensions. Some are capable of creating the music. The importance of music for human beings has exploited the advancements in technology resulting in an enormous number of digital tracks. However, a majority of tracks are available with an inadequate meta-information. The meta-information is limited to the song title, album name, singer name and composer. Now, the question is how to organize them effectively in order to retrieve the relevant clips quickly, without proper meta-information like genre, lyrics, raga, mood, instrument names, etc. The process of labelling the meta-information manually for millions of tracks of the digital cloud is practically not possible. Hence, an area of research known as music information retrieval (MIR) has been introduced in the early years of 21st century. However, it acquired much attention of researchers since 2005 with the support of Music Information Retrieval Evaluation eXchange (MIREX)1 competition. There are several works that have been proposed for various tasks of MIR such as singing voice detection, singer identification, genre classification, instrument identification, music mood estimation, lyrics generation, music annotation and so on. However, the main focus is on Western music, and only a few works are reported on Indian songs in the literature. Since Indian popular songs are contributing to a major portion of the global digital cloud, in this thesis, an attempt has been made to develop a few useful MIR tasks such as vocal and non-vocal segmentation, singer identification, music mood estimation and development of music recommender system in Indian scenario. Efforts have been put to construct relevant databases with a possible coherence for all the tasks mentioned above. Results include comparative analysis with standard datasets such as MIR-1K and artist20 are given. For each of the four tasks, some novel approach has been presented in this thesis. First, the task of vocal and non-vocal segmentation has been choosen to locate the onset and offset points of singing voice regions. A set of novel features such as formant attack slope (FAS), formant heights from base-to-peak (FH1), formant angle values at peak (FA1),formant angle values of valley (FA2), and singer formant (F5) have been computed and used for discriminating vocal and non-vocal segments. Also, an attempt has been made to develop a feature selection algorithm based on the concepts of genetics, known as genetic algorithm based feature selection (GAFS). The list of observations made out of this experimentation using selected features on the Indian and Western databases has been reported. Second, the task of singer identification (SID) has been considered. A database with the songs of 10 male and 10 female singers has been constructed. The songs are taken from two popular cine industries of Indian subcontinent named Tollywood (Telugu) and Bollywood (Hindi). Various timbral and temporal features have been computed to analyze their effect on singer identification with different classifiers. However, the feature based systems are found to be less effective, and hence the trending convolutional neural networks (CNNs) have been used with spectrograms of song clips as inputs. Identifying mood of the song has been considered as a third objective for this thesis. Six different moods are identified based on the analysis done on the combination of Russell’s and Thayer’s models (Saari and Eerola, 2014). We have developed, a two-level classification model for music mood detection. In the first stage, songs have been categorized into energetic or non-energetic songs. The actual class label has been predicted in the second stage. The performance of the system is found to be better in this case compared to development of single phase classification recommender system has been taken up using the labels like the title of a track, singer name(s), mood of a song, and duration. The graph structure based recommendation system has been proposed in this work to estimate the similarity in the listening patterns of same listeners. A graph has been constructed for every user by considering songs as nodes. Further, the similarities are estimated using the adjacency matrices obtained on listening patterns. This approach could be more appropriate for improving the performance of song recommender systems.
  • Thumbnail Image
    Item
    Automatic Segmentation of Intra-Retinal Cysts from Optical Coherence Tomography Scans
    (National Institute of Technology Karnataka, Surathkal, 2018) G. N., Girish; Rajan, Jeny
    Retinal cysts are formed by accumulation of fluid in the retina caused by leakage due to blood retinal barrier breakdown from inflammation or vascular disorders. Analysis of retinal cystic spaces holds significance in detection, treatment and prognostication of several ocular diseases like age-related macular degeneration, diabetic macular edema, etc. Segmentation of intra-retinal cysts (IRCs) and their quantification is important for retinal pathology and severity characterization. In recent years, automated segmentation of intra-retinal cysts from optical coherence tomography (OCT) B-scans has gained significance in the field of retinal image analysis. In this thesis, a benchmark study is conducted to compare existing methods to identify the factors affecting IRC segmentation from OCT scans. A modular approach is employed to standardize the different IRC segmentation algorithms followed by analysis of variations in automated cyst segmentation performances and method scalability across image acquisition systems are done by using publicly available cyst segmentation challenge dataset (OPTIMA cyst segmentation challenge). Such exhaustive analysis on the scalability of OCT cyst segmentation methods in terms of methodological and input data variations has not been done before. An efficient cyst segmentation technique must be capable of performing cyst identification and delineation with minimum errors. Several methods proposed in the literature fail to delineate cysts up to their true boundary. To address this problem, an unsupervised vendor dependent method using marker controlled watershed transformation is proposed in this thesis. The method is based on two stages- k-means clustering technique is used to identify cysts in the form of marker, followed by topographical based watershed transform for final segmentation. Qualitative and quantitative evaluation of vthe proposed method is carried out against ground truth obtained from two graders on OPTIMA cyst segmentation challenge dataset (Spectralis Vendor OCT scans). Obtained results show that the proposed method outperformed other considered unsupervised methods. Several segmentation methods have been proposed in the literature for IRC segmentation on vendor-specific OCT images, but these lack generalizability across imaging systems. To address this issue, a fully convolutional network (FCN) model for vendor-independent IRC segmentation is proposed in this thesis. The proposed FCN was trained using the OPTIMA cyst segmentation challenge dataset (with four different vendor-specific images, namely, Cirrus, Nidek, Spectralis and Topcon). This method counteracts image noise variabilities and model over-fitting by data augmentation and hyper-parameter optimization. Additionally, sensitivity analysis of the model hyperparameters (depth and receptive field size) is performed to optimize the proposed FCN. The Dice Correlation Coefficient of the proposed method outperforms the algorithms published in the OPTIMA cyst segmentation challenge. Deeper FCNs exhibit better feature learning capabilities than shallower networks but those are computationally intensive due to large number of computation parameters and may be prone to vanishing gradient problem. To address this issue, a depthwise separable convolutional filter based end-end convolutional neural network architecture with swish activation functions is proposed in this thesis. OPTIMA cyst segmentation challenge dataset with four different vendor scans were used to evaluate the proposed architecture for vendor independent IRC segmentation task. Obtained experimental results show that the proposed method significantly reduced the number of computation parameters compared to regular convolution based FCN.