2. Thesis and Dissertations

Permanent URI for this communityhttps://idr.nitk.ac.in/handle/1/10

Browse

Search Results

Now showing 1 - 4 of 4
  • Thumbnail Image
    Item
    A Framework for Human Activity and Behavioural Pattern Recognition in Multimodal Sensor Smart Home Environment
    (National Institute of Technology Karnataka, Surathkal, 2024) Kolkar, Ranjit; Geetha V.
    Human Activity Recognition (HAR) has become a subject of significant interest due to its potential applications in various fields, including healthcare, sports, and user profiling. There are four main types of sensor-based HAR: wearable, ambient, camera, and hybrid sensor-based recognition. Smartphones, with their built-in sensors, have emerged as valuable tools for HAR, and other sensors like Passive Infrared (PIR), load sensors, smart switches, and smartwatches are extensively used in HAR systems along with vision-based sensors. Despite advancements, accurately recognizing human activities remains challenging due to the complexity and diversity of sensors used and the intricate nature of human activities. Each sensor type has advantages and limitations, making selecting appropriate sensors a challenging task requiring a comprehensive understanding of their characteristics. While there are existing applications of HAR, there are still significant opportunities to address various challenges. This work addresses several challenges in improving recognition efficiency, integrating multimodal sensors, achieving synchronization between heterogeneous sensors, collecting long-hour data using these sensors, and developing a cost-effective framework for human activity recognition and behavioral patterns in the daily life of an elderly person. The thesis work addresses the challenges and develops a framework for HAR and behavioural pattern recognition using multimodal sensors in a smart home environment. First, we design and develop a deep learning-based solution to recognize the activities based on sensors present in the smartphone. Later, we create and curate a dataset for long-hour human activities in a multimodal sensor-equipped smart home environment and follow to design and develop a human behavioural pattern recognition system in a smart home environment. The first work focuses on comparing the performance of various deep learning models Convolutional Neural Networks (CNN), Long Short-Term Memory (LSTM), and Gated Recurrent Unit (GRU) for HAR using smartphone-based sensors. The study explored various datasets and recognition models, providing valuable insights into the overall HAR architecture. The primary objective of this research is to accurately recognize basic human activities such as walking, sitting, standing, going upstairs, going downstairs, and lying down. The models were trained and evaluated on well-known datasets like Wireless Sensor Data Mining (WISDM) and University of California, Irvine, Human Activity Recognition (UCI-HAR). Through rigorous experimentation, the performance of the models on these datasets was significantly improved using the GRU model, laying the foundation for the subsequent research objectives. Additionally, the thesis proposed a novel approach called Spider Monkey Optimization (SMO)-based deep neural network to enhance HAR’s accuracy and precision further. The proposed system was evaluated on various datasets involving similar activities, including UCI-HAR, WISDM, Royal Institute of Technology (KTH) action, and Physical Activity Monitoring using Accelerometers, Gyroscopes, and Magnetometers (PAMAP2). The optimization improved the performance and the reduced training time, making it practical for real-world applications. The second work in the thesis involves the collection of long-hour datasets using a multimodal approach. It has been observed from the literature and our previous work that understanding human behaviour patterns solely based on basic activities and smartphone sensors is challenging. Therefore, in this work, we combined smartphone sensor data with ambient sensors to better understand the user’s context. The context includes room occupancy detection using PIR sensors, water bottle level indication using load sensors, and monitoring the status of the TV, bathroom lights, and mirror bulb lights using smart switches. We derived a broader range of activities beyond the basic ones by combining and proposing a hybrid sensor-based data collection approach for two individuals over an extended period. The third work in the thesis also proposes a novel priority-based labelling technique for data segmentation to retain user context while labelling. This enhanced dataset enables us to gain valuable insights into human behaviour patterns in dayto- day life. Additionally, through a comprehensive analysis of user data, we can derive the user’s personality and provide feedback on their behaviour patterns to improve or analyze activities performed over time. The research identifies various applications, such as elderly monitoring systems, personality identification, and behaviour analysis, all aimed at improving health and well-being. KEYWORDS: HAR, SMO, Wearable sensors Smartphone sensors, Deep learning, Ambient sensors, Internet of Things (IoT), PIR, Elderly monitoring, User profiling, Behaviour patterns.
  • Thumbnail Image
    Item
    Unobtrusive Context-Aware Human Identification and Action Recognition System for Smart Environments
    (National Institute Of Technology Karnataka Surathkal, 2023) M, Rashmi; Reddy Guddeti, Ram Mohana
    A smart environment has the ability to securely integrate multiple technological so- lutions to manage its assets, such as the information systems of local government de- partments, schools, transportation networks, hospitals, and other community services. They utilize low-power sensors, cameras, and software with Artificial Intelligence to continuously monitor the system’s operation. Smart environments require appropriate monitoring technologies for a secure living environment and efficient management. Global security threats have produced a considerable demand for intelligent surveil- lance systems in smart environments. Consequently, the number of cameras deployed in smart environments to record the happenings in the vicinity is increasing rapidly. In recent years, the proliferation of cameras such as Closed Circuit Television (CCTV), depth sensors, and mobile phones used to monitor human activities has led to an ex- plosion of visual data. It requires considerable effort to interpret and store all of this visual data. Numerous applications of intelligent environments rely on the content of captured videos, including smart video surveillance to monitor human activities, crime detection, intelligent traffic management, human identification, etc. Intelligent surveillance systems must perform unobtrusive human identification and human action recognition to ensure a secure and pleasant life in a smart environment. This research thesis presents various approaches using advanced deep learning technol- ogy for unobtrusive human identification and human action recognition based on visual data in various data modalities. This research thesis explores the unobtrusive identifica- tion of humans based on skeleton and depth data. Also, several methods for recognizing human actions using RGB, depth, and skeleton data are presented. Initially, a domain-specific human action recognition system employing RGB data for a computer laboratory in a college environment is introduced. A dataset of human actions particular to the computer laboratory environment is generated using sponta- neous video data captured by cameras installed in laboratories. The dataset contains several instances of five distinct human actions in college computer laboratories. Also, human action recognition system based on transfer learning is presented for locating and recognizing multiple human actions in an RGB image. Human action recognition systems based on skeleton data is developed and evalu- ated on publicly available datasets using benchmark evaluation protocols and metrics. The skeleton data-based action recognition mainly concentrates on the 3D coordinates of various skeleton joints of the human body. This research thesis presents several ef- ficient action representation methods from the data sequence in skeleton frames. Askeleton data-based human action recognition system places the skeleton joints in a specific order, and the distance between joints is extracted as features. A multi-layer deep learning model is proposed to learn the features and recognize human actions. Human gait is one of the most useful biometric features for human identification. The vision-based gait data allows human identification unobtrusively. This research thesis presents deep learning-based human identification systems using gait data in skeleton format. We present an efficient feature extraction method that captures human skeleton joints’ spatial and temporal features during walking. This specifically focuses on the features of different gait events in the entire gait cycle. Also, deep learning models are developed to learn these features for accurate human identification systems. The developed models are evaluated on publicly available single and multi-view gait datasets using various evaluation protocols and performance metrics. In addition, multi-modal human action recognition and human identification sys- tems are developed using skeleton and depth data. This presents efficient image rep- resentations of human actions from the sequence of frames in skeleton and depth data formats. Various deep learning models using CNN, LSTM, and advanced techniques such as Attention is presented to extract and learn the features from image represen- tation of the actions. Also, another work presents a method focusing on overlapping sub-actions of action in depth and skeleton format for action representation and fea- ture extraction. In addition, the image representation of the gait cycle in skeleton and depth data, along with a deep learning model, is proposed. Multi-stream deep learning models are proposed to learn features from multi-modal data for human action recogni- tion and human identification. In addition, various score fusion operations are proposed to merge the results from multiple streams of deep learning models to ensure efficient performance. The developed systems are evaluated on publicly available multi-modal datasets for human actions and human gait using standard evaluation protocols.
  • Thumbnail Image
    Item
    Deep Learning For Nuclei Segmentation and Classification of Histopathology Images
    (National Institute Of Technology Karnataka Surathkal, 2023) Chanchal, Amit Kumar; Lal, Shyam
    To improve the process of diagnosis and treatment of cancer disease, automatic segmentation and classification of haematoxylin and eosin (H & E) stained histopathology images are important steps in digital pathology. The advent of new computation systems like GPU, fast digital scanners, and the availability of lots of data, Deep Learning (DL) techniques have shown superior perfor- mance in different applications of medical image analysis. The potential and applicability of deep learning models for the analysis of histopathology images have been demonstrated by many researchers. Due to variations in the appear- ance and complex clinical structure of histopathology slides, reported results still needed to be improved for accurate diagnosis of disease. An accurate and efficient classification algorithm that exactly resembles the clinical feature of cancer disease is still open-ended research. This thesis investigates a detailed methodology for the design and implementation of deep learning architectures which includes nuclei detection and segmentation, characterization of subtypes of cancer, and grading of histopathological tissues. In the first part of the thesis, the analysis of histopathology images by us- ing efficient segmentation algorithms is presented. In this study, an effective encoder-decoder architecture with a separable convolution pyramid pooling network (SCPP-Net) is designed and implemented for automatically segment- ing complex nuclei present in digital histopathology images. The SCPP unit focuses on two aspects: first, it increases the receptive field by varying four different dilation rates, keeping the kernel size fixed, and second, it reduces the trainable parameter by using depth-wise separable convolution. For multi- organ histopathology analysis, a new deep learning framework is proposed, that consists of a high-resolution encoder path, an atrous spatial pyramid pooling (ASPP) bottleneck module, and a powerful decoder. The proposed network is wide and deep that effectively leverages the strength of residual learning as well as encoder-decoder architecture. The problem of the vanished bound- ary of detected nuclei is addressed by proposing an efficient loss function that better trains the proposed deep structured residual encoder-decoder network (DSREDN) and reduces the false prediction. The obtained score of nuclei segmentation indicated that the proposed architectures achieved a considerable margin over state-of-the-art deep learning models on three different publicly iavailable histopathology image datasets. Next, in the thesis, a novel dataset and an efficient deep-learning framework for the classification of subtypes of renal cell carcinoma (RCC) from kidney histopathological images are proposed. The proposed RenalNet is intended to capture cross-channel and inter-spatial features at three different scales paral- lelly and held them together. The proposed model contains a new convolu- tional neural network (CNN) block called multiple channel residual transfor- mation (MCRT), to focus on the most relevant morphological features of RCC by fusing the information of multiple paths. Further, to improve the network’s representation power, a novel block called group convolution deep localization (GCDL) is introduced that effectively integrates three different feature descrip- tors. A new benchmark dataset for the classification of subtypes of RCC from kidney histopathology images is also introduced as a part of this study. The re- sults of the proposed model are compared with the existing DL models trained from scratch as well as networks leveraged by transfer learning of pre-trained weights. During the experimentation, the proposed network achieved an accu- racy 91.67%, and F1-Score 91.65% on the proposed kidney dataset that is the highest among all competitive models. The experimental results show that the proposed RenalNet architecture is best in terms of training and prediction time, classification accuracy, F1 score, and computational complexity. A pathologist report affirmed that the stage and grade of diagnosis is the most important prognostic factor. In these cases, continuous staging and grading evaluation is extremely important for the clinical management of patients. This study proposed a robust and computationally efficient fully automated Renal Cell Carcinoma Grading Network (RCCGNet) from kidney histopathology im- ages. The proposed shared channel residual (SCR) block shares the information between two different layers and operates the shared data separately by provid- ing beneficial supplements to each other. As a part of this study, a new dataset also has been introduced for the grading of RCC with five different grades. The simulation results include deep learning models trained from scratch as well as transfer learning techniques using pre-trained weights of the ImageNet. The performance of the proposed RCCGNet is evaluated by the most preferred quality metrics and achieved 90.14% of accuracy, and 89.06% F1-score on the introduced kidney dataset. iiAnother proposed architecture is called Robust CNN (RoCNN) for grading (Normal, Grade-1, Grade-2, Grade-3, and Grade-4) and classification (Normal, KIRC, KIRP, KICH) in kidney cancer tissue. To demonstrate that the proposed model is generalized and independent of the dataset, it has experimented on two well-known datasets, the KMC kidney dataset of five different grades and the TCGA dataset of four classes. The RoCNN is capable of learning features at varying convolutional filter sizes because of the inception modules employed in it. Squeeze and Extract (SE) blocks are used to remove unnecessary contri- butions from noisy channels and improve model accuracy. Regarding the com- putational complexities the proposed RoCNN is extremely efficient compared to the reference models. Due to a substantial reduction in the computational complexities, incorporation of the proposed method into FPGA board process- ing for next-generation histopathological image analysis is a significant step in the right direction. Compared to the best-performing state-of-the-art model the accuracy of RoCNN shows significant improvement of about 4.22% and 3.01% for two different datasets. All proposed deep learning algorithms evolved to be the most promising, stable, and computationally efficient for the analysis of histopathological images.
  • Thumbnail Image
    Item
    Computational Analysis of Protein Structure and its Subcellular Localization using Amino Acid Sequences
    (National Institute of Technology Karnataka, Surathkal, 2021) Bankapur, Sanjay S.; Patil, Nagamma.
    A cell is the basic unit of all organisms. In a cellular life cycle, various complex metabolic activities are being carried out in different cell compartments. Protein plays an important role in many complex metabolic activities. Proteins are generated in the post-transcriptional modification activity of a cell. Initially, the generated proteins are in the linear structure and it is called as protein primary structure. Within the cell, proteins tend to move from one compartment (subcellular location) to other compartments, and based on the environment (in which the primarily structured proteins reside), primary structured proteins transform into secondary and tertiary structures. Tertiary structured proteins interact with nearby structured proteins to form a quaternary structure. A protein performs its biological functions when it attains its respective tertiary structure. Identification of a protein structure and its subcellular locations are challenging and important tasks in the field of medical science. Various health issues are identified and solved via novel drug discoveries and a prior and accurate knowledge of protein structure and its subcellular location helps in developing a respective drug. In order to identify protein structure and its subcellular locations, various biological methods such as X-ray crystallography, nuclear magnetic resonance spectroscopy, cell fractionation, fluorescence microscopy, and electron microscopy are used. The main advantage of biological methods is that they are accurate in identifying protein structures and its subcellular locations. The disadvantages of biological methods are that they are time-consuming and very expensive. In this post-genomic era, high-volumes of protein primary structures are decoded by various research communities and are added to protein data banks. Identification of protein structure and its subcellular locations using biological methods are not a feasible option for high-volumes of proteins. Over the decades, various computational methods have been proposed to identify protein structure and its locations; however, the existing computational methods exhibit limited accuracy and hence they are less effective. The main objective of this thesis is to propose effective computational models that contribute to the prediction of protein structure and its subcellular locations. In this regard, four important and specific problems of protein structure and its subcellular location have been solved and they are: (i) multiple sequence alignment, (ii) protein secondary structural class prediction, (iii) protein fold recognition, and (iv) protein subcellular localization prediction. The importance of multiple sequence alignment is that a vital and consistent homologous pattern of proteins can be captured and these patterns will further help in solving protein structure and its subcellular locations. The proposed alignment method includes three main modules: a) an effective scoring system to score the quality of the aligned sequences, b) a progressive-based alignment approach is adopted and modified to align multiple sequences, and c) the aligned sequences are refined using the proposed polynomial-time complexity-based single iterative optimization framework. The proposed method has been assessed on publicly available benchmark datasets and recorded 17.7% improvement over the CLUSTAL X model on the BAliBASE dataset. Identification of protein secondary structural class is one of the important tasks that further help in the prediction of protein tertiary structure. Protein secondary structural class prediction is a supervised problem that falls under the multi-class category. The proposed protein secondary structural class prediction model contains a novel feature modelling strategy that extracts global and local features followed by a novel ensemble of classifiers to predict structural class. The proposed model has been assessed on both publicly available benchmark datasets and derived latest high-volume datasets. The performance of the proposed model recorded an improvement of 5.3% on the 25PDB dataset over one of the best predictors from the literature. A protein fold recognition is a categorization of various folds of a protein that exhibits in tertiary structure. Protein fold recognition is a supervised problem that falls under the multi-class category. The proposed fold recognition model contains a novel and effective feature modelling approach that includes Convolutional and SkipXGram bi-gram techniques to extract global and local features followed by an effective deep learning framework for fold recognition. The proposed model has been assessed on both publicly available benchmark datasets and derived latest high-volume datasets. The performance of the proposed model recorded a relative improvement of 5% on the DD dataset over one of the best predictors from the literature. An effective protein sub-chloroplast localization prediction model is proposed to solve one-level more microscopic problem of subcellular localization. Protein subchloroplast localization is a supervised problem that falls under the multi-class and multi-label category. The proposed protein sub-chloroplast localization prediction model contains a novel feature extraction technique such as SkipXGram bi-gram followed by a deep learning framework for multi-label classification. The proposed model has been assessed on publicly available benchmark datasets and recorded an improvement of (absolute) 30.39% on the Novel dataset over the best predictor from the literature.