2. Thesis and Dissertations
Permanent URI for this communityhttps://idr.nitk.ac.in/handle/1/10
Browse
2 results
Search Results
Item Automatic Detection of Malignancy in Low-Magnification Effusion Cytology Images(National Institute of Technology Karnataka, Surathkal, 2024) Aboobacker, Shajahan; Deepu, Vijayasenan; Sumam David S.This thesis focuses on developing an integrated system for automatically detecting malignancy in effusion cytology images. Effusion cytology plays a crucial role in diagnosing various diseases, including cancer, by analyzing the cells present in body fluids. This study aims to develop an integrated system that can handle different resolutions of images and accurately detect malignant cases. Effusion cytology greatly benefits from the automatic detection of malignant cells, providing significant assistance to cytopathologists. However, conventional automation algorithms often rely on high-magnification images for analysis. In contrast, cytopathologists consider multiple magnifications when evaluating cytology images for malignancy. Lower magnification images capture a larger area in a single frame compared to high magnification images. This allows cytopathologists to identify regions of interest (ROI) using textural and morphological characteristics of cell clusters. Once identified, the ROIs are examined at a higher magnification for a closer evaluation at the cell level. Initially, we trained the existing state-of-the-art models with high magnification images for the semantic segmentation and classification of effusion cytology images. We obtained state-of-the-art results for the semantic segmentation task with a mean F-score of 0.82 and classification performance with a sensitivity value of 1, specificity of 0.85, and an area under the curve (AUC) of 0.98. However, using lower magnification images can be beneficial in identifying malignant areas, as it reduces memory requirements and scanning time by focusing only on the ROI at higher magnification. However, detecting malignancy in low-magnification images (4X) is challenging due to the blurring of features such as texture and nuclei. This blurring also makes it difficult to label the images accurately at low magnification levels. Therefore, an alternative method is needed that doesn’t rely on labels for the lowest magnification. We propose two methods for the semantic segmentation of low-magnification images. The first method is based on semi-supervised learning, and the second uses a combination of unsupervised, few-shot and weakly supervised learning. In the semi-supervised approach, we have extended the MixMatch and Fixi Match algorithms from the classification task to semantic segmentation. We used augmentation of the images and reverse augmentation of the predicted label to achieve this. The proposed methods allow using the 4X images without any labels along with the labelled 10X images to train the semantic segmentation model. The average F-score of benign and malignant pixels on the predictions of 4X images using the Extended FixMatch and Extended MixMatch has improved approximately by 9% compared with the predictions of 4X data on the semantic 10X model. The Extended MixMatch reduces the area to be scanned at a higher magnification by approximately 62%. Only 38% of sub-regions of low-magnification images have to be scanned at a higher magnification, thereby saving scanning time. In the context of semi-supervised learning for semantic segmentation of low-magnification images, it is worth noting that while we can reduce the reliance on pixel-wise labels for 4X magnification data, we still require labelled data at a higher magnification level, specifically at 10X. We propose WeakSegNet, a novel approach that combines unsupervised, few-shot and weakly supervised methods for the semantic segmentation of low-magnification effusion cytology images. By leveraging image-level labels and a small number of images with pixel-wise labels, our model achieves accurate and efficient detection of malignancy. Our approach utilizes unlabeled low-magnification images for training, reducing the need for manual annotations. The significant elimination (approximately 47%) achieved by our model in higher magnification scanning demonstrates its potential for time and resource savings. Overall, our approaches offer an effective solution for automating malignancy detection in low-magnification images, improving efficiency in cytology analysis.Item A Region Based Semantic Composition Framework to Visual Image and Video Event Specificatioa(National Institute Of Technology Karnataka Surathkal, 2023) Naik, Dinesh; C D, JaidharA long-standing goal of artificial intelligence in Computer Vision has been to de- velop models capable of perceiving and comprehending the complex visual environ- ment around us and communicating with us in natural language about it. Significant progress has been achieved toward this goal over the last few years as a result of paral- lel advancements in computing systems, data collection, and algorithms. Visual recog- nition has advanced at a breakneck pace, with computers now capable of classifying images, recognising them, and describing them in even longer words. They exceed humans in various categories, even surpassing them in some instances. Despite tremen- dous progress, the majority of improvements in visual recognition continue to occur when an image is labelled with one or a few different labels and swiftly explained in natural language. The majority of people find it straightforward to watch a brief video and describe what occurred (in words). Machines have a difficult time extracting meaning from video frames and generating a sentence description. Computer vision research has long been focused on comprehending visual media, such as images and videos. Additionally, a new issue within the scope of this study area, dynamic image and video transcription, has sparked the interest of a large number of people. This re- search presents models and methods for associating visual data with semantic labels and visual data with natural language utterances, thereby simplifying translation be- tween domain constituents. Semantic segmentation is a fundamental component of object recognition models, as it aims to classify things on a pixel-by-pixel basis. The primary goal of this re- search is to classify an individual object within an image pixel by pixel. The provided image is evaluated to ascertain the pixel-level properties that are present. Second, we suggested an encoder-decoder architecture with a hybrid loss function that employs a layered LSTM as the encoder and an LSTM model combined with an attention mecha- nism as the decoder. Thirdly, we propose a unique framework for video captioning that combines a bidirectional multi-layer LSTM encoder and a unidirectional decoder with a temporal attention technique to produce superior global representations for videos. Finally, we propose an efficient method for captioning videos using CNN in conjunc- tion with a short-connected LSTM-based encoder-decoder model and a phrase context vector.
