Faculty Publications
Permanent URI for this communityhttps://idr.nitk.ac.in/handle/123456789/18736
Publications by NITK Faculty
Browse
14 results
Search Results
Item BEV Detection and Localisation using Semantic Segmentation in Autonomous Car Driving Systems(Institute of Electrical and Electronics Engineers Inc., 2021) Ashwin Nayak, U.; Naganure, N.; Kamath S․, S.In autonomous vehicles, the perception system plays an important role in environment modeling and object detection in 3D space. Existing perception systems use various sensors to localize and track the surrounding obstacles, but have some limitations. Most existing end-to-end autonomous systems are computationally heavy as they are built on multiple deep networks that are trained to detect and localize objects, thus requiring custom, high-end computation devices with high compute power. To address this issue, we propose and experiment with different semantic segmentation-based models for Birds Eye View (BEV) detection and localization of surrounding objects like vehicles and pedestrians from LiDAR (light detection, and ranging) point clouds. Voxelisation techniques are used to transform 3D LiDAR point clouds to 2D RGB images. The semantic segmentation models are trained from the ground up on the Lyft Level 5 dataset. During experimental evaluation, the proposed approach achieved a mean average precision score of 0.044 for UNET, 0.041 for SegNet and 0.033 for FCN, while being significantly less compute-intensive when compared to the state-of-the-art approaches. © 2021 IEEE.Item M-CAD: Towards Multi-Categorical Auto Diagnosis of Varied Diseases using Deep Learning(Institute of Electrical and Electronics Engineers Inc., 2021) Praveen, K.; Patil, N.; Srikanth, C.S.; Nayaka, J.The economic burden and the number of lives lost due to diagnostic errors are higher than ever due to the onset of pandemics and new viruses, Specially in medium and low-economic status nations (including India) are affected heavily in terms of capital and human resources. Due to limited expertise in diagnostic technologies in remote parts of India and many low-economic nations of Africa, autonomous diagnostics can save millions of lives and lower the costs. To accomplish this goal we propose a method that uses modern developments in Deep Learning in semantic segmentation and classification to predict multiple diseases from multiple medical images. To conduct the study we test the model with Dermoscopy images and CT-Scans to predict 8 classes relating to Melanoma cancer, Covid-19 virus and different types of Carcinoma. The setup is tested on largest publicly available ISIC Dermoscopy dataset, 1061 CT-scan images combined for the classification and Segmentation(only for Melanoma). Classification model(M-CAD) is progressively tested by increasing the number of classes and data that it trains on. This pilot study is conducted on a small subset of the complete data, segmentation of Melanoma images obtained an accuracy of 96.6% compared to human expert agreement which is 90.9%. we were able to produce average accuracy of 81.5% and AUC of 0.94 for 6 classes using CT-Scans whereas accuracy and AUC for all the 8 classes is 80.2% and 0.97 respectively. These results were quite promising for a model that classifies different images with no apparent relation at all. © 2021 IEEE.Item Semantic Segmentation on Low Resolution Cytology Images of Pleural and Peritoneal Effusion(Institute of Electrical and Electronics Engineers Inc., 2022) Aboobacker, S.; Verma, A.; Vijayasenan, D.; Sumam David, S.; Suresh, P.K.; Sreeram, S.Automation in the detection of malignancy in effusion cytology helps to save time and workload for cytopathologists. Cytopathologists typically consider a low-resolution image to identify the malignant regions. The identified regions are scanned at a higher resolution to confirm malignancy by investigating the cell level behaviour. Scanning and processing time can be saved by zooming only the identified malignant regions instead of entire low-resolution images. This work predicts malignancy in cytology images at a very low resolution (4X). Annotation of cytology images at a very low resolution is challenging due to the blurring of features such as nuclei and texture. We address this issue by upsampling the very low-resolution images using adversarial training. This work develops a semantic segmentation model trained on 10X images and reuse the network to utilize the 4X images. The prediction results of low resolution images improved by 15% in average F-score for adversarial based upsampling compared to a bicubic filter. The high resolution model gives a 95% average F-score for high resolution images. Also, the sub-area of the whole slide that requires to be scanned at high magnification is reduced by approximately 61% while using adversarial based upsampling compared to a bicubic filter. © 2022 IEEE.Item ACR2UNet: Semantic Segmentation of Remotely Sensed Images using Residual-Recurrent UNet and Asymmetric Convolutions(Institute of Electrical and Electronics Engineers Inc., 2023) Putty, A.; Annappa, B.Land-use and land-cover (LULC) mapping is one of the significant components in environmental monitoring. LULC mapping, necessary to manage the vital resource of land, has been achieved, in recent years, by segmenting remotely sensed images (RSIs). A standard paradigm for segmentation is UNet, and this paper proposes a novel asymmetric convolutional residualrecurrent UNet architecture, which utilizes the power of asymmetric convolutions as well as residual and recurrent techniques for mapping RSIs. The proposed methodology has a couple of additional advantages. First, asymmetric convolution operations strengthen the square kernels and enhance the semantic feature space. Further, a recurrent network assists in providing rich local contextual information with the help of residual inputs. The presented model is evaluated on the WHDLD dataset for LULC segmentation and is found to achieve an improvement of 1-2% in the mIoU score compared to state-of-the-art methods. © 2023 IEEE.Item Semantic Segmentation of Remotely Sensed Images using Multisource Data: An Experimental Analysis(Institute of Electrical and Electronics Engineers Inc., 2024) Putty, A.; Annappa, B.; Prajwal, R.; Pariserum Perumal, S.P.Remotely sensed data obtained from diverse sensors provide rich information for a wide range of applications in remote sensing, such as land use and land cover mapping. Due to the availability of a large amount of data, advanced deep-learning techniques have been incorporated into this domain. However, these techniques require a significant amount of annotated data, which can be challenging to obtain for land-use and land-cover mapping. Multisource data fusion has become crucial in remotely sensed image analysis to overcome this challenge, providing significant benefits across various applications. This paper analyzes the fusion of multisource data tailored for land-use and land-cover mapping. The analysis showcases that incorporating the novel knowledge transfer approach from multisource data has helped to achieve a 1-6% improvement in mIoU for the Kaggle Aerial Image dataset. © 2024 IEEE.Item DIResUNet: Architecture for multiclass semantic segmentation of high resolution remote sensing imagery data(Springer, 2022) Priyanka; Sravya, N.; Lal, S.; Nalini, J.; Chintala, C.S.; Dell’Acqua, F.Scene understanding is an important task in information extraction from high-resolution aerial images, an operation which is often involved in remote sensing applications. Recently, semantic segmentation using deep learning has become an important method to achieve state-of-the-art performance in pixel-level classification of objects. This latter is still a challenging task due to large pixel variance within classes possibly coupled with small pixel variance between classes. This paper proposes an artificial-intelligence (AI)-based approach to this problem, by designing the DIResUNet deep learning model. The model is built by integrating the inception module, a modified residual block, and a dense global spatial pyramid pooling (DGSPP) module, in combination with the well-known U-Net scheme. The modified residual blocks and the inception module extract multi-level features, whereas DGSPP extracts contextual intelligence. In this way, both local and global information about the scene are extracted in parallel using dedicated processing structures, resulting in a more effective overall approach. The performance of the proposed DIResUNet model is evaluated on the Landcover and WHDLD high resolution remote sensing (HRRS) datasets. We compared DIResUNet performance with recent benchmark models such as U-Net, UNet++, Attention UNet, FPN, UNet+SPP, and DGRNet to prove the effectiveness of our proposed model. Results show that the proposed DIResUNet model outperforms benchmark models on two HRRS datasets. © 2022, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.Item Semantic segmentation of low magnification effusion cytology images: A semi-supervised approach(Elsevier Ltd, 2022) Aboobacker, S.; Vijayasenan, D.; Sumam David, S.; Suresh, P.K.; Sreeram, S.Cytopathologists examine microscopic images obtained at various magnifications to identify malignancy in effusions. They locate the malignant cell clusters at a low magnification and then zoom in to investigate cell-level features at a high magnification. This study predicts the malignancy at low magnification levels such as 4X and 10X in effusion cytology images to reduce scanning time. However, the most challenging problem is annotating the low magnification images, particularly the 4X images. This paper extends two semi-supervised learning (SSL) models, MixMatch and FixMatch, for semantic segmentation. The original FixMatch and MixMatch algorithms are designed for classification tasks. While performing image augmentation, the generated pseudo labels are spatially altered. We introduce reverse augmentation to compensate for the effect of the spatial alterations. The extended models are trained using labelled 10X and unlabelled 4X images. The average F-score of benign and malignant pixels on the predictions of 4X images is improved approximately by 9% for both Extended MixMatch and Extended FixMatch respectively compared with the baseline model. In the Extended MixMatch, 62% sub-regions of low magnification images are eliminated from scanning at a higher magnification, thereby saving scanning time. © 2022 Elsevier LtdItem Semantic context driven language descriptions of videos using deep neural network(Springer Science and Business Media Deutschland GmbH, 2022) Naik, D.; Jaidhar, C.D.The massive addition of data to the internet in text, images, and videos made computer vision-based tasks challenging in the big data domain. Recent exploration of video data and progress in visual information captioning has been an arduous task in computer vision. Visual captioning is attributable to integrating visual information with natural language descriptions. This paper proposes an encoder-decoder framework with a 2D-Convolutional Neural Network (CNN) model and layered Long Short Term Memory (LSTM) as the encoder and an LSTM model integrated with an attention mechanism working as the decoder with a hybrid loss function. Visual feature vectors extracted from the video frames using a 2D-CNN model capture spatial features. Specifically, the visual feature vectors are fed into the layered LSTM to capture the temporal information. The attention mechanism enables the decoder to perceive and focus on relevant objects and correlate the visual context and language content for producing semantically correct captions. The visual features and GloVe word embeddings are input into the decoder to generate natural semantic descriptions for the videos. The performance of the proposed framework is evaluated on the video captioning benchmark dataset Microsoft Video Description (MSVD) using various well-known evaluation metrics. The experimental findings indicate that the suggested framework outperforms state-of-the-art techniques. Compared to the state-of-the-art research methods, the proposed model significantly increased all measures, B@1, B@2, B@3, B@4, METEOR, and CIDEr, with the score of 78.4, 64.8, 54.2, and 43.7, 32.3, and 70.7, respectively. The progression in all scores indicates a more excellent grasp of the context of the inputs, which results in more accurate caption prediction. © 2022, The Author(s).Item Automated Molecular Subtyping of Breast Carcinoma Using Deep Learning Techniques(Institute of Electrical and Electronics Engineers Inc., 2023) Niyas, S.; Bygari, R.; Naik, R.; Viswanath, B.; Ugwekar, D.; Mathew, T.; Kavya, J.; Kini, J.R.; Rajan, J.Objective: Molecular subtyping is an important procedure for prognosis and targeted therapy of breast carcinoma, the most common type of malignancy affecting women. Immunohistochemistry (IHC) analysis is the widely accepted method for molecular subtyping. It involves the assessment of the four molecular biomarkers namely estrogen receptor (ER), progesterone receptor (PR), human epidermal growth factor receptor 2 (HER2), and antigen Ki67 using appropriate antibody reagents. Conventionally, these biomarkers are assessed manually by a pathologist, who finally combines individual results to identify the molecular subtype. Molecular subtyping necessitates the status of all the four biomarkers together, and to the best of our knowledge, no such automated method exists. This paper proposes a novel deep learning framework for automatic molecular subtyping of breast cancer from IHC images. Methods and procedures: A modified LadderNet architecture is proposed to segment the immunopositive elements from ER, PR, HER2, and Ki67 biomarker slides. This architecture uses long skip connections to pass encoder feature space from different semantic levels to the decoder layers, allowing concurrent learning with multi-scale features. The entire architecture is an ensemble of multiple fully convolutional neural networks, and learning pathways are chosen adaptively based on input data. The segmentation stage is followed by a post-processing stage to quantify the extent of immunopositive elements to predict the final status for each biomarker. Results: The performance of segmentation models for each IHC biomarker is evaluated qualitatively and quantitatively. Furthermore, the biomarker prediction results are also evaluated. The results obtained by our method are highly in concordance with manual assessment by pathologists. Clinical impact: Accurate automated molecular subtyping can speed up this pathology procedure, reduce pathologists' workload and associated costs, and facilitate targeted treatment to obtain better outcomes. © 2013 IEEE.Item DPPNet: An Efficient and Robust Deep Learning Network for Land Cover Segmentation From High-Resolution Satellite Images(Institute of Electrical and Electronics Engineers Inc., 2023) Sravya, N.; Priyanka; Lal, S.; Nalini, J.; Chintala, C.S.; Dell’Acqua, F.Visual understanding of land cover is an important task in information extraction from high-resolution satellite images, an operation which is often involved in remote sensing applications. Multi-class semantic segmentation of high-resolution satellite images turned out to be an important research topic because of its wide range of real-life applications. Although scientific literature reports several deep learning methods that can provide good results in segmenting remotely sensed images, these are generally computationally expensive. There still exists an open challenge towards developing a robust deep learning model capable of improving performances while requiring less computational complexity. In this article, we propose a new model termed DPPNet (Depth-wise Pyramid Pooling Network), which uses the newly designed Depth-wise Pyramid Pooling (DPP) block and a dense block with multi-dilated depth-wise residual connections. This proposed DPPNet model is evaluated and compared with the benchmark semantic segmentation models on the Land-cover and WHDLD high-resolution Space-borne Sensor (HRS) datasets. The proposed model provides DC, IoU, OA, Ka scores of (88.81%, 78.29%), (76.35%, 60.92%), (87.15%, 81.02%), (77.86%, 72.73%) on the Land-cover and WHDLD HRS datasets respectively. Results show that the proposed DPPNet model provides better performances, in both quantitative and qualitative terms, on these standard benchmark datasets than current state-of-art methods. © 2017 IEEE.
