Faculty Publications

Permanent URI for this communityhttps://idr.nitk.ac.in/handle/123456789/18736

Publications by NITK Faculty

Browse

Search Results

Now showing 1 - 9 of 9
  • Item
    A cascaded convolutional neural network architecture for despeckling OCT images
    (Elsevier Ltd, 2021) Anoop, B.N.; Kalmady, K.S.; Udathu, A.; Siddharth, V.; Girish, G.N.; Kothari, A.R.; Rajan, J.
    Optical Coherence Tomography (OCT) is an imaging technique widely used for medical imaging. Noise in an OCT image generally degrades its quality, thereby obscuring clinical features and making the automated segmentation task suboptimal. Obtaining higher quality images requires sophisticated equipment and technology, available only in selected research settings, and is expensive to acquire. Developing effective denoising methods to improve the quality of the images acquired on systems currently in use has potential for vastly improving image quality and automated quantitative analysis. Noise characteristics in images acquired from machines of different makes and models may vary. Our experiments show that any single state-of-the-art method for noise reduction fails to perform equally well on images from various sources. Therefore, detailed analysis is required to determine the exact noise type in images acquired using different OCT machines. In this work we studied noise characteristics in the publicly available DUKE and OPTIMA datasets to build a more efficient model for noise reduction. These datasets have OCT images acquired using machines of different manufacturers. We further propose a patch-wise training methodology to build a system to effectively denoise OCT images. We have performed an extensive range of experiments to show that the proposed method performs superior to other state-of-the-art-methods. © 2021 Elsevier Ltd
  • Item
    Capsule Network–based architectures for the segmentation of sub-retinal serous fluid in optical coherence tomography images of central serous chorioretinopathy
    (Springer Science and Business Media Deutschland GmbH, 2021) Pawan, S.J.; Sankar, R.; Jain, A.; Jain, M.; Darshan, D.V.; Anoop, B.N.; Kothari, A.R.; Venkatesan, M.; Rajan, J.
    Central serous chorioretinopathy (CSCR) is a chorioretinal disorder of the eye characterized by serous detachment of the neurosensory retina at the posterior pole of the eye. CSCR results from the accumulation of subretinal fluid (SRF) due to idiopathic defects at the level of the retinal pigment epithelial (RPE) that allows serous fluid from the choriocapillaris to diffuse into the subretinal space between RPE and neurosensory retinal layers. This condition is presently investigated by clinicians using invasive angiography or non-invasive optical coherence tomography (OCT) imaging. OCT images provide a representation of the fluid underlying the retina, and in the absence of automated segmentation tools, currently only a qualitative assessment of the same is used to follow the progression of the disease. Automated segmentation of the SRF can prove to be extremely useful for the assessment of progression and for the timely management of CSCR. In this paper, we adopt an existing architecture called SegCaps, which is based on the recently introduced Capsule Networks concept, for the segmentation of SRF from CSCR OCT images. Furthermore, we propose an enhancement to SegCaps, which we have termed as DRIP-Caps, that utilizes the concepts of Dilation, Residual Connections, Inception Blocks, and Capsule Pooling to address the defined problem. The proposed model outperforms the benchmark UNet architecture while reducing the number of trainable parameters by 54.21%. Moreover, it reduces the computation complexity of SegCaps by reducing the number of trainable parameters by 37.85%, with competitive performance. The experiments demonstrate the generalizability of the proposed model, as evidenced by its remarkable performance even with a limited number of training samples. [Figure not available: see fulltext.]. © 2021, International Federation for Medical and Biological Engineering.
  • Item
    An empirical study of the impact of masks on face recognition
    (Elsevier Ltd, 2022) Jeevan, G.; Zacharias, G.C.; Nair, M.S.; Rajan, J.
    Face recognition has a wide range of applications like video surveillance, security, access control, etc. Over the past decade, the field of face recognition has matured and grown at par with the latest advancements in technology, particularly deep learning. Convolution Neural Networks have surpassed human accuracy in Face Recognition on popular evaluation tests such as LFW. However, most existing models evaluate their performance with an assumption of the availability of full facial information. The COVID-19 pandemic has laid forth challenges to this assumption, and to the performance of existing methods and leading-edge algorithms in the field of face recognition. This is in the wake of an explosive increase in the number of people wearing face masks. The reduced amount of facial information available to a recognition system from a masked face impacts their discrimination ability. In this context, we design and conduct a series of experiments comparing the masked face recognition performances of CNN architectures available in literature and exploring possible alterations in loss functions, architectures, and training methods that can enable existing methods to fully extract and leverage the limited facial information available in a masked face. We evaluate existing CNN-based face recognition systems for their performance against datasets composed entirely of masked faces, in contrast to the existing standard evaluations where masked or occluded faces are a rare occurrence. The study also presents evidence denoting an increased impact of network depth on performance compared to standard face recognition. Our observations indicate that substantial performance gains can be achieved by the introduction of masked faces in the training set. The study also inferred that various parameter settings determined suitable for standard face recognition are not ideal for masked face recognition. Through empirical analysis we derived new value recommendations for these parameters and settings. © 2021 Elsevier Ltd
  • Item
    Crossover based technique for data augmentation
    (Elsevier Ireland Ltd, 2022) Raj, R.; Mathew, J.; Kannath, S.K.; Rajan, J.
    Background and Objective: Medical image classification problems are frequently constrained by the availability of datasets. “Data augmentation” has come as a data enhancement and data enrichment solution to the challenge of limited data. Traditionally data augmentation techniques are based on linear and label preserving transformations; however, recent works have demonstrated that even non-linear, non-label preserving techniques can be unexpectedly effective. This paper proposes a non-linear data augmentation technique for the medical domain and explores its results. Methods: This paper introduces “Crossover technique”, a new data augmentation technique for Convolutional Neural Networks in Medical Image Classification problems. Our technique synthesizes a pair of samples by applying two-point crossover on the already available training dataset. By this technique, we create N new samples from N training samples. The proposed crossover based data augmentation technique, although non-label preserving, has performed significantly better in terms of increased accuracy and reduced loss for all the tested datasets over varied architectures. Results: The proposed method was tested on three publicly available medical datasets with various network architectures. For the mini-MIAS database of mammograms, our method improved the accuracy by 1.47%, achieving 80.15% using VGG-16 architecture. Our method works fine for both gray-scale as well as RGB images, as on the PH2 database for Skin Cancer, it improved the accuracy by 3.57%, achieving 85.71% using VGG-19 architecture. In addition, our technique improved accuracy on the brain tumor dataset by 0.40%, achieving 97.97% using VGG-16 architecture. Conclusion: The proposed novel crossover technique for training the Convolutional Neural Network (CNN) is painless to implement by applying two-point crossover on two images to form new images. The method would go a long way in tackling the challenges of limited datasets and problems of class imbalances in medical image analysis. Our code is available at https://github.com/rishiraj-cs/Crossover-augmentation © 2022
  • Item
    Stroke classification from computed tomography scans using 3D convolutional neural network
    (Elsevier Ltd, 2022) Neethi, A.S.; Niyas, S.; Kannath, S.K.; Mathew, J.; Anzar, A.M.; Rajan, J.
    Stroke is a cerebrovascular condition with a significant morbidity and mortality rate and causes physical disabilities for survivors. Once the symptoms are identified, it requires a time-critical diagnosis with the help of the most commonly available imaging techniques. Computed tomography (CT) scans are used worldwide for preliminary stroke diagnosis. It demands the expertise and experience of a radiologist to identify the stroke type, which is critical for initiating the treatment. This work attempts to gather those domain skills and build a model from CT scans to diagnose stroke. The non-contrast computed tomography (NCCT) scan of the brain comprises volumetric images or a 3D stack of image slices. So, a model that aims to solve the problem by targeting a 2D slice may fail to address the volumetric nature. We propose a 3D-based fully convolutional classification model to identify stroke cases from CT images that take into account the contextual longitudinal composition of volumetric data. We formulate a custom pre-processing module to enhance the scans and aid in improving the classification performance. Some of the significant challenges faced by 3D CNN are the less number of training samples, and the number of scans is mostly biased in favor of normal patients. In this work, the limitation of insufficient training volume and class imbalanced data have been rectified with the help of a strided slicing approach. A block-wise design was used to formulate the proposed network, with the initial part focusing on adjusting the dimensionality, at the same time retaining the features. Later on, the accumulated feature maps were effectively learned utilizing bundled convolutions and skip connections. The results of the proposed method were compared against 3D CNN stroke classification models on NCCT, various 3D CNN architectures on other brain imaging modalities, and 3D extensions of some of the classical CNN architectures. The proposed method achieved an improvement of 14.28% in the F1-score over the state-of-the-art 3D CNN stroke classification model. © 2022 Elsevier Ltd
  • Item
    Automated Molecular Subtyping of Breast Carcinoma Using Deep Learning Techniques
    (Institute of Electrical and Electronics Engineers Inc., 2023) Niyas, S.; Bygari, R.; Naik, R.; Viswanath, B.; Ugwekar, D.; Mathew, T.; Kavya, J.; Kini, J.R.; Rajan, J.
    Objective: Molecular subtyping is an important procedure for prognosis and targeted therapy of breast carcinoma, the most common type of malignancy affecting women. Immunohistochemistry (IHC) analysis is the widely accepted method for molecular subtyping. It involves the assessment of the four molecular biomarkers namely estrogen receptor (ER), progesterone receptor (PR), human epidermal growth factor receptor 2 (HER2), and antigen Ki67 using appropriate antibody reagents. Conventionally, these biomarkers are assessed manually by a pathologist, who finally combines individual results to identify the molecular subtype. Molecular subtyping necessitates the status of all the four biomarkers together, and to the best of our knowledge, no such automated method exists. This paper proposes a novel deep learning framework for automatic molecular subtyping of breast cancer from IHC images. Methods and procedures: A modified LadderNet architecture is proposed to segment the immunopositive elements from ER, PR, HER2, and Ki67 biomarker slides. This architecture uses long skip connections to pass encoder feature space from different semantic levels to the decoder layers, allowing concurrent learning with multi-scale features. The entire architecture is an ensemble of multiple fully convolutional neural networks, and learning pathways are chosen adaptively based on input data. The segmentation stage is followed by a post-processing stage to quantify the extent of immunopositive elements to predict the final status for each biomarker. Results: The performance of segmentation models for each IHC biomarker is evaluated qualitatively and quantitatively. Furthermore, the biomarker prediction results are also evaluated. The results obtained by our method are highly in concordance with manual assessment by pathologists. Clinical impact: Accurate automated molecular subtyping can speed up this pathology procedure, reduce pathologists' workload and associated costs, and facilitate targeted treatment to obtain better outcomes. © 2013 IEEE.
  • Item
    A Deep Ensemble Learning-Based CNN Architecture for Multiclass Retinal Fluid Segmentation in OCT Images
    (Institute of Electrical and Electronics Engineers Inc., 2023) Rahil, M.; Anoop, B.N.; Girish, G.N.; Kothari, A.R.; Koolagudi, S.G.; Rajan, J.
    Retinal Fluids (fluid collections) develop because of the accumulation of fluid in the retina, which may be caused by several retinal disorders, and can lead to loss of vision. Optical coherence tomography (OCT) provides non-invasive cross-sectional images of the retina and enables the visualization of different retinal abnormalities. The identification and segmentation of retinal cysts from OCT scans is gaining immense attention since the manual analysis of OCT data is time consuming and requires an experienced ophthalmologist. Identification and categorization of the retinal cysts aids in establishing the pathophysiology of various retinal diseases, such as macular edema, diabetic macular edema, and age-related macular degeneration. Hence, an automatic algorithm for the segmentation and detection of retinal cysts would be of great value to the ophthalmologists. In this study, we have proposed a convolutional neural network-based deep ensemble architecture that can segment the three different types of retinal cysts from the retinal OCT images. The quantitative and qualitative performance of the model was evaluated using the publicly available RETOUCH challenge dataset. The proposed model outperformed the state-of-the-art methods, with an overall improvement of 1.8%. © 2013 IEEE.
  • Item
    StrokeViT with AutoML for brain stroke classification
    (Elsevier Ltd, 2023) Raj, R.; Mathew, J.; Kannath, S.K.; Rajan, J.
    Stroke, categorized under cardiovascular and circulatory diseases, is considered the second foremost cause of death worldwide, causing approximately 11% of deaths annually. Stroke diagnosis using a Computed Tomography (CT) scan is considered ideal for identifying whether the stroke is hemorrhagic or ischemic. However, most methods for stroke classification are based on a single slice-level prediction mechanism, meaning that the most imperative CT slice has to be manually selected by the radiologist from the original CT volume. This paper proposes an integration of Convolutional Neural Network (CNN), Vision Transformers (ViT), and AutoML to obtain slice-level predictions as well as patient-wise prediction results. While the CNN with inductive bias captures local features, the transformer captures long-range dependencies between sequences. This collaborative local-global feature extractor improves upon the slice-wise predictions of the CT volume. We propose stroke-specific feature extraction from each slice-wise prediction to obtain the patient-wise prediction using AutoML. While the slice-wise predictions helps the radiologist to verify close and corner cases, the patient-wise predictions makes the outcome clinically relevant and closer to real-world scenario. The proposed architecture has achieved an accuracy of 87% for single slice-level prediction and an accuracy of 92% for patient-wise prediction. For comparative analysis of slice-level predictions, standalone architectures of VGG-16, VGG-19, ResNet50, and ViT were considered. The proposed architecture has outperformed the standalone architectures by 9% in terms of accuracy. For patient-wise predictions, AutoML considers 13 different ML algorithms, of which 3 achieve an accuracy of more than 90%. The proposed architecture helps in reducing the manual effort by the radiologist to manually select the most imperative CT from the original CT volume and shows improvement over other standalone architectures for classification tasks. The proposed architecture can be generalized for volumetric scans aiding in the patient diagnosis of head and neck, lungs, diseases of hepatobiliary tract, genitourinary diseases, women's imaging including breast cancer and various musculoskeletal diseases. Code for proposed stroke-specific feature extraction with the pre-trained weights of the trained model is available at: https://github.com/rishiraj-cs/StrokeViT_With_AutoML. © 2022 Elsevier Ltd
  • Item
    WideCaps: a wide attention-based capsule network for image classification
    (Springer Science and Business Media Deutschland GmbH, 2023) Pawan, S.J.; Sharma, R.; Reddy, H.; Vani, M.; Rajan, J.
    The capsule network is a distinct and promising segment of the neural network family that has drawn attention due to its unique ability to maintain equivariance by preserving spatial relationships among the features. The capsule network has attained unprecedented success in image classification with datasets such as MNIST and affNIST by encoding the characteristic features into capsules and building a parse-tree structure. However, on datasets involving complex foreground and background regions, such as CIFAR-10 and CIFAR-100, the performance of the capsule network is suboptimal due to its naive data routing policy and incompetence in extracting complex features. This paper proposes a new design strategy for capsule network architectures for efficiently dealing with complex images. The proposed method incorporates the optimal placement of the novel wide bottleneck residual block and squeeze and excitation Attention Blocks into the capsule network upheld by the modified factorized machines routing algorithm to address the defined problem. This setup allows channel interdependencies at almost no computational cost, thereby enhancing the representation ability of capsules on complex images. We extensively evaluate the performance of the proposed model on the five publicly available datasets, namely the CIFAR-10, Fashion MNIST, Brain Tumor, SVHN, and the CIFAR-100 datasets. The proposed method outperformed the top-5 capsule network-based methods on Fashion MNIST, CIFAR-10, SVHN, Brain Tumor, and gave a highly competitive performance on the CIFAR-100 datasets. © 2023, The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature.