Journal Articles
Permanent URI for this collectionhttps://idr.nitk.ac.in/handle/123456789/19884
Browse
Search Results
Item TransSARNet: a deep learning framework for despeckling of SAR images(Institute of Physics, 2025) Kevala, V.D.; Sravya, N.; Lal, S.; Suresh, S.; Dell’Acqua, F.Synthetic Aperture Radar(SAR) images are extensively used for Earth observation because of their all-weather, day, and night imaging capabilities. However, speckle noise in SAR images significantly reduces their usability in a variety of applications. Deep learning models developed for SAR despeckling exhibit promising noise reduction capabilities. Bringing a balance between reducing graininess and preserving texture details is a challenging task. In addition, supervised training of a robust deep learning model requires noisy images that capture the SAR speckle dynamics and the corresponding speckle-free ground truth, which is generally not available. This study proposes the first hybrid CNN-Halo attention-based transformer model for SAR despeckling. CNN-based feature extraction modules provide multiscale and multidirectional and large-scale feature maps. A halo-attention transformer block is used in the skip connection. It aids in the better preservation of radiometric information in the despeckled SAR images. TransSARNet is trained in a supervised manner using a new synthetic SAR dataset, which is a combination of the Kylberg and UCMerced land-use datasets. This study also analyzed the effect of combining the Kylberg and UCMerced datasets on texture preservation in despeckled SAR images. The visual and qualitative metrics evaluated on Sentinel-1 Single Look Complex SAR data showed that the proposed TransSARNet approach outperformed the other models under consideration. TransSARNet achieves a harmonious balance between model complexity, despeckling ability, edge preservation, radiometric information preservation, and smoothing in homogeneous regions. © 2025 IOP Publishing Ltd. All rights, including for text and data mining, AI training, and similar technologies, are reserved.Item CAR-BRAINet: Sub-6 GHz aided spatial adaptive beam prediction with multi head attention for heterogeneous vehicular networks(Institute of Physics, 2025) Menon, A.G.; Krishnan, P.; Lal, S.Heterogeneous Vehicular Networks (HetVNets) play a crucial role by integrating different communication technologies, such as sub-6 GHz, mm-wave, and DSRC, to meet the diverse connectivity requirements of 5G/B5G vehicular networks. HetVNet helps address humongous user demands, but maintaining a steady connection in highly mobile, real-world conditions remains challenging. Though ample studies have been conducted on beam prediction models, a dedicated solution for HetVNets has been sparsely explored. Hence, developing a reliable beam prediction model, specifically for HetVNets, is necessary. This paper introduces a lightweight deep learning-based model termed ‘CAR-BRAINet’, which consists of convolutional neural networks with a powerful multi-head attention (MHA) mechanism. Existing literature on beam prediction is primarily studied under a limited, idealised vehicular scenario, often overlooking the real-time complexities and intricacies of vehicular networks. Therefore, this study aims to mimic the complexities of a real-time driving scenario by incorporating key factors, such as prominent MAC protocols (3GPP-C-V2X and IEEE 802.11BD), the effect of Doppler shifts under high velocity and varying distance, and SNR levels, into three high-quality dynamic data sets for urban, rural, and highway vehicular networks. CAR-BRAINet achieves a steady improvement of 11.6467% in spectral efficiency, with a 93.1638% lighter architecture compared to existing methods, resulting in a 94.7103% reduction in prediction time. Therefore, demonstrating a precise beam prediction across all vehicular scenarios, with minimal beam overhead. Thus, this study justifies the effectiveness of CAR-BRAINet in complex HetVNets, offering promising performance without relying on mobile users’ location, angle, and antenna dimensions, thereby reducing redundant sensor latency. © 2025 IOP Publishing Ltd. All rights, including for text and data mining, AI training, and similar technologies, are reserved.Item Transformer assisted framework for automated multi-class abnormality classification for video capsule endoscopy(Institute of Physics, 2025) Prabhu, M.M.; Kaliki, V.S.; Lal, S.Video Capsule Endoscopy (VCE) is a minimally invasive imaging technique used for diagnosing gastrointestinal (GI) disorders, enabling detailed visualization of the digestive tract. This study introduces CASCRNet, a novel and parameter-efficient deep learning architecture designed to enhance interpretability and computational efficiency in multi-class abnormality classification for VCE. CASCRNet integrates focal loss, Atrous Spatial Pyramid Pooling, and Shared Channel Residual blocks to improve feature extraction and address class imbalance. In addition to CASCRNet, this study conducts a comprehensive evaluation of several deep learning models, including ResNet50, DenseNet121, RCCGNet, Hiera, and AIMv2. Among these, AIMv2, a fine-tuned transformer-based model, achieved the highest overall performance, serving as a new benchmark for accuracy. The proposed framework demonstrates robust results on the Capsule Vision 2024 dataset and highlights the potential of both lightweight and transformer-based solutions to improve diagnostic efficiency and clinical workflow in gastrointestinal imaging. © 2025 IOP Publishing Ltd. All rights, including for text and data mining, AI training, and similar technologies, are reserved.
