Transformer assisted framework for automated multi-class abnormality classification for video capsule endoscopy

dc.contributor.authorPrabhu, M.M.
dc.contributor.authorKaliki, V.S.
dc.contributor.authorLal, S.
dc.date.accessioned2026-02-03T13:04:17Z
dc.date.issued2025
dc.description.abstractVideo Capsule Endoscopy (VCE) is a minimally invasive imaging technique used for diagnosing gastrointestinal (GI) disorders, enabling detailed visualization of the digestive tract. This study introduces CASCRNet, a novel and parameter-efficient deep learning architecture designed to enhance interpretability and computational efficiency in multi-class abnormality classification for VCE. CASCRNet integrates focal loss, Atrous Spatial Pyramid Pooling, and Shared Channel Residual blocks to improve feature extraction and address class imbalance. In addition to CASCRNet, this study conducts a comprehensive evaluation of several deep learning models, including ResNet50, DenseNet121, RCCGNet, Hiera, and AIMv2. Among these, AIMv2, a fine-tuned transformer-based model, achieved the highest overall performance, serving as a new benchmark for accuracy. The proposed framework demonstrates robust results on the Capsule Vision 2024 dataset and highlights the potential of both lightweight and transformer-based solutions to improve diagnostic efficiency and clinical workflow in gastrointestinal imaging. © 2025 IOP Publishing Ltd. All rights, including for text and data mining, AI training, and similar technologies, are reserved.
dc.identifier.citationEngineering Research Express, 2025, 7, 4, pp. -
dc.identifier.urihttps://doi.org/10.1088/2631-8695/ae2426
dc.identifier.urihttps://idr.nitk.ac.in/handle/123456789/19888
dc.publisherInstitute of Physics
dc.subjectBenchmarking
dc.subjectClassification (of information)
dc.subjectComputational efficiency
dc.subjectComputer aided diagnosis
dc.subjectData mining
dc.subjectDeep learning
dc.subjectLearning systems
dc.subjectMedical imaging
dc.subjectAutomated medical diagnose
dc.subjectDigestive tract
dc.subjectGastrointestinal disorders
dc.subjectGastrointestinal imaging
dc.subjectInterpretability
dc.subjectLearning architectures
dc.subjectMinimally invasive imaging
dc.subjectMulti-class abnormality classification
dc.subjectVideo capsule endoscopies
dc.subjectVideo capsule endoscopy
dc.subjectEndoscopy
dc.titleTransformer assisted framework for automated multi-class abnormality classification for video capsule endoscopy

Files

Collections