Faculty Publications

Permanent URI for this communityhttps://idr.nitk.ac.in/handle/123456789/18736

Publications by NITK Faculty

Browse

Search Results

Now showing 1 - 6 of 6
  • Item
    Cardamom Plant Disease Detection Approach Using EfficientNetV2
    (Institute of Electrical and Electronics Engineers Inc., 2022) Sunil, C.K.; Jaidhar, C.D.; Patil, N.
    Cardamom is a queen of spices. It is indigenously grown in the evergreen forests of Karnataka, Kerala, Tamil Nadu, and the northeastern states of India. India is the third largest producer of cardamom. Plant diseases cause a catastrophic influence on food production safety; they reduce the eminence and quantum of agricultural products. Plant diseases may cause significantly high loss or no harvest in dreadful cases. Various diseases and pests affect the growth of cardamom plants at different stages and crop yields. This study concentrated on two diseases of cardamom plants, Colletotrichum Blight and Phyllosticta Leaf Spot of cardamom and three diseases of grape, Black Rot, ESCA, and Isariopsis Leaf Spot. Various methods have been proposed for plant disease detection, and deep learning has become the preferred method because of its spectacular accomplishment. In this study, U2-Net was used to remove the unwanted background of an input image by selecting multiscale features. This work proposes a cardamom plant disease detection approach using the EfficientNetV2 model. A comprehensive set of experiments was carried out to ascertain the performance of the proposed approach and compare it with other models such as EfficientNet and Convolutional Neural Network (CNN). The experimental results showed that the proposed approach achieved a detection accuracy of 98.26%. © 2013 IEEE.
  • Item
    Semantic context driven language descriptions of videos using deep neural network
    (Springer Science and Business Media Deutschland GmbH, 2022) Naik, D.; Jaidhar, C.D.
    The massive addition of data to the internet in text, images, and videos made computer vision-based tasks challenging in the big data domain. Recent exploration of video data and progress in visual information captioning has been an arduous task in computer vision. Visual captioning is attributable to integrating visual information with natural language descriptions. This paper proposes an encoder-decoder framework with a 2D-Convolutional Neural Network (CNN) model and layered Long Short Term Memory (LSTM) as the encoder and an LSTM model integrated with an attention mechanism working as the decoder with a hybrid loss function. Visual feature vectors extracted from the video frames using a 2D-CNN model capture spatial features. Specifically, the visual feature vectors are fed into the layered LSTM to capture the temporal information. The attention mechanism enables the decoder to perceive and focus on relevant objects and correlate the visual context and language content for producing semantically correct captions. The visual features and GloVe word embeddings are input into the decoder to generate natural semantic descriptions for the videos. The performance of the proposed framework is evaluated on the video captioning benchmark dataset Microsoft Video Description (MSVD) using various well-known evaluation metrics. The experimental findings indicate that the suggested framework outperforms state-of-the-art techniques. Compared to the state-of-the-art research methods, the proposed model significantly increased all measures, B@1, B@2, B@3, B@4, METEOR, and CIDEr, with the score of 78.4, 64.8, 54.2, and 43.7, 32.3, and 70.7, respectively. The progression in all scores indicates a more excellent grasp of the context of the inputs, which results in more accurate caption prediction. © 2022, The Author(s).
  • Item
    A novel Multi-Layer Attention Framework for visual description prediction using bidirectional LSTM
    (Springer Science and Business Media Deutschland GmbH, 2022) Naik, D.; Jaidhar, C.D.
    The massive influx of text, images, and videos to the internet has recently increased the challenge of computer vision-based tasks in big data. Integrating visual data with natural language to generate video explanations has been a challenge for decades. However, recent experiments on image/video captioning that employ Long-Short-Term-Memory (LSTM) have piqued the interest of researchers studying its possible application in video captioning. The proposed video captioning architecture combines the bidirectional multilayer LSTM (BiLSTM) encoder and unidirectional decoder. The innovative architecture also considers temporal relations when creating superior global video representations. In contrast to the majority of prior work, the most relevant features of a video are selected and utilized specifically for captioning purposes. Existing methods utilize a single-layer attention mechanism for linking visual input with phrase meaning. This approach employs LSTMs and a multilayer attention mechanism to extract characteristics from movies, construct links between multi-modal (words and visual material) representations, and generate sentences with rich semantic coherence. In addition, we evaluated the performance of the suggested system using a benchmark dataset for video captioning. The obtained results reveal superior performance relative to state-of-the-art works in METEOR and promising performance relative to the BLEU score. In terms of quantitative performance, the proposed approach outperforms most existing methodologies. © 2022, The Author(s).
  • Item
    Systematic study on deep learning-based plant disease detection or classification
    (Springer Nature, 2023) Sunil, C.K.; Jaidhar, C.D.; Patil, N.
    Plant diseases impact extensively on agricultural production growth. It results in a price hike on food grains and vegetables. To reduce economic loss and to predict yield loss, early detection of plant disease is highly essential. Current plant disease detection involves the physical presence of domain experts to ascertain the disease; this approach has significant limitations, namely: domain experts need to move from one place to another place which involves transportation cost as well as travel time; heavy transportation charge makes the domain expert not travel a long distance, and domain experts may not be available all the time, and though the domain experts are available, the domain expert(s) may charge high consultation charge which may not be feasible for many farmers. Thus, there is a need for a cost-effective, robust automated plant disease detection or classification approach. In this line, various plant disease detection approaches are proposed in the literature. This systematic study provides various Deep Learning-based and Machine Learning-based plant disease detection or classification approaches; 160 diverse research works are considered in this study, which comprises single network models, hybrid models, and also real-time detection approaches. Around 57 studies considered multiple plants, and 103 works considered a single plant. 50 different plant leaf disease datasets are discussed, which include publicly available and publicly unavailable datasets. This study also discusses the various challenges and research gaps in plant disease detection. This study also highlighted the importance of hyperparameters in deep learning. © 2023, The Author(s), under exclusive licence to Springer Nature B.V.
  • Item
    Video Captioning using Sentence Vector-enabled Convolutional Framework with Short-Connected LSTM
    (Springer, 2024) Naik, D.; Jaidhar, C.D.
    The principal objective of video/image captioning is to portray the dynamics of a video clip in plain natural language. Captioning is motivated by its ability to make the video more accessible to deaf and hard-of-hearing individuals, to help people focus on and recall information more readily, and to watch it in sound-sensitive locations. The most frequently utilized design paradigm is the revolutionary structurally improved encoder-decoder configuration. Recent developments emphasize the utilization of various creative structural modifications to maximize efficiency while demonstrating their viability in real-world applications. The utilization of well-known and well-researched technological advancements such as deep Convolutional Neural Networks (CNNs) and Sentence Transformers are trending in encoder-decoders. This paper proposes an approach for efficiently captioning videos using CNN and a short-connected LSTM-based encoder-decoder model blended with a sentence context vector. This sentence context vector emphasizes the relationship between the video and text spaces. Inspired by the human visual system, the attention mechanism is utilized to selectively concentrate on the context of the important frames. Also, a contextual hybrid embedding block is presented for connecting the two vector spaces generated during the encoding and decoding stages. The proposed architecture is investigated through well-known CNN architectures and various word embeddings. It is assessed using two benchmark video captioning datasets, MSVD and MSR-VTT, considering standard evaluation metrics such as BLEU, METEOR, ROUGH, and CIDEr. In accordance with experimental exploration, when the proposed model with NASNet-large alone is viewed across all three embeddings, the BERT findings on MSVD Dataset performed better than the results obtained with the other two embeddings. Inception-v4 outperformed VGG-16, ResNet-152, and NASNet-Large for feature extraction. Considering word embedding initiatives, BERT is far superior to ELMo and GloVe based on the MSR-VTT dataset. © 2023, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.
  • Item
    Anomalous Electrical Power Consumption Detection in Household Appliances via Micro-Moment Classification
    (Institute of Electrical and Electronics Engineers Inc., 2025) Nayak, R.; Jaidhar, C.D.
    The detection of anomalous power consumption is critical for improving energy efficiency, particularly with the increasing demand in buildings. This study explores Convolutional Neural Network-based models by transforming 1-dimensional micro-moment labeled data into 2-dimensional matrices to capture both temporal and spatial consumption patterns. Three architectural variants are investigated: a conventional Deep Convolutional Neural Network (DCNN), a Depthwise Separable Convolutional Neural Network (DS-CNN), and a Depthwise Separable Residual Convolutional Neural Network (DSR-CNN). Unlike earlier studies, this work incorporates hyperparameter tuning, statistical validation, and cross-validation, resulting in the evaluation of over 450 model configurations. The results indicate that while the DCNN consistently achieves the highest accuracy, the DS-CNN achieves comparable performance with significantly reduced parameters and computational cost, making it suitable for real-time and resource-constrained environments. Model complexity analysis and statistical tests confirm the robustness of the findings. Finally, a systematic model selection strategy is presented, identifying the DS-CNN as the most balanced solution for effective and efficient anomaly detection in smart grid applications. © 2020 IEEE.