Faculty Publications

Permanent URI for this communityhttps://idr.nitk.ac.in/handle/123456789/18736

Publications by NITK Faculty

Browse

Search Results

Now showing 1 - 10 of 42
  • Item
    Deep Learning for COVID-19
    (Springer Science and Business Media Deutschland GmbH, 2022) Bs, B.S.; Manoj Kumar, M.V.; Thomas, L.; Ajay Kumar, M.A.; Wu, D.; Annappa, B.; Hebbar, A.; Vishnu Srinivasa Murthy, Y.V.S.
    Ever since the outbreak in Wuhan, China, a variant of Coronavirus named “COVID 19” has taken human lives in millions all around the world. The detection of the infection is quite tedious since it takes 3–14 days for the symptoms to surface in patients. Early detection of the infection and prohibiting it would limit the spread to only to Local Transmission. Deep learning techniques can be used to gain insights on the early detection of infection on the medical image data such as Computed Tomography (CT images), Magnetic resonance Imaging (MRI images), and X-Ray images collected from the infected patients provided by the Medical institution or from the publicly available databases. The same techniques can be applied to do the analysis of infection rates and do predictions for the coming days. A wide range of open-source pre-trained models that are trained for general classification or segmentation is available for the proposed study. Using these models with the concept of transfer learning, obtained resultant models when applied to the medical image datasets would draw much more insights into the COVID-19 detection and prediction process. Innumerable works have been done by researchers all over the world on the publicly available COVID-19 datasets and were successful in deriving good results. Visualizing the results and presenting the summarized data of prediction in a cleaner, unambiguous way to the doctors would also facilitate the early detection and prevention of COVID-19 Infection. © 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.
  • Item
    Convolutional Neural Network-Enabling Speech Command Recognition
    (Springer Science and Business Media Deutschland GmbH, 2023) Patra, A.; Pandey, C.; Palaniappan, K.; Sethy, P.K.
    The speech command recognition system based on deep image classification is the key that would tremendously promise to revolutionize research and development by overcoming the communication barrier between human and machine or computer. We are all aware of challenges in identifying the voice command in noise and variability in speed, pitch, and projection. This paper has developed an efficient and highly accurate speech command recognition for smart and effective speech processing applications like modern telecommunication. In particular, a novel convolutional neural network (CNN) is presented that works with a one-second audio clip consisting of one specific word including ten speech commands and other words labeled as “unknown,” and model implementations were operated in the noisy environment. The CNNs are structurally fully developed in such a way to recognize the speech commands with the utilization of deep learning (DL) for image classification concepts. Thus, this research used the concept of DL for image classification to translate the problem of speech command recognition into the image domain. © 2023, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
  • Item
    What makes a video memorable?
    (Institute of Electrical and Electronics Engineers Inc., 2017) Kar, A.; Prashasthi, P.; Ghaturle, Y.; Vani, M.
    Humans are exposed to many pictures and videos on a daily basis, but they have this exceptional ability to remember the details, even though many of them look very similar. This Video Memorability (VM) is mainly due to distinguishable and a fine representation of the frames in human mind that people tend to remember. Videos have an abundance data contained in the frames which can be used for feature extraction purposes. Each feature from each frame has to be carefully considered to determine the intrinsic property of the video i.e. memorability. Using Convolutional Neural Network (CNN), we propose a solution to the problem of predicting VM, by estimating its memorability. A model has been developed to predict VM using algorithmically extracted features. Two types of features (i) semantic features (ii) visual features have been considered. The effectiveness of the model has been tested using publicly available image and video data. The results confirm that the CNN model can predict memorability with a acceptable performance. © 2017 IEEE.
  • Item
    SolveIt: An Application for Automated Recognition and Processing of Handwritten Mathematical Equations
    (Institute of Electrical and Electronics Engineers Inc., 2018) Sagar Bharadwaj, K.S.; Bhat, V.; Krishnan, A.S.
    Solving mathematical equations is an integral part of most, if not all forms of scientific studies. Researchers usually go through an arduous process of learning the nuances and syntactic complexities of a mathematical tool in order to solve or process mathematical equations. In this paper, we present a mobile application that can process an image of a handwritten mathematical equation captured using the device's camera, recognise the equation, form the corresponding string that can be parsed by a computer algebraic system and display all possible solutions. We aim to make the whole experience of experimenting with equations very user friendly and to remove the hassle of learning a mathematical tool just for mathematical experimentation. We propose a novel machine learning approach to recognise handwritten mathematical symbols achieving a 99.2% cross validation percentage accuracy on the kaggle math symbol dataset with reduced symbols. The application covers useful features like simultaneous equation solving, graph plotting and simple arithmetic computations from images. Overall it is a very user friendly equation solver that can leverage the power of existing powerful math packages. © 2018 IEEE.
  • Item
    Audio Replay Attack Detection for Speaker Verification System Using Convolutional Neural Networks
    (Springer, 2019) Kemanth, P.J.; Supanekar, S.; Koolagudi, G.K.
    An audio replay attack is one of the most popular spoofing attacks on speaker verification systems because it is very economical and does not require much knowledge of signal processing. In this paper, we investigate the significance of non-voiced audio segments and deep learning models like Convolutional Neural Networks (CNN) for audio replay attack detection. The non-voiced segments of the audio can be used to detect reverberation and channel noise. FFT spectrograms are generated and given as input to CNN to classify the audio as genuine or replay. The advantage of the proposed approach is, because of the removal of the voiced speech, the feature vector size is reduced without compromising the necessary features. This leads to significant amount of reduction on training time of the networks. The ASVspoof 2017 dataset is used to train and evaluate the model. The Equal Error Rate (EER) is computed and used as a metric to evaluate model performance. The proposed system has achieved an EER of 5.62% on the development dataset and 12.47% on the evaluation dataset. © 2019, Springer Nature Switzerland AG.
  • Item
    Brain tumor segmentation based on 3D residual U-Net
    (Springer, 2020) Bhalerao, M.; Thakur, S.
    We propose a deep learning based approach for automatic brain tumor segmentation utilizing a three-dimensional U-Net extended by residual connections. In this work, we did not incorporate architectural modifications to the existing 3D U-Net, but rather evaluated different training strategies for potential improvement of performance. Our model was trained on the dataset of the International Brain Tumor Segmentation (BraTS) challenge 2019 that comprise multi-parametric magnetic resonance imaging (mpMRI) scans from 335 patients diagnosed with a glial tumor. Furthermore, our model was evaluated on the BraTS 2019 independent validation data that consisted of another 125 brain tumor mpMRI scans. The results that our 3D Residual U-Net obtained on the BraTS 2019 test data are Mean Dice scores of 0.697, 0.828, 0.772 and Hausdorff95 distances of 25.56, 14.64, 26.69 for enhancing tumor, whole tumor, and tumor core, respectively. © Springer Nature Switzerland AG 2020.
  • Item
    Vocal Tract Articulatory Contour Detection in Real-Time Magnetic Resonance Images Using Spatio-Temporal Context
    (Institute of Electrical and Electronics Engineers Inc., 2020) Hebbar, S.A.; Sharma, R.; Somandepalli, K.; Toutios, A.; Narayanan, S.
    Due to its ability to visualize and measure the dynamics of vocal tract shaping during speech production, real-time magnetic resonance imaging (rtMRI) has emerged as one of the prominent research tools. The ability to track different articulators such as the tongue, lips, velum, and the pharynx is a crucial step toward automating further scientific and clinical analysis. Recently, various researchers have addressed the problem of detecting articulatory boundaries, but those are primarily limited to static-image based methods. In this work, we propose to use information from temporal dynamics together with the spatial structure to detect the articulatory boundaries in rtMRI videos. We train a convolutional LSTM network to detect and label the articulatory contours. We compare the produced contours against reference labels generated by iteratively fitting a manually created subject-specific template. We observe that the proposed method outperforms solely image-based methods, especially for the difficult-to-track articulators involved in airway constriction formation during speech. © 2020 IEEE.
  • Item
    Human Activity Recognition in Smart Home using Deep Learning Techniques
    (Institute of Electrical and Electronics Engineers Inc., 2021) Kolkar, R.; Geetha, V.
    To understand the human activities and anticipate his intentions Human Activity Recognition(HAR) research is rapidly developing in tandem with the widespread availability of sensors. Various applications like elderly care and health monitoring systems in smart homes use smartphones and wearable devices. This paper proposes an effective HAR framework that uses deep learning methodology like Convolution Neural Networks(CNN), variations of LSTM(Long Short term Memory) and Gated Recurrent Units(GRU) Networks to recognize the activities based on smartphone sensors. The hybrid use of CNN-LSTM eliminates the handcrafted feature engineering and uses spatial and temporal data deep. The experiments are carried on UCI HAR and WISDM data sets, and the comparison results are obtained. The result shows a better 96.83 % and 98.00% for the UCI-HAR and WISDM datasets, respectively. © 2021 IEEE.
  • Item
    COVID-19 Prediction Using Chest X-rays Images
    (Institute of Electrical and Electronics Engineers Inc., 2021) Kumar, A.; Sharma, N.; Naik, D.
    Understanding covid-19 became very important since large scale vaccination of this was not possible. Chest X-ray is the first imaging technique that plays an important role in the diagnosis of COVID-19 disease. Till now in various fields, great success has been achieved using convolutional neural networks(CNNs) for image recognition and classification. However, due to the limited availability of annotated medical images, the classification of medical images remains the biggest challenge in medical diagnosis. The proposed research work has performed transfer learning using deep learning models like Resnet50 and VGG16 and compare their performance with a newly developed CNN based model. Resnet50 and VGG16 are state of the art models and have been used extensively. A comparative analysis with them will give us an idea of how good our model is. Also, this research work develops a CNN model as it is expected to perform really good on image classification related problems. The proposed research work has used kaggle radiography dataset for training, validating and testing. Moreover, this research work has used another x-ray images dataset which have been created from two different sources. The result shows that the CNN model developed by us outperforms VGG16 and Resnet50 model. © 2021 IEEE.
  • Item
    Covid-19 Fake News Detector using Hybrid Convolutional and Bi-LSTM Model
    (Institute of Electrical and Electronics Engineers Inc., 2021) Surendran, P.; Balamuralidhar, B.; Kambham, H.; Anand Kumar, M.
    Fake news is essentially incorrect and deceiving information presented to the public as news with the motive of tarnishing the reputations of individuals and organizations. In today's world, where we are so closely connected due to the internet, we see a boom in the development of social networking platforms and, thus, the amount of news circulated over the internet. We must keep in mind that fake news circulated on social media and other platforms can cause problems and false alarms in society. In some cases, false information can cause panic and have a dangerous effect on society and the people who believe it to be true. Along with the virus, the Covid-19 pandemic has also brought on distribution and spreading of misinformation. Claims of fake cures, wrong interpretations of government policies, false statistics, etc., bring about a need for a fact-checking system that keeps the circulating news in control. This work examines multiple models and builds an Artificial Intelligence system to detect Covid-19 fake news using a deep neural network. © 2021 IEEE.