Conference Papers

Permanent URI for this collectionhttps://idr.nitk.ac.in/handle/123456789/28506

Browse

Search Results

Now showing 1 - 10 of 37
  • Item
    What makes a video memorable?
    (Institute of Electrical and Electronics Engineers Inc., 2017) Kar, A.; Prashasthi, P.; Ghaturle, Y.; Vani, M.
    Humans are exposed to many pictures and videos on a daily basis, but they have this exceptional ability to remember the details, even though many of them look very similar. This Video Memorability (VM) is mainly due to distinguishable and a fine representation of the frames in human mind that people tend to remember. Videos have an abundance data contained in the frames which can be used for feature extraction purposes. Each feature from each frame has to be carefully considered to determine the intrinsic property of the video i.e. memorability. Using Convolutional Neural Network (CNN), we propose a solution to the problem of predicting VM, by estimating its memorability. A model has been developed to predict VM using algorithmically extracted features. Two types of features (i) semantic features (ii) visual features have been considered. The effectiveness of the model has been tested using publicly available image and video data. The results confirm that the CNN model can predict memorability with a acceptable performance. © 2017 IEEE.
  • Item
    SolveIt: An Application for Automated Recognition and Processing of Handwritten Mathematical Equations
    (Institute of Electrical and Electronics Engineers Inc., 2018) Sagar Bharadwaj, K.S.; Bhat, V.; Krishnan, A.S.
    Solving mathematical equations is an integral part of most, if not all forms of scientific studies. Researchers usually go through an arduous process of learning the nuances and syntactic complexities of a mathematical tool in order to solve or process mathematical equations. In this paper, we present a mobile application that can process an image of a handwritten mathematical equation captured using the device's camera, recognise the equation, form the corresponding string that can be parsed by a computer algebraic system and display all possible solutions. We aim to make the whole experience of experimenting with equations very user friendly and to remove the hassle of learning a mathematical tool just for mathematical experimentation. We propose a novel machine learning approach to recognise handwritten mathematical symbols achieving a 99.2% cross validation percentage accuracy on the kaggle math symbol dataset with reduced symbols. The application covers useful features like simultaneous equation solving, graph plotting and simple arithmetic computations from images. Overall it is a very user friendly equation solver that can leverage the power of existing powerful math packages. © 2018 IEEE.
  • Item
    Audio Replay Attack Detection for Speaker Verification System Using Convolutional Neural Networks
    (Springer, 2019) Kemanth, P.J.; Supanekar, S.; Koolagudi, G.K.
    An audio replay attack is one of the most popular spoofing attacks on speaker verification systems because it is very economical and does not require much knowledge of signal processing. In this paper, we investigate the significance of non-voiced audio segments and deep learning models like Convolutional Neural Networks (CNN) for audio replay attack detection. The non-voiced segments of the audio can be used to detect reverberation and channel noise. FFT spectrograms are generated and given as input to CNN to classify the audio as genuine or replay. The advantage of the proposed approach is, because of the removal of the voiced speech, the feature vector size is reduced without compromising the necessary features. This leads to significant amount of reduction on training time of the networks. The ASVspoof 2017 dataset is used to train and evaluate the model. The Equal Error Rate (EER) is computed and used as a metric to evaluate model performance. The proposed system has achieved an EER of 5.62% on the development dataset and 12.47% on the evaluation dataset. © 2019, Springer Nature Switzerland AG.
  • Item
    Brain tumor segmentation based on 3D residual U-Net
    (Springer, 2020) Bhalerao, M.; Thakur, S.
    We propose a deep learning based approach for automatic brain tumor segmentation utilizing a three-dimensional U-Net extended by residual connections. In this work, we did not incorporate architectural modifications to the existing 3D U-Net, but rather evaluated different training strategies for potential improvement of performance. Our model was trained on the dataset of the International Brain Tumor Segmentation (BraTS) challenge 2019 that comprise multi-parametric magnetic resonance imaging (mpMRI) scans from 335 patients diagnosed with a glial tumor. Furthermore, our model was evaluated on the BraTS 2019 independent validation data that consisted of another 125 brain tumor mpMRI scans. The results that our 3D Residual U-Net obtained on the BraTS 2019 test data are Mean Dice scores of 0.697, 0.828, 0.772 and Hausdorff95 distances of 25.56, 14.64, 26.69 for enhancing tumor, whole tumor, and tumor core, respectively. © Springer Nature Switzerland AG 2020.
  • Item
    Vocal Tract Articulatory Contour Detection in Real-Time Magnetic Resonance Images Using Spatio-Temporal Context
    (Institute of Electrical and Electronics Engineers Inc., 2020) Hebbar, S.A.; Sharma, R.; Somandepalli, K.; Toutios, A.; Narayanan, S.
    Due to its ability to visualize and measure the dynamics of vocal tract shaping during speech production, real-time magnetic resonance imaging (rtMRI) has emerged as one of the prominent research tools. The ability to track different articulators such as the tongue, lips, velum, and the pharynx is a crucial step toward automating further scientific and clinical analysis. Recently, various researchers have addressed the problem of detecting articulatory boundaries, but those are primarily limited to static-image based methods. In this work, we propose to use information from temporal dynamics together with the spatial structure to detect the articulatory boundaries in rtMRI videos. We train a convolutional LSTM network to detect and label the articulatory contours. We compare the produced contours against reference labels generated by iteratively fitting a manually created subject-specific template. We observe that the proposed method outperforms solely image-based methods, especially for the difficult-to-track articulators involved in airway constriction formation during speech. © 2020 IEEE.
  • Item
    Human Activity Recognition in Smart Home using Deep Learning Techniques
    (Institute of Electrical and Electronics Engineers Inc., 2021) Kolkar, R.; Geetha, V.
    To understand the human activities and anticipate his intentions Human Activity Recognition(HAR) research is rapidly developing in tandem with the widespread availability of sensors. Various applications like elderly care and health monitoring systems in smart homes use smartphones and wearable devices. This paper proposes an effective HAR framework that uses deep learning methodology like Convolution Neural Networks(CNN), variations of LSTM(Long Short term Memory) and Gated Recurrent Units(GRU) Networks to recognize the activities based on smartphone sensors. The hybrid use of CNN-LSTM eliminates the handcrafted feature engineering and uses spatial and temporal data deep. The experiments are carried on UCI HAR and WISDM data sets, and the comparison results are obtained. The result shows a better 96.83 % and 98.00% for the UCI-HAR and WISDM datasets, respectively. © 2021 IEEE.
  • Item
    COVID-19 Prediction Using Chest X-rays Images
    (Institute of Electrical and Electronics Engineers Inc., 2021) Kumar, A.; Sharma, N.; Naik, D.
    Understanding covid-19 became very important since large scale vaccination of this was not possible. Chest X-ray is the first imaging technique that plays an important role in the diagnosis of COVID-19 disease. Till now in various fields, great success has been achieved using convolutional neural networks(CNNs) for image recognition and classification. However, due to the limited availability of annotated medical images, the classification of medical images remains the biggest challenge in medical diagnosis. The proposed research work has performed transfer learning using deep learning models like Resnet50 and VGG16 and compare their performance with a newly developed CNN based model. Resnet50 and VGG16 are state of the art models and have been used extensively. A comparative analysis with them will give us an idea of how good our model is. Also, this research work develops a CNN model as it is expected to perform really good on image classification related problems. The proposed research work has used kaggle radiography dataset for training, validating and testing. Moreover, this research work has used another x-ray images dataset which have been created from two different sources. The result shows that the CNN model developed by us outperforms VGG16 and Resnet50 model. © 2021 IEEE.
  • Item
    Covid-19 Fake News Detector using Hybrid Convolutional and Bi-LSTM Model
    (Institute of Electrical and Electronics Engineers Inc., 2021) Surendran, P.; Balamuralidhar, B.; Kambham, H.; Anand Kumar, M.
    Fake news is essentially incorrect and deceiving information presented to the public as news with the motive of tarnishing the reputations of individuals and organizations. In today's world, where we are so closely connected due to the internet, we see a boom in the development of social networking platforms and, thus, the amount of news circulated over the internet. We must keep in mind that fake news circulated on social media and other platforms can cause problems and false alarms in society. In some cases, false information can cause panic and have a dangerous effect on society and the people who believe it to be true. Along with the virus, the Covid-19 pandemic has also brought on distribution and spreading of misinformation. Claims of fake cures, wrong interpretations of government policies, false statistics, etc., bring about a need for a fact-checking system that keeps the circulating news in control. This work examines multiple models and builds an Artificial Intelligence system to detect Covid-19 fake news using a deep neural network. © 2021 IEEE.
  • Item
    Face Parts Recognition Using Deep Neural Networks
    (Institute of Electrical and Electronics Engineers Inc., 2021) Krishna, M.S.; Nali, A.; Aggarwal, N.; Krishna, T.; Ramesh, R.
    This paper has expressed overall procedure of the facial recognition with its importance and essential beneficial factors. CNN and ML methods are used to find out the accuracy of the model for which data test train and features extraction has been processed. The output accuracy is observed to be 91.8%. Involvement of optimizers, batch normalization and dropout functionalities reported advantages in proposed CNN model. © 2021 IEEE.
  • Item
    Handwritten Text Recognition from an Image with Android Application
    (Institute of Electrical and Electronics Engineers Inc., 2022) Mule, H.; Kadam, N.; Naik, D.
    Nowadays, Storing information from handwritten documents for future use is becoming necessary. An easy way to store information is to capture handwritten documents and save them in image format. Recognizing the text or characters present in the image is called Optical Character Recognition. Text extraction from the image in the recent research is challenging due to stroke variation, inconsistent writing style, Cursive handwriting, etc. We have proposed CNN and BiLSTM models for text recognition in this work. This model is evaluated on the IAM dataset and achieved 92% character recognition accuracy. This model is deployed to the Firebase as a custom model to increase usability. We have developed an android application that will allow the user to capture or browse the image and extract the text from the picture by calling the firebase model and saving text in the file. To store the text file user can browse for the appropriate location. The proposed model works on both printed and handwritten text. © 2022 IEEE.