Faculty Publications

Permanent URI for this communityhttps://idr.nitk.ac.in/handle/123456789/18736

Publications by NITK Faculty

Browse

Search Results

Now showing 1 - 10 of 12

Automatic shadow removal algorithm for VOP, DWT based watermarking algorithm for VOP and generation of super resolved VOP
(2011) Pais, A.R.; D'Souza, J.; Reddy, R.M.; Hari Krishna, P.
Removal of shadow from Video Object Planes (VOPs) will assist in surveillance applications for comprehensive detection of activities. We have proposed a method for removal of shadows from the VOP. Also noise removal is done using existing methods from the VOP. To authenticate the surveillance VOP, digital watermarking is used. We have proposed digital watermarking using localized Biorthogonal wavelets for VOP. Super-resolved VOP is generated using multi-frame method. Edge model based super resolution method is used to get the better results. Also the effect of digital watermarking is studied for the super-resolved VOP. A number of test cases have been proposed and found out a best method for video surveillance application. Our proposed super resolution (SR) method gives better results than bilinear and bi-cubic methods.
Super-resolution video generation algorithm for surveillance applications
(Maney Publishing Suite 1C, Joseph's Well, Hanover Walk Leeds LS3 1AB, 2014) Pais, A.R.; D'Souza, J.; Reddy, R.M.
Video surveillance is one of the major applications where high-resolution (HR) images are crucial. Since the video camera has limited spatial and temporal resolution, there is a need for super resolution video generation algorithms. In this paper, we have presented a novel technique for activity detection in the surveillance video. To achieve this goal, we have proposed and investigated efficient algorithms for Video Object Plane (VOP) generation, shadow removal from VOP and super-resolved VOP generation, for activity detection from surveillance video. The proposed VOP generation algorithm is computationally efficient and works for both dynamic and static backgrounds. The novel shadow removal algorithm for the VOP is based on texture and its performance has been studied based on average shadow detection and discrimination rates. The proposed super-resolution video generation algorithm has been designed using edge models. The performance of this algorithm has been evaluated using a numerical analysis technique and is found to be better than bi-cubic and bi-linear interpolation techniques. © 2014 RPS.
Dynamic video anomaly detection and localization using sparse denoising autoencoders
(Springer New York LLC barbara.b.bertram@gsk.com, 2018) Narasimhan, M.G.; Kamath S?, S.
The emergence of novel techniques for automatic anomaly detection in surveillance videos has significantly reduced the burden of manual processing of large, continuous video streams. However, existing anomaly detection systems suffer from a high false-positive rate and also, are not real-time, which makes them practically redundant. Furthermore, their predefined feature selection techniques limit their application to specific cases. To overcome these shortcomings, a dynamic anomaly detection and localization system is proposed, which uses deep learning to automatically learn relevant features. In this technique, each video is represented as a group of cubic patches for identifying local and global anomalies. A unique sparse denoising autoencoder architecture is used, that significantly reduced the computation time and the number of false positives in frame-level anomaly detection by more than 2.5%. Experimental analysis on two benchmark data sets - UMN dataset and UCSD Pedestrian dataset, show that our algorithm outperforms the state-of-the-art models in terms of false positive rate, while also showing a significant reduction in computation time. © 2017, Springer Science+Business Media, LLC.
Gradient-oriented directional predictor for HEVC planar and angular intra prediction modes to enhance lossless compression
(Elsevier GmbH journals@elsevier.com, 2018) Shilpa Kamath, S.; Aparna., P.; Antony, A.
Recent advancements in the capture and display technologies motivated the ITU-T Video Coding Experts Group and ISO/IEC Moving Picture Experts Group to jointly develop the High-Efficiency Video Coding (HEVC), a state-of-the-art video coding standard for efficient compression. The compression applications that essentially require lossless compression scenarios include medical imaging, video analytics, video surveillance, video streaming etc., where the content reconstruction should be flawless. In the proposed work, we present a gradient-oriented directional prediction (GDP) strategy at the pixel level to enhance the compression efficiency of the conventional block-based planar and angular intra prediction in the HEVC lossless mode. The detailed experimental analysis demonstrates that the proposed method outperforms the lossless mode of HEVC anchor in terms of bit-rate savings by 8.29%, 1.65%, 1.94% and 2.21% for Main-AI, LD, LDP and RA configurations respectively, without impairing the computational complexity. The experimental results also illustrates that the proposed predictor performs superior to the existing state-of-the-art techniques in the literature. © 2018 Elsevier GmbH
UAV based cost-effective real-time abnormal event detection using edge computing
(Springer, 2019) Shahzad Alam, M.S.; Natesha, B.V.; Ashwin, T.S.; Guddeti, R.M.R.
Recent advancements in computer vision led to the development of a real-time surveillance system which ensures the safety and security of the people in public places. An aerial surveillance system will be advantageous in this scenario using a platform like Unmanned Aerial Vehicle (UAV) will be very reliable and can be considered as a cost-effective option for this task. To make the system fully autonomous, we require real-time abnormal event detection. But, this is computationally complex and time-consuming due to the heavy load on the UAV, which affords limited processing and payload capacity. In this paper, we propose a cost-effective approach for aerial surveillance in which we move the large computation tasks to the cloud while keeping limited computation on-board UAV device using edge computing technique. Further, our proposed system will maintain the minimum communication between UAV and cloud. Thus it not only reduces the network traffic but also reduces the end-to-end delay. The proposed method is based on the state-of-the-art YOLO (You Only Look Once) technique for real-time object detection deployed on edge computing device using Intel neural compute stick Movidius VPU (Vision Processing Unit), and we applied abnormal event detection using motion influence map on the cloud. Experimental results demonstrate that the proposed system reduces the end-to-end delay. Further, Tiny YOLO is six times faster while processing the frames per second (fps) when compared to other state-of-the-art methods. © 2019, Springer Science+Business Media, LLC, part of Springer Nature.
Multimodal behavior analysis in computer-enabled laboratories using nonverbal cues
(Springer Science and Business Media Deutschland GmbH info@springer-sbm.com, 2020) Banerjee, S.; Ashwin, T.S.; Guddeti, R.M.R.
In the modern era, there is a growing need for surveillance to ensure the safety and security of the people. Real-time object detection is crucial for many applications such as traffic monitoring, security, search and rescue, vehicle counting, and classroom monitoring. Computer-enabled laboratories are generally equipped with video surveillance cameras in the smart campus. But, from the existing literature, it is observed that the use of video surveillance data obtained from smart campus for any unobtrusive behavioral analysis is seldom performed. Though there are several works on the students’ and teachers’ behavior recognition from devices such as Kinect and handy cameras, there exists no such work which extracts the video surveillance data and predicts the behavioral patterns of both the students and the teachers in real time. Hence, in this study, we unobtrusively analyze the students’ and teachers’ behavioral patterns inside a teaching laboratory (which is considered as an indoor scenario of a smart campus). Here, we propose a deep convolution network architecture to classify and recognize an object in the indoor scenario, i.e., the teaching laboratory environment of the smart campus with modified Single-Shot MultiBox Detector approach. We used six different class labels for predicting the behavioral patterns of both the students and the teachers. We created our dataset with six different class labels for training deep learning architecture. The performance evaluation demonstrates that the proposed method performs better with an accuracy of 0.765 for classification and localization. © 2020, Springer-Verlag London Ltd., part of Springer Nature.
Surveillance video analysis for student action recognition and localization inside computer laboratories of a smart campus
(Springer, 2021) Rashmi, M.; Ashwin, T.S.; Guddeti, G.R.M.
In the era of smart campus, unobtrusive methods for students’ monitoring is a challenging task. The monitoring system must have the ability to recognize and detect the actions performed by the students. Recently many deep neural network based approaches have been proposed to automate Human Action Recognition (HAR) in different domains, but these are not explored in learning environments. HAR can be used in classrooms, laboratories, and libraries to make the teaching-learning process more effective. To make the learning process more effective in computer laboratories, in this study, we proposed a system for recognition and localization of student actions from still images extracted from (Closed Circuit Television) CCTV videos. The proposed method uses (You Only Look Once) YOLOv3, state-of-the-art real-time object detection technology, for localization, recognition of students’ actions. Further, the image template matching method is used to decrease the number of image frames and thus processing the video quickly. As actions performed by the humans are domain specific and since no standard dataset is available for students’ action recognition in smart computer laboratories, thus we created the STUDENT ACTION dataset using the image frames obtained from the CCTV cameras placed in the computer laboratory of a university campus. The proposed method recognizes various actions performed by students in different locations within an image frame. It shows excellent performance in identifying the actions with more samples compared to actions with fewer samples. © 2020, Springer Science+Business Media, LLC, part of Springer Nature.
An empirical study of the impact of masks on face recognition
(Elsevier Ltd, 2022) Jeevan, G.; Zacharias, G.C.; Nair, M.S.; Rajan, J.
Face recognition has a wide range of applications like video surveillance, security, access control, etc. Over the past decade, the field of face recognition has matured and grown at par with the latest advancements in technology, particularly deep learning. Convolution Neural Networks have surpassed human accuracy in Face Recognition on popular evaluation tests such as LFW. However, most existing models evaluate their performance with an assumption of the availability of full facial information. The COVID-19 pandemic has laid forth challenges to this assumption, and to the performance of existing methods and leading-edge algorithms in the field of face recognition. This is in the wake of an explosive increase in the number of people wearing face masks. The reduced amount of facial information available to a recognition system from a masked face impacts their discrimination ability. In this context, we design and conduct a series of experiments comparing the masked face recognition performances of CNN architectures available in literature and exploring possible alterations in loss functions, architectures, and training methods that can enable existing methods to fully extract and leverage the limited facial information available in a masked face. We evaluate existing CNN-based face recognition systems for their performance against datasets composed entirely of masked faces, in contrast to the existing standard evaluations where masked or occluded faces are a rare occurrence. The study also presents evidence denoting an increased impact of network depth on performance compared to standard face recognition. Our observations indicate that substantial performance gains can be achieved by the introduction of masked faces in the training set. The study also inferred that various parameter settings determined suitable for standard face recognition are not ideal for masked face recognition. Through empirical analysis we derived new value recommendations for these parameters and settings. © 2021 Elsevier Ltd
Deep learning-based multi-view 3D-human action recognition using skeleton and depth data
(Springer, 2023) Ghosh, S.K.; Rashmi, M.; Mohan, B.R.; Guddeti, R.M.R.
Human Action Recognition (HAR) is a fundamental challenge that smart surveillance systems must overcome. With the rising affordability of capturing human actions with more advanced depth cameras, HAR has garnered increased interest over the years, however the majority of these efforts have been on single-view HAR. Recognizing human actions from arbitrary viewpoints is more challenging, as the same action is observed differently from different angles. This paper proposes a multi-stream Convolutional Neural Network (CNN) model for multi-view HAR using depth and skeleton data. We also propose a novel and efficient depth descriptor, Edge Detected-Motion History Image (ED-MHI), based on Canny Edge Detection and Motion History Image. Also, the proposed skeleton descriptor, Motion and Orientation of Joints (MOJ), represent the appropriate action by using joint motion and orientation. Experimental results on two datasets of human actions: NUCLA Multiview Action3D and NTU RGB-D using a Cross-subject evaluation protocol demonstrated that the proposed system exhibits the superior performance as compared to the state-of-the-art works with 93.87% and 85.61% accuracy, respectively. © 2022, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.
RMDNet-Deep Learning Paradigms for Effective Malware Detection and Classification
(Institute of Electrical and Electronics Engineers Inc., 2024) S, S.; Lal, S.; Pratap Singh, M.; Raghavendra, B.S.
Malware analysis and detection are still essential for maintaining the security of networks and computer systems, even as the threat landscape shifts. Traditional approaches are insufficient to keep pace with the rapidly evolving nature of malware. Artificial Intelligence (AI) assumes a significant role in propelling its design to unprecedented levels. Various Machine Learning (ML) based malware detection systems have been developed to combat the ever-changing characteristics of malware. Consequently, there is a growing interest in exploring advanced techniques that leverage the power of Deep Learning (DL) to effectively analyze and detect malicious software. DL models demonstrate enhanced capabilities for analyzing extensive sequences of system calls. This paper proposes a Robust Malware Detection Network (RMDNet) for effective malware detection and classification. The proposed RMDNet model branches the input and performs depth-wise convolution and concatenation operations. The experimental results of the proposed RMDNet and existing DL models are evaluated on 48240 malware and binary visualization image dataset with RGB format. Also on the multi-class malimg and dumpware-10 datasets with grayscale format. The experimental results on each of these datasets demonstrate that the proposed RMDNet model can effectively and accurately categorize malware, outperforming the most recent benchmark DL algorithms. © 2013 IEEE.

Faculty Publications

Browse

Filters

Settings

Sort By

Results per page

Search Results