Faculty Publications
Permanent URI for this communityhttps://idr.nitk.ac.in/handle/123456789/18736
Publications by NITK Faculty
Browse
12 results
Search Results
Item An efficient dynamic switching algorithm for mining colossal closed itemsets from high dimensional datasets(Elsevier B.V., 2019) Vanahalli, M.K.; Patil, N.The abundant data across a variety of domains including bioinformatics has led to the formation of dataset with high dimensionality. The conventional algorithms expend most of their time in mining a large number of small and mid-sized itemsets which does not enclose complete and valuable information for decision making. The recent research is focused on Frequent Colossal Closed Itemsets (FCCI), which plays a significant role in decision making for many applications, especially in the field of bioinformatics. The state-of-the-art algorithms in mining FCCI from datasets consisting of a large number of rows and a large number of features are computationally expensive, as they are either pure row or feature enumeration based algorithms. Moreover, the existing preprocessing techniques fail to prune the complete set of irrelevant features and irrelevant rows. The proposed work emphasizes an Effective Improvised Preprocessing (EIP) technique to prune the complete set of irrelevant features and irrelevant rows, and a novel efficient Dynamic Switching Frequent Colossal Closed Itemset Mining (DSFCCIM) algorithm. The proposed DSFCCIM algorithm efficiently switches between row and feature enumeration methods based on data characteristics during the mining process. Further, the DSFCCIM algorithm is integrated with a novel Rowset Cardinality Table, Itemset Support Table, two efficient methods to check the closeness of rowset and itemset, and two efficient pruning strategies to cut down the search space. The proposed DSFCCIM algorithm is the first dynamic switching algorithm to mine FCCI from datasets consisting of a large number of rows and a large number of features. The performance study shows the improved effectiveness of the proposed EIP technique over the existing preprocessing techniques and the improved efficiency of the proposed DSFCCIM algorithm over the existing algorithms. © 2019 Elsevier B.V.Item Improving MapReduce scheduler for heterogeneous workloads in a heterogeneous environment(John Wiley and Sons Ltd, 2020) Jeyaraj, R.; Ananthanarayana, V.S.; Paul, A.Big data is largely influencing business entities and research sectors to be more data-driven. Hadoop MapReduce is one of the cost-effective ways to process large scale datasets and offered as a service over the Internet. Even though cloud service providers promise an infinite amount of resources available on-demand, it is inevitable that some of the hired virtual resources for MapReduce are left unutilized and makespan is limited due to various heterogeneities that exist while offering MapReduce as a service. As MapReduce v2 allows users to define the size of containers for the map and reduce tasks, jobs in a batch become heterogeneous and behave differently. Also, the different capacity of virtual machines in the MapReduce virtual cluster accommodate a varying number of map/reduce tasks. These factors highly affect resource utilization in the virtual cluster and the makespan for a batch of MapReduce jobs. Default MapReduce job schedulers do not consider these heterogeneities that exist in a cloud environment. Moreover, virtual machines in MapReduce virtual cluster process an equal number of blocks regardless of their capacity, which affects the makespan. Therefore, we devised a heuristic-based MapReduce job scheduler that exploits virtual machine and MapReduce workload level heterogeneities to improve resource utilization and makespan. We proposed two methods to achieve this: (i) roulette wheel scheme based data block placement in heterogeneous virtual machines, and (ii) a constrained 2-dimensional bin packing to place heterogeneous map/reduce tasks. We compared heuristic-based MapReduce job scheduler against the classical fair scheduler in MapReduce v2. Experimental results showed that our proposed scheduler improved makespan and resource utilization by 45.6% and 47.9% over classical fair scheduler. © 2019 John Wiley & Sons, Ltd.Item Comparative study on tool fault diagnosis methods using vibration signals and cutting force signals by machine learning technique(Tech Science Press sale@techscience.com, 2020) Aralikatti, S.S.; Ravikumar, K.N.; Kumar, H.; Shivananda Nayaka, H.; Sugumaran, V.The state of cutting tool determines the quality of surface produced on the machined parts. A faulty tool produces poor surface, inaccurate geometry and non-economic production. Thus, it is necessary to monitor tool condition for a machining process to have superior quality and economic production. In the present study, fault classification of single point cutting tool for hard turning has been carried out by employing machine learning technique. Cutting force and vibration signals were acquired to monitor tool condition during machining. A set of four tooling conditions namely healthy, worn flank, broken insert and extended tool overhang have been considered for the study. The machine learning technique was applied to both vibration and cutting force signals. Discrete wavelet features of the signals have been extracted using discrete wavelet transformation (DWT). This transformation represents a large dataset into approximation coefficients which contain the most useful information of the dataset. Significant features, among features extracted, were selected using J48 decision tree technique. Classification of tool conditions was carried out using Naïve Bayes algorithm. A 10 fold cross validation was incorporated to test the validity of classifier. A comparison of performance of classifier was made between cutting force and vibration signal to choose the best signal acquisition method in classifying tool fault conditions using machine learning technique. © 2020 Tech Science Press. All rights reserved.Item Enhanced protein structural class prediction using effective feature modeling and ensemble of classifiers(Institute of Electrical and Electronics Engineers Inc., 2021) Bankapur, S.; Patil, N.Protein Secondary Structural Class (PSSC) information is important in investigating further challenges of protein sequences like protein fold recognition, protein tertiary structure prediction, and analysis of protein functions for drug discovery. Identification of PSSC using biological methods is time-consuming and cost-intensive. Several computational models have been developed to predict the structural class; however, they lack in generalization of the model. Hence, predicting PSSC based on protein sequences is still proving to be an uphill task. In this article, we proposed an effective, novel and generalized prediction model consisting of a feature modeling and an ensemble of classifiers. The proposed feature modeling extracts discriminating information (features) by leveraging three techniques: (i) Embedding – features are extracted on the basis of spatial residue arrangements of the sequences using word embedding approaches; (ii) SkipXGram Bi-gram – various sets of skipped bi-gram features are extracted from the sequences; and (iii) General Statistical (GS) based features are extracted which covers the global information of structural sequences. The combined effective sets of features are trained and classified using an ensemble of three classifiers: Support Vector Machine (SVM), Random Forest (RF), and Gradient Boosting Machines (GBM). The proposed model when assessed on five benchmark datasets (high and low sequence similarity), viz. z277, z498, 25PDB, 1189, and FC699, reported an overall accuracy of 93.55, 97.58, 81.82, 81.11, and 93.93 percent respectively. The proposed model is further validated on a large-scale updated low similarity (?25%) dataset, where it achieved an overall accuracy of 81.11 percent. The proposed generalized model is robust and consistently outperformed several state-of-the-art models on all the five benchmark datasets. © 2021 Institute of Electrical and Electronics Engineers Inc.. All rights reserved.Item A deep neural network model for content-based medical image retrieval with multi-view classification(Springer Science and Business Media Deutschland GmbH, 2021) Karthik, K.; Kamath S?, S.S.In medical applications, retrieving similar images from repositories is most essential for supporting diagnostic imaging-based clinical analysis and decision support systems. However, this is a challenging task, due to the multi-modal and multi-dimensional nature of medical images. In practical scenarios, the availability of large and balanced datasets that can be used for developing intelligent systems for efficient medical image management is quite limited. Traditional models often fail to capture the latent characteristics of images and have achieved limited accuracy when applied to medical images. For addressing these issues, a deep neural network-based approach for view classification and content-based image retrieval is proposed and its application for efficient medical image retrieval is demonstrated. We also designed an approach for body part orientation view classification labels, intending to reduce the variance that occurs in different types of scans. The learned features are used first to predict class labels and later used to model the feature space for similarity computation for the retrieval task. The outcome of this approach is measured in terms of error score. When benchmarked against 12 state-of-the-art works, the model achieved the lowest error score of 132.45, with 9.62–63.14% improvement over other works, thus highlighting its suitability for real-world applications. © 2020, Springer-Verlag GmbH Germany, part of Springer Nature.Item Segmentation of focal cortical dysplasia lesions from magnetic resonance images using 3D convolutional neural networks(Elsevier Ltd, 2021) Niyas, S.; Chethana Vaisali, S.; Show, I.; Chandrika, T.G.; Vinayagamani, S.; Kesavadas, C.; Rajan, J.Computer-aided diagnosis using advanced Artific ial Intelligence (AI) techniques has become much popular over the last few years. This work automates the segmentation of Focal Cortical Dysplasia (FCD) lesions from three-dimensional (3D) Magnetic Resonance (MR) images. FCD is a type of neuronal malformation in the brain cortex and is the leading cause of intractable epilepsy, irrespective of gender or age differences. Since the neuron related abnormalities are usually resistant to drug therapy, surgical resection has been the main treatment approach for patients with intractable epilepsy. Automating the identification and segmentation of FCD is useful for neuroradiologists in pre-surgical evaluations. Convolutional Neural Networks (CNNs) have the ability to learn appropriate features from the training data without any human intervention. But, most of the state-of-the-art FCD segmentation approaches use two-dimensional (2D) CNN models despite the availability of 3D Magnetic resonance imaging (MRI) volumes, and hence fail to leverage the inter-slice information present in the MRI volumes. The major hurdles in considering a 3D CNN model are the need for a large 3D dataset, big memory, and high computation cost. A deep 3D CNN segmentation model, which can extract inter-slice information and overcomes the drawbacks of conventional 3D CNN methods to an extent, is proposed in this paper. The model uses a 3D version of U-Net with residual blocks that works on shallow depth 3D sub-volumes generated from MRI volumes. The proposed method shows superior performance over the state-of-the-art FCD segmentation methods in both qualitative and quantitative analysis. © 2021 Elsevier LtdItem Deep learning-based automated mitosis detection in histopathology images for breast cancer grading(John Wiley and Sons Inc, 2022) Mathew, T.; Ajith, B.; Kini, J.; Rajan, J.Cancer grade is an indicator of the aggressiveness of cancer. It is used for prognosis and treatment decisions. Conventionally cancer grading is performed manually by experienced pathologists via microscopic examination of pathology slides. Among the three factors involved in breast cancer grading (mitosis count, nuclear atypia, and tubule formation), mitotic cell counting is the most challenging task for pathologists. It is possible to automate this task by applying computational algorithms on pathology slides images. Lack of sufficiently large datasets and class imbalance between mitotic and non-mitotic cells in slide images are the two major challenges in developing effective deep learning-based methods for mitosis detection. In this paper, we propose a new approach and a method based on that to address these challenges. The high training data requirement of the advanced deep neural network is met by combining two datasets from different sources after a color-normalization process. Class imbalance is addressed by the augmentation of the mitotic samples in a context-preserving manner. Finally, a customized convolutional neural network classifier is used to classify the candidate cells into the target classes. We have used the publicly available datasets MITOS-ATYPIA and MITOS for the experiments. Our method outperforms most of the recent methods that are based on independent datasets and at the same time offers adaptability to the combination of datasets from different sources. © 2022 Wiley Periodicals LLC.Item A deep learning based classifier framework for automated nuclear atypia scoring of breast carcinoma(Elsevier Ltd, 2023) Mathew, T.; Johnpaul, C.I.; Ajith, B.; Kini, J.R.; Rajan, J.Nuclear atypia scoring is an essential procedure in the grading of breast carcinoma. Manual procedure of nuclear atypia scoring is error-prone, and marked by pathologists’ disagreement and low reproducibility. Automated methods are actively attempted by researchers to solve the problems of manual scoring. In this work, we propose a novel deep learning-based framework for automated nuclear atypia scoring of breast cancer from histopathology slide images. The framework consists of three major phases namely preprocessing, deep learning, and postprocessing. The original three-class problem of atypia scoring at slide level is not suitable for direct application of deep learning algorithms. This is due to the large dimensions and structural complexity of slide images, compounded by the small sample size of the available dataset. Redesign of this problem into a six-class nuclei classification problem through a set of preprocessing steps to facilitate effective use of deep learning algorithms, and the flexibility of the proposed three-phase framework to use different algorithms in each phase are the novel aspects of the proposed work. We used the publicly available slide image dataset MITOS-ATYPIA that contains 600 slide images of high spatial dimension for the experiments. A five-fold cross validation with the train-test sample ratio 80:20 in each fold is used for the performance evaluation. The performance of the method based on this framework exceeds the state-of-the-art with the results 0.8766, 0.8760, and 0.8745 for the metrics precision, recall, and F1 score respectively. © 2023 Elsevier LtdItem Affective Feedback Synthesis Towards Multimodal Text and Image Data(Association for Computing Machinery, 2023) Kumar, P.; Bhatt, G.; Ingle, O.; Goyal, D.; Raman, B.In this article, we have defined a novel task of affective feedback synthesis that generates feedback for input text and corresponding images in a way similar to humans responding to multimodal data. A feedback synthesis system has been proposed and trained using ground-truth human comments along with image-text input. We have also constructed a large-scale dataset consisting of images, text, Twitter user comments, and the number of likes for the comments by crawling news articles through Twitter feeds. The proposed system extracts textual features using a transformer-based textual encoder. The visual features have been extracted using a Faster region-based convolutional neural networks model. The textual and visual features have been concatenated to construct multimodal features that the decoder uses to synthesize the feedback. We have compared the results of the proposed system with baseline models using quantitative and qualitative measures. The synthesized feedbacks have been analyzed using automatic and human evaluation. They have been found to be semantically similar to the ground-truth comments and relevant to the given text-image input. © 2023 Copyright held by the owner/author(s). Publication rights licensed to ACM.Item Evolution of LiverNet 2.x: Architectures for automated liver cancer grade classification from H&E stained liver histopathological images(Springer, 2024) Chanchal, A.K.; Lal, S.; Barnwal, D.; Sinha, P.; Arvavasu, S.; Kini, J.Recently, the automation of disease identification has been quite popular in the field of medical diagnosis. The rise of Convolutional Neural Networks (CNNs) for training and generalizing medical image data has proven to be quite efficient in detecting and identifying the types and sub-types of various diseases. Since the classification of large datasets of Hematoxylin & Eosin (H&E) stained histopathology images by experts can be expensive and time-consuming, automated processes using deep learning have been encouraged for the past decade. This paper introduces LiverNet 2.x model by modifying the previously encountered LiverNet architecture. The proposed model uses two different improvements of the Atrous Spatial Pyramid Pooling (ASPP) block to extract the clinically defined features of hepatocellular carcinoma (HCC) from liver histopathology images. LiverNet 2.0 uses a modified form of ASPP block known as DenseASPP, where all the atrous convolution outputs are densely connected. Whereas LiverNet 2.1 uses fewer concatenations while maintaining a large receptive field by stacking the dilated convolutional blocks in a tree-like fashion. This paper also discusses the trade-off between LiverNet 2.0 and LiverNet 2.1 in terms of accuracy and computational complexity. All comparison model and the proposed model is trained and tested on the patches of two different histopathological datasets. The experimental results show that the proposed model performs better compared to reference models. For the KMC Liver dataset, LiverNet 2.0 and LiverNet 2.1 achieved an accuracy of 97.50% and 97.14% respectively. Accuracy of 94.37% and 97.14% for the TCGA Liver dataset are achieved. © 2023, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.
