Faculty Publications

Permanent URI for this communityhttps://idr.nitk.ac.in/handle/123456789/18736

Publications by NITK Faculty

Browse

Search Results

Now showing 1 - 10 of 15
  • Item
    Classifying behavioural traits of small-scale farmers: Use of a novel artificial neural network (ANN) classifier
    (Institute of Electrical and Electronics Engineers Inc., 2016) Jena, P.R.; Majhi, R.
    This paper develops and employs a novel artificial neural network (ANN) model to study farmers' behaviour towards decision making on maize production in Kenya. The paper has compared the accuracy level of ANN based model and the statistical model and found out that the ANN model has achieved higher accuracy and efficiency. The findings from the study reveal that the farmers are mostly influenced by their demographic and food security for decision making. Further to examine the relative importance of different demographic and food security characteristics, an ANOVA test is undertaken. The results found that education and food security indices are instrumental in influencing farmers' decision making. © 2016 IEEE.
  • Item
    QSAR Classification Models for Predicting 3CLPro-protease Inhibitor Activity
    (Institute of Electrical and Electronics Engineers Inc., 2021) Mondal, K.; Kamath S․, S.
    The ongoing COVID-19 pandemic achieved a worldwide scale rapidly and has caused devastating casualties in terms of both human lives and in damage to the world economy. Several efforts for designing drugs and vaccines are underway across the globe. One of the potential early breakthroughs resulted due to the potential for repurposing existing drugs for COVID-19, specifically by drug modeling using computing power availability. Prediction of inhibition activity is a major step in such computation based drug discovery process. It is one of the virtual screening processes that throws light on particular molecules that may potential drug candidates. The subsequent stages in drug discovery are highly resource-intensive, during which a streamlined analysis of potential candidates can help in optimal design. Thus, the problem of predicting inhibition activity of compounds on proteins has attracted significant research interest. In this paper, an approach that employs quantitative structure-activity relationship (QSAR) modelling of SARS-CoV-3CLpro enzyme inhibitors for the development of activity classification model is proposed. The classification models predict SAR-CoV-3CLpro inhibitory activity for query compounds in the screening process using labels. Moreover, molecular docking analysis is performed using 3 FDA approved drugs that are being used as repurposed drugs for COVID-19 treatment. The best performing model with docking data (RMSD and Binding energy) of these 3 drugs were validated and the results obtained were promising. © 2021 IEEE.
  • Item
    Machine Learning based COVID-19 Mortality Prediction using Common Patient Data
    (Institute of Electrical and Electronics Engineers Inc., 2022) Agrawal, S.; Patil, N.
    COVID-19 was declared a pandemic in 2020, and it caused havoc worldwide. The fact that it is unpredictable adds to its lethality. The world has already seen various COVID-19 infection waves, subsequent waves being even more deadly. Many patients are asymptomatic initially but suddenly develop breathing problems. More than four million people have died due to COVID-19. It is necessary to forecast a patient's likelihood of dying so that appropriate precautions can be implemented. In this study, a COVID-19 mortality prediction model which uses machine learning is proposed. Most of the current research work requires several patient features and lab test results to predict mortality. However, we suggest a simpler and more efficient technique that relies solely on X-rays and basic patient information such as age and gender. Several ensemble-based models were evaluated and compared using a variety of metrics, and the best method was able to achieve a classification accuracy of 92.6% and AUPRC of 0.95. © 2022 IEEE.
  • Item
    Citation Intent Classification Using Transformers
    (Institute of Electrical and Electronics Engineers Inc., 2024) Rakshith Gowda, H.C.; Raj, K.S.; Anand Kumar, M.
    As the world of scholarly research continues to grow, the intricate network of citations serves as the foundation of academic discussion, symbolizing the interweaving of concepts and the dissemination of information. The study of citations in scientific literature is important for discovering new knowledge, retrieving information, and analyzing discourse. However, manually categorizing citation functions is a slow and biased process. To address this, we conducted research on automated citation function classification in astrophysics literature by creating and evaluating deep learning models. We also introduce the FOCAL dataset, which stands for Functions of Citations in Astrophysics Literature, includes astrophysics articles with manually labelled citation functions. Our approach uses language features, citation contexts, and domain knowledge to classify citation functions. Results show that our method accurately identifies citation functions, indicating its potential for improving citation analysis. © 2024 IEEE.
  • Item
    Two Automated Techniques for Carotid Lumen Diameter Measurement: Regional versus Boundary Approaches
    (Springer New York LLC barbara.b.bertram@gsk.com, 2016) Araki, T.; Kumar, P.K.; Suri, H.S.; Ikeda, N.; Gupta, A.; Saba, L.; Rajan, J.; Lavra, F.; Sharma, A.M.; Shafique, S.; Nicolaïdes, A.; Laird, J.R.; Suri, J.S.
    The degree of stenosis in the carotid artery can be predicted using automated carotid lumen diameter (LD) measured from B-mode ultrasound images. Systolic velocity-based methods for measurement of LD are subjective. With the advancement of high resolution imaging, image-based methods have started to emerge. However, they require robust image analysis for accurate LD measurement. This paper presents two different algorithms for automated segmentation of the lumen borders in carotid ultrasound images. Both algorithms are modeled as a two stage process. Stage one consists of a global-based model using scale-space framework for the extraction of the region of interest. This stage is common to both algorithms. Stage two is modeled using a local-based strategy that extracts the lumen interfaces. At this stage, the algorithm-1 is modeled as a region-based strategy using a classification framework, whereas the algorithm-2 is modeled as a boundary-based approach that uses the level set framework. Two sets of databases (DB), Japan DB (JDB) (202 patients, 404 images) and Hong Kong DB (HKDB) (50 patients, 300 images) were used in this study. Two trained neuroradiologists performed manual LD tracings. The mean automated LD measured was 6.35 ± 0.95 mm for JDB and 6.20 ± 1.35 mm for HKDB. The precision-of-merit was: 97.4 % and 98.0 % w.r.t to two manual tracings for JDB and 99.7 % and 97.9 % w.r.t to two manual tracings for HKDB. Statistical tests such as ANOVA, Chi-Squared, T-test, and Mann-Whitney test were conducted to show the stability and reliability of the automated techniques. © 2016, Springer Science+Business Media New York.
  • Item
    Gene essentiality, conservation index and coevolution of genes in cyanobacteria
    (Public Library of Science plos@plos.org, 2017) Tiruveedula, G.S.S.; Wangikar, P.P.
    Cyanobacteria, a group of photosynthetic prokaryotes, dominate the earth with ? 1015 g wet biomass. Despite diversity in habitats and an ancient origin, cyanobacterial phylum has retained a significant core genome. Cyanobacteria are being explored for direct conversion of solar energy and carbon dioxide into biofuels. For this, efficient cyanobacterial strains will need to be designed via metabolic engineering. This will require identification of target knockouts to channelize the flow of carbon toward the product of interest while minimizing deletions of essential genes. We propose "Gene Conservation Index" (GCI) as a quick measure to predict gene essentiality in cyanobacteria. GCI is based on phylogenetic profile of a gene constructed with a reduced dataset of cyanobacterial genomes. GCI is the percentage of organism clusters in which the query gene is present in the reduced dataset. Of the 750 genes deemed to be essential in the experimental study on S. elongatus PCC 7942, we found 494 to be conserved across the phylum which largely comprise of the essential metabolic pathways. On the contrary, the conserved but non-essential genes broadly comprise of genes required under stress conditions. Exceptions to this rule include genes such as the glycogen synthesis and degradation enzymes, deoxyribose-phosphate aldolase (DERA), glucose-6-phosphate 1-dehydrogenase (zwf) and fructose-1,6-bisphosphatase class1, which are conserved but non-essential. While the essential genes are to be avoided during gene knockout studies as potentially lethal deletions, the non-essential but conserved set of genes could be interesting targets for metabolic engineering. Further, we identify clusters of co-evolving genes (CCG), which provide insights that may be useful in annotation. Principal component analysis (PCA) plots of the CCGs are demonstrated as data visualization tools that are complementary to the conventional heatmaps. Our dataset consists of phylogenetic profiles for 23,643 non-redundant cyanobacterial genes. We believe that the data and the analysis presented here will be a great resource to the scientific community interested in cyanobacteria. © 2017 Tiruveedula, Wangikar. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
  • Item
    A novel feature extraction technique for pulmonary sound analysis based on EMD
    (Elsevier Ireland Ltd, 2018) Mondal, A.; Banerjee, P.; Tang, H.
    Background and objective: The stethoscope based auscultation technique is a primary diagnostic tool for chest sound analysis. However, the performance of this method is limited due to its dependency on physicians experience, knowledge and also clarity of the signal. To overcome this problem we need an automated computer-aided diagnostic system that will be competent in noisy environment. In this paper, a novel feature extraction technique is introduced for discriminating various pulmonary dysfunctions in an automated way based on pattern recognition algorithms. Method: In this work, the disease correlated relevant characteristics of lung sounds signals are identified in terms of statistical distribution parameters: mean, variance, skewness, and kurtosis. These features are extracted from selective morphological components of the mapped signal in the empirical mode decomposition domain. The feature set is fed to the classifier model to differentiate their corresponding classes. Results: The significance of features developed are validated by conducting several experiments using supervised and unsupervised classifiers. Furthermore, the discriminating power of the proposed features is compared with three types of baseline features. The experimental result is evaluated by statistical analysis and also validated with physicians inference. Conclusions: It is found that the proposed features extraction technique is superior to the baseline methods in terms of classification accuracy, sensitivity and specificity. The developed method gives better results compared to baseline methods in any circumstance. The proposed method gives a higher accuracy of 94.16, sensitivity of 100 and specificity of 93.75 for an artificial neural network classifier. © 2018 Elsevier B.V.
  • Item
    T4-like Escherichia coli phages from the environment carry blaCTX-M
    (John Wiley and Sons Inc, 2018) Mohan Raj, J.R.; Vittal, R.; Huilgol, P.; Bhat, U.; Karunasagar, I.
    The resistance determinant blaCTX-M has many variants and has been the most commonly reported gene in clinical isolates of extended spectrum beta-lactamase producing Escherichia coli. Phages have been speculated as potential reservoirs of resistance genes and efficient vehicles for horizontal gene transfer. The objective of the study was to determine the prevalence and characterize bacteriophages that harbour the resistance determinant blaCTX-M. Escherichia coli specific bacteriophages were isolated from 15 samples including soil and water across Mangaluru, India using bacterial hosts that were sensitive to ?-lactams. Phenotypic and genotypic characterization based on plaque morphology, host range, restriction fragment length polymorphism (RFLP), presence of blaCTX-M and electron microscopy was performed. Of 36 phages isolated, seven were positive for Group 1 of blaCTX-M. Based on host range and RFLP pattern, the seven phages were classified into four distinct groups, each harbouring a variant of blaCTX-M. Five phages were T4-like Myoviridae by electron microscopy which was further confirmed by polymerase chain reaction (PCR) for T4 specific gp14. Generalized transduction of the CTX-M gene from these phages was also observed. The high prevalence (20%) of this gene blaCTX-M in the phage pool confirms the significant role of Myoviridae members, specifically T4-like phages in the dissemination of this resistance gene. Significance and Impact of the Study: The CTX-M gene that confers resistance to Beta-lactam class of drugs is widespread and diverse. Understanding mechanisms of antimicrobial resistance transfer is a key to devise methods for controlling it. Few studies indicate that bacteriophages are involved in the transfer of this gene but the type of phages involved and the degree of involvement remains to be explored. Our work has been able to identify the class of phages and the magnitude of involvement in the dissemination of this gene. © 2018 The Society for Applied Microbiology
  • Item
    GPUPeP: Parallel Enzymatic Numerical P System simulator with a Python-based interface
    (Elsevier Ireland Ltd, 2020) Raghavan, S.; Rai, S.S.; Rohit, M.P.; Chandrasekaran, K.
    Membrane computing is a computational paradigm inspired by the structure and behavior of a living cell. P Systems are the computing devices that are used to realize membrane computing models. Numerous theoretical studies on many variants of P Systems have shown them to be computationally universal. There is a wide range of applications of P Systems from modeling of biological processes to image processing. Among many variants of P Systems, one of the most important is Enzymatic Numerical P System (ENPS). ENPS is a class of P System in which membranes operate on numerical values. To realize the power of ENPS there are a few simulators developed. Each and every simulator has some advantages as well as some disadvantages. Here, a GPU based simulator using Python as a user interaction language is developed. This tool is a completely parallel variant, compatible with a Python based sequential simulator (PeP) which was the first Python based work for ENPS. The developed simulator uses CUDA to interact with GPU and gives the desired speed up, while processing the membranes. There are two important case studies which show the performance of the developed tool to be far better than the other serial simulators. © 2020 Elsevier B.V.
  • Item
    Enhanced protein structural class prediction using effective feature modeling and ensemble of classifiers
    (Institute of Electrical and Electronics Engineers Inc., 2021) Bankapur, S.; Patil, N.
    Protein Secondary Structural Class (PSSC) information is important in investigating further challenges of protein sequences like protein fold recognition, protein tertiary structure prediction, and analysis of protein functions for drug discovery. Identification of PSSC using biological methods is time-consuming and cost-intensive. Several computational models have been developed to predict the structural class; however, they lack in generalization of the model. Hence, predicting PSSC based on protein sequences is still proving to be an uphill task. In this article, we proposed an effective, novel and generalized prediction model consisting of a feature modeling and an ensemble of classifiers. The proposed feature modeling extracts discriminating information (features) by leveraging three techniques: (i) Embedding – features are extracted on the basis of spatial residue arrangements of the sequences using word embedding approaches; (ii) SkipXGram Bi-gram – various sets of skipped bi-gram features are extracted from the sequences; and (iii) General Statistical (GS) based features are extracted which covers the global information of structural sequences. The combined effective sets of features are trained and classified using an ensemble of three classifiers: Support Vector Machine (SVM), Random Forest (RF), and Gradient Boosting Machines (GBM). The proposed model when assessed on five benchmark datasets (high and low sequence similarity), viz. z277, z498, 25PDB, 1189, and FC699, reported an overall accuracy of 93.55, 97.58, 81.82, 81.11, and 93.93 percent respectively. The proposed model is further validated on a large-scale updated low similarity (?25%) dataset, where it achieved an overall accuracy of 81.11 percent. The proposed generalized model is robust and consistently outperformed several state-of-the-art models on all the five benchmark datasets. © 2021 Institute of Electrical and Electronics Engineers Inc.. All rights reserved.