Journal Articles

Permanent URI for this collectionhttps://idr.nitk.ac.in/handle/123456789/19884

Browse

Search Results

Now showing 1 - 10 of 17
  • Item
    Antibiofouling hollow-fiber membranes for dye rejection by embedding chitosan and silver-loaded chitosan nanoparticles
    (Springer Verlag, 2019) Kolangare, I.M.; Isloor, A.M.; Zulhairun, Z.A.; Kulal, A.; A.F., A.F.; Siddique, I.; Asiri, A.M.
    The removal of toxic dyes from the wastewater and industrial effluents is a major environmental challenge. Various techniques have been employed for the removal of dyes, including the application of nano-sized adsorbents, nanocomposite membranes and photodegradation. Membrane filtration is an alterntive but suffers from drawbacks such as fouling. Here we present a simple approach for the development of antibiofouling membranes based on chitosan. The application of chitosan-based nanoparticles as additives for wastewater treatment is poorly explored. The chitosan and silver-loaded chitosan nanoparticles were synthesized by ionic gelation method and incorporated to fabricate hollow-fiber membranes by dry–wet spinning technique. The prepared membranes were characterized by morphological study, permeability test, antibiofouling study and dye rejection study. The nanocomposite hollow-fiber membranes displayed superior performance than their pristine form. The incorporation of 0.30 weight percent of the chitosan and silver-loaded chitosan nanoparticles into the hollow-fiber membranes enhanced the antifouling property with flux recovery ratio of 81.21 and 86.13%, respectively. The dye rejection results showed maximum rejection of 89.27 and 86.04% for Reactive Black 5 and Reactive Orange 16, respectively. Hence, it can be concluded that hollow-fiber membranes with silver-loaded chitosan nanoparticles are pertinent in developing antibiofouling membranes for the treatment of industrial dye effluents. © 2018, Springer Nature Switzerland AG.
  • Item
    Enhanced protein structural class prediction using effective feature modeling and ensemble of classifiers
    (Institute of Electrical and Electronics Engineers Inc., 2021) Bankapur, S.; Patil, N.
    Protein Secondary Structural Class (PSSC) information is important in investigating further challenges of protein sequences like protein fold recognition, protein tertiary structure prediction, and analysis of protein functions for drug discovery. Identification of PSSC using biological methods is time-consuming and cost-intensive. Several computational models have been developed to predict the structural class; however, they lack in generalization of the model. Hence, predicting PSSC based on protein sequences is still proving to be an uphill task. In this article, we proposed an effective, novel and generalized prediction model consisting of a feature modeling and an ensemble of classifiers. The proposed feature modeling extracts discriminating information (features) by leveraging three techniques: (i) Embedding – features are extracted on the basis of spatial residue arrangements of the sequences using word embedding approaches; (ii) SkipXGram Bi-gram – various sets of skipped bi-gram features are extracted from the sequences; and (iii) General Statistical (GS) based features are extracted which covers the global information of structural sequences. The combined effective sets of features are trained and classified using an ensemble of three classifiers: Support Vector Machine (SVM), Random Forest (RF), and Gradient Boosting Machines (GBM). The proposed model when assessed on five benchmark datasets (high and low sequence similarity), viz. z277, z498, 25PDB, 1189, and FC699, reported an overall accuracy of 93.55, 97.58, 81.82, 81.11, and 93.93 percent respectively. The proposed model is further validated on a large-scale updated low similarity (?25%) dataset, where it achieved an overall accuracy of 81.11 percent. The proposed generalized model is robust and consistently outperformed several state-of-the-art models on all the five benchmark datasets. © 2021 Institute of Electrical and Electronics Engineers Inc.. All rights reserved.
  • Item
    Application of word embedding and machine learning in detecting phishing websites
    (Springer, 2022) Rao, R.S.; Umarekar, A.; Pais, A.R.
    Phishing is an attack whose aim is to gain personal information such as passwords, credit card details etc. from online users by deceiving them through fake websites, emails or any legitimate internet service. There exists many techniques to detect phishing sites such as third-party based techniques, source code based methods and URL based methods but still users are getting trapped into revealing their sensitive information. In this paper, we propose a new technique which detects phishing sites with word embeddings using plain text and domain specific text extracted from the source code. We applied various word embedding for the evaluation of our model using ensemble and multimodal approaches. From the experimental evaluation, we observed that multimodal with domain specific text achieved a significant accuracy of 99.34% with TPR of 99.59%, FPR of 0.93%, and MCC of 98.68% © 2021, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.
  • Item
    Classification of Phishing Email Using Word Embedding and Machine Learning Techniques
    (River Publishers, 2022) Somesha, M.; Pais, A.R.
    Email phishing is a cyber-attack, bringing substantial financial damage to corporate and commercial organizations. A phishing email is a special type of spamming, used to trick the user to disclose personal information to access his digital assets. Phishing attack is generally triggered by emailing links to spoofed websites that collect sensitive information. The APWG survey suggests that the existing countermeasures remain ineffective and insufficient for detecting phishing attacks. Hence there is a need for an efficient mechanism to detect phishing emails to provide better security against such attacks to the common user. The existing open-source data sets are limited in diversity, hence they do not capture the real picture of the attack. Hence there is a need for real-time input data set to design accurate email anti-phishing solutions. In the current work, it has been created a real-time in-house corpus of phishing and legitimate emails and proposed efficient techniques to detect phishing emails using a word embedding and machine learning algorithms. The proposed system uses only four email header-based heuristics for the classification of emails. The proposed word embedding cum machine learning framework comprises six word embedding techniques with five machine learning classifiers to evaluate the best performing combination. Among all six combinations, Random Forest consistently performed the best with FastText (CBOW) by achieving an accuracy of 99.50% with a false positive rate of 0.053%, TF-IDF achieved an accuracy of 99.39% with a false positive rate of 0.4% and Count Vectorizer achieved an accuracy of 99.18% with a false positive rate of 0.98% respectively for three datasets used. © 2022 River Publishers.
  • Item
    Multi-layer perceptron based fake news classification using knowledge base triples
    (Springer, 2023) Srinivasa, K.; Santhi Thilagam, P.S.
    Recent attempts to detect fake news have relied on the implementation of machine or deep learning models that have been trained on text. These models, on the other hand, are insufficient for classifying knowledge base facts or triples as fake or true. However, it is critical to assess the credibility of facts before they are included to the knowledge base. Hence, this paper suggests using a Multi-layer Perceptron to categorize a given triple as fake or true. Furthermore, extant works embed the features using either frequency or prediction based word embedding models, and thus both document and word level features are not captured. To address this issue, a data modeling approach is proposed that vectorizes the triples using two cutting-edge word embedding models, Wrod2Vec and GloVe, as well as TF-IDF and Counter Vectorizer. Empirical results show that the Multi-layer Perceptron with GloVe and count vectorizer outperforms the baseline model in terms of accuracy. Moreover, named entity tags associated with the entities, such as PERSON, add an extra feature for training the models. As a result, an algorithm that jointly extracts the triples along with named entity tags is also proposed. Experiments demonstrated that models trained on triples with named entity tags produce high accuracy. © 2022, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.
  • Item
    NORD: NOde Ranking-based efficient virtual network embedding over single Domain substrate networks
    (Elsevier B.V., 2023) Keerthan Kumar, T.G.; Addya, S.K.; Satpathy, A.; Koolagudi, S.G.
    Network virtualization (NV) allows the service providers (SPs) to partition the substrate resources in the form of isolated virtual networks (VNs) comprising multiple correlated virtual machines (VMs) and virtual links (VLs), capturing the dependencies. Though NV brought about multiple benefits, such as service isolation, improved quality-of-service (QoS), secure communication, and better utilization of substrate resources, it also introduced numerous research challenges. In this regard, one of the predominant challenges is assigning resources to the virtual components, i.e., VMs and VLs, also termed virtual network embedding (VNE). VNE comprises two closely related sub-problems, (i.) VM embedding and (ii.) VL embedding, and both the problems have been demonstrated to be NP-Hard. In the context of VNE, maximizing the revenue to cost ratio remains the focal point for the SPs as it not only boosts acceptance of VNRs but also effectively utilizes the substrate resources. However, the existing literature on VNE suffers from the following pitfalls: (i.) They only consider system resources or (ii.) limited topological attributes. However, both attributes are quintessential in accurately capturing the VNRs and the substrate network dependencies, thereby augmenting the revenue to cost ratio. This paper proposes an efficient VNE strategy called, NOde Ranking-based efficient virtual network embedding over single Domain substrate networks (NORD), to maximize the revenue to cost ratio. To address the problem of VM embedding, NORD utilizes a hybrid entropy and the technique for order of preference by similarity to ideal solution (TOPSIS) based ranking strategy for VMs and servers considering both system and topological attributes that effectively capture the dependencies. Once the ranking is generated, A greedy VM embedding followed by shortest path VL embedding completes the assignment. Simulation results confirm that NORD attains a 40% and 61% increment in average acceptance and revenue-to-cost ratios compared to the baselines. © 2023 Elsevier B.V.
  • Item
    Transfer learning based code-mixed part-of-speech tagging using character level representations for Indian languages
    (Springer Science and Business Media Deutschland GmbH, 2023) Anand Kumar, A.K.; Padannayil, S.K.
    Massive amounts of unstructured content have been generated day-by-day on social media platforms like Facebook, Twitter and blogs. Analyzing and extracting useful information from this vast amount of text content is a challenging process. Social media have currently provided extensive opportunities for researchers and practitioners to do adequate research on this area. Most of the text content in social media tend to be either in English or code-mixed regional languages. In a multilingual country like India, code-mixing is the usual fashion witnessed in social media discussions. Multilingual users frequently use Roman script, an convenient mode of expression, instead of the regional language script for posting messages on social media and often mix it with English into their native languages. Stylistic and grammatical irregularities are significant challenges in processing the code-mixed text using conventional methods. This paper explains the new word embedding via character level representation as features for POS tagging the code-mixed text in Indian languages using the ICON-2015, ICON-2016 NLP tools contest data set. The proposed word embedding features are context-appended, and the well-known Support Vector Machine (SVM) classifier has been used to train the system. We have combined the Facebook, Twitter, and WhatsApp code-mixed data of three Indian languages to train the Transfer learning based language-independent and source independent POS tagging. The experimental results demonstrated that the proposed transfer method achieved state-of-the-art accuracy in 12 systems out of 18 systems for the ICON data set. © 2021, The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature.
  • Item
    Overlapping word removal is all you need: revisiting data imbalance in hope speech detection
    (Taylor and Francis Ltd., 2024) RamakrishnaIyer LekshmiAmmal, H.; Ravikiran, M.; Nisha, G.; Balamuralidhar, N.; Madhusoodanan, A.; Anand Kumar, A.K.; Chakravarthi, B.R.
    Hope speech detection is a new task for finding and highlighting positive comments or supporting content from user-generated social media comments. For this task, we have used a Shared Task multilingual dataset on Hope Speech Detection for Equality, Diversity, and Inclusion (HopeEDI) for three languages English, code-switched Tamil and Malayalam. In this paper, we present deep learning techniques using context-aware string embeddings for word representations and Recurrent Neural Network (RNN) and pooled document embeddings for text representation. We have evaluated and compared the three models for each language with different approaches. Our proposed methodology works fine and achieved higher performance than baselines. The highest weighted average F-scores of 0.93, 0.58, and 0.84 are obtained on the task organisers{'} final evaluation test set. The proposed models are outperforming the baselines by 3{\%}, 2{\%} and 11{\%} in absolute terms for English, Tamil and Malayalam respectively. © 2023 Informa UK Limited, trading as Taylor & Francis Group.
  • Item
    Video Captioning using Sentence Vector-enabled Convolutional Framework with Short-Connected LSTM
    (Springer, 2024) Naik, D.; Jaidhar, C.D.
    The principal objective of video/image captioning is to portray the dynamics of a video clip in plain natural language. Captioning is motivated by its ability to make the video more accessible to deaf and hard-of-hearing individuals, to help people focus on and recall information more readily, and to watch it in sound-sensitive locations. The most frequently utilized design paradigm is the revolutionary structurally improved encoder-decoder configuration. Recent developments emphasize the utilization of various creative structural modifications to maximize efficiency while demonstrating their viability in real-world applications. The utilization of well-known and well-researched technological advancements such as deep Convolutional Neural Networks (CNNs) and Sentence Transformers are trending in encoder-decoders. This paper proposes an approach for efficiently captioning videos using CNN and a short-connected LSTM-based encoder-decoder model blended with a sentence context vector. This sentence context vector emphasizes the relationship between the video and text spaces. Inspired by the human visual system, the attention mechanism is utilized to selectively concentrate on the context of the important frames. Also, a contextual hybrid embedding block is presented for connecting the two vector spaces generated during the encoding and decoding stages. The proposed architecture is investigated through well-known CNN architectures and various word embeddings. It is assessed using two benchmark video captioning datasets, MSVD and MSR-VTT, considering standard evaluation metrics such as BLEU, METEOR, ROUGH, and CIDEr. In accordance with experimental exploration, when the proposed model with NASNet-large alone is viewed across all three embeddings, the BERT findings on MSVD Dataset performed better than the results obtained with the other two embeddings. Inception-v4 outperformed VGG-16, ResNet-152, and NASNet-Large for feature extraction. Considering word embedding initiatives, BERT is far superior to ELMo and GloVe based on the MSR-VTT dataset. © 2023, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.
  • Item
    The Effect of Phrase Vector Embedding in Explainable Hierarchical Attention-Based Tamil Code-Mixed Hate Speech and Intent Detection
    (Institute of Electrical and Electronics Engineers Inc., 2024) Sharmila Devi, V.S.; Subramanian, S.; Anand Kumar, A.K.
    The substantial growth in social media users has led to a significant increase in code-mixed content on social media platforms. Millions of users on these platforms upload pictures and videos and post comments regarding their recent or exciting activities. Responding to this uploaded content, a few users occasionally use offensive language to insult others or specific groups. Social media platforms encounter challenges identifying and removing hate speech and objectionable content in various languages. Hate speech, in its general sense, refers to harmful posts directed at individuals or groups based on factors such as their sexuality, religion, community affiliation, disability, and others. Typically, offensive language is directly or indirectly utilized in hate speech posts to insult someone, causing psychological distress to users. In light of this, we propose developing a system to automatically block, remove, or report posts written in code-mixed Tamil containing hate speech. We have gathered code-mixed Tamil comments from Twitter and the Helo App, categorizing them as hate speech and classifying their intent. We have identified three categories of hate speech intent, namely Targeted Individual (TI), Targeted Group (TG), and Others (O). The Targeted Individual (TI) class encompasses posts aimed at a specific individual target. At the same time, the Targeted Group (TG) category primarily focuses on identifying people based on their religion, community, gender, and other characteristics. The Others (O) category encompasses untargeted offensive posts and other posts containing offensive language. In this context, we propose using a phrase-based, Explainable Hierarchical Attention model for hate speech detection. The results demonstrate that the proposed method is more effective in identifying and explaining hate speech and offensive language in social media posts. © 2013 IEEE.