Faculty Publications

Permanent URI for this communityhttps://idr.nitk.ac.in/handle/123456789/18736

Publications by NITK Faculty

Browse

Search Results

Now showing 1 - 7 of 7

Machine Learning-Based Malware Detection andÂ Classification inÂ Encrypted TLS Traffic
(Springer Science and Business Media Deutschland GmbH, 2023) Kashyap, H.; Pais, A.R.; Kondaiah, C.
Malware has become a significant threat to Internet users in the modern digital era. Malware spreads quickly and poses a significant threat to cyber security. As a result, network security measures play an important role in countering these cyber threats. Existing malware detection techniques are unable to detect them effectively. A novel Ensemble Machine Learning (ML)-based malware detection technique from Transport Layer Security (TLS)-encrypted traffic without decryption is proposed in this paper. The features are extracted from TLS traffic. Based on the extracted features, malware detection is performed using Ensemble ML algorithms. The benign and malware file datasets are created using features extracted from TLS traffic. According to the experimental results, the 65 new extracted features perform well in detecting malware from encrypted traffic. The proposed method achieves an accuracy of 99.85% for random forest and 97.43% for multiclass classification for identifying malware families. The ensemble model achieved an accuracy of 99.74% for binary classification and 97.45% for multiclass classification. Â© 2023, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
Machine learning models for phishing detection from TLS traffic
(Springer, 2023) Kumar, M.; Kondaiah, C.; Pais, A.R.; Rao, R.S.
Phishing is a fraudulent tactic for attackers to obtain victims personal information, such as passwords, account details, credit card details, and other sensitive information. Existing anti-phishing detection methods using at the application layer and cannot be applied at the transport layer. A novel machine learning (ML) based phishing detection technique from transport layer security (TLS) 1.2 and TLS 1.3 encrypted traffic without decryption is proposed in this paper. Our proposed model detects phishing URLs at the transport layer and classifies them as legitimate or phishing. The features are extracted from TLS 1.2 and TLS 1.3 traffic, and phishing detection is performed using ML algorithms based on the extracted features. The datasets for legitimate and phishing sites are created using features derived from TLS 1.2 and TLS 1.3 traffic. According to the experimental results, the proposed model effectively detects phishing URLs in encrypted traffic. The proposed model achieves an accuracy of 93.63% for Random Forest (RF), 95.07% for XGBoost (XGB), and the highest accuracy of 95.40% for Light GBM (LGBM). © 2023, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.
Enhanced Malicious Traffic Detection in Encrypted Communication Using TLS Features and a Multi-class Classifier Ensemble
(Springer, 2024) Kondaiah, C.; Pais, A.R.; Rao, R.S.
The use of encryption for network communication leads to a significant challenge in identifying malicious traffic. The existing malicious traffic detection techniques fail to identify malicious traffic from the encrypted traffic without decryption. The current research focuses on feature extraction and malicious traffic classification from the encrypted network traffic without decryption. In this paper, we propose an ensemble model using Deep Learning (DL), Machine Learning (ML), and self-attention-based methods. Also, we propose novel TLS features extracted from the network and perform experimentation on the ensemble model. The experimental results demonstrated that the ML-based (RF, LGBM, XGB) ensemble model achieved a significant accuracy of 94.85% whereas the other ensemble model using RF, LSTM, and Bi-LSTM with self-attention technique achieved an accuracy of 96.71%. To evaluate the efficacy of our proposed models, we curated datasets encompassing both phishing, legitimate and malware websites, leveraging features extracted from TLS 1.2 and 1.3 traffic without decryption. © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024.
An ensemble learning approach for detecting phishing URLs in encrypted TLS traffic
(Springer, 2024) Kondaiah, C.; Pais, A.R.; Rao, R.S.
Phishing is a fraudulent method used by hackers to acquire confidential data from victims, including security passwords, bank account details, debit card data, and other sensitive data. Owing to the increase in internet users, the corresponding network attacks have also grown over the last decade. Existing phishing detection methods are implemented for the application layer and are not effectively adapted to the transport layer. In this paper, we propose a novel phishing detection method that extends beyond traditional approaches by utilizing a multi-model ensemble of deep neural networks, long short term memory, and Random Forest classifiers. Our approach is distinguished by its unique feature extraction from transport layer security (TLS) 1.2 and 1.3 network traffic and the application of advanced deep learning algorithms to enhance phishing detection capabilities. To assess the effectiveness of our model, we curated datasets that include both phishing and legitimate websites, using features derived from TLS 1.2 and 1.3 traffic. The experimental results show that our proposed model achieved a classification accuracy of 99.61%, a precision of 99.80%, and a Matthews Correlation Coefficient of 99.22% on an in-house dataset. Our model excels at detecting phishing Uniform Resource Locator at the transport layer without data decryption. It is designed to block phishing attacks at the network gateway or firewall level. © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024.
TrackPhish: A Multi-Embedding Attention-Enhanced 1D CNN Model for Phishing URL Detection
(Institute of Electrical and Electronics Engineers Inc., 2025) Kondaiah, C.; Pais, A.R.; Rao, R.S.
Phishing attacks are a growing threat to online security, with increasingly sophisticated and frequent tactics. This rise in cyber threats underscores the need for advanced detection methods. While the Internet is crucial for modern communication and commerce, it also exposes users to risks such as phishing, spamming, malware, and performance degradation attacks. Among these, malicious URLs, commonly embedded in static links within emails and websites, are a significant challenge in identifying and mitigating these attacks. This study proposes TrackPhish, a novel lightweight application that predicts URL legitimacy without visiting the associated website. The proposed model combines traditional word embeddings (Word2Vec, FastText, GloVe) with transformer models (BERT, RoBERTa, GPT-2) to create a comprehensive feature set fed into a Deep Learning (DL) model for detecting phishing URLs. The integration of these embeddings captures semantic relationships and contextual understanding of the text, generating a robust feature set enhanced by an attention mechanism to choose relevant features. The refined features are then used to train a One-Dimensional Convolutional Neural Network (1D CNN) model for phishing URL detection. The proposed model offers key advantages over existing methods, including independence from third-party features, adaptability for client-side deployment, and target-independent detection. Experimental results demonstrate the model’s effectiveness, achieving 95.41% accuracy with a low false positive rate of 1.44% on our dataset and an impressive 98.55% accuracy on benchmark datasets, outperforming existing baseline models. The proposed model represents a significant advancement over traditional methods, enhancing online security against phishing URLs. © 2005-2012 IEEE.
GraPhish: A graph-based approach for phishing detection from encrypted TLS traffic
(Elsevier Ltd, 2025) Manguli, K.; Kondaiah, C.; Pais, A.R.; Rao, R.S.
Phishing has increased substantially over the last few years, with cybercriminals deceiving users via spurious websites or confusing mails to steal confidential data like username and password. Even with browser-integrated security indicators like HTTPS prefixes and padlock symbols, new phishing strategies have circumvented these security features. This paper proposes GraPhish, a novel graph-based phishing detection framework that leverages encrypted TLS traffic features. We constructed an in-house dataset and proposed an effective method for graph generation based solely on TLS-based features. Our model performs better than traditional machine learning algorithms. GraPhish achieved an accuracy of 94.82%, a precision of 96.28%, a recall of 92.11%, and an improved AUC-ROC score of 98.29%. © 2025 Elsevier Ltd
A hybrid super learner ensemble for phishing detection on mobile devices
(Nature Research, 2025) Rao, R.S.; Kondaiah, C.; Pais, A.R.; Lee, B.
In today’s digital age, the rapid increase in online users and massive network traffic has made ensuring security more challenging. Among the various cyber threats, phishing remains one of the most significant. Phishing is a cyberattack in which attackers steal sensitive information, such as usernames, passwords, and credit card details, through fake web pages designed to mimic legitimate websites. These attacks primarily occur via emails or websites. Several antiphishing techniques, such as blacklist-based, source code analysis, and visual similarity-based methods, have been developed to counter phishing websites. However, these methods have specific limitations, including vulnerability to zero-day attacks, susceptibility to drive-by-downloads, and high detection latency. Furthermore, many of these techniques are unsuitable for mobile devices, which face additional constraints, such as limited RAM, smaller screen sizes, and lower computational power. To address these limitations, this paper proposes a novel hybrid super learner ensemble model named Phish-Jam, a mobile application specifically designed for phishing detection on mobile devices. Phish-Jam utilizes a super learner ensemble that combines predictions from diverse Machine Learning (ML) algorithms to classify legitimate and phishing websites. By focusing on extracting features from URLs, including handcrafted features, transformer-based text embeddings, and other Deep Learning (DL) architectures, the proposed model offers several advantages: fast computation, language independence, and robustness against accidental malware downloads. From the experimental analysis, it is observed that the super learner ensemble achieved significant accuracy of 98.93%, precision of 99.15%, MCC of 97.81% and F1 Score of 99.07%. © The Author(s) 2025.

Faculty Publications

Browse

Filters

Settings

Sort By

Results per page

Search Results