Faculty Publications

Permanent URI for this communityhttps://idr.nitk.ac.in/handle/123456789/18736

Publications by NITK Faculty

Browse

Search Results

Now showing 1 - 5 of 5

PhishDump: A multi-model ensemble based technique for the detection of phishing sites in mobile devices
(Elsevier B.V., 2019) Rao, R.S.; Vaishnavi, T.; Pais, A.R.
Phishing is a technique in which the attackers trick the online users to reveal the sensitive information by creating the phishing sites which look similar to that of legitimate sites. There exist many techniques to detect phishing sites in desktop computers. In recent years, the number of mobile users accessing the web has increased which lead to a rise in the number of attacks in mobile devices. Existing techniques designed for desktop computers may not be suitable for mobile devices due to their hardware limitations such as RAM, Screen size, low computational power etc. In this paper, we propose a mobile application named PhishDump to classify the legitimate and phishing websites in mobile devices. PhishDump is based on the multi-model ensemble of Long Short Term Memory (LSTM) and Support Vector Machine (SVM) classifier. As PhishDump focuses on the extraction of features from URL, it has several advantages over existing works such as fast computation, language independence and robust to accidental download of malwares. From the experimental analysis, we observed that our proposed multi-model ensemble outperformed traditional LSTM character and word-level models. PhishDump performed better than the existing baseline models with an accuracy of 97.30% on our dataset and 98.50% on the benchmark dataset. © 2019 Elsevier B.V.
Two level filtering mechanism to detect phishing sites using lightweight visual similarity approach
(Springer, 2020) Rao, R.S.; Pais, A.R.
The visual similarity-based techniques detect the phishing sites based on the similarity between the suspicious site and the existing database of resources such as screenshots, styles, logos, favicons etc. These techniques fail to detect phishing sites which target non-whitelisted legitimate domain or when phishing site with manipulated whitelisted legitimate content is encountered. Also, these techniques are not well adaptable at the client-side due to their computation and space complexity. Thus there is a need for light weight visual similarity-based technique detecting phishing sites targeting non-whitelisted legitimate resources. Unlike traditional visual similarity-based techniques using whitelists, in this paper, we employed a light-weight visual similarity based blacklist approach as a first level filter for the detection of near duplicate phishing sites. For the non-blacklisted phishing sites, we have incorporated a heuristic mechanism as a second level filter. We used two fuzzy similarity measures, Simhash and Perceptual hash for calculating the similarity score between the suspicious site and existing blacklisted phishing sites. Each similarity measure generates a unique fingerprint for a given website and also differs with less number of bits with a similar website. All three fingerprints together represent a website which undergoes blacklist filtering for the identification of the target website. The phishing sites which bypassed from the first level filter undergo second level heuristic filtering. We used comprehensive heuristic features including URL and source code based features for the detection of non-blacklisted phishing sites. The experimental results demonstrate that the blacklist filter alone is able to detect 55.58% of phishing sites which are either replicas or near duplicates of existing phishing sites. We also proposed an ensemble model with Random Forest (RF), Extra-Tree and XGBoost to evaluate the contribution of both blacklist and heuristic filters together as an entity and the model achieved a significant accuracy of 98.72% and Matthews Correlation Coefficient (MCC) of 97.39%. The proposed model is deployed as a chrome extension named as BlackPhish to provide real time protection against phishing sites at the client side. We also compared BlackPhish with the existing anti-phishing techniques where it outperformed existing works with a significant difference in accuracy and MCC. © 2019, Springer-Verlag GmbH Germany, part of Springer Nature.
A Boosting-Based Hybrid Feature Selection and Multi-Layer Stacked Ensemble Learning Model to Detect Phishing Websites
(Institute of Electrical and Electronics Engineers Inc., 2023) Lakshmana Rao, L.R.; Rao, R.S.; Pais, A.R.; Gabralla, L.A.
Phishing is a type of online scam where the attacker tries to trick you into giving away your personal information, such as passwords or credit card details, by posing as a trustworthy entity like a bank, email provider, or social media site. These attacks have been around for a long time and unfortunately, they continue to be a common threat. In this paper, we propose a boosting based multi layer stacked ensemble learning model that uses hybrid feature selection technique to select the relevant features for the classification. The dataset with selected features are sent to various classifiers at different layers where the predictions of lower layers are fed as input to the upper layers for the phishing detection. From the experimental analysis, it is observed that the proposed model achieved an accuracy ranging from 96.16 to 98.95% without feature selection across different datasets and also achieved an accuracy ranging from 96.18 to 98.80% with feature selection. The proposed model is compared with baseline models and it has outperformed the existing models with a significant difference. © 2013 IEEE.
Enhanced Malicious Traffic Detection in Encrypted Communication Using TLS Features and a Multi-class Classifier Ensemble
(Springer, 2024) Kondaiah, C.; Pais, A.R.; Rao, R.S.
The use of encryption for network communication leads to a significant challenge in identifying malicious traffic. The existing malicious traffic detection techniques fail to identify malicious traffic from the encrypted traffic without decryption. The current research focuses on feature extraction and malicious traffic classification from the encrypted network traffic without decryption. In this paper, we propose an ensemble model using Deep Learning (DL), Machine Learning (ML), and self-attention-based methods. Also, we propose novel TLS features extracted from the network and perform experimentation on the ensemble model. The experimental results demonstrated that the ML-based (RF, LGBM, XGB) ensemble model achieved a significant accuracy of 94.85% whereas the other ensemble model using RF, LSTM, and Bi-LSTM with self-attention technique achieved an accuracy of 96.71%. To evaluate the efficacy of our proposed models, we curated datasets encompassing both phishing, legitimate and malware websites, leveraging features extracted from TLS 1.2 and 1.3 traffic without decryption. © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024.
An ensemble learning approach for detecting phishing URLs in encrypted TLS traffic
(Springer, 2024) Kondaiah, C.; Pais, A.R.; Rao, R.S.
Phishing is a fraudulent method used by hackers to acquire confidential data from victims, including security passwords, bank account details, debit card data, and other sensitive data. Owing to the increase in internet users, the corresponding network attacks have also grown over the last decade. Existing phishing detection methods are implemented for the application layer and are not effectively adapted to the transport layer. In this paper, we propose a novel phishing detection method that extends beyond traditional approaches by utilizing a multi-model ensemble of deep neural networks, long short term memory, and Random Forest classifiers. Our approach is distinguished by its unique feature extraction from transport layer security (TLS) 1.2 and 1.3 network traffic and the application of advanced deep learning algorithms to enhance phishing detection capabilities. To assess the effectiveness of our model, we curated datasets that include both phishing and legitimate websites, using features derived from TLS 1.2 and 1.3 traffic. The experimental results show that our proposed model achieved a classification accuracy of 99.61%, a precision of 99.80%, and a Matthews Correlation Coefficient of 99.22% on an in-house dataset. Our model excels at detecting phishing Uniform Resource Locator at the transport layer without data decryption. It is designed to block phishing attacks at the network gateway or firewall level. © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024.

Faculty Publications

Browse

Filters

Settings

Sort By

Results per page

Search Results