Phishing Email and URL Detection using Machine learning and Deep learning

M, Somesha

Phishing Email and URL Detection using Machine learning and Deep learning

Files

187105-CO004-Somesha M.pdf (8.74 MB)

Date

2023

Authors

M, Somesha

Publisher

National Institute Of Technology Karnataka Surathkal

Abstract

The research thesis attempts to address the issue of email phishing, which poses a se- rious risk to businesses and corporations. Through the use of social engineering strate- gies, email phishing assaults persuade users to divulge personal data that can be ex- ploited to access their digital assets. Despite the presence several defenses, the Anti- Phishing Working Group survey reveals that the present approaches to phishing attack detection are still insufficient and ineffective. This underlines the requirement for a more effective system to identify phishing emails and offer greater protection against such assaults to the end user. There exist many machine learning based techniques to detect phishing emails. Also, they use a large number of heuristics to classify the email. To overcome the dis- advantages of existing schemes, we have presented an efficient word embedding cum machine learning framework to classify the emails. The presented technique uses only four email header based heuristics (i.e. From, Return-path, Subject, and Message-ID). The model achieved a significant accuracy of 99.50% using FastText-CBOW algorithm in combination with the Random Forest classifier. Although machine learning based techniques achieved significant accuracy, it is ad- visable to use deep learning models whenever we have sufficient data. We have pre- sented an efficient deep learning model called ”DeepEPhishNet” for the classification of emails. The presented model based on FastText-SkipGram with Deep Neural Network (DNN) achieved a significant accuracy of 99.52%, TPR of 99.38%, TNR of 99.92%, F-Score of 99.68%, Precision of 99.97%, and MCC of 98.71%. The above methods make use of only four email header based heuristics for the classification. To study the contribution of the email body in the detection of phishing emails, we have presented an efficient model using transformers. The presented model achieved an accuracy of 99.51% using open source datasets. The body of the email might contain phishing URLs, which may lead to a phishing attack. In order to overcome this, we have presented an efficient deep learning basedmodel for phishing URL detection. The accuracy achieved for the DNN, LSTM, and CNN are 99.52%, 99.57%, and 99.43% respectively. Overall, this research thesis presents efficient techniques for detecting phishing emails and URLs using word embedding, deep learning, and machine learning clas- sifiers.

URI

https://idr.nitk.ac.in/handle/123456789/17720

Collections

1. Ph.D Theses

Full item page

Phishing Email and URL Detection using Machine learning and Deep learning

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By