CatchPhish: detection of phishing websites by inspecting URLs

No Thumbnail Available

Date

2020

Authors

Rao, R.S.
Vaishnavi, T.
Pais, A.R.

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

There exists many anti-phishing techniques which use source code-based features and third party services to detect the phishing sites. These techniques have some limitations and one of them is that they fail to handle drive-by-downloads. They also use third-party services for the detection of phishing URLs which delay the classification process. Hence, in this paper, we propose a light-weight application, CatchPhish which predicts the URL legitimacy without visiting the website. The proposed technique uses hostname, full URL, Term Frequency-Inverse Document Frequency (TF-IDF) features and phish-hinted words from the suspicious URL for the classification using the Random forest classifier. The proposed model with only TF-IDF features on our dataset achieved an accuracy of 93.25%. Experiment with TF-IDF and hand-crafted features achieved a significant accuracy of 94.26% on our dataset and an accuracy of 98.25%, 97.49% on benchmark datasets which is much better than the existing baseline models. 2019, Springer-Verlag GmbH Germany, part of Springer Nature.

Description

Keywords

Citation

Journal of Ambient Intelligence and Humanized Computing, 2020, Vol.11, 2, pp.813-825

Endorsement

Review

Supplemented By

Referenced By