CatchPhish: detection of phishing websites by inspecting URLs

Rao, R.S.; Vaishnavi, T.; Pais, A.R.

CatchPhish: detection of phishing websites by inspecting URLs

dc.contributor.author	Rao, R.S.
dc.contributor.author	Vaishnavi, T.
dc.contributor.author	Pais, A.R.
dc.date.accessioned	2026-02-05T09:29:04Z
dc.date.issued	2020
dc.description.abstract	There exists many anti-phishing techniques which use source code-based features and third party services to detect the phishing sites. These techniques have some limitations and one of them is that they fail to handle drive-by-downloads. They also use third-party services for the detection of phishing URLs which delay the classification process. Hence, in this paper, we propose a light-weight application, CatchPhish which predicts the URL legitimacy without visiting the website. The proposed technique uses hostname, full URL, Term Frequency-Inverse Document Frequency (TF-IDF) features and phish-hinted words from the suspicious URL for the classification using the Random forest classifier. The proposed model with only TF-IDF features on our dataset achieved an accuracy of 93.25%. Experiment with TF-IDF and hand-crafted features achieved a significant accuracy of 94.26% on our dataset and an accuracy of 98.25%, 97.49% on benchmark datasets which is much better than the existing baseline models. © 2019, Springer-Verlag GmbH Germany, part of Springer Nature.
dc.identifier.citation	Journal of Ambient Intelligence and Humanized Computing, 2020, 11, 2, pp. 813-825
dc.identifier.issn	18685137
dc.identifier.uri	https://doi.org/10.1007/s12652-019-01311-4
dc.identifier.uri	https://idr.nitk.ac.in/handle/123456789/24087
dc.publisher	Springer
dc.subject	Decision trees
dc.subject	Information retrieval systems
dc.subject	Text processing
dc.subject	Websites
dc.subject	Anti-phishing
dc.subject	Hostname
dc.subject	Phishing
dc.subject	Random forests
dc.subject	TF-IDF
dc.subject	Computer crime
dc.title	CatchPhish: detection of phishing websites by inspecting URLs

Collections

Journal Articles

CatchPhish: detection of phishing websites by inspecting URLs

Files

Collections