Novel hybrid feature selection models for unsupervised document categorization

Bhopale, A.P.; Kamath Sâ€¤, S.

Novel hybrid feature selection models for unsupervised document categorization

Date

2017

Authors

Bhopale, A.P.

Kamath Sâ€¤, S.

Publisher

Institute of Electrical and Electronics Engineers Inc.

Abstract

Dealing with high dimensional data is a challenging and computationally complex task in the data pre-processing phase of text clustering. Conventionally, union and intersection approaches have been used to combine results of different feature selection methods to optimize relevant feature space for document collection. Union method selects all features from considered sub-models, whereas, intersection method selects only common features identified by sub-models. However, in reality, any type of feature selection can cause a loss of some potentially important features. In this paper, a hybrid feature selection model called Modified Hybrid Union (MHU) is proposed, which selects features by considering the individual strengths and weaknesses of each constituent component of the model. A comparative evaluation of its performance for K-means clustering and Bio-inspired Flockbased clustering is also presented on standard data sets such as OWL-S TC and Reuters-21578. Â© 2017 IEEE.

Keywords

Dimensionality reduction, Feature selection, Text categorization, Unsupervised learning

Citation

2017 International Conference on Advances in Computing, Communications and Informatics, ICACCI 2017, 2017, Vol.2017-January, , p. 1471-1477

URI

https://doi.org/10.1109/ICACCI.2017.8126048
https://idr.nitk.ac.in/handle/123456789/31738

Collections

Conference Papers

Full item page

Novel hybrid feature selection models for unsupervised document categorization

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By