Classification of protein sequences by means of an ensemble classifier with an improved feature selection strategy

No Thumbnail Available

Date

2018

Authors

Sriram, A.
Sanapala, M.
Patel, R.
Patil, N.

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

With decreasing cost of biological sequencing, the influx of new sequences into biological databases such as NCBI, SwissProt, UniProt is increasing at an ever-growing pace. Annotating these newly sequenced proteins will aid in ground breaking discoveries for developing novel drugs and potential therapies for diseases. Previous work in this field has harnessed the high computational power of modern machines to achieve good prediction quality but at the cost of high dimensionality. To address this disparity, we propose a novel word segmentation-based feature selection strategy to classify protein sequences using a highly condensed feature set. Using an incremental classifier selection strategy was seen to yield better results than all existing methods. The antioxidant protein data curated in the previous work was used in order to facilitate a level ground for evaluation and comparison of results. The proposed method was found to outperform all existing works on this data with an accuracy of 95%. � Springer Nature Singapore Pte Ltd. 2018.

Description

Keywords

Citation

Advances in Intelligent Systems and Computing, 2018, Vol.708, , pp.167-174

Endorsement

Review

Supplemented By

Referenced By