Faculty Publications

Search Results

Now showing 1 - 3 of 3

A novel technique of feature selection with relieff and CFS for protein sequence classification
(Springer Verlag service@springer.de, 2019) Kaur, K.; Patil, N.
Bioinformatics has gained wide importance in research area for the last few decades. The main aim is to store the biological data and analyze it for better understanding. To predict the functions of newly added protein sequences, the classification of existing protein sequence is of great use. The rate at which protein sequence data is getting accumulated is increasing exponentially. So, it emerges as a very challenging task for the researcher, to deal with large number of features obtained by the use of various encoding techniques. Here, a two-stage algorithm is proposed for feature selection that combines ReliefF and CFS technique that takes extracted features as input and provides us with the discriminative set of features. The n-gram sequence encoding technique has been used to extract the feature vector from the protein sequences. In the first stage, ReliefF approach is used to rank the features and obtain candidate feature set. In the second stage, CFS is applied on this candidate feature set to obtain features that have high correlation with the class but less correlation with other features. The classification methods like Naive-Bayes, decision tree, and k-nearest neighbor can be used to analyze the performance of proposed approach. It is observed that this approach has increased accuracy of classification methods in comparison to existing methods. © Springer Nature Singapore Pte Ltd. 2019
Grey relational effort analysis technique using regression methods for software estimation
(Zarka Private Univ PO Box 132222 ZARQA 13132, 2014) Geeta, N.; Moin, U.; Kaur, K.
Software project planning and estimation is the most important confront for software developers and researchers. It incorporates estimating the size of the software project to be produced, estimating the effort required, developing initial project schedules, and ultimately, estimating on the whole cost of the project. Numerous empirical explorations have been performed on the existing methods, but they lack convergence in choosing the best prediction methodology. Analogy based estimation is still one of the most extensively used method in industry which is based on finding effort from similar projects from the project repository. Two alternative approaches using analogy for estimation have been proposed in this study. Firstly, a precise and comprehensible predictive model based on the integration of Grey Relational Analysis (GRA) and regression has been discussed. Second approach deals with the uncertainty in the software projects, and how fuzzy set theory in fusion with grey relational analysis can minimize this uncertainty. Empirical results attained are remarkable indicating that the methodologies have a great potential and can be used as a candidate approaches for software effort estimation. The results obtained using both the methods are subjected to rigorous statistical testing using Wilcoxon signed rank test.
A fast and novel approach based on grouping and weighted mRMR for feature selection and classification of protein sequence data
(Inderscience Publishers, 2020) Kaur, K.; Patil, N.
The analysis of protein sequences under bioinformatics has gained wide importance in research area. Newly added protein sequences can be analysed using existing proteins and converting them into feature vector form. However, it emerges as a challenging task to deal with huge number of features obtained using sequence encoding techniques. Since all the features obtained are not actually required, a three-stage feature selection approach has been proposed. In the first stage, features are ranked and most irrelevant features are removed; in the second stage, conflicting features are grouped together; and in third stage, a fast approach based on weighted Minimum Redundancy Maximum Relevance (wMRMR) has been proposed and applied on grouped features. Different classification methods are used to analyse the performance of the proposed approach. It is observed that the proposed approach has increased classification accuracy results and reduced time consumption in comparison to the state-of-the-art methods. © 2020 Inderscience Enterprises Ltd.

Faculty Publications

Browse

Filters

Settings

Sort By

Results per page

Search Results