Conference Papers

Permanent URI for this collectionhttps://idr.nitk.ac.in/handle/123456789/28506

Browse

Search Results

Now showing 1 - 8 of 8
  • Item
    Feature selection using Markov clustering and maximum spanning tree in high dimensional data
    (Institute of Electrical and Electronics Engineers Inc., 2017) Bisht, N.; Annappa, B.
    Feature selection is the most important preprocessing step for classification of high dimensional data. It reduces the load of computational cost and prediction time on classification algorithm by selecting only the salient features from the data set for learning. The main challenges while applying feature selection on high dimensional data (HDD) are: handling the relevancy, redundancy and correlation between features. The proposed algorithm works with the three main steps to overcome these issues. It focuses on filtering strategy for its effectiveness in handling the data sets with large size and high dimensions. Initially to measure the relevancy of features with respect to class, fisher score is calculated for each feature independently. Next, only relevant features are passed to the clustering algorithm to check the redundancy of features. Finally the correlation between features is calculated using maximum spanning tree and the most appropriate features are filtered out. The classification accuracy of the presented approach is validated by using C4.5, IB1 and Naive Bayes classifier. The proposed algorithm gives high classification accuracy when compared against the accuracies given by three different classifiers on the datasets containing features extracted from fisher score method and dataset containing all the features or full-featured dataset. © 2016 IEEE.
  • Item
    Video Affective Content Analysis based on multimodal features using a novel hybrid SVM-RBM classifier
    (Institute of Electrical and Electronics Engineers Inc., 2017) Ashwin, T.S.; Saran, S.; Guddeti, G.R.M.
    Video Affective Content Analysis is an active research area in computer vision. Live Streaming video has become one of the modes of communication in the recent decade. Hence video affect content analysis plays a vital role. Existing works on video affective content analysis are more focused on predicting the current state of the users using either of the visual or the acoustic features. In this paper, we propose a novel hybrid SVM-RBM classifier which recognizes the emotion for both live streaming video and stored video data using audio-visual features; thus recognizes the users' mood based on categorical emotion descriptors. The proposed method is experimented for human emotions recognition for live streaming data using the devices such as Microsoft Kinect and Web Cam. Further we tested and validated using standard datasets like HUMANE and SAVEE. Classification of emotion is performed for both acoustic and visual data using Restricted Boltzmann Machine (RBM) and Support Vector Machine (SVM). It is observed that SVM-RBM classifier outperforms RBM and SVM for annotated datasets. © 2016 IEEE.
  • Item
    Concise semantic analysis based text categorization using modified hybrid union feature selection approach
    (Institute of Electrical and Electronics Engineers Inc., 2018) Bhopale, A.P.; Kamath S․, S.; Tiwari, A.
    Text categorization mainly comprises of deriving a representation of the corpus in a standard bag-of-words format. The merit of bag-of-word representations is that they considering every term as a feature, while the downside of this is that the computation cost increases with the number of features and the representation of relations between documents and features. Semantic analysis can help in gaining an edge through document and term correlation in a concept space. However, most semantic analysis techniques have their own limitations when used for text categorization. In this work, a Concise Semantic Analysis (CSA) technique that extracts concepts from corpus and then interpret the document & word relationship in a given concept space is proposed. To improve the performance of CSA, a novel feature selection technique called the Modified hybrid union (MHU) was designed, which considerably reduced computation time and cost. To experimentally validate the proposed approach, MHU based CSA was applied to the problem of text categorization. Experiments performed on standard data sets like Reuters-21578 and WSDL-TC, show that the proposed CSA with MHU approach significantly improved performance in terms of execution time and categorization accuracy. © 2018 IEEE.
  • Item
    Recursive Harmony Search Based Classifier Ensemble Reduction
    (Institute of Electrical and Electronics Engineers Inc., 2018) Kailas, P.; Chandrasekaran, K.
    In recent times classifier ensembles have become a mainstay in data mining and machine learning. The combination of several classifiers generally results in better performance and accuracy as compared to a single classifier. There are many different methods and techniques for constructing ensembles. Most of the time however, when these ensemble classifiers are constructed, the data used in the construction of ensemble classifiers becomes redundant. This redundant data results in a loss of accuracy and an increase in memory and system overhead. Therefore by removing this redundant data we can reduce the memory and system overhead as well as obtain an increase in accuracy. The redundant data can be eliminated by using a technique called feature selection. Feature selection is used to select the most relevant features while performing any task. There are many different feature selection algorithms such as memetic algorithms, sub-modular feature selection, etc. The feature selection technique can be used to choose the relevant data and eliminate the redundant data. The way to eliminate redundant data in ensemble classifiers is to perform classifier ensemble reduction. This paper discusses using feature selection and in particular employing recursive harmony search to perform classifier ensemble reduction via feature selection. The final ensemble classifier will be a reduced set of the original ensemble classifier, while maintaining diversity and accuracy of the original one. © 2018 IEEE.
  • Item
    Comparative Analysis of Intrusion Detection System using ML and DL Techniques
    (Springer Science and Business Media Deutschland GmbH, 2023) Sunil, C.K.; Reddy, S.; Kanber, S.G.; Vuddanti, V.R.; Patil, N.
    Intrusion detection system (IDS) protects the network from suspicious and harmful activities. It scans the network for harmful activity and any potential breaching. Even in the presence of the so many network intrusion APIs there are still problems in detecting the intrusion. These problems can be handled through the normalization of whole dataset, and ranking of feature on benchmark dataset before training the classification models. In this paper, used NSL-KDD dataset for the analysation of various features and test the efficiency of the various algorithms. For each value of k, then, trained each model separately and evaluated the feature selection approach with the algorithms. This work, make use of feature selection techniques like Information gain, SelectKBest, Pearson coefficient and Random forest. And also iterate over the number of features to pick the best values in order to train the dataset.The selected features then tested on different machine and deep learning approach. This work make use of stacked ensemble learning technique for classification. This stacked ensemble learner contains model which makes un-correlated error there by making the model more robust. © 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.
  • Item
    Fertilizer Recommendation Using Ensemble Filter-Based Feature Selection Approach
    (Springer Science and Business Media Deutschland GmbH, 2023) Sujatha, M.; Jaidhar, C.D.
    Precise application of fertilizer is essential for sustainable agricultural yield. Machine learning-based classifiers are vital in evaluating soil fertility without contaminating the environment. This work uses machine learning-based classifiers such as Classification and Regression Tree, Extra Tree, J48 Decision Tree, Random Forest, REPTree, Naive Bayes, and Support Vector Machine to classify soil fertility. Initially, soil classification was conducted using chemical measurements of 11 soil parameters such as Electrical Conductivity, pH, Organic Carbon, Boron, Copper, Iron, Manganese, Phosphorus, Potassium, Sulphur, and Zinc. The traditional laboratory analysis of soil chemical parameters is time-consuming and expensive. This research work focuses on developing a robust machine learning-based classification approach by employing prominent features recommended by the ensemble filter-based feature selection. To overcome the inconsistency in generating different feature scores, an ensemble filter-based feature selection is devised using three different filter-based feature selection approaches: Information Gain, Gain Ratio, and Relief Feature. Two different datasets are used to evaluate the robustness of the proposed approach. Obtained experimental results demonstrated that the proposed approach with the Random Forest classifier achieved the highest Accuracy of 99.96% and 99.90% for dataset-1 and dataset-2, respectively. The proposed method reduces the inconsistency in feature selection by eliminating a common parameter from both datasets. It minimizes the cost of soil fertility classification by using relevant soil parameters. The classification results are used to provide fertilizer prescriptions. © 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.
  • Item
    Feature Selection for Peer-to-Peer Lending Default Risk Using Boruta and mRMR Approach
    (Institute of Electrical and Electronics Engineers Inc., 2023) Anusha Hegde, H.; Bhowmik, B.
    Peer-to-peer (P2P) lending in the Financial Technology (FinTech) sector is increasingly gaining attention from people where the online platform enables lenders to offer loans to borrowers. The platform as a much needed mechanism targets to reduce the risk of default and increase profitability for lenders and the platform. Each loan record maintains a variety of attributes, including details about the loan, the borrower, their credit history, their finances, and public data. If all the features are considered, the performance of the lending platform may decline. Finding the necessary characteristics more helpful in forecasting loan default is a concern. This paper investigates essential features of the P2P lending mechanism with adequate performance in lending money to individuals or businesses. We employ two algorithms to find the pertinent features: Boruta and Max-Relevance and Min-Redundancy (mRMR). Further, we use two classifiers-decision tree and XGBoost that exercise the selected elements to predict the loan defaults. © 2023 IEEE.
  • Item
    Comparative analysis of Software Reliability using Grey Wolf Optimisation and Machine Learning
    (Institute of Electrical and Electronics Engineers Inc., 2024) Kelkar, S.; Vishvasrao, S.P.; Agarwal, A.; Rajput, C.; Mohan, B.R.; Das, M.
    Software reliability is a crucial aspect of software quality. In this paper, we aim to explore the application of Gray Wolf Optimization (GWO) for feature selection and classification on various software dataset, such as KC1, JM1, and PC5. We compare the performance of Machine Learning models (Random Forest, Decision Tree, Support Vector Machine, XGBoost and Neural Networks) with and without GWO-based feature selection. Our results demonstrate the effectiveness of GWO in enhancing the accuracy of software reliability analysis. Or Math in Paper Title or Abstract. © 2024 IEEE.