Conference Papers

Permanent URI for this collectionhttps://idr.nitk.ac.in/handle/123456789/28506

Browse

Search Results

Now showing 1 - 3 of 3
  • Item
    Association analysis of significant frequent colossal itemsets mined from high dimensional datasets
    (Institute of Electrical and Electronics Engineers Inc., 2017) Vanahalli, M.K.; Patil, N.
    Bioinformatics has contributed to a different form of datasets called as high dimensional datasets. The high dimensional datasets are characterized by a large number of features and a small number of samples. The traditional algorithms expend most of the running time in mining large number of small and mid-size items which does not enclose valuable and significant information. The recent research focused on mining large cardinality itemsets called as colossal itemsets which are significant to many applications, especially in the field of bioinformatics. The existing frequent colossal itemset mining algorithms are unsuccessful in discovering complete set of significant frequent colossal itemsets. The mined colossal itemsets from existing algorithms provide erroneous support information which affects association analysis. Mining significant frequent colossal itemsets with accurate support information helps in attaining a high-level accuracy of association analysis. The proposed work highlights a novel pre-processing technique and bottom-up row enumeration algorithm to mine significant frequent colossal itemsets with accurate support information. A novel pre-processing technique efficiently utilizes minimum support threshold and minimum cardinality threshold to prune irrelevant samples and features. The experiment results demonstrate that the proposed algorithm has high accuracy over existing algorithms. Performance study indicates the efficiency of the pre-processing technique. © 2016 IEEE.
  • Item
    Distributed mining of significant frequent colossal closed itemsets from long biological dataset
    (Springer Verlag service@springer.de, 2020) Vanahalli, M.K.; Patil, N.
    Mining colossal itemsets have gained more attention in recent times. An extensive set of short and average sized itemsets do not confine complete and valuable information for decision making. But, the traditional itemset mining algorithms expend a gigantic measure of time in mining these little and average sized itemsets. Colossal itemsets are very significant for numerous applications including the field of bioinformatics and are influential during the decision making. The new mode of dataset known as long biological dataset was contributed by Bioinformatics. These datasets are high dimensional datasets, which are depicted by an expansive number of features (attributes) and a less number of rows (samples). Extracting huge amount of information and knowledge from high dimensional long biological dataset is a nontrivial task. The existing algorithms are computationally expensive and sequential in mining significant Frequent Colossal Closed itemsets (FCCI) from long biological dataset. Distributed computing is a good strategy to overcome the inefficiency of the existing sequential algorithm. The paper proposes a distributed computing approach for mining FCCI. The row enumerated mining search space is efficiently cut down by pruning strategy enclosed in Distributed Row Enumerated Frequent Colossal Closed Itemset Mining (DREFCCIM) algorithm. The proposed DREFCCIM algorithm is the first distributed algorithm to mine FCCI from long biological dataset. The experimental results demonstrate the efficient performance of the DREFCCIM algorithm in comparison to the current algorithms. © Springer Nature Switzerland AG 2020.
  • Item
    Analysis and Prediction of Fantasy Cricket Contest Winners Using Machine Learning Techniques
    (Springer Science and Business Media Deutschland GmbH info@springer-sbm.com, 2021) Karthik, K.; S. Krishnan, G.S.; Shetty, S.; Bankapur, S.; Kolkar, R.; Ashwin, T.S.; Vanahalli, M.K.
    Cricket is one of the well-known sports across the world. The increasing interest of cricket in recent years resulted in different forms like T20, T10 from test and one day format. The craze of all these formats of cricket matches today has come into online fantasy cricket league games. Dream11 is one such app that is most popular in this context, along with many similar apps. Creating a dream team of 11 players from playing 11 of both teams involves skills, ideas and luck. Predicting a winner among all the joined contestants based on the previous historical data is a challenging task. In this paper, we used a feed-forward deep neural network (DNN) classifier for predicting the winning contestant for the top three positions in a fantasy league cricket contest. The performance of the DNN approach was compared against that of state-of-the-art machine learning approaches like k-nearest neighbours (KNN), logistic regression (LR), Naive Bayes (NB), random forest (RF), support vector machines (SVM) and in predicting the fantasy cricket contest winners. Among the methods used, DNN showed the best results for all three positions, showing its consistency in predicting the winners and outperforms the state-of-the-art machine learning classifiers by 13%, 8% and 9%, respectively, for first, second and third winning positions, respectively. © 2021, The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.