Faculty Publications
Permanent URI for this communityhttps://idr.nitk.ac.in/handle/123456789/18736
Publications by NITK Faculty
Browse
3 results
Search Results
Item Prediction model for peninsular Indian summer monsoon rainfall using data mining and statistical approaches(Elsevier Ltd, 2017) Vathsala, H.; Koolagudi, S.G.In this paper we discuss a data mining application for predicting peninsular Indian summer monsoon rainfall, and propose an algorithm that combine data mining and statistical techniques. We select likely predictors based on association rules that have the highest confidence levels. We then cluster the selected predictors to reduce their dimensions and use cluster membership values for classification. We derive the predictors from local conditions in southern India, including mean sea level pressure, wind speed, and maximum and minimum temperatures. The global condition variables include southern oscillation and Indian Ocean dipole conditions. The algorithm predicts rainfall in five categories: Flood, Excess, Normal, Deficit and Drought. We use closed itemset mining, cluster membership calculations and a multilayer perceptron function in the algorithm to predict monsoon rainfall in peninsular India. Using Indian Institute of Tropical Meteorology data, we found the prediction accuracy of our proposed approach to be exceptionally good. © 2016 Elsevier LtdItem An efficient parallel row enumerated algorithm for mining frequent colossal closed itemsets from high dimensional datasets(Elsevier Inc. usjcs@elsevier.com, 2019) Vanahalli, M.K.; Patil, N.Mining colossal itemsets from high dimensional datasets have gained focus in recent times. The conventional algorithms expend most of the time in mining small and mid-sized itemsets, which do not enclose valuable and complete information for decision making. Mining Frequent Colossal Closed Itemsets (FCCI) from a high dimensional dataset play a highly significant role in decision making for many applications, especially in the field of bioinformatics. To mine FCCI from a high dimensional dataset, the existing preprocessing techniques fail to prune the complete set of irrelevant features and irrelevant rows. Besides, the state-of-the-art algorithms for the same are sequential and computationally expensive. The proposed work highlights an Effective Improved Parallel Preprocessing (EIPP) technique to prune the complete set of irrelevant features and irrelevant rows from high dimensional dataset and a novel efficient Parallel Frequent Colossal Closed Itemset Mining (PFCCIM) algorithm. Further, the PFCCIM algorithm is integrated with a novel Rowset Cardinality Table (RCT), an efficient method to check the closeness of a rowset and also an efficient pruning strategy to cut down the mining search space. The proposed PFCCIM algorithm is the first parallel algorithm to mine FCCI from a high dimensional dataset. The performance study shows the improved effectiveness of the proposed EIPP technique over the existing preprocessing techniques and the improved efficiency of the proposed PFCCIM algorithm over the existing algorithms. © 2018 Elsevier Inc.Item An efficient dynamic switching algorithm for mining colossal closed itemsets from high dimensional datasets(Elsevier B.V., 2019) Vanahalli, M.K.; Patil, N.The abundant data across a variety of domains including bioinformatics has led to the formation of dataset with high dimensionality. The conventional algorithms expend most of their time in mining a large number of small and mid-sized itemsets which does not enclose complete and valuable information for decision making. The recent research is focused on Frequent Colossal Closed Itemsets (FCCI), which plays a significant role in decision making for many applications, especially in the field of bioinformatics. The state-of-the-art algorithms in mining FCCI from datasets consisting of a large number of rows and a large number of features are computationally expensive, as they are either pure row or feature enumeration based algorithms. Moreover, the existing preprocessing techniques fail to prune the complete set of irrelevant features and irrelevant rows. The proposed work emphasizes an Effective Improvised Preprocessing (EIP) technique to prune the complete set of irrelevant features and irrelevant rows, and a novel efficient Dynamic Switching Frequent Colossal Closed Itemset Mining (DSFCCIM) algorithm. The proposed DSFCCIM algorithm efficiently switches between row and feature enumeration methods based on data characteristics during the mining process. Further, the DSFCCIM algorithm is integrated with a novel Rowset Cardinality Table, Itemset Support Table, two efficient methods to check the closeness of rowset and itemset, and two efficient pruning strategies to cut down the search space. The proposed DSFCCIM algorithm is the first dynamic switching algorithm to mine FCCI from datasets consisting of a large number of rows and a large number of features. The performance study shows the improved effectiveness of the proposed EIP technique over the existing preprocessing techniques and the improved efficiency of the proposed DSFCCIM algorithm over the existing algorithms. © 2019 Elsevier B.V.
