Conference Papers

Permanent URI for this collectionhttps://idr.nitk.ac.in/handle/123456789/28506

Browse

Search Results

Now showing 1 - 10 of 20
  • Item
    Hybrid intelligent bayesian model for analyzing spatial data
    (Springer Verlag service@springer.de, 2018) Velmurugan, J.; Venkatesan, M.
    Spatial data mining refers to the extraction of Geo Spatial Knowledge, maintaining their spatial relationships, along with other interesting patterns not explicitly stored in spatial datasets. The overall objective of this research work is to apply GIS based data mining classification modeling techniques to assess the spatial landslide risk analysis in Nilgris district, Tamilnadu, India. Landslide is one of the most important hazards that affect different parts of India in the every year. Landslides cover broad range impact on the people of the affected area in terms of the devastation caused to material and human resources. Landslide is generated by various factors such as rainfall, soil, slope, land use and land covers, geology, etc. Each landslide factor has a different level of values. The ranking of values and assignment of weight to the landslide factor gives good classification of landslide risk level. Data science and soft computing play major role in landslide risk analysis. The rank and weight are assigned to the landslide factor and its different levels using classification data science techniques. In this paper, we proposed a new model with integration of rough set and Bayesian classification called Hybrid Intelligent Bayesian Model (HIBM) to analyze the possibilities of various landslide risk level. The proposed model is compared with real-time data, and performance is validated with other data science models. © 2018, Springer Nature Singapore Pte Ltd.
  • Item
    Text document analysis using map-reduce framework
    (Springer Verlag service@springer.de, 2018) Kanimozhi, K.V.; Prabhavathy, P.; Venkatesan, M.
    Due to the advance Internet and increasing globalization, the electronics forms of information grow in a rapid manner. Extracting the useful hidden information from those multiple documents is a recent challenge. Hence, efficient and automated clustering algorithm which is effective in identifying topics plays the main role in information retrieval. In this paper, the analysis regarding the large unstructured text document corpus using our proposed map-reduce algorithm has been performed, and the results show the advantage of the proposed method by detecting clusters of document features within less computation time and provides premier solution for increasing the precision rate of retrieval in information extraction. © 2018, Springer Nature Singapore Pte Ltd.
  • Item
    A Bag-of-Phonetic-Codes Modelfor Cyber-Bullying Detection in Twitter
    (Institute of Electrical and Electronics Engineers Inc., 2018) Shekhar, A.; Venkatesan, M.
    Social networking sites such as Twitter, Facebook, MySpace, Instagram are emerging as a strong medium of communication these days. These have become a part and parcel of daily life. People can express their thoughts and activities among their social circle with brings them closer to their community. However this freedom of expression has its drawbacks. Sometimes people show their aggression on Social Media which in turn hurts the sentiments of the targeted victims. Certain forms of cyber-bullying are sexual, racial and physical disability based. Hence a proper surveillance is necessary to tackle such situations. Twitter as a micro-blogging site sees cyber abuse on a daily basis. However, tweets are raw texts; containing a lot of misspelled words and censored words. This paper proposes a novel method to detect cyber-bullying, a Bag-of-Phonetic-Codes model. Using pronunciation of words as features can rectify misspelled words and can identify censored words. Correctly identifying duplicate words can lead to smaller vocabulary of words, thereby reducing the feature space. The inspiration for this proposed work is drawn from the famous Bag-of-Words model for extracting textual features. Phonetic code generation has been done using the Soundex Algorithm. Besides the proposed model, experiments were carried out with both supervised and unsupervised machine learning approaches on multiple datasets to understand the approaches and challenges in the domain of cyber-bullying detection. © 2018 IEEE.
  • Item
    Predicting Influenza Outbreak using Constrained Static and dynamic Feature
    (Institute of Electrical and Electronics Engineers Inc., 2018) Dofadar, S.; Venkatesan, M.
    Twitter is a free social networking and micro-blogging service that gives the opportunity to write and read each others tweet to its 330 million users all over the world with a limitation of 280 characters in each tweet. As a result, Twitter can provide a huge amount of data regarding what is happening at a particular time in all over the world. One of those is epidemic event detection and prediction from the twitter data. In this study, the use of Twitter data to detect influenza outbreak is examined. The result from this experiment shows that estimate of influenza outbreak can be derived from twitter correctly combining constrained supervised and unsupervised features and then using a prediction model. © 2018 IEEE.
  • Item
    Hybrid Approach for Intrusion Detection System
    (Institute of Electrical and Electronics Engineers Inc., 2018) Singh, P.; Venkatesan, M.
    In the recent research, Intrusion Detection sys- tem in Machine Learning has been giving good detection and high accuracy on novel attacks. The major purpose of this study is implementing a method that combines Random-Forest classification technique and K-Means clustering Algorithms. In misuse-detection, random-forest algorithm will build a patterns of intrusion over a training data. And in anomaly-detection, intrusions will be identified by the outlier-detection mechanism in the random-forest algorithm. This hybrid-detection system will combine the advantage of anomaly and mis-use detection and improves the performance of detection. This paper mainly focused on evaluating the performance of hybrid approaches namely Gaussian Mixture clustering with Random Forest Classifiers and K-Means clustering with Random Forest Classifiers in-order to detect intrusion. These algorithms were evaluated for the four categories of attacks based on accuracy, false-alarm-rate, and detection-rate. From our experiments conducted, K-Means clustering with Random Forest Classifiers outperformed over the Gaussian Mixture clustering with Random Forest Classifiers. © 2018 IEEE.
  • Item
    Optimal Band Selection Using Generalized Covering-Based Rough Sets on Hyperspectral Remote Sensing Big Data
    (Springer Verlag service@springer.de, 2019) Kelam, H.; Venkatesan, M.
    Hyperspectral remote sensing has been gaining attention from the past few decades. Due to the diverse and high dimensionality nature of the remote sensing data, it is called as remote sensing Big Data. Hyperspectral images have high dimensionality due to number of spectral bands and pixels having continuous spectrum. These images provide us with more details than other images but still, it suffers from ‘curse of dimensionality’. Band selection is the conventional method to reduce the dimensionality and remove the redundant bands. Many methods have been developed in the past years to find the optimal set of bands. Generalized covering-based rough set is an extended method of rough sets in which indiscernibility relations of rough sets are replaced by coverings. Recently, this method is used for attribute reduction in pattern recognition and data mining. In this paper, we will discuss the implementation of covering-based rough sets for optimal band selection of hyperspectral images and compare these results with the existing methods like PCA, SVD and rough sets. © 2019, Springer Nature Singapore Pte Ltd.
  • Item
    Graph based Unsupervised Learning Methods for Edge and Node Anomaly Detection in Social Network
    (Institute of Electrical and Electronics Engineers Inc., 2019) Venkatesan, M.; Prabhavathy, P.
    In the last decade online social networks analysis has become an interesting area of research for researchers, to study and analyze the activities of users using which the user interaction pattern can be identified and capture any anomalies within an user community. Detecting such users can help in identifying malicious individuals such as automated bots, fake accounts, spammers, sexual predators, and fraudsters. An anomaly (outliers, deviant patterns, exceptions, abnormal data points, malicious user) is an important task in social network analysis. The major hurdle in social networks anomaly detection is to identify irregular patterns in data that behaves significantly different from regular patterns. The focus of this paper is to propose graph based unsupervised machine learning methods for edge anomaly and node anomaly detection in social network data. © 2019 IEEE.
  • Item
    Spatial data-based prediction models for crop yield analysis: A systematic review
    (Springer, 2020) Mohan, A.; Venkatesan, M.
    Agriculture plays a vital role in the global economy. WHO states that there are three pillars of food security: availability, access, and usage. Among these three pillars, availability is the most important one. Ensuring food for the entire population of a country is achieved only through an increase in crop production. Accurate and timely forecasting of the weather can help to increase the yield production. Early prediction of crop yield has a vital role in food availability measure. Researchers monitor different parameters that affect the crop yield regularly. Yield prediction did through either statistical data or spatial data. Crop monitoring through remote sensing can cover a vast land area. Therefore, spatial data-based prediction is widespread in recent decades. Satellite images such as multispectral, hyperspectral, and radar images were used to calculate crop area, soil moisture, field greenness, etc. Among these imaging modalities, hyperspectral images give more accurate results, but its higher dimensionality is a challenging issue. Optimal band selection from hyperspectral images helps to reduce this curse of dimensionality problem. Crop area is one of the essential parameters for yield prediction. The exact crop area measure can be achieved only through the best crop discrimination methods. This paper provides a comprehensive review of crop yield prediction using hyperspectral images. Besides, we explore the research challenges and open issues in this area. © Springer Nature Singapore Pte Ltd 2020.
  • Item
    Mineral identification using unsupervised classification from hyperspectral data
    (Springer, 2020) Gupta, P.; Venkatesan, M.
    Hyperspectral imagery is one of the research areas in the field of remote sensing. Hyperspectral sensors record reflectance of object or material or region across the electromagnetic spectrum. Mineral identification is an urban application in the field of remote sensing of Hyperspectral data. Challenges with the hyperspectral data include high dimensionality and size of the hyperspectral data. Principle component analysis (PCA) is used to reduce the dimension of data by band selection approach. Unsupervised classification technique is one of the hot research topics. Due to the unavailability of ground truth data, unsupervised algorithm is used to classify the minerals present in the remotely sensed hyperspectral data. K-means is unsupervised clustering algorithm used to classify the mineral and then further SVM is used to check the classification accuracy. K-means is applied to end member data only. SVM used k-means result as a labelled data and classify another set of dataset. © Springer Nature Singapore Pte Ltd 2020.
  • Item
    Tea leaf disease prediction using texture-based image processing
    (Springer, 2020) Srivastava, A.R.; Venkatesan, M.
    Nowadays, Tea is commonly used in India as well as in all over the world. Tea is produced in many states of India, i.e., Assam, West Bengal, Tamil Nadu, Karnataka, and so on. But, production of tea is heavily affected by various diseases and pests. There are various kinds of diseases in tea leaves and various pests that can damage the tea crop and affect the tea production. Tea crop damage is reduced by recognizing the tea leaf diseases in an early stage. After detection of the kind of tea leaf diseases, suitable preventive method can be used to reduce the tea crop damage. For the detection of tea leaves diseases, there are different classification methods. Some classification techniques are random forest classifier, k-nearest neighbor classifier, support vector machine classifier, neural network, etc. After training the dataset with classifier, the image of tea leaf is given as an input, the best possible match is found by the classifier system, and diseases are recognized by the classifier system. This project is going to use various classification techniques to recognize and predict the tea leaves disease which helps us to improve the tea production of India. © Springer Nature Singapore Pte Ltd 2020.