Conference Papers

Permanent URI for this collectionhttps://idr.nitk.ac.in/handle/123456789/28506

Browse

Search Results

Now showing 1 - 6 of 6
  • Item
    Windows malware detection based on cuckoo sandbox generated report using machine learning algorithm
    (Institute of Electrical and Electronics Engineers Inc., 2016) Shiva Darshan, S.L.S.; M.a, M.A.A.; Jaidhar, C.D.
    Malicious software or malware has grown rapidly and many anti-malware defensive solutions have failed to detect the unknown malware since most of them rely on signature-based technique. This technique can detect a malware based on a pre-defined signature, which achieves poor performance when attempting to classify unseen malware with the capability to evade detection using various code obfuscation techniques. This growing evasion capability of new and unknown malwares needs to be countered by analyzing the malware dynamically in a sandbox environment, since the sandbox provides an isolated environment for analyzing the behavior of the malware. In this paper, the malware is executed on to the cuckoo sandbox to obtain its run-time behavior. At the end of the execution, the cuckoo sandbox reports the system calls invoked by the malware during execution. However, this report is in JSON format and has to be converted to MIST format to extract the system calls. The collected system calls are structured in the form of N-Grams, which help to build the classifier by using the Information Gain (IG) as a feature selection technique. A comprehensive experiment was conducted to perceive the best fit classifier among the chosen classifiers, including the Bayesian-Logistic-Regression, SPegasos, IB1, Bagging, Part, and J48 defined within the WEKA tool. From the experimental results, the overall best performance for all the selected top N-Grams such as 200, 400, and 600 goes to SPegasos with the highest accuracy, highest True Positive Rate (TPR), and lowest False Positive Rate (FPR). © 2016 IEEE.
  • Item
    Information gain score computation for N-grams using multiprocessing model
    (Institute of Electrical and Electronics Engineers Inc., 2017) Shiva Darshan, S.L.S.; M.a, M.A.A.; Jaidhar, C.D.
    Currently, the Internet faces serious threat from malwares, and its propagation may cause great havoc on computers and network security solutions. Several existing anti-malware defensive solutions detect known malware accurately. However, they fail to recognize unseen malware, since most of them rely on signature-based techniques, which are easily evadable using obfuscation or polymorphism technique. Therefore, there is immediate requirement of new techniques that can detect and classify the new malwares. In this context, heuristic analysis is found to be promising, since it is capable of detecting unknown malwares and new variants of current malwares. The N-Gram extraction technique is one such heuristic method commonly used in malware detection. Previous works have witnessed that shorter length N-Grams are easier to extract. In order to identify and remove noisy N-Grams, a popular Feature Selection Technique (FST), namely, Information Gain (IG), which computes score for each N-Gram (feature) in the dataset has been used in this work. N-Grams with the highest IG score are considered as best features, while the remaining N-Grams are neglected. The IG-FST (Information Gain-Feature Selection Technique) is computational resource demanding and takes time to generate IG scores for larger N-Gram datasets, if the processing is to be accomplished in the sequential mode. To address this issue, the present work presents a multiprocessing model that computes IG scores rapidly for larger N-Gram datasets. The proposed model has been designed, implemented, and compared with the sequential mode of IG score computation. The experimental results demonstrate that the proposed multiprocessing model performance is 80% faster than the sequential model of IG score computation. © 2017 IEEE.
  • Item
    Fertilizer Recommendation Using Ensemble Filter-Based Feature Selection Approach
    (Springer Science and Business Media Deutschland GmbH, 2023) Sujatha, M.; Jaidhar, C.D.
    Precise application of fertilizer is essential for sustainable agricultural yield. Machine learning-based classifiers are vital in evaluating soil fertility without contaminating the environment. This work uses machine learning-based classifiers such as Classification and Regression Tree, Extra Tree, J48 Decision Tree, Random Forest, REPTree, Naive Bayes, and Support Vector Machine to classify soil fertility. Initially, soil classification was conducted using chemical measurements of 11 soil parameters such as Electrical Conductivity, pH, Organic Carbon, Boron, Copper, Iron, Manganese, Phosphorus, Potassium, Sulphur, and Zinc. The traditional laboratory analysis of soil chemical parameters is time-consuming and expensive. This research work focuses on developing a robust machine learning-based classification approach by employing prominent features recommended by the ensemble filter-based feature selection. To overcome the inconsistency in generating different feature scores, an ensemble filter-based feature selection is devised using three different filter-based feature selection approaches: Information Gain, Gain Ratio, and Relief Feature. Two different datasets are used to evaluate the robustness of the proposed approach. Obtained experimental results demonstrated that the proposed approach with the Random Forest classifier achieved the highest Accuracy of 99.96% and 99.90% for dataset-1 and dataset-2, respectively. The proposed method reduces the inconsistency in feature selection by eliminating a common parameter from both datasets. It minimizes the cost of soil fertility classification by using relevant soil parameters. The classification results are used to provide fertilizer prescriptions. © 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.
  • Item
    Symbolic Deterministic Finite Automata-based Automated Fertilizer Prescription
    (Institute of Electrical and Electronics Engineers Inc., 2023) Sujatha, M.; Jaidhar, C.D.
    Sustainable agriculture requires the use of an adequate amount of fertilizers. In this research work, initially, an attempt was made to classify soil fertility using machine learning-based classifiers. To overcome the drawbacks of machine learning-based classifiers, this research uses Symbolic Deterministic Finite Automata (SDFA) for soil fertility classification. The proposed method classifies soil fertility as LOW, MEDIUM (MED), or HIGH using the levels of four soil parameters, including pH, Electrical Conductivity (EC), Organic Carbon (OC), and Nitrogen (N). The proposed approach was assessed using Sentinel-2 remotely sensed data and laboratory-measured soil-health data. The experiments' outcomes show that the proposed approach effectively classifies soil fertility. The accuracy achieved using Sentinel-2 data was 100%, while the accuracy gained using laboratory-measured data with four and twelve soil parameters were 100% and 98.37%, respectively. The results of soil fertility classification were used to recommend fertilizers. © 2023 IEEE.
  • Item
    Smart Appliance Abnormal Electrical Power Consumption Detection
    (Springer Science and Business Media Deutschland GmbH, 2024) Nayak, R.; Jaidhar, C.D.
    Potential cyber threats now have an immensely larger attack surface due to the widespread use of smart devices and smart environments. Smart home appliances build a network of linked objects that exchange information and communicate with each other. Detecting abnormal electrical power consumption becomes a first line of protection for bolstering the security of smart homes. Using Machine Learning (ML), anomalous electrical power consumption of the Smart Appliance can be identified. This work proposes an ML-based anomalous electrical power consumption detection to identify the security breach of the Smart Appliances. SimDataset is used for anomalous power consumption detection as a proof of concept for experimentation, and results depicted that Random Forest (RF) classifier outperformed other ML-based classifiers while detecting the abnormal electrical power usage. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.
  • Item
    Experimental Study on Impact of Appliance ID-Based Normalization on SimDataset for Anomalous Power Consumption Classification
    (Institute of Electrical and Electronics Engineers Inc., 2024) Nayak, R.; Jaidhar, C.D.
    In terms of annual worldwide energy consumption, buildings use more energy than any other sector. Enhancing buildings' energy efficiency and ensuring security of the appliances requires iden-tifying abnormal power usage. Identifying anomalous power usage is essential for energy conservation. This study suggests an experimental analysis of SimDataset used for detecting micro-moment-based abnormal power usage. Five machine learning-based classifiers-Random Forest (RF), Support Vector Ma-chine (SVM), K Nearest Neighbors (KNN), Naive Bayes (NB), and Decision Tree (DT)-are used to detect unusual consumption of electricity. The Sim-Dataset has undergone binary and multi-class classi-fication. Effect on the performance of the classifiers after the inclusion of new features is examined. Computational complexity of the classifiers is also analyzed. Experimental results showed, the binary and multi-class classification using the RF model with the original dataset, with Min-Max Normalized Power feature and Appliance Id-based Normalized Power feature, produced identical and maximum accuracy, precision, recall, and F1-Score. © 2024 IEEE.