Conference Papers
Permanent URI for this collectionhttps://idr.nitk.ac.in/handle/123456789/28506
Browse
10 results
Search Results
Item Sound event detection in urban soundscape using two-level classification(Institute of Electrical and Electronics Engineers Inc., 2016) Luitel, B.; Vishnu Srinivasa Murthy, Y.V.S.; Koolagudi, S.G.A huge increase in automobile field h as lead t o the creation of different sounds in large volume, especially in urban cities. An analysis of the increased quantity of automobiles will give information related to traffic and vehicles. It also provides a scope to understand the scenario of particular location using sound scape information. In this paper, a two level classification is proposed to classify urban sound events such as bus engine (BE), bus horn (BH), car horn (CH) and whistle (W) sounds. The above sounds are taken as they place a major role in traffic scenario. A real-time data is collected from the live recordings at major locations of the urban city. Prior to the detection of events, the class of events is identified u sing signal processing techniques. Further, features such as Mel-frequency cepstral coefficients (MFCCs) a re extracted based on the analysis of a spectrum of the above-mentioned events and they are prominent to classify even in the complex scenario. Classifiers such as artificial neural networks (ANN), naive-Bayesian (NB), decision tree (J48), random forest (RF) are used at two levels. The proposed approach outperforms the existing approaches that usually does direct feature extraction without signal level analysis. © 2016 IEEE.Item Characterization of aspirated and unaspirated sounds in speech(Institute of Electrical and Electronics Engineers Inc., 2017) Ramteke, P.B.; Sadanand, A.; Koolagudi, S.G.; Pai, V.In this work, consonant aspiration and unaspiration phenomena are studied. It is known that, pronunciation of aspiration and unaspiration is characterized by the 'puff of air' released at the place of constriction in the vocal tract which is known as burst. Here, the properties of vowel immediately after the burst are studied for characterization of the burst. Excitation source signal estimated from the speech linear prediction residual is used for the task. The signal characteristics such as glottal pulse, duration of open, closed & return phases, slope of open & return phases, duration of burst, ratio of highest and lowest energies of signal and voice onset time (VOT) are explored to characterize aspiration and unaspiration. TIMIT English speech corpus is used to test the proposed approach. Random forest (RF) and support vector machine (SVMs) are used as classifiers to test the effectiveness of the features used for the task. An accuracy of 99.93% and 94.03% is achieved respectively. From the results, it is observed that the proposed features are robust in classifying the aspirated and unaspirated consonants. © 2017 IEEE.Item Characterization of Consonant Sounds Using Features Related to Place of Articulation(Springer, 2020) Ramteke, P.B.; Hegde, S.; Koolagudi, S.G.Speech sounds are classified into 5 classes, grouped based on place and manner of articulation: velar, palatal, retroflex, dental and labial. In this paper, an attempt has been made to explore the role of place of articulation and vocal tract length in characterizing the different class of speech sounds. Formants and vocal tract length available for the production of each class of sound are extracted from the region of transition from consonant burst to the rising profile of the immediate following vowel. These features along with their statistical variations are considered for the analysis. Based on the non-linear nature of the features Random Forest (RF) is used for the classification. From the results, it is observed that the proposed features are efficient in discriminating the class of consonants: velar and palatal, palatal and retroflex and palatal and labial sounds with an accuracy of 92.9%, 93.83 and 94.07 respectively. © 2020, Springer Nature Singapore Pte Ltd.Item Estimation of Tyre Pressure from the Characteristics of the Wheel: An Image Processing Approach(Springer, 2020) Vineeth Reddy, V.B.; Ananda Rao, H.; Yeshwanth, A.; Ramteke, P.B.; Koolagudi, S.G.Improper tyre pressure is a safety issue that falls prey to ignorance of users. But a drop in tyre pressure can result in the reduction of mileage, tyre life, vehicle safety and performance. In this paper, an approach is proposed to measure the tyre pressure from the image of the wheel. The tyre pressure is classified into under pressure and normal pressure using load index, tyre type, tyre position and ratio of compressed and uncompressed tyre radius. The efficiency of the feature is evaluated using three classifiers namely Random Forest, AdaBoost and Artificial Neural Networks. It is observed that the ratio of radii plays a major role in classifying the tyres. The proposed system can be used to obtain a rough idea on whether the tyre should be refilled or not. © 2020, Springer Nature Singapore Pte Ltd.Item Prevention of webshell attack using machine learning techniques(Grenze Scientific Society, 2021) Satish, Y.C.; Naik, P.M.; Rudra, B.Webshell is a web vulnerability and a security threat to any user or a server that can be accessed by attackers to control our system. And also, they may use our system as a command control device to attack other systems. It is difficult to monitor and identify such threats because attackers always tried to attack in different methods and new technologies. However, we can detect the webshell with Machine Learning Techniques with better accuracy; all we need is more number of samples. With this project, we presented a PHP based webshell detecting model. We used different ML algorithms: Logistic Regression(LR), Random Forest(RF), Support Vector Machine(SVM) and K-Nearest Neighbour(KNN). Addition to this PHP file's standard statistical features, we also added an opcode sequence from the PHP files, consisting of the TF-IDF Vector and the Hash Vector. Depending upon these features, we trained with different machine learning models(SVM, RF, LR, KNN). In these models, we got better results with Random Forest having an accuracy of 96.45\% with a false-positive rate of 3.5\%, which is good results compared to several popular detection techniques. © Grenze Scientific Society, 2021.Item Hate Speech and Offensive Content Identification in Hindi and Marathi Language Tweets using Ensemble Techniques(CEUR-WS, 2021) Rajalakshmi, R.; Mattins, F.; Srivarshan, S.; Reddy, L.P.; Anand Kumar, M.Hate Speech is described as any form of speech in which speakers attempt to ridicule, humiliate, or inculcate hatred in someone else’s minds based on characteristics such as religion, the colour of skin, race, or sexual preference. In recent years, social networking sites have been a major source of excessive amounts of hate speech. If unaddressed, these might cause anxiety and despair in the affected individuals or groups. As a result, the above-mentioned social networks utilize an assortment of algorithms to identify such hate speech. Detecting Hate Speech in English texts has been one of the hottest topics in recent years, with multiple types of research being published. However, in regional and indigenous languages, hate speech detection is a recent area with not much research being conducted. It is difficult to perform hate speech detection using data in regional languages due to a lack of large enough training data and a lack of resources about that domain. The HASOC [1] 2021 Hate Speech Detection Task solves one of the problems. It provides a dataset containing Tweet data in English, Hindi [2] and Marathi [3] languages. There were two subtasks as part of the main task. The subtask was to classify the hate speech and offensive texts in the Hindi and Marathi tweet dataset as Hate Speech (HATE), Offensive (OFFN) or Profane (PRF). This work compares the performance of different models on both subtasks and provides a conclusion on the best performing model. The Random Forest Classifier reports the most remarkable accuracy on the first subtask with a macro F1 score of 75.19% and 73.12% on the Marathi and Hindi tweet datasets. The XGBoost algorithm is the best performing algorithm on the second subtask with a 46.5% macro F1 score. Overall any of these models can get satisfactory results when dealing with hate speech detection in regional language. This work has been submitted to the FIRE2021 shared task, Hate Speech and Offensive Content Identification in English and Indo-Aryan Languages (HASOC-2021) by team DLRG. © 2021 Copyright for this paper by its authors.Item A Machine Learning Approach for Daily Temperature Prediction Using Big Data(Springer Science and Business Media Deutschland GmbH, 2022) Divakarla, U.; Chandrasekaran, K.; Hemant Kumar Reddy, K.H.K.; Reddy, R.V.; Rao, M.Due to global warming, weather forecasting becomes complex problem which is affected by a lot of factors like temperature, wind speed, humidity, year, month, day, etc. weather prediction depends on historical data and computational power to analyze. Weather prediction helps us in many ways like in astronomy, agriculture, predicting tsunamis, drought, etc. this helps us to be prepared in advance for any kinds disasters. With rapid development in computational power of high end machines and availability of enormous data weather prediction becomes more and more popular. But handling such huge data becomes an issue for real time prediction. In this paper, we introduced the machine learning-based prediction approach in Hadoop clusters. The extensive use of map-reduce function helps us distribute the big data into different clusters as it is designed to scale up from single servers to thousands of machines, each offering local computation and storage. An ensemble distributed machine learning algorithms are employed to predict the daily temperature. The experimental results of proposed model outperform than the techniques available in literature. © 2022, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.Item Estimation of Breast Tumor Parameters by Random Forest Method with the Help of Temperature Data on the Surface of the Numerical Breast Model(Springer Science and Business Media Deutschland GmbH, 2023) Venkatapathy, G.; Rahul, V.M.; Gnanasekaran, N.The second most frequent reason for cancer-related fatalities in women is breast cancer. When a condition is identified early, better treatment choices are available. Different temperature patterns are seen on the breast surface due to the tumors, which change blood perfusion rate and metabolic heat production. Thermography is an infrared imaging technology for breast cancer screening that records temperature variations. The temperature dataset on the surface of the breast that corresponded to the tumor’s diameter and the location was needed for the current study, but such actual data are not accessible. Thus, the modeling and development of a dataset constitute the initial component of the current study. The bio-heat transport equation is solved using COMSOL multiphysics software, and the model consists of a spherical tumor inside of a hemispherical breast model. By changing the sizes and positions of the tumor inside the breast during simulations, a reliable dataset is created. The training and testing of the dataset produced from the simulations using the random forest method make up the second portion of the current study. Breast skin temperature is used as an input in a random forest machine learning algorithm in the current work to determine the diameter and location of the tumor inside the breast. The diameter and area of the tumor location are estimated by a trained random forest algorithm. © 2023, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.Item Prediction of Pore Solution Concentration in Cement Composite System by Using Machine Learning Techniques(Springer Science and Business Media Deutschland GmbH, 2024) Walke, S.; Sundaramoorthi, S.; Palanisamy, T.A thorough understanding of the pore solution's composition is crucial for a number of cementitious material properties, including durability. The pore solution concentration is determined by a variety of experimental techniques. However, these approaches aren't always straightforward. A possible substitute to complex pore solution extraction and analysis procedures could be machine learning (ML) models. The objective of this research is to explore ML techniques for predicting the cement pore solution composition composite systems produced with Ordinary Portland cement (OPC) and supplemental cementitious materials (SCM). Data on the compositions of pore solutions for different cementitious systems were gathered from the literature and combined into a comprehensive database that has over 400 data entries. Random Forest and Gradient Boosting techniques were applied to the database. Statistic metrics such as R2, RMSE and MAE were used to evaluate the prediction accuracy of the built model. Sensitivity analysis of the built models was carried out and compared. The gradient boosting technique was found to be the most effective method in prediction of the pore solution concentration (R2 ranging from (0.80–0.98) and lower RMSE values) due to its effective problem-solving capacity and minimum requirement for future engineering. Thus, ML models offer a potential approach for determining the pore solution concentration. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.Item An Approach for Integrating Behavioral Analytics and Machine Learning for Enhanced Cybersecurity(Institute of Electrical and Electronics Engineers Inc., 2024) Shivappa, P.K.; Shetty D, P.Data breaches and cyber threats have evolved into increasingly complex and stealthy forms. Conventional anomaly detection based on rules is ineffective in identifying numerous contemporary attacks. Hence, User Behavior Analysis is performed on the network traffic flow data to comprehend, model, and forecast users' actions. Nevertheless, the diversity of the methods makes their understanding exceedingly complex. Therefore, domain experts use machine learning (ML) to accomplish their goals. Thus, this paper aims to suggest an innovative architecture that can detect anomalies in the network traffic flow by analyzing user behavior. The two different sets of data are used for two-class and four-class classification. Both the data are pre-processed for duplicates, missing values, and performing encoding techniques. The correlation analysis is performed to understand the user's behavior before training the ML models. The four different ML algorithms, like Logistic regression LR, KNN, DT, and RF algorithms are applied to the pre-processed datasets. The Random Forest algorithm outperforms by achieving 100% accuracy on two- and four-class classification. The described behavioral modeling approach updates cyber threat detection to match the needs of the modern, ever-changing threat landscape. © 2024 IEEE.
