Imbwaga, J.L.Chittaragi, N.B.Koolagudi, S.G.2026-02-042024International Journal of Speech Technology, 2024, 27, 2, pp. 447-46913812416https://doi.org/10.1007/s10772-024-10116-6https://idr.nitk.ac.in/handle/123456789/21069Even though every individual is entitled to freedom of speech, some limitations exist when this freedom is used to target and harm another individual or a group of people, as it translates to hate speech. In this study, the proposed research deals with detection of hate speech for English and Kiswahili languages from audio. The dataset used in this work was collected manually from YouTube videos and then converted to audio. Audio-based features namely spectral, temporal, prosodic and excitation source features were extracted and used to train various machine learning classifiers. Initial experiments were conducted for English language and later on for Kiswahili language. However, it is observed from literature that research activities on Kiswahili language is comparatively lesser. The scores calculated for accuracy, recall, precision, auc and f1 score in detecting hate speech, suggest that Random Forest classifier performed better for English language while the Extreme Gradient Boosting classifier performed better for Kiswahili language. © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024.Audio acousticsLearning algorithmsSpeech recognitionAudio classificationEnglish languagesHate speechKiswahilusMachine learning algorithmsMachine-learningProsodic featuresSpectral featureSpeech detectionYouTubeMachine learningAutomatic hate speech detection in audio using machine learning algorithms