Device Robust Acoustic Scene Classification Using Adaptive Noise Reduction and Convolutional Recurrent Attention Neural Network
No Thumbnail Available
Date
2022
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Springer Science and Business Media Deutschland GmbH
Abstract
Acoustic Scene Classification (ASC) is the task of identifying a scene using sound cues and assigning a label to the identified scene. From the past two years, the datasets that are released for ASC consist of audio samples recorded with multiple devices bringing the problem closer to real-world scenarios. Therefore, we aim to develop a device robust ASC model consisting of audio samples recorded with three different devices. The dataset considered is DCASE 2019 ASC task 1a which consists of the primary recording device (Device A) and two mobile devices (Device B and C). This work introduces the Adaptive Noise Reduction (ANR) technique to reduce the device distortion present in devices B and C audio samples. Spectrograms are extracted from all audio samples and normalized to remove biased values in the input signal. The normalized features are fed to Light weight Convolutional Recurrent Attention Neural Network to perform ASC. The key contributions of this work are the reduction of device distortion in mismatched devices and the introduction of an attention layer in the Convolutional Recurrent Neural Network (CRANN). The results achieved from the proposed method have shown a considerable improvement in the accuracy related to mismatched device ASC. © 2022, Springer Nature Switzerland AG.
Description
Keywords
Adaptive noise reduction, Device distortion, Device robust acoustic scene classification (ASC), Light weight convolutional recurrent attention neural network
Citation
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2022, Vol.13721 LNAI, , p. 688-699
