Polyphonic sound event detection using transposed convolutional recurrent neural network

No Thumbnail Available

Date

2020

Authors

Chatterjee C.C.
Mulimani M.
Koolagudi S.G.

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

In this paper we propose a Transposed Convolutional Recurrent Neural Network (TCRNN) architecture for polyphonic sound event recognition. Transposed convolution layer, which caries out a regular convolution operation but reverts the spatial transformation and it is combined with a bidirectional Recurrent Neural Network (RNN) to get TCRNN. Instead of the traditional mel spectrogram features, the proposed methodology incorporates mel-IFgram (Instantaneous Frequency spectrogram) features. The performance of the proposed approach is evaluated on sound events of publicly available TUT-SED 2016 and Joint sound scene and polyphonic sound event recognition datasets. Results show that the proposed approach outperforms state-of-the-art methods. © 2020 Institute of Electrical and Electronics Engineers Inc.. All rights reserved.

Description

Keywords

Citation

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings , Vol. 2020-May , , p. 661 - 665

Endorsement

Review

Supplemented By

Referenced By