Polyphonic Sound Event Detection Using Modified Recurrent Temporal Pyramid Neural Network

Venkatesh, S.; Koolagudi, S.G.

Polyphonic Sound Event Detection Using Modified Recurrent Temporal Pyramid Neural Network

dc.contributor.author	Venkatesh, S.
dc.contributor.author	Koolagudi, S.G.
dc.date.accessioned	2026-02-06T06:34:01Z
dc.date.issued	2024
dc.description.abstract	In this paper, a novel approach to performing polyphonic Sound Event Detection (SED) is presented. A new deep learning architecture named â€œModified Recurrent Temporal Pyramid Neural Network (MR-TPNN)â€ is introduced. The input features fed to the network are spectrograms generated from Constant Q-Transform (CQT). CQT spectrograms provided better sound event information in the audio recording than the Short Time Fourier Transform (STFT) and Fast Fourier Transform (FFT) methods. The temporal information is an essential factor for detecting the onset and offset of events in an audio recording. Capturing the temporal information is ensured by fusing Temporal pyramids and Bi-directional long short-term memory (LSTM) recurrent layers in deep learning architecture. Extensive experiments are carried out on three benchmark datasets, and the results of the proposed method are superior to those of the existing polyphonic SED systems. Â© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.
dc.identifier.citation	Communications in Computer and Information Science, 2024, Vol.2009 CCIS, , p. 554-564
dc.identifier.issn	18650929
dc.identifier.uri	https://doi.org/10.1007/978-3-031-58181-6_47
dc.identifier.uri	https://idr.nitk.ac.in/handle/123456789/29009
dc.publisher	Springer Science and Business Media Deutschland GmbH
dc.subject	Constant Q-Transform (CQT)
dc.subject	Deep learning
dc.subject	Modified Recurrent Temporal Pyramid Network
dc.subject	Polyphonic Sound Event Detection (SED)
dc.title	Polyphonic Sound Event Detection Using Modified Recurrent Temporal Pyramid Neural Network

Collections

Conference Papers

Polyphonic Sound Event Detection Using Modified Recurrent Temporal Pyramid Neural Network

Files

Collections