Rare Sound Event Detection Using Multi-resolution Cochleagram Features and CRNN with Attention Mechanism

No Thumbnail Available

Date

2025

Journal Title

Journal ISSN

Volume Title

Publisher

Birkhauser

Abstract

Acoustic event detection (AED) or sound event detection (SED) is a problem that focuses on automatically detecting acoustic events in an audio recording along with their onset and offset times. Rare acoustic event detection in AED is a challenging problem. Rare AED aims to detect rare but significant sound events in an audio signal. Traditional methods used for SED often struggle to accurately detect rare sound events due to their infrequent occurrence and diverse characteristics. This paper introduces novel features named as multi-resolution cochleagrams (MRCGs) for rare SED tasks. Different cochleagrams with different resolutions are extracted from the audio recording and stacked to get the MRCG feature vector. The equivalent rectangular bandwidth (ERB) scale used in the cochleagram simulates the human auditory filter. The classifier used is a convolutional recurrent neural network (CRNN) embedded with an attention module. This work considers the Task 2 DCASE 2017 dataset for detecting rare sound events. Results show that the proposed MRCG and CRNN with attention combination improves the performance. The proposed method achieved an average error rate of 0.11 and an average F1 score of 94.3%. © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2025.

Description

Keywords

Acoustic measuring instruments, Audio acoustics, Audio recordings, Audition, Deep neural networks, Feature extraction, Gears, Acoustic event detections, Cochleagram, DCASE, Deep learning, Multi-resolution cochleagram, Neural-networks, Rare sound event detection, Sound event detection, Time-frequency Analysis, Convolution

Citation

Circuits, Systems, and Signal Processing, 2025, , , pp. -

Collections

Endorsement

Review

Supplemented By

Referenced By