Vocal and Non-vocal Segmentation based on the Analysis of Formant Structure

Murthy, Y.V.S.; Koolagudi, S.G.; Swaroop, V.G.

Vocal and Non-vocal Segmentation based on the Analysis of Formant Structure

Date

2018

Authors

Murthy, Y.V.S.

Koolagudi, S.G.

Swaroop, V.G.

Abstract

The process of classifying vocal and non-vocal regions in an audio clip is the base for many Music Information Retrieval (MIR) tasks. In this work, we have computed novel features based on formant structure for segmenting the vocal and non-vocal regions of a given music clip. The features such as obtuse angles at formant peak, valley locations, convexity, and concavity have been proposed for this task after thorough analysis. The obtuse angles have been computed for second, third and fourth formants as much discrimination is not found for the first formant. The computed formant related features have been added to the base-line Mel frequency cepstral coefficients (MFCCs) in order to improve the performance. Moreover, singer formant (F5) has also been computed forming a 19-dimensional feature vector. As artificial neural networks (ANNs) are more suitable for handling nonlinear data, they have been considered as a classifier. Further, the 11-point moving window has been applied to avoid intermittent misclassifications. An accuracy of 88% is obtained using the proposed approach with a 19-dimensional feature vector. � 2017 IEEE.

Citation

2017 9th International Conference on Advances in Pattern Recognition, ICAPR 2017, 2018, Vol., , pp.304-309

URI

https://idr.nitk.ac.in/handle/123456789/6906

Collections

2. Conference Papers

Full item page

Vocal and Non-vocal Segmentation based on the Analysis of Formant Structure

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By