Speech enhancement using multiple deep neural networks

dc.contributor.authorKarjol, P.
dc.contributor.authorKumar, M.A.
dc.contributor.authorGhosh, P.K.
dc.date.accessioned2026-02-06T06:37:57Z
dc.date.issued2018
dc.description.abstractIn this work, we present a variant of multiple deep neural network (DNN) based speech enhancement method. We directly estimate clean speech spectrum as a weighted average of outputs from multiple DNNs. The weights are provided by a gating network. The multiple DNNs and the gating network are trained jointly. The objective function is set as the mean square logarithmic error between the target clean spectrum and the estimated spectrum. We conduct experiments using two and four DNNs using the TIMIT corpus with nine noise types (four seen noises and five unseen noises) taken from the AURORA database at four different signal-to-noise ratios (SNRs). We also compare the proposed method with a single DNN based speech enhancement scheme and existing multiple DNN schemes using segmental SNR, perceptual evaluation of speech quality (PESQ) and short-term objective intelligibility (STOI) as the evaluation metrics. These comparisons show the superiority of proposed method over baseline schemes in both seen and unseen noises. Specifically, we observe an absolute improvement of 0.07 and 0.04 in PESQ measure compared to single DNN when averaged over all noises and SNRs for seen and unseen noise cases respectively. © 2018 IEEE.
dc.identifier.citationICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 2018, Vol.2018-April, , p. 5049-5052
dc.identifier.issn07367791; 15206149
dc.identifier.urihttps://doi.org/10.1109/ICASSP.2018.8462649
dc.identifier.urihttps://idr.nitk.ac.in/handle/123456789/31335
dc.publisherInstitute of Electrical and Electronics Engineers Inc.
dc.subjectDeep neural networks
dc.subjectGating network
dc.subjectSpeech enhancement
dc.titleSpeech enhancement using multiple deep neural networks

Files