Semi-Supervised Speech Enhancement Combining Nonnegative Matrix Factorization and Robust Principal Component Analysis

Hu, Yonggang; Xiongwei, Zhang; Zou, Xia; Sun, Meng; Zheng, Yunfei; Gang, Min

Semi-Supervised Speech Enhancement Combining Nonnegative Matrix Factorization and Robust Principal Component Analysis

dc.contributor.author	Hu, Yonggang
dc.contributor.author	Xiongwei, Zhang
dc.contributor.author	Zou, Xia
dc.contributor.author	Sun, Meng
dc.contributor.author	Zheng, Yunfei
dc.contributor.author	Gang, Min
dc.date.accessioned	2024-05-15T23:23:26Z
dc.date.issued	2017
dc.date.updated	2023-01-15T07:17:52Z
dc.description.abstract	Nonnegative matrix factorization (NMF) is one of the most popular machine learning tools for speech enhancement. The supervised NMF-based speech enhancement is accomplished by updating iteratively with the prior knowledge of the clean speech and noise spectra bases. However, in many real-world scenarios, it is not always possible for conducting any prior training. The traditional semi-supervised NMF (SNMF) version overcomes this shortcoming while the performance degrades. In this letter, without any prior knowledge of the speech and noise, we present an improved semi-supervised NMF-based speech enhancement algorithm combining techniques of NMF and robust principal component analysis (RPCA). In this approach, fixed speech bases are obtained from the training samples chosen from public dateset offline. The noise samples used for noise bases training, instead of characterizing a priori as usual, can be obtained via RPCA algorithm on the fly. This letter also conducts a study on the assumption whether the time length of the estimated noise samples may have an effect on the performance of the algorithm. Three metrics, including PESQ, SDR and SNR are applied to evaluate the performance of the algorithms by making experiments on TIMIT with 20 noise types at various signal-to-noise ratio levels. Extensive experimental results demonstrate the superiority of the proposed algorithm over the competing speech enhancement algorithm.	en_AU
dc.description.sponsorship	This work is partially supported by NSF of China (Grant No.61471394,61402519)and NSF of JIANG Su Province (GrantNo.BK2012510,BK20140071,BK20140074).	en_AU
dc.format.mimetype	application/pdf	en_AU
dc.identifier.issn	0916-8508	en_AU
dc.identifier.uri	http://hdl.handle.net/1885/317534
dc.language.iso	en_AU	en_AU
dc.publisher	J-STAGE	en_AU
dc.rights	© 2017 The Institute of Electronics, Information and Communication Engineers	en_AU
dc.source	IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences	en_AU
dc.title	Semi-Supervised Speech Enhancement Combining Nonnegative Matrix Factorization and Robust Principal Component Analysis	en_AU
dc.type	Journal article	en_AU
local.bibliographicCitation.issue	8	en_AU
local.bibliographicCitation.lastpage	1719	en_AU
local.bibliographicCitation.startpage	1714	en_AU
local.contributor.affiliation	Hu, Yonggang, College of Engineering, Computing and Cybernetics, ANU	en_AU
local.contributor.affiliation	Xiongwei, Zhang, PLA University of Science and Technology	en_AU
local.contributor.affiliation	Zou, Xia, The Army Engineering University of PLA	en_AU
local.contributor.affiliation	Sun, Meng, PLA University of Science and Technology	en_AU
local.contributor.affiliation	Zheng, Yunfei, PLA University of Science and Technology	en_AU
local.contributor.affiliation	Gang, Min, XI’AN Communications Institute	en_AU
local.contributor.authoremail	repository.admin@anu.edu.au	en_AU
local.contributor.authoruid	Hu, Yonggang, u6014346	en_AU
local.description.embargo	2099-12-31
local.description.notes	Imported from ARIES	en_AU
local.identifier.absfor	400607 - Signal processing	en_AU
local.identifier.absfor	460302 - Audio processing	en_AU
local.identifier.ariespublication	u4485658xPUB785	en_AU
local.identifier.citationvolume	E100A	en_AU
local.identifier.doi	10.1587/transfun.E100.A.1714	en_AU
local.identifier.scopusID	2-s2.0-85026627051
local.identifier.thomsonID	WOS:000406923700011
local.identifier.uidSubmittedBy	u4485658	en_AU
local.publisher.url	https://www.jstage.jst.go.jp/	en_AU
local.type.status	Published Version	en_AU

Downloads

Original bundle

Now showing 1 - 1 of 1

Name:: Semi-Supervised Speech Enhancement Combining Nonnegative.pdf
Size:: 470.6 KB
Format:: Adobe Portable Document Format
Description:

Download

Collections

ANU Research Publications