Semi-Supervised Speech Enhancement Combining Nonnegative Matrix Factorization and Robust Principal Component Analysis
dc.contributor.author | Hu, Yonggang | |
dc.contributor.author | Xiongwei, Zhang | |
dc.contributor.author | Zou, Xia | |
dc.contributor.author | Sun, Meng | |
dc.contributor.author | Zheng, Yunfei | |
dc.contributor.author | Gang, Min | |
dc.date.accessioned | 2024-05-15T23:23:26Z | |
dc.date.issued | 2017 | |
dc.date.updated | 2023-01-15T07:17:52Z | |
dc.description.abstract | Nonnegative matrix factorization (NMF) is one of the most popular machine learning tools for speech enhancement. The supervised NMF-based speech enhancement is accomplished by updating iteratively with the prior knowledge of the clean speech and noise spectra bases. However, in many real-world scenarios, it is not always possible for conducting any prior training. The traditional semi-supervised NMF (SNMF) version overcomes this shortcoming while the performance degrades. In this letter, without any prior knowledge of the speech and noise, we present an improved semi-supervised NMF-based speech enhancement algorithm combining techniques of NMF and robust principal component analysis (RPCA). In this approach, fixed speech bases are obtained from the training samples chosen from public dateset offline. The noise samples used for noise bases training, instead of characterizing a priori as usual, can be obtained via RPCA algorithm on the fly. This letter also conducts a study on the assumption whether the time length of the estimated noise samples may have an effect on the performance of the algorithm. Three metrics, including PESQ, SDR and SNR are applied to evaluate the performance of the algorithms by making experiments on TIMIT with 20 noise types at various signal-to-noise ratio levels. Extensive experimental results demonstrate the superiority of the proposed algorithm over the competing speech enhancement algorithm. | en_AU |
dc.description.sponsorship | This work is partially supported by NSF of China (Grant No.61471394,61402519)and NSF of JIANG Su Province (GrantNo.BK2012510,BK20140071,BK20140074). | en_AU |
dc.format.mimetype | application/pdf | en_AU |
dc.identifier.issn | 0916-8508 | en_AU |
dc.identifier.uri | http://hdl.handle.net/1885/317534 | |
dc.language.iso | en_AU | en_AU |
dc.publisher | J-STAGE | en_AU |
dc.rights | © 2017 The Institute of Electronics, Information and Communication Engineers | en_AU |
dc.source | IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences | en_AU |
dc.title | Semi-Supervised Speech Enhancement Combining Nonnegative Matrix Factorization and Robust Principal Component Analysis | en_AU |
dc.type | Journal article | en_AU |
local.bibliographicCitation.issue | 8 | en_AU |
local.bibliographicCitation.lastpage | 1719 | en_AU |
local.bibliographicCitation.startpage | 1714 | en_AU |
local.contributor.affiliation | Hu, Yonggang, College of Engineering, Computing and Cybernetics, ANU | en_AU |
local.contributor.affiliation | Xiongwei, Zhang, PLA University of Science and Technology | en_AU |
local.contributor.affiliation | Zou, Xia, The Army Engineering University of PLA | en_AU |
local.contributor.affiliation | Sun, Meng, PLA University of Science and Technology | en_AU |
local.contributor.affiliation | Zheng, Yunfei, PLA University of Science and Technology | en_AU |
local.contributor.affiliation | Gang, Min, XI’AN Communications Institute | en_AU |
local.contributor.authoremail | repository.admin@anu.edu.au | en_AU |
local.contributor.authoruid | Hu, Yonggang, u6014346 | en_AU |
local.description.embargo | 2099-12-31 | |
local.description.notes | Imported from ARIES | en_AU |
local.identifier.absfor | 400607 - Signal processing | en_AU |
local.identifier.absfor | 460302 - Audio processing | en_AU |
local.identifier.ariespublication | u4485658xPUB785 | en_AU |
local.identifier.citationvolume | E100A | en_AU |
local.identifier.doi | 10.1587/transfun.E100.A.1714 | en_AU |
local.identifier.scopusID | 2-s2.0-85026627051 | |
local.identifier.thomsonID | WOS:000406923700011 | |
local.identifier.uidSubmittedBy | u4485658 | en_AU |
local.publisher.url | https://www.jstage.jst.go.jp/ | en_AU |
local.type.status | Published Version | en_AU |
Downloads
Original bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- Semi-Supervised Speech Enhancement Combining Nonnegative.pdf
- Size:
- 470.6 KB
- Format:
- Adobe Portable Document Format
- Description: