The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
Algoritma peningkatan pertuturan berasaskan wavelet tradisional tidak berkesan dengan kehadiran hingar yang sangat tidak pegun kerana kesukaran dalam anggaran tepat spektrum hingar tempatan. Dalam makalah ini, kaedah mudah anggaran hingar menggunakan penggunaan pengesan aktiviti suara dicadangkan. Kami boleh meningkatkan output algoritma peningkatan pertuturan berasaskan wavelet dengan kehadiran letupan hingar rawak mengikut keputusan keputusan VAD. Pertuturan bising terlebih dahulu dipraproses menggunakan penguraian paket wavelet skala kulit ( BSWPD ) untuk menukar isyarat bising kepada pekali wavelet (WC). Didapati bahawa VAD menggunakan entropi spektrum skala kulit kayu, dipanggil sebagai BS-Entropi, parameter adalah lebih baik daripada pendekatan berasaskan tenaga lain terutamanya dalam tahap hingar berubah-ubah. Ambang pekali wavelet (WCT) bagi setiap subband kemudiannya dilaraskan secara sementara mengikut keputusan pendekatan VAD. Dalam bingkai yang dikuasai pertuturan, pertuturan dikategorikan kepada sama ada bingkai bersuara atau bingkai tidak bersuara. Bingkai bersuara mempunyai spektrum seperti nada yang kuat dalam subjalur bawah, jadi WC jalur bawah mesti ditempah. Sebaliknya, WCT cenderung meningkat dalam jalur rendah jika ucapan dikategorikan sebagai tidak bersuara. Dalam bingkai yang dikuasai hingar, hingar latar boleh hampir sepenuhnya dikeluarkan dengan meningkatkan WCT. Keputusan eksperimen objektif dan subjektif kemudiannya digunakan untuk menilai sistem yang dicadangkan. Eksperimen menunjukkan bahawa algoritma ini sah pada pelbagai keadaan hingar, terutamanya untuk hingar warna dan keadaan hingar tidak pegun.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Salinan
Kun-Ching WANG, "An Adaptive Wavelet-Based Denoising Algorithm for Enhancing Speech in Non-stationary Noise Environment" in IEICE TRANSACTIONS on Information,
vol. E93-D, no. 2, pp. 341-349, February 2010, doi: 10.1587/transinf.E93.D.341.
Abstract: Traditional wavelet-based speech enhancement algorithms are ineffective in the presence of highly non-stationary noise because of the difficulties in the accurate estimation of the local noise spectrum. In this paper, a simple method of noise estimation employing the use of a voice activity detector is proposed. We can improve the output of a wavelet-based speech enhancement algorithm in the presence of random noise bursts according to the results of VAD decision. The noisy speech is first preprocessed using bark-scale wavelet packet decomposition ( BSWPD ) to convert a noisy signal into wavelet coefficients (WCs). It is found that the VAD using bark-scale spectral entropy, called as BS-Entropy, parameter is superior to other energy-based approach especially in variable noise-level. The wavelet coefficient threshold (WCT) of each subband is then temporally adjusted according to the result of VAD approach. In a speech-dominated frame, the speech is categorized into either a voiced frame or an unvoiced frame. A voiced frame possesses a strong tone-like spectrum in lower subbands, so that the WCs of lower-band must be reserved. On the contrary, the WCT tends to increase in lower-band if the speech is categorized as unvoiced. In a noise-dominated frame, the background noise can be almost completely removed by increasing the WCT. The objective and subjective experimental results are then used to evaluate the proposed system. The experiments show that this algorithm is valid on various noise conditions, especially for color noise and non-stationary noise conditions.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.E93.D.341/_p
Salinan
@ARTICLE{e93-d_2_341,
author={Kun-Ching WANG, },
journal={IEICE TRANSACTIONS on Information},
title={An Adaptive Wavelet-Based Denoising Algorithm for Enhancing Speech in Non-stationary Noise Environment},
year={2010},
volume={E93-D},
number={2},
pages={341-349},
abstract={Traditional wavelet-based speech enhancement algorithms are ineffective in the presence of highly non-stationary noise because of the difficulties in the accurate estimation of the local noise spectrum. In this paper, a simple method of noise estimation employing the use of a voice activity detector is proposed. We can improve the output of a wavelet-based speech enhancement algorithm in the presence of random noise bursts according to the results of VAD decision. The noisy speech is first preprocessed using bark-scale wavelet packet decomposition ( BSWPD ) to convert a noisy signal into wavelet coefficients (WCs). It is found that the VAD using bark-scale spectral entropy, called as BS-Entropy, parameter is superior to other energy-based approach especially in variable noise-level. The wavelet coefficient threshold (WCT) of each subband is then temporally adjusted according to the result of VAD approach. In a speech-dominated frame, the speech is categorized into either a voiced frame or an unvoiced frame. A voiced frame possesses a strong tone-like spectrum in lower subbands, so that the WCs of lower-band must be reserved. On the contrary, the WCT tends to increase in lower-band if the speech is categorized as unvoiced. In a noise-dominated frame, the background noise can be almost completely removed by increasing the WCT. The objective and subjective experimental results are then used to evaluate the proposed system. The experiments show that this algorithm is valid on various noise conditions, especially for color noise and non-stationary noise conditions.},
keywords={},
doi={10.1587/transinf.E93.D.341},
ISSN={1745-1361},
month={February},}
Salinan
TY - JOUR
TI - An Adaptive Wavelet-Based Denoising Algorithm for Enhancing Speech in Non-stationary Noise Environment
T2 - IEICE TRANSACTIONS on Information
SP - 341
EP - 349
AU - Kun-Ching WANG
PY - 2010
DO - 10.1587/transinf.E93.D.341
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E93-D
IS - 2
JA - IEICE TRANSACTIONS on Information
Y1 - February 2010
AB - Traditional wavelet-based speech enhancement algorithms are ineffective in the presence of highly non-stationary noise because of the difficulties in the accurate estimation of the local noise spectrum. In this paper, a simple method of noise estimation employing the use of a voice activity detector is proposed. We can improve the output of a wavelet-based speech enhancement algorithm in the presence of random noise bursts according to the results of VAD decision. The noisy speech is first preprocessed using bark-scale wavelet packet decomposition ( BSWPD ) to convert a noisy signal into wavelet coefficients (WCs). It is found that the VAD using bark-scale spectral entropy, called as BS-Entropy, parameter is superior to other energy-based approach especially in variable noise-level. The wavelet coefficient threshold (WCT) of each subband is then temporally adjusted according to the result of VAD approach. In a speech-dominated frame, the speech is categorized into either a voiced frame or an unvoiced frame. A voiced frame possesses a strong tone-like spectrum in lower subbands, so that the WCs of lower-band must be reserved. On the contrary, the WCT tends to increase in lower-band if the speech is categorized as unvoiced. In a noise-dominated frame, the background noise can be almost completely removed by increasing the WCT. The objective and subjective experimental results are then used to evaluate the proposed system. The experiments show that this algorithm is valid on various noise conditions, especially for color noise and non-stationary noise conditions.
ER -