摘要: |
为提高语音端点检测系统在低信噪比下检测的准确性,提出了一种基于倒谱特征和谱
熵的端点检测算法。首先,根据分析得到待测语音帧的倒谱特征量,然后计算该特征量分别
在
通过训练得到的语音和噪声的高斯混合模型下的似然概率,通过两者概率的比较作出有声无
声初判决;联合能量熵端点检测结果得到最终判决,最后通过Hangover机制最大限度的保护
了语音。实验结果表明,此方法改善了能量熵端点检测法在babble噪声下的劣势,且在不同
噪声环境下均优于G.729 Annex B的性能。 |
关键词: 语音信号处理 话音端点检测 谱熵 线性预测系数 倒谱系
数 高斯混合模型 |
DOI: |
|
基金项目:国家自然科学基金资助项目(60572081) |
|
Voice Activity Detection Based on LPCC and Spectrum Entropy |
ZHU Xiao-jing,HOU Xu-chu,CUI Hui-juan,TANG Kun |
(National Laboratory of Information Science and Technology,Department of Elec
tronic Engineering,Tsinghua University, Beijing 100084,China) |
Abstract: |
In order to improve the accuracy of Voice Activity Detection(VAD) in low SNR no
isy environments, an algorithm based on Linear Predictive Cepstral Coeffici
ent (LPCC) and energy entropy is proposed. First, the LPCC extracted from
the input speech is imported into speech model and noise model, both of which ar
e Gaussian Mixture Model (GMM) separately, to calculate the likelihood ratio o
f speech to noise. The firststage VAD decision is made based on the likelihoo
d
ratio. Then the spectrum entropy is applied to the second decisionmaking sta
g
e. Finally, a mechanism called Hangover is used to better protect the speech. E
xperiment results show that the new algorithm can compensate the drawbacks of sp
ectrum entropy method in babble noisy environment. Furthermore, it outperforms t
he
G.729 Annex B under various noisy environments. |
Key words: speech signal processing voice activity detection(VAD) spectrum entropy linear prediction coefficien
t(LPC) linear predictive cepstral coefficient(LPCC) Gaussian mixture mode
l(GMM) |