中国科学院机构知识库网格系统: 氦语音增强算法研究

中国科学院机构知识库网格

Chinese Academy of Sciences Institutional Repositories Grid

氦语音增强算法研究

文献类型：学位论文


作者	成少锋
学位类别	博士
答辩日期	2003
授予单位	中国科学院声学研究所
授予地点	中国科学院声学研究所
关键词	氦语音短时傅立叶变换线性预测
其他题名	Research on Helium Speech Enhancement Algorithm
中文摘要	由于生理方面的原因，深海作业时潜水员必须以高压氦氧混合气体为呼吸介质，而无法使用空气。氦氧混合气体使潜水员的语音产生很大畸变，清晰度极低，对潜水员同外界的语音通信构成很大的障碍，由此引出了氦语音增强这一课题。本文采用经典的语音产生声管理论，详尽地分析了氦语音相对于常态语音的变化规律，总结得出对于设计氦语音增强算法非常重要的一些结论。对氦语音变化规律的分析不难发现，语音信号的窄带短时傅立叶变换（STFT）是氦语音增强的有力工具。为此，在简要地介绍了这一理论工具之后，本文详细地讨论了基于STFT的氦语音增强算法。对于算法中非常重要的频谱包络估计这一环节，本文介绍了一种简单有效的分段线性化方法。实验表明，基于STFT的氦语音增强算法计算简单，无需基音提取和清蚀判断，因而有着很好的顽健性，并易于与谱减法降噪技术相结合。深入的理论分析发现，基于STFT的氦语音增强算法中存在一些不足之处。一是将仅适用于共振峰中心频率处的映射公式应用到了整个频率范围，二是无法单独调节共振峰带宽。为此，本文提出了一种基于线性预测（LP）的新算法，使共振峰的中心频率和带宽，可以通过对声道滤波器全极点模型极点的操作，进行独立调节，从理论上克服了基于短时傅立叶变换的算法的不足。实验验证了这一算法的可行性。实验比较和主观测听发现，采用上述两种算法均可以使校正后的氦语音可懂度和自然度得到明显的提高，但基于LP的算法总体效果略逊于基于STFT的算法，对共振峰带宽调节参数的设置还需要进一步研究。
英文摘要	For some physiological reasons, divers working in deep sea have to breathe hyperbaric helium-oxygen mixed atmosphere in place of air. Unfortunately, helium-oxygen mixed atmosphere makes divers' speech distorted and unintelligible, which is a severe obstacle for speech communication of divers with others. As a result, helium speech enhancement is of great importance. Helium speech is analysed based on the classical acoustic tube theory of speech production, from which the essential laws of helium speech transformation are obtained and some important conclusions used for designing helium speech enhancement algorithm are drawn in this thesis. From the analysis of the law of helium speech transformation, it is not hard to find that the narrow band short-time fourier transform(STFT) of speech signal is an effective tool for helium speech enhancement, therefore, after some brief introduction to STFT, an algorithm based on STFT is discussed in detail. A main step in the algorithm is to estimate the spectral envelop, a piecewise linear method is presented in the thesis, which is simple as well as exact. Because both pitch extraction and voice/unvoice judgement are needless, the algorithm is very robust. Moreover, noise reduction with spectral subtraction can be easily combined. Two drawbacks of the algorithm based on STFT are found through deep theoretic analysis in the thesis. One is that the mapping formula which just hold true for formant central frequency is applied in whole frequency range in the aforesaid algorithm, the other is that the algorithm can not adjust formant bandwidth independently. A new algorithm based on linear prediction(LP) is proposed, in which the formant central frequency and bandwidth can be controlled independently by direct operation on the poles of all poles model of vocal tract filter, accordingly, the deficiency of STFT based algorithm can be avoided in theoretic sense. The experimental results show that both the 2 algorithms can correct helium speech, the intelligibility and naturalness are enhanced greatly. But the performance of LP based algorithm is a little inferior to that of STFT based algorithm in that the adjustment parameter of formant bandwidth needs to be further studied in future.
语种	中文
公开日期	2011-05-07
页码	66
源URL	[http://159.226.59.140/handle/311008/1070]
专题	声学研究所_声学所博硕士学位论文_1981-2009博硕士学位论文
推荐引用方式 GB/T 7714	成少锋. 氦语音增强算法研究[D]. 中国科学院声学研究所. 中国科学院声学研究所. 2003.

入库方式： OAI收割

来源：声学研究所

浏览0

下载0

收藏0

其他版本

除非特别说明，本系统中所有内容都受版权保护，并保留所有权利。