中国科学院机构知识库网格系统: 麦克风阵列时间延迟估计和语音信号增强的研究

中国科学院机构知识库网格

Chinese Academy of Sciences Institutional Repositories Grid

麦克风阵列时间延迟估计和语音信号增强的研究

文献类型：学位论文


作者	曾辉
学位类别	博士
答辩日期	2005
授予单位	中国科学院声学研究所
授予地点	中国科学院声学研究所
关键词	麦克风阵列语音增强时间延迟估计维纳后滤波波束形成分布式语音识别
其他题名	Research on Time Delay Estimate And Speech Signal Enhance Based on Microphone Array
中文摘要	语音增强是一种信号的选择性处理技术，主要目标是解决从受到不同方式污染的语音信号中提取出尽可能纯净的目标语音信号的问题。由于语音增强技术的研究成果具有很强的实用性，与人们的生活密切相关，因此越来越受到人们的重视。随着通讯技术的发展，人们发现在诸如视频会议或者车载通讯等复杂的应用环境中，不仅存在环境噪声，而且还有回声和混响的干扰，这时传统的单通道语音增强系统显得无能为力。为此，人们发展出了利用麦克风阵列进行语音增强的方法。本论文开展麦克风阵列的时间延迟估计和语音信号增强算法研究，达到进行麦克风阵列语音增强实时处理的目的。本论文的主要工作如下：1．通过对各种常见的时间延迟估计算法的延迟估计准确率受信噪比和混响时间的影响进行系统的对比研究，确定采用基于听觉特性的时间延迟估计算法作为语音增强系统的处理前端，使系统波束能够较好地跟随运动声源；2．在前人工作的基础＿匕提出了一种结合延迟一相加和改进维纳后滤波的波束形成器，实验表明，增强后语音的信噪比和大词汇量语音识别的识别率方面，比改进前的算法都有较大提高；3．使用Aardvark公司的Direct Pro O10作为数字音频输入输出接口，实现了一个完整的麦克风阵列语音增强实验系统，在语音增强处理的实时性和信噪比方面都达到了预期目标；4．实现了一个分布式的语音识别实验平台，充分利用实验室多台计算机的闲置运算能力，极大的提高了语音识别实验的处理速度。
英文摘要	Speech enhancement is a signal selectivity processing technique, its main target is extracting the primary speech as pure as possible from the contaminated speech signal. Because the research production of speech enhancement is very useful, and has a close relation with human's life, people put more and more attention to speech enhancement. Since the advancement of communication technology, people need to capture speech signal in more complex entironment such as video conference or vechile telephone. In this conditions, there is not only noise of surroundings, but also the disturbing of resonance and reverberation. Conventional single-channel approaches for speech enhancement do not perform well in the face of changeful surrounding. In recent years, the use of microphone array has received considerable attention as a means for dramatically improving the performance of traditional single-channel systems. Microphone array includes plenty of spatial and time information, providing more space for speech processing research. This thesis's primary target is TDE (Time Delay Estimation), speech enhancement by post-filter and build a real-time speech enhance system base on microphone arrary. The main contributions of this thesis are: 1. Compare performance of different TDE algorithms in inequable SNR and reverberation time. Implement one TDE algorithm which based on hearing characteristics, and use this algorithm as the front-end of real-time microphone array speech enhancement system to ensure that the beamformer can trace the moving phonetic source. 2. Analyse plenty of multi-channels speech enhance methods, contrast the capability of nosie canceling in diverse noise conditions. Based on the previous work, a ameliorated beamformer which integrate delay-sum beamformer and improved weiner post-filter beamformer is presented. Experiment shows that the ameliorated beamformer is able to cancel the noise obviously and the enhanced speeches have a higher recognition rate. 3. Using Direct Pro Q10 digital audio 10 interface manufacturer is Aardvark) as the hardware platform, implement a real-time multi-channel speech enhance system. The testing proves that it can enhance speech signal in real-time and the SNR of enhanced speech is achieve to the request. 4. Design a distributed speech recognition experiment platform base on client/server model, accelerate the speech recognition experiment by making use of the idle computers in the laboratory.
语种	中文
公开日期	2011-05-07
页码	103
源URL	[http://159.226.59.140/handle/311008/1058]
专题	声学研究所_声学所博硕士学位论文_1981-2009博硕士学位论文
推荐引用方式 GB/T 7714	曾辉. 麦克风阵列时间延迟估计和语音信号增强的研究[D]. 中国科学院声学研究所. 中国科学院声学研究所. 2005.

入库方式： OAI收割

来源：声学研究所

浏览0

下载0

收藏0

其他版本

除非特别说明，本系统中所有内容都受版权保护，并保留所有权利。