中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
基于小数量麦克风的语音增强算法研究

文献类型:学位论文

作者魏建强
学位类别博士
答辩日期2005
授予单位中国科学院声学研究所
授予地点中国科学院声学研究所
关键词单通道语音增强 谱相减 维纳滤波 最小均方误差估计子 卡尔曼滤波 掩蔽效应 多通道(麦克阵列)语音增强 时间延迟估计 延迟—相加波束形成 自适应波束形成 广义旁瓣抵消器 后滤波波束形成
其他题名Research on Algorithm of Speech Enhancement Based on Small Number of Microphones
中文摘要实际中,语音常常受到环境噪声的干扰而使通话质最卜降或使语音处理系统不能正常工作,在这些情况下都需要对带噪语音进行增强处理,以提高语音质量。根据接收语音信号时所用麦克风数目的不同,语音增强系统可以分为单通道语音增强系统和多通道(麦克阵列)语音增强系统这两种类型。单通道语音增强系统只需一路语音信号,因此算法复杂性较小,硬件要求低。但是在车载电话、视频会议系统等场合中,不仅存在环境噪声,而且还有回声私l混响问题,这时单通道语音增强系统就显得无能为力。为解决该问题,人们提出了麦克阵列语音增强方法。麦克阵列具有空间选择性,它能有效抑制除所需语一信号方向外的其它噪声干扰,因此可以取得明显的消噪效果。在此背景下,根据不同的要求,本论文研究了基于小数量麦克风的语音增强算法,主要包括单通道语音增强算法和双通道(麦克阵列)语音增强算法两大部分。本论文的主要工作如下:1.在单通道语音增强方法中,本文提出了一种基于人的语音感知特性和改善的卡尔曼滤波(KalmanFiitering)的语音增强方法。该方法利用了谱相减方法来有效地对带噪语音进行语音及噪声的特征分离和AR模型参数计算,并且借助于在卡尔曼滤波输出的后端引入感知后滤波器来进一步提高语音质量。大量的实验表明该算法具有很好的消噪性能,并且不存在明显的音乐噪声(MusicalNofse)现象。而且尤其在较低信噪比和非平稳噪声的条件下,其优势则更加明显。2.针对传统的相位变换(PHAT)加权函数所存在的问题,本文提出了一种基于改进的广义互相关一相位变换(GCC-PfIAT)加权函数的时间延迟估计新方法。由于该方法不仅具有比较低的运算量,而且对混响和噪声都有一定的抑制作用,因此它非常适合于在实际的麦克阵列语音增强系统中实时实现。3.在多通道语音增强方法中,在前人所做研究工作的基础上,本文提出了一种新颖的基于频域自适应波束(AdaptiveBeamforming)和后滤波(Post-filtering)的双通道(麦克阵列)语音增强系统。由于该系统在频域内集成了自适应波束、阵列注视方向的自适应滤波、信一号的软判决和麦克阵列后滤波等消噪组件,其对于各种特性的噪声(相关/非相关、平稳/非平稳等)都具有比较好的消噪性能。
英文摘要Speech has become an increasingly vital component in modern human-machine interfaces. Background noise, reverberation effects, and interfering signals produce aesthetically undesirable effects and diminish the system's ability to convey information across the interface. Speech enhancement has become an important topic of study. Conventional systems for speech acquisition use one microphone and there is a rich history of work addressing the use of single-channel methods for speech enhancement. While capable of improving speech quality in restrictive environments (additive noise, no multipath, high to moderate signal-to-noise ratio (SNR), single source), these approaches do not perform well in the face1 of reverberant distortions, competing sources, and severe noise conditions. In recent years, the use of microphone array has received considerable attention as a means for dramatically improving the performance of traditional single-channel systems. Based on these, according to different demands, this thesis studied the algorithms of speech enhancement based on small number of microphones, which included mainly single-channel speech enhancement and two-channel (multi-channel or microphone array) speech enhancement. The main contributions of this thesis are:1. For the single-channel speech enhancement methods, an improved Kalman filter-based speech enhancement algorithm with perceptual post-filtering is presented. A new technique based on spectral subtraction is used for separation speech and noise characteristics from noisy speech and for the computation of speech and noise autoregressive (AR) parameters. In order to obtain a Kalman filter output with high audible quality, a perceptual post-filter is placed at the output of the Kalman filter to smooth the enhanced speech spectra. 2. A new method of time delay estimation (TDE) based on improved generalized cross correlation - phase transform (GCC-PHAT) weighting function is presented, which solves the existing problems of the traditional PHAT weighting function. Due to its low computational complexity and robustness to reverberant and noise environments, this new approach is apt to be implemented in the microphone array speech enhancement systems. 3. For the multi-channel speech enhancement methods, based on the previous work, a novel two-channel (microphone array) speech enhancement system based on frequency domain adaptive beamforming and postfiltering is presented. Adaptive beamforming, adaptive look-direction Wiener filtering, soft signal detection, and microphone array postfiltering are integrated in the proposed system, which can suppress coherent as well as incoherent noise, especially work well in non-stationary noise environments.
语种中文
公开日期2011-05-07
页码178
源URL[http://159.226.59.140/handle/311008/946]  
专题声学研究所_声学所博硕士学位论文_1981-2009博硕士学位论文
推荐引用方式
GB/T 7714
魏建强. 基于小数量麦克风的语音增强算法研究[D]. 中国科学院声学研究所. 中国科学院声学研究所. 2005.

入库方式: OAI收割

来源:声学研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。