中国科学院机构知识库网格系统: 甚低速率混事激励语音编码算法研究

中国科学院机构知识库网格

Chinese Academy of Sciences Institutional Repositories Grid

甚低速率混事激励语音编码算法研究

文献类型：学位论文


作者	宾清原
学位类别	博士
答辩日期	2005
授予单位	中国科学院声学研究所
授予地点	中国科学院声学研究所
关键词	语音编码混合激励甚低速率语音编码多帧联合量化变速率语音编码
其他题名	Research on Very Low Bit Rate Mixed Excitation Speech Coding Algorithm
中文摘要	语音编码技术是现代通信中的重要支撑技术，随着数字通信业务的发展，降低速率是解决带宽问题的出路之一，迫切需要高质量的甚低速率语音编码算法。甚低速率语音编码算法在保密电话、军用短波通信、水下通信、语音存储等领域具有很大实用价值。混合激励线性预测（MELP）算法是一种优秀的低速率算法，由于采用了混合激励、非周期脉冲、傅里叶幅度、自适应谱增强、脉冲扩散等五种新的技术，相比传统的线性预测声码器，合成语音的质量有很大提高。本论文针对甚低速率语音编码技术，首先对MELP算法进行了深入研究，然后着重研究并实现了基于MELP的60ObPs语音编码算法，通过对语音参数的帧间相关性的研究，利用多帧联合矢量量化技术消除语音的帧间冗余，同时比较了甚低速率（320bPs）下几种线谱对频率参数（LSF）的矢量量化方案，并从中选择一种三帧联合加预测的矢量量化方案用于60ObPs语音编码算法。由于对各个语音参数进行了合理的比特分配，较好地完成了主要参数的量化。经非正式测听表明，该算法合成的语音具有较高可懂度和一定的自然度。论文还利用相邻帧语音帧谱参数的慢变特性，选取谱失真（SD）作为相邻两帧语音的谱参数距离测度，实现了一种可变速率的MELP算法。当谱失真小于一个预先设定的门限时，当前帧的LSF参数被前一帧LSF参数量化值所代替，否则把当前帧LSF参数进行量化传输。实验发现在平均速率下降到1.8kbPs左右时，合成语音的质量接近于2.4kbPsMELP合成语音。
英文摘要	Speech coding technique is one of the major supporting technologies in modern telecommunication systems. In the development of digital telecommunication services, reduced bit rate is one of the solutions to the bandwidth problem. Therefore high quality low bit rate speech coding algorithms are in high demand. The very low bit rate speech coder has great practical values in the fields of secure telephony, military short wave telecommunication, underwater telecommunication and speech storage, and so on. Mixed excitation linear prediction (MELP) algorithm is a widely used low bit rate speech coding method. Modifications of the MELP algorithm such as mixed excitation, aperiodic pulse, Fourier magnitude modeling, adaptive spectral enhancement, and pulse dispersion, have greatly improved its performance. The synthesized speech quality of the MELP coder is much better than that of the conventional LPC speech coder. In this thesis, the MELP algorithm is thoroughly studied. In particular, we focus on researching and implementing a MELP-based 600 bps speech coder. Multiframe structure is adopted to utilize the interframe correlation properties of the speech signal. After comparing several LSF quantization schemes in the very low bit rate, we select the multiframe multistage vector quantization (including prediction) scheme as the LSF parameter quantization scheme for the 600 bps speech coder. Results show that the output speech of the coder has high intelligibility and certain naturalness. Furthermore we implement a variable bit rate MELP speech coder. We use spectral distortion (SD) as the distance measure of two consecutive frames. If the SD value is lower than a pre-set threshold, then the LSF parameter of the current frame replaces that of the last frame. We find that even when the average bit rate reaches as low as 1.8 kbps, the quality of synthesized speech is still almost the same as that of the MELP coder.
语种	中文
公开日期	2011-05-07
页码	58
源URL	[http://159.226.59.140/handle/311008/982]
专题	声学研究所_声学所博硕士学位论文_1981-2009博硕士学位论文
推荐引用方式 GB/T 7714	宾清原. 甚低速率混事激励语音编码算法研究[D]. 中国科学院声学研究所. 中国科学院声学研究所. 2005.

入库方式： OAI收割

来源：声学研究所

浏览0

下载0

收藏0

其他版本

除非特别说明，本系统中所有内容都受版权保护，并保留所有权利。