甚纸速率音段声码器编码算法研究
文献类型:学位论文
作者 | 邓昊 |
学位类别 | 博士 |
答辩日期 | 2003 |
授予单位 | 中国科学院声学研究所 |
授予地点 | 中国科学院声学研究所 |
关键词 | 甚低速率语音编码 音段声码器 分段算法 码本设计方法 |
其他题名 | Research on Very Low Bit Rate Segment Vocoder |
中文摘要 | 语音编码技术在数字语音通信系统中起着重要的作用。在对传输比特率限制十分严格的场合,进行甚低速率语音编码算法的研究具有特别重要的意义。在30Obits/s左右的甚低速率条件下,采用常规的单帧矢量量化技术已很难确保量化质量,因而合成语音的可懂度很差。变段长的音段声码器是一种有效的甚低速率编码算法,它采用适当的段边界确定算法将语音数据分成长度不等的段,保证各段内的语音有比较平稳的特性,然后将它们的谱参数矢量组成的参数矩阵作为一个整体进行矩阵量化。本文对音段声码器算法的关键组成部分进行了深入的研究:详尽比较了基于动态规划思想的分段算法和采用"浮动块边界"的分段算法,介绍了矩阵量化的物理意义和一种较为符合主观感知的距离测度。音段声码器的码本设计是一项富有挑战性的工作,码本质量对合成语音质量有着重要的影响。为此,本文提出了一种采用线性弯折算法构造随机初始码本,运用"联合分段量化法"构造最终码本的设计方案;并采用一种改进的基音轨迹量化方法,在降低比特率的同时保证了激励信息的量化质量。测听实验表明了设计方案的有效性。基于上述方案的音段声码器能够在300bits/s左右的的速率得到可懂、自然的重建语音。因此,本文提出的编解码方案对甚低速率语音编码技术的发展有一定的意义。 |
英文摘要 | Speech coding is of great importance in digital communication systems. In some application areas where the transmission rate is limited strictly, research on Very Low Bit Rate Speech Coding (VLBRSC) is especially significant. For the very low bit rate of about 300bits/s, using conventional single frame Vector Quantization is no longer sufficient for maintaining fair quantization quality of spectral parameters and results in the poor intelligibility of reconstructed speech. Variable-length Segment Vocoder is an effective VLBRSC algorithm in which the input speech is decomposed into variable-length segments by a successful algorithm for finding the optimal segment boundary which means that the speech data of a segment is relatively stationary, and the parameter matrix composed of the spectral parameter vectors of the speech frames subjected to one segment is matrix quantized as a whole. In this thesis, the key components of Segment Vocoder algorithm are studied thoroughly: a segmentation algorithm based on the idea of dynamic programming and one using a method called "floating block boundary" are compared intensively, also the concept of matrix quantization and a kind of distortion measure matched comparatively well with human auditory perception are introduced. The design of matrix codebook in a Segment Vocoder is a challenging task in that the codebook quality of the Vocoder has a great influence on the intelligibility of the reconstructed speech, therefore a novel codebook design scheme, in which the initial codebook is constructed using linear warping algorithm and the final codebook is constructed using Joint Segmentation and Quantization, is proposed in this thesis. In order to preserve the quantization quality of the excitaion information while lowering the bit rates, an improved pitch track quantization scheme is also used in our segment vocoder. Experiments show the effectiveness of the scheme on which the reconstructed speech of the segment vocoder at a bit rate about 300bits/s is intelligible and natural. It's believed that the coding scheme proposed in the thesis will, in some sense, promote the development of VLBRSC technology. |
语种 | 中文 |
公开日期 | 2011-05-07 |
页码 | 86 |
源URL | [http://159.226.59.140/handle/311008/1086] ![]() |
专题 | 声学研究所_声学所博硕士学位论文_1981-2009博硕士学位论文 |
推荐引用方式 GB/T 7714 | 邓昊. 甚纸速率音段声码器编码算法研究[D]. 中国科学院声学研究所. 中国科学院声学研究所. 2003. |
入库方式: OAI收割
来源:声学研究所
浏览0
下载0
收藏0
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。