中国科学院机构知识库网格系统: 话者识别中的语图实现

中国科学院机构知识库网格

Chinese Academy of Sciences Institutional Repositories Grid

话者识别中的语图实现

文献类型：学位论文


作者	游文颖
学位类别	博士
答辩日期	1999
授予单位	中国科学院声学所
授予地点	中国科学院声学所
关键词	语谱图 Lerner 滤波器话者识别
中文摘要	最初对语图的研究可以从R.K.Potter等人的《Visible Speech》一书看到。虽然希望用这种语图显示来进行聋子的对话的愿望未能实现，但它提供了丰富的有关语音频谱及时间特性的信息，譬如可以用测量语图的方法来确定语音参数。随着计算机的发展，直接用TV或CRT来显示语音的谱图更方便和直接快速。这为话者自动识别提供了很好的工具。在本文中，语图的实现是指数字化语音信号通过低通和一组在频域上相互衔接的线性相位带通滤波器，得出其估计谱，从而作出关于时间，频率，能量的能显示语音特征参量的三维语图。本文的目的是用语图实现话者自动识别，主要介绍用计算机和DSP实现语图的原理及方法。首先阐述语音的发音机理、发音基础及其数字模型。然后介绍谱估计的傅立叶方法之一：滤波器组法。滤波器组法是指在感兴趣频段并排设置一组密接的窄带滤波器，并对各窄带滤波器输出行平方积分，则其输出将给出输入功率谱在一系列频率上的取值估计。本文给出了详细的证明和推导。同时介绍用来实现滤波器组法的一种线性相位的滤波器，Lerner滤波器。Lerner滤波器既可构成宽带也可构成窄带带通滤波器，这里用N个窄带滤波器在所需频段拼接成一个窄带滤波器组。不仅窄带滤波器具有通带内幅度近似平坦，有锐截止频率特性和线性相位特性，用它构成的窄带滤波器组也具有同样的特性，并且结构紧凑。用这样的滤波器组来作谱估计可以获得精确的分析带宽，并且由于线性相位，谱估计值也较精确。接着介绍基于语图实现的系统硬件结构，包括预处理板、A／D及D／A板、DSP高速数字信号处理板和后处理板。本文在软件方面介绍了用DSP（TMS320C50）实现谱估计的方法。然后在谱估计的基础上，介绍用PC机在基于WINDOWS平台下，显示以及处理语图的方法。语图的分析包括窄带和宽带两种。最后给出系统的电性能测试结果以及语图测试结果。
英文摘要	The research of spectrogram can be found from the original work "visible speech", written by R. K. Potter. The desire to talk with deaf by means of spectrogram failed, but it provided a lot of information in frequency and time domain. Now, spectrogram becomes a good tool for speaker recognition. With the development of computer, spectrogram display is more convenient by using TV or CRT. The realization of spectrogram refers to 3D-speech display of time, frequency and energy. The procedure mainly includes narrow band filtering and energy estimating in narrow bands. The purpose of this thesis is to introduce the theory and the method of 3D-spectrogram. Firstly, vocalization mechanism and digital model of speech will be described or derived in this thesis. Secondly, the filter bank method for spectrum estimation will be proved and deduced. Simultaneously, Lerner linear phase band pass filter used in the filter bank is also analyzed in detail. Thirdly, the hardware structure of the whole system will be introduced, including pre-processing board, A/D and D/A board, DSP board and post-processing board. The algorithm of spectrum estimation processed in DSP (TMS320C50) will be presented. Display and post-processing method for the spectrogram also will be discussed. Finally, The performance and the test result of the system will be given in the thesis.
语种	中文
公开日期	2011-05-07
页码	127
源URL	[http://159.226.59.140/handle/311008/654]
专题	声学研究所_声学所博硕士学位论文_1981-2009博硕士学位论文
推荐引用方式 GB/T 7714	游文颖. 话者识别中的语图实现[D]. 中国科学院声学所. 中国科学院声学所. 1999.

入库方式： OAI收割

来源：声学研究所

浏览0

下载0

收藏0

其他版本

除非特别说明，本系统中所有内容都受版权保护，并保留所有权利。