中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
面向机器翻译的汉英句类及句式转换研究

文献类型:学位论文

作者张克亮
学位类别博士
答辩日期2004
授予单位中国科学院声学研究所
授予地点中国科学院声学研究所
关键词机器翻译 概念层次网络(HNC)理论 句类 语句格式 汉英转换
其他题名An MT-oriented Study of Sentence Category and Sentence Format Transfer from Chinese to English
中文摘要50多年来,世界范围的机器翻译事业走过了一条曲折的发展道路,取得了许多令人瞩目的成绩。但是,尽管研究人员尝试了形形色色的语言学理论,使用了各种各样的翻译技术,但开发出来的机译系统却始终无法突破70%的正确率,形成了机器翻译领域的雪线现象。对于汉英机器翻译来说尤其如此,由于理论上缺少一种对路的分析和理解汉语的语言学理论的指导,开发的众多汉英机器翻译系统始终不能取得突破性的进展,距达到用户满意的实用化水平还相差甚远。概念层次网络(HNC)理论是关于人类语言认知机制的学说,也是面向计算机的自然语言理解的学说。基于HNC理论的汉英句类及句式转换研究具有两方面的重大意义:(l)以英语为试点和突破日,检验HNC理论的普适性,从而进一步发展这一创新的自然语言理解理沦,加速HNC立足汉语、走向世界的进程;(2)以机器翻译为目标,探索源语一泽语转换的规律和机制,从而推动基于HNC理论的机器翻译引擎的研究,为研制HNC机器翻译系统创造必需的条件。本文研究主要采取两种方法:一是对比研究的方法,即将HNC理论基于现代汉语归纳和演绎出的句类及句式试用于英语,分析两种语言的句类和句式在数量、结构、分布等方面的异同;二是归纳概括的方法,即通过对汉英/英汉句级对齐语料的标注和分析,揭示汉英两种语言句类及句式转换的一般规律。本文研究的目标是:在HNC理论有关句类、句式以及机器翻译的思想的指导下,探索汉英句类及句式转换的一般规律。内容主要包括以下几个方面:(l)分析基于HNc理论的汉英机器翻译系统的原理和结构,制定HNC机器翻译应该采取的策略和方法;(2)从理论上界定句类转换的类型,并定义一种形式化的句类转换描述框架-TransFrame;(3)针对英语的语种个性,定义!2/1J、!24/1J、!212/1J等新的语句格式表示式,并详细分析汉英两种语言在句式表达方面的异同,研究汉英句式转换的一般规律;(4)从HNC57组基本句类中选定是否判断句、承受句(包括一般、主动、被动、特殊四个一级子类)、块扩作用句、简明状态句等重要句类以及效应句、存在判断句、比较判断句等典型句类,对它们的句类和句式转换规律进行深入的干讲究;(5)收集、整理汉英/英汉对照语料,建立汉英/英汉句级对齐语料库,对语料进行标注和分析。
英文摘要Machine translation (MT) worldwide has made remarkable progress as a result of the past 50-odd years of strenuous endeavor. Among all the available MT systems that resorted to various linguistic theories and translation technologies, however, none have ever broken through the snowline of 70% accuracy of translations. As to Chinese-English MT, the status quo is even worse. Due to the absence of a linguistic theory suited to the analysis and understanding of the Chinese language, and also due to the inadequacies in the models of natural language representation and processing, all the Chinese-English MT systems available now still have a very long way to go before they can satisfactorily meet the needs of consumers. The Hierarchical Network of Concepts (HNC) theory, which is in nature intended for exploring the human cognitive mechanism of language acquisition, is well suited to the task of computer understanding of natural languages. The present research, an MT-oriented study of sentence category (SC) and sentence format (SF) transfer from Chinese to English, has twofold significant meanings. Firstly, by taking English as a starting-point and test bed we can check to what extent the HNC theory is applicable to languages apart from Chinese, and thus further develop this innovative theory and accelerate its pace of advancing toward the world. Secondly, by exploring the transfer rules and mechanism from source language to target language in MT, we can promote the research on HNC-based MT engine, and thus lay foundations for the development of HNC MT systems. Two main methods are taken in this study. One is comparison and contrast, i.e., a comparative study is made of Chinese and English in terms of the quantity, form and distribution of their respective SCs and SFs. The other is induction and generalization, i.e., the general rules about the SC and SF transfer from Chinese to English are inferred through the tagging and analysis of a bilingual corpus of aligned Chine-English sentence pairs. The present study aims to explore the general rules underlying the SC and SF transfer of sentences from Chinese to English under the guidance of the HNC theory, especially those thoughts on SC, SF and MT. This paper mainly includes the following aspects: (I) Introducing the HNC viewpoints on MT, analyzing the framework of a possible MT system based on the HNC theory, and proposing general strategies and guidelines for HNC-based MT systems to follow; Defining the categories of SC transfer and a formal way to describe them, i.e. TransFrame; Defining some novel SFs that are characteristic of the English language, making a comparative study of Chinese and English in terms of SF, and discussing in detail the general rules underlying the SF transfer between the two languages; Investigating the general rules underlying the SC and SF transfer of such significant sentence categories as yes-no judgment sentence, bearing sentence, chunk-extended action sentence, concise state sentence, effect sentence, existential sentence, comparative sentence, and so on; Collecting enough bi-directional and bilingual Chinese-English raw materials, building corpora of aligned Chinese-English sentence pairs, and tagging and analyzing the tagged materials.
语种中文
公开日期2011-05-07
页码126
源URL[http://159.226.59.140/handle/311008/824]  
专题声学研究所_声学所博硕士学位论文_1981-2009博硕士学位论文
推荐引用方式
GB/T 7714
张克亮. 面向机器翻译的汉英句类及句式转换研究[D]. 中国科学院声学研究所. 中国科学院声学研究所. 2004.

入库方式: OAI收割

来源:声学研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。