中国科学院机构知识库网格系统: automatic acquisition of chinese-tibetan multi-word equivalent pair from bilingual corpora

中国科学院机构知识库网格

Chinese Academy of Sciences Institutional Repositories Grid

automatic acquisition of chinese-tibetan multi-word equivalent pair from bilingual corpora

文献类型：会议论文


作者	Nuo Minghua ; Liu Huidan ; Ma Longlong ; Wu Jian ; Ding Zhiming
出版日期	2011
会议名称	2011 International Conference on Asian Language Processing, IALP 2011
会议日期	November 1
会议地点	Penang, Malaysia
关键词	Natural language processing systems
页码	177-180
中文摘要	This paper aims to construct Chinese-Tibetan multi-word equivalent pair dictionary for Chinese-Tibetan computer-aided translation system. Since Tibetan is a morphologically rich language, we propose two-phase framework to automatically extract multi-word equivalent pairs. First, extract Chinese Multi-word Units (MWUs). In this phase, we propose CBEM model to partition a Chinese sentence into MWUs using two measures of collocation and binding degree. Second, get Tibetan translations of the extracted Chinese MWUs. In the second phase, we propose TSIM model to focus on extracting 1-to-n bilingual MWUs. Preliminary experimental results show that the mixed method combining CBEM model with TSIM model is effective. © 2011 IEEE.
英文摘要	This paper aims to construct Chinese-Tibetan multi-word equivalent pair dictionary for Chinese-Tibetan computer-aided translation system. Since Tibetan is a morphologically rich language, we propose two-phase framework to automatically extract multi-word equivalent pairs. First, extract Chinese Multi-word Units (MWUs). In this phase, we propose CBEM model to partition a Chinese sentence into MWUs using two measures of collocation and binding degree. Second, get Tibetan translations of the extracted Chinese MWUs. In the second phase, we propose TSIM model to focus on extracting 1-to-n bilingual MWUs. Preliminary experimental results show that the mixed method combining CBEM model with TSIM model is effective. © 2011 IEEE.
收录类别	EI
会议录	Proceedings - 2011 International Conference on Asian Language Processing, IALP 2011
语种	英语
ISBN号	9780769545547
源URL	[http://ir.iscas.ac.cn/handle/311060/16257]
专题	软件研究所_软件所图书馆_会议论文
推荐引用方式 GB/T 7714	Nuo Minghua,Liu Huidan,Ma Longlong,et al. automatic acquisition of chinese-tibetan multi-word equivalent pair from bilingual corpora[C]. 见:2011 International Conference on Asian Language Processing, IALP 2011. Penang, Malaysia. November 1.

入库方式： OAI收割

来源：软件研究所

浏览0

下载0

收藏0

其他版本

除非特别说明，本系统中所有内容都受版权保护，并保留所有权利。