A Phrase Table Filtering Model Based on Binary Classification for Uyghur-Chinese Machine Translation
文献类型:期刊论文
作者 | Chenggang Mi; Yating Yang![]() ![]() ![]() |
刊名 | Journal of Computers
![]() |
出版日期 | 2014 |
卷号 | 9期号:12页码:2780-2786 |
英文摘要 | In statistical machine translation, large amount of unreasonable phrase pairs in a phrase table can affect the decoding efficiency and the overall translation performance, especially in Uyghur-Chinese machine translation. In this paper, we present a novel phrase table filtering model based on binary classification, which consider differences between Uyghur and Chinese, and draw lessons from binary classification in machine learning. In our model, four features are considered: 1) Difference in length between source and target phrase; 2) Proportion of translated words in phrase pairs; 3) Proportion of symbol words; 4) Average number of co-occurrence words in training corpus. We use this model to generate a filtered phrase table. Experimental results show that this new filtering model can improve the performance and efficiency of our current Uygur-Chinese machine translation system. |
源URL | [http://ir.xjipc.cas.cn/handle/365002/5193] ![]() |
专题 | 新疆理化技术研究所_多语种信息技术研究室 |
作者单位 | Xinjiang Technical Institute of Physics and Chemistry of Chinese Academy of Sciences,Urumqi 830011, China;University of Chinese Academy of Sciences, Beijing 100049, China |
推荐引用方式 GB/T 7714 | Chenggang Mi,Yating Yang,Xi Zhou,et al. A Phrase Table Filtering Model Based on Binary Classification for Uyghur-Chinese Machine Translation[J]. Journal of Computers,2014,9(12):2780-2786. |
APA | Chenggang Mi,Yating Yang,Xi Zhou,Lei Wang,Xiao Li,&Tursun, E..(2014).A Phrase Table Filtering Model Based on Binary Classification for Uyghur-Chinese Machine Translation.Journal of Computers,9(12),2780-2786. |
MLA | Chenggang Mi,et al."A Phrase Table Filtering Model Based on Binary Classification for Uyghur-Chinese Machine Translation".Journal of Computers 9.12(2014):2780-2786. |
入库方式: OAI收割
来源:新疆理化技术研究所
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。