中国科学院机构知识库网格系统: Towards Compact and Fast Neural Machine Translation Using a Combined Method

中国科学院机构知识库网格

Chinese Academy of Sciences Institutional Repositories Grid

Towards Compact and Fast Neural Machine Translation Using a Combined Method

文献类型：会议论文


作者	Xiaowei Zhang1,2 ; Wei Chen1 ; Feng Wang1,2 ; Shuang Xu1 ; Bo Xu1
出版日期	2017-09
会议日期	2017-9
会议地点	丹麦哥本哈根
关键词	Machine Translation Neural Network Model Compression Decoding Speedup
页码	1475–1481
英文摘要	Neural Machine Translation (NMT) lays intensive burden on computation and memory cost. It is a challenge to deploy NMT models on the devices with limited computation and memory budgets. This paper presents a four stage pipeline to compress model and speed up the decoding for NMT. Our method first introduces a compact architecture based on convolutional encoder and weight shared embeddings. Then weight pruning is applied to obtain a sparse model. Next, we propose a fast sequence interpolation approach which enables the greedy decoding to achieve performance on par with the beam search. Hence, the time-consuming beam search can be replaced by simple greedy decoding. Finally, vocabulary selection is used to reduce the computation of softmax layer. Our final model achieves 10× speedup, 17× parameters reduction, <35MB storage size and comparable performance compared to the baseline model.
语种	英语
源URL	[http://ir.ia.ac.cn/handle/173211/21185]
专题	类脑智能研究中心_神经计算及脑机交互
作者单位	1.Institute of Automation, Chinese Academy of Sciences 2.University of Chinese Academy of Sciences
推荐引用方式 GB/T 7714	Xiaowei Zhang,Wei Chen,Feng Wang,et al. Towards Compact and Fast Neural Machine Translation Using a Combined Method[C]. 见:. 丹麦哥本哈根. 2017-9.

入库方式： OAI收割

来源：自动化研究所

浏览0

下载0

收藏0

其他版本

除非特别说明，本系统中所有内容都受版权保护，并保留所有权利。