中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
Convolutional Attention Networks for Scene Text Recognition

文献类型:期刊论文

作者Xie, HT (Xie, Hongtao)[ 1 ]; Fang, SC (Fang, Shancheng)[ 2,3 ]; Zha, ZJ (Zha, Zheng-Jun)[ 1 ]; Yang, YT (Yang, Yating)[ 4 ]; Li, Y (Li, Yan)[ 5 ]; Zhang, YD (Zhang, Yongdong)[ 1 ]
刊名ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS
出版日期2019
卷号15期号:1 增刊页码:3-17
关键词Text recognition text detection convolutional neural networks multi-level supervised information attention model
ISSN号1551-6857
DOI10.1145/3231737
英文摘要

In this article, we present Convoluitional Attention Networks (CAN) for unconstrained scene text recognition. Recent dominant approaches for scene text recognition are mainly based on Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN), where the CNN encodes images and the RNN generates character sequences. Our CAN is different from these methods; our CAN is completely built on CNN and includes an attention mechanism. The distinctive characteristics of our method include (i) CAN follows encoder-decoder architecture, in which the encoder is a deep two-dimensional CNN and the decoder is a one-dimensional CNN; (ii) the attention mechanism is applied in every convolutional layer of the decoder, and we propose a novel spatial attention method using average pooling; and (iii) position embeddings are equipped in both a spatial encoder and a sequence decoder to give our networks a sense of location. We conduct experiments on standard datasets for scene text recognition, including Street View Text, IIIT5K, and ICDAR datasets. The experimental results validate the effectiveness of different components and show that our convolutional-based method achieves state-of-the-art or competitive performance over prior works, even without the use of RNN.

WOS记录号WOS:000459798100003
源URL[http://ir.xjipc.cas.cn/handle/365002/5690]  
专题新疆理化技术研究所_多语种信息技术研究室
作者单位1.Univ Sci & Technol China, Sch Informat Sci & Technol, Hefei, Anhui, Peoples R China
2.Chinese Acad Sci, Inst Informat Engn, Beijing, Peoples R China
3.Univ Chinese Acad Sci, Sch Cyber Secur, Beijing, Peoples R China
4.Chinese Acad Sci, Xinjiang Tech Inst Phys & Chem, Urumqi, Peoples R China
5.Beijing Kuaishou Technol Co Ltd, Beijing, Peoples R China
推荐引用方式
GB/T 7714
Xie, HT ,Fang, SC ,Zha, ZJ ,et al. Convolutional Attention Networks for Scene Text Recognition[J]. ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS,2019,15(1 增刊):3-17.
APA Xie, HT ,Fang, SC ,Zha, ZJ ,Yang, YT ,Li, Y ,&Zhang, YD .(2019).Convolutional Attention Networks for Scene Text Recognition.ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS,15(1 增刊),3-17.
MLA Xie, HT ,et al."Convolutional Attention Networks for Scene Text Recognition".ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS 15.1 增刊(2019):3-17.

入库方式: OAI收割

来源:新疆理化技术研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。