An End-to-end TextSpotter with Explicit Alignment and Attention
文献类型:会议论文
作者 | Tong He; Zhi Tian; Weilin Huang; Chunhua Shen; Yu Qiao; Changming Sun |
出版日期 | 2018 |
会议日期 | 2018 |
会议地点 | 美国 |
英文摘要 | Text detection and recognition in natural images have long been considered as two separate tasks that are processed sequentially. Training of two tasks in a unified framework is non-trivial due to significant dif- ferences in optimisation difficulties. In this work, we present a conceptually simple yet efficient framework that simultaneously processes the two tasks in one shot. Our main contributions are three-fold: 1) we propose a novel text-alignment layer that allows it to precisely compute convolutional features of a text instance in ar- bitrary orientation, which is the key to boost the per- formance; 2) a character attention mechanism is introduced by using character spatial information as explicit supervision, leading to large improvements in recognition; 3) two technologies, together with a new RNN branch for word recognition, are integrated seamlessly into a single model which is end-to-end trainable. This allows the two tasks to work collaboratively by shar- ing convolutional features, which is critical to identify challenging text instances. Our model achieves impressive results in end-to-end recognition on the ICDAR2015 dataset, significantly advancing most recent results, with improvements of F-measure from (0.54, 0.51, 0.47) to (0.82, 0.77, 0.63), by using a strong, weak and generic lexicon respectively. Thanks to joint training, our method can also serve as a good detec- tor by achieving a new state-of-the-art detection performance on two datasets. |
URL标识 | 查看原文 |
源URL | [http://ir.siat.ac.cn:8080/handle/172644/13686] ![]() |
专题 | 深圳先进技术研究院_集成所 |
推荐引用方式 GB/T 7714 | Tong He,Zhi Tian,Weilin Huang,et al. An End-to-end TextSpotter with Explicit Alignment and Attention[C]. 见:. 美国. 2018. |
入库方式: OAI收割
来源:深圳先进技术研究院
浏览0
下载0
收藏0
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。