A Two-Level Rectification Attention Network for Scene Text Recognition
文献类型:期刊论文
作者 | Wu, Lintai1,8,9; Xu, Yong1,7,9; Hou, Junhui6,8; Chen, C. L. Philip4,5; Liu, Cheng-Lin2,3![]() |
刊名 | IEEE TRANSACTIONS ON MULTIMEDIA
![]() |
出版日期 | 2023 |
卷号 | 25页码:2404-2414 |
关键词 | Scene text recognition text rectification spatial transformer network optical character recognition |
ISSN号 | 1520-9210 |
DOI | 10.1109/TMM.2022.3146779 |
通讯作者 | Xu, Yong(yongxu@ymail.com) |
英文摘要 | Scene text recognition is a challenging task in the computer vision field due to the diversity of text styles and the complexity of the image backgrounds. In recent decades, numerous text rectification and recognition methods have been proposed to solve these problems. However, most of these methods rectify texts at the geometry level or pixel level. The former is limited by geometric constraints, and the latter is prone to blurring the text. In this paper, we propose a two-level rectification attention network (TRAN) to rectify and recognize texts. This network consists of two parts: a two-level rectification network (TORN) and an attention-based recognition network (ABRN). Specifically, the TORN first rectifies texts at the geometry level and then performs a pixel-level adjustment, which not only eliminates the geometric constraints but also renders clear texts. The ABRN's role is to recognize text in the rectified images. To improve the feature extraction ability of our model, we design a new channel-wise and kernel-wise attention unit, which enables the network to handle significant variations of character size and channel interdependencies. Furthermore, we propose a skip training strategy to make our model converge smoothly. We conduct experiments on various benchmarks, including regular and irregular datasets. The experimental results show that our method achieves a state-of-the-art performance. |
WOS关键词 | EFFICIENT |
资助项目 | National Nature Science Foundation of China[61876051] ; Shenzhen Key Laboratory of Visual Object Detection and Recognition[ZDSYS20190902093015527] |
WOS研究方向 | Computer Science ; Telecommunications |
语种 | 英语 |
WOS记录号 | WOS:001007432100062 |
出版者 | IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC |
资助机构 | National Nature Science Foundation of China ; Shenzhen Key Laboratory of Visual Object Detection and Recognition |
源URL | [http://ir.ia.ac.cn/handle/173211/53681] ![]() |
专题 | 多模态人工智能系统全国重点实验室 |
通讯作者 | Xu, Yong |
作者单位 | 1.Harbin Inst Technol, Biocomp Res Ctr, Shenzhen 518055, Guangdong, Peoples R China 2.Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R China 3.Chinese Acad Sci, Inst Automat, NLPR, Beijing 100190, Peoples R China 4.Pazhou Lab, Guangzhou 510335, Peoples R China 5.South China Univ Technol, Sch Comp Sci & Engn, Guangzhou 510006, Peoples R China 6.City Univ Hong Kong, Shenzhen Res Inst, Hong Kong, Peoples R China 7.Peng Cheng Lab, Shenzhen 518055, Guangdong, Peoples R China 8.City Univ Hong Kong, Dept Comp Sci, Hong Kong, Peoples R China 9.Shenzhen Key Lab Visual Object Detect & Recognit, Shenzhen 518055, Guangdong, Peoples R China |
推荐引用方式 GB/T 7714 | Wu, Lintai,Xu, Yong,Hou, Junhui,et al. A Two-Level Rectification Attention Network for Scene Text Recognition[J]. IEEE TRANSACTIONS ON MULTIMEDIA,2023,25:2404-2414. |
APA | Wu, Lintai,Xu, Yong,Hou, Junhui,Chen, C. L. Philip,&Liu, Cheng-Lin.(2023).A Two-Level Rectification Attention Network for Scene Text Recognition.IEEE TRANSACTIONS ON MULTIMEDIA,25,2404-2414. |
MLA | Wu, Lintai,et al."A Two-Level Rectification Attention Network for Scene Text Recognition".IEEE TRANSACTIONS ON MULTIMEDIA 25(2023):2404-2414. |
入库方式: OAI收割
来源:自动化研究所
浏览0
下载0
收藏0
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。