Adaptive Scaling and Reffned Pyramid Feature Fusion Network for Scene Text Segmentation
文献类型:期刊论文
作者 | Li TZ(李天佐)![]() ![]() ![]() ![]() |
刊名 | ICDAR2024
![]() |
出版日期 | 2024 |
页码 | 1 |
文献子类 | 国际会议 |
英文摘要 | Although scene text recognition has achieved high performance, text segmentation still needs to be improved. The goal of text segmentation is to obtain pixel-level foreground text masks from scene images. In this paper, we adaptively resize the input images to their optimal scales and propose the Reffned Pyramid Feature Fusion Network (RPFF-Net) for robust scene text segmentation. To address the issue of inconsistent text scaling, we propose an adaptive image scaling method that takes into account the density of text regions in each scene image. In the RPFF-Net, we ffrst extract multi-scale features from the backbone network, and then combine these features using effective pyramid feature fusion methods. To enhance the interaction between texts from contextual characters and extract features at different levels, we apply two self-attention mechanisms to the fusion feature map in spatial and channel dimensions. The experimental results demonstrate the effectiveness of our approach on several text segmentation benchmarks including the monolingual TextSeg and bilingual BTS dataset, and show that it outperforms the existing state-of-the-art scene text segmentation methods even without OCR (optical character recognition) enhancement. |
语种 | 英语 |
源URL | [http://ir.ia.ac.cn/handle/173211/57528] ![]() |
专题 | 自动化研究所_模式识别国家重点实验室_模式分析与学习团队 |
推荐引用方式 GB/T 7714 | Li TZ,Zhang H,Li XH,et al. Adaptive Scaling and Reffned Pyramid Feature Fusion Network for Scene Text Segmentation[J]. ICDAR2024,2024:1. |
APA | Li TZ,Zhang H,Li XH,&Yin F.(2024).Adaptive Scaling and Reffned Pyramid Feature Fusion Network for Scene Text Segmentation.ICDAR2024,1. |
MLA | Li TZ,et al."Adaptive Scaling and Reffned Pyramid Feature Fusion Network for Scene Text Segmentation".ICDAR2024 (2024):1. |
入库方式: OAI收割
来源:自动化研究所
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。