中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
Obj2Seq: Formatting Objects as Sequences with Class Prompt for Visual Tasks

文献类型:会议论文

作者Chen, Zhiyang2,4; Zhu, Yousong2; Li, Zhaowen2,4; Yang, Fan1,2; Li, Wei3; Wang, Haixin2,4; Zhao, Chaoyang2; Wu, Liwei3; Zhao, Rui3; Wang, Jinqiao1,2,4
出版日期2022-11-01
会议日期2022-11-28
会议地点New Orleans, Louisiana & Online
关键词transformer general visual framework sequence prediction multi-task
英文摘要

Visual tasks vary a lot in their output formats and concerned contents, therefore it is hard to process them with an identical structure. One main obstacle lies in the high-dimensional outputs in object-level visual tasks. In this paper, we propose an object-centric vision framework, Obj2Seq. Obj2Seq takes objects as basic units, and regards most object-level visual tasks as sequence generation problems of objects. Therefore, these visual tasks can be decoupled into two steps. First recognize objects of given categories, and then generate a sequence for each of these objects. The definition of the output sequences varies for different tasks, and the model is supervised by matching these sequences with ground-truth targets. Obj2Seq is able to flexibly determine input categories to satisfy customized requirements, and be easily extended to different visual tasks. When experimenting on MS COCO, Obj2Seq achieves 45.7% AP on object detection, 89.0% AP on multi-label classification and 65.0% AP on human pose estimation. These results demonstrate its potential to be generally applied to different visual tasks.

源URL[http://ir.ia.ac.cn/handle/173211/56593]  
专题紫东太初大模型研究中心_大模型计算
作者单位1.Peng Cheng Laboratory
2.National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences
3.SenseTime Research
4.School of Artificial Intelligence, University of Chinese Academy of Sciences
推荐引用方式
GB/T 7714
Chen, Zhiyang,Zhu, Yousong,Li, Zhaowen,et al. Obj2Seq: Formatting Objects as Sequences with Class Prompt for Visual Tasks[C]. 见:. New Orleans, Louisiana & Online. 2022-11-28.

入库方式: OAI收割

来源:自动化研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。