中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
Zero-Shot Predicate Prediction for Scene Graph Parsing

文献类型:期刊论文

作者Li, Yiming5; Yang, Xiaoshan1,3,4; Huang, Xuhui2; Ma, Zhe2; Xu, Changsheng1,3,4
刊名IEEE TRANSACTIONS ON MULTIMEDIA
出版日期2023
卷号25页码:3140-3153
ISSN号1520-9210
关键词Deep learning zero-shot scene graph
DOI10.1109/TMM.2022.3155928
通讯作者Xu, Changsheng(csxu@nlpr.ia.ac.cn)
英文摘要The scene graph is a structured semantic representation of an image, which represents objects and relationships with vertices and edges, respectively. Since it is impossible to manually label all potential relationships in the real world, some previous methods try to apply the zero-shot method for scene graph generation. However, existing methods take triplet (i.e., (subject -predicate -object)) as the basic unit of a relationship. Each element (i.e., subject, predicate, or object) of the unseen relationship is actually seen in the training data. Therefore, they ignore the unseen predicate. To predict the unseen predicate, we introduce a novel task named zero-shot predicate prediction, which is crucial to extending existing scene graph generation methods to recognize more relationship classes. The new task is challenging and cannot be simply resolved through conventional zero-shot learning methods because there is a large intra-class variation of each predicate. Firstly, the large intra-class variation leads to the difficulty of computing the discriminative instance-level feature of the predicate class. Secondly, the large intra-class variation also brings more difficulties when knowledge is transferred from seen classes to unseen classes. For the first challenge, we propose distilling lexical knowledge of different objects and construct multi-modal representations of pairwise objects to reduce the intra-class variation of the predicate. To respond to the second challenge, we build a compact semantic space where the representations of unseen classes are reconstructed based on the seen classes for zero-shot predicate classification. We evaluate the proposed method on the public dataset Visual Genome. The extensive experiment results under the zero-shot/few-shot/supervised settings demonstrate the effectiveness of the proposed method.
资助项目National Key Research and Development Program of China[2018AAA0100604] ; National Natural Science Foundation of China[61720106006] ; National Natural Science Foundation of China[62036012] ; National Natural Science Foundation of China[61721004] ; National Natural Science Foundation of China[62072455] ; National Natural Science Foundation of China[U1836220] ; National Natural Science Foundation of China[U1705262] ; National Natural Science Foundation of China[61872424] ; Key Research Program of Frontier Sciences of CAS[QYZDJ-SSW-JSC039] ; Beijing Natural Science Foundation[L201001]
WOS研究方向Computer Science ; Telecommunications
语种英语
出版者IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
WOS记录号WOS:001045742200015
资助机构National Key Research and Development Program of China ; National Natural Science Foundation of China ; Key Research Program of Frontier Sciences of CAS ; Beijing Natural Science Foundation
源URL[http://ir.ia.ac.cn/handle/173211/54026]  
专题多模态人工智能系统全国重点实验室
通讯作者Xu, Changsheng
作者单位1.Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing 100190, Peoples R China
2.CASIC, Acad 2, Lab 10, Beijing 100854, Peoples R China
3.PengCheng Lab, Shenzhen 518066, Peoples R China
4.Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R China
5.Zhengzhou Univ, Sch Informat Engn, Zhengzhou 450001, Peoples R China
推荐引用方式
GB/T 7714
Li, Yiming,Yang, Xiaoshan,Huang, Xuhui,et al. Zero-Shot Predicate Prediction for Scene Graph Parsing[J]. IEEE TRANSACTIONS ON MULTIMEDIA,2023,25:3140-3153.
APA Li, Yiming,Yang, Xiaoshan,Huang, Xuhui,Ma, Zhe,&Xu, Changsheng.(2023).Zero-Shot Predicate Prediction for Scene Graph Parsing.IEEE TRANSACTIONS ON MULTIMEDIA,25,3140-3153.
MLA Li, Yiming,et al."Zero-Shot Predicate Prediction for Scene Graph Parsing".IEEE TRANSACTIONS ON MULTIMEDIA 25(2023):3140-3153.

入库方式: OAI收割

来源:自动化研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。