Learning Hierarchical Video Graph Networks for One-Stop Video Delivery
文献类型:期刊论文
作者 | Song, Yaguang1,3; Gao, Junyu1,3; Yang, Xiaoshan1,2,3; Xu, Changsheng1,2,3 |
刊名 | ACM Transactions on Multimedia Computing, Communications, and Applications |
出版日期 | 2022-01-27 |
卷号 | 18期号:1页码:1-23 |
关键词 | Cross modal video retrieval deep learning graph neural networks |
文献子类 | 期刊论文 |
英文摘要 | The explosive growth of video data has brought great challenges to video retrieval, which aims to find out related videos from a video collection. Most users are usually not interested in all the content of retrieved videos but have a more fine-grained need. In the meantime, most existing methods can only return a ranked list of retrieved videos lacking a proper way to present the video content. In this paper, we introduce a distinctively new task, namely One-Stop Video Delivery (OSVD) aiming to realize a comprehensive retrieval system with the following merits: it not only retrieves the relevant videos but also filters out irrelevant information and presents compact video content to users, given a natural language query and video collection. To solve this task, we propose an end-to-end Hierarchical Video Graph Reasoning framework (HVGR), which considers relations of different video levels and jointly accomplishes the one-stop delivery task. Specifically, we decompose the video into three levels, namely the video-level, moment-level, and the clip-level in a coarse-to-fine manner, and apply Graph Neural Networks (GNNs) on the hierarchical graph to model the relations. Furthermore, a pairwise ranking loss named Progressively Refined Loss is proposed based on prior knowledge that there is a relative order of the similarity of query-video, query-moment, and query-clip due to the different granularity of matched information. Extensive experimental results on benchmark datasets demonstrate that the proposed method achieves superior performance compared with baseline methods. |
语种 | 英语 |
源URL | [http://ir.ia.ac.cn/handle/173211/51526] |
专题 | 多模态人工智能系统全国重点实验室 |
通讯作者 | Xu, Changsheng |
作者单位 | 1.National Lab of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences 2.Peng Cheng Laboratory 3.School of Artifical Intelligence, University of Chinese Academy of Sciences |
推荐引用方式 GB/T 7714 | Song, Yaguang,Gao, Junyu,Yang, Xiaoshan,et al. Learning Hierarchical Video Graph Networks for One-Stop Video Delivery[J]. ACM Transactions on Multimedia Computing, Communications, and Applications,2022,18(1):1-23. |
APA | Song, Yaguang,Gao, Junyu,Yang, Xiaoshan,&Xu, Changsheng.(2022).Learning Hierarchical Video Graph Networks for One-Stop Video Delivery.ACM Transactions on Multimedia Computing, Communications, and Applications,18(1),1-23. |
MLA | Song, Yaguang,et al."Learning Hierarchical Video Graph Networks for One-Stop Video Delivery".ACM Transactions on Multimedia Computing, Communications, and Applications 18.1(2022):1-23. |
入库方式: OAI收割
来源:自动化研究所
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。