中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
Dilated temporal relational adversarial network for generic video summarization

文献类型:期刊论文

作者Zhang, Yujia2,3; Kampffmeyer, Michael4; Liang, Xiaodan5; Zhang, Dingwen1; Tan, Min2,3; Xing, Eric P.5
刊名MULTIMEDIA TOOLS AND APPLICATIONS
出版日期2019-10-12
页码25
关键词Video summarization Dilated temporal relation Generative adversarial network Three-player loss
ISSN号1380-7501
DOI10.1007/s11042-019-08175-y
通讯作者Zhang, Yujia(zhangyujia2014@ia.ac.cn)
英文摘要The large amount of videos popping up every day, make it more and more critical that key information within videos can be extracted and understood in a very short time. Video summarization, the task of finding the smallest subset of frames, which still conveys the whole story of a given video, is thus of great significance to improve efficiency of video understanding. We propose a novel Dilated Temporal Relational Generative Adversarial Network (DTR-GAN) to achieve frame-level video summarization. Given a video, it selects the set of key frames, which contain the most meaningful and compact information. Specifically, DTR-GAN learns a dilated temporal relational generator and a discriminator with three-player loss in an adversarial manner. A new dilated temporal relation (DTR) unit is introduced to enhance temporal representation capturing. The generator uses this unit to effectively exploit global multi-scale temporal context to select key frames and to complement the commonly used Bi-LSTM. To ensure that summaries capture enough key video representation from a global perspective rather than a trivial randomly shorten sequence, we present a discriminator that learns to enforce both the information completeness and compactness of summaries via a three-player loss. The loss includes the generated summary loss, the random summary loss, and the real summary (ground-truth) loss, which play important roles for better regularizing the learned model to obtain useful summaries. Comprehensive experiments on three public datasets show the effectiveness of the proposed approach.
资助项目Department of Defense[FA8702-15-D-0002] ; Carnegie Mellon University ; National Natural Science Foundation of China[61673378] ; National Natural Science Foundation of China[61333016] ; Norwegian Research Council FRIPRO grant[239844]
WOS研究方向Computer Science ; Engineering
语种英语
WOS记录号WOS:000492236300003
出版者SPRINGER
资助机构Department of Defense ; Carnegie Mellon University ; National Natural Science Foundation of China ; Norwegian Research Council FRIPRO grant
源URL[http://ir.ia.ac.cn/handle/173211/28910]  
专题自动化研究所_复杂系统管理与控制国家重点实验室_先进机器人控制团队
通讯作者Zhang, Yujia
作者单位1.Xidian Univ, Xian 710071, Shaanxi, Peoples R China
2.Chinese Acad Sci, Inst Automat, Beijing 100190, Peoples R China
3.Univ Chinese Acad Sci, Beijing 100049, Peoples R China
4.UiT Arctic Univ Norway, Machine Learning Grp, N-9019 Tromso, Norway
5.Carnegie Mellon Univ, Machine Learning Dept, Pittsburgh, PA 15213 USA
推荐引用方式
GB/T 7714
Zhang, Yujia,Kampffmeyer, Michael,Liang, Xiaodan,et al. Dilated temporal relational adversarial network for generic video summarization[J]. MULTIMEDIA TOOLS AND APPLICATIONS,2019:25.
APA Zhang, Yujia,Kampffmeyer, Michael,Liang, Xiaodan,Zhang, Dingwen,Tan, Min,&Xing, Eric P..(2019).Dilated temporal relational adversarial network for generic video summarization.MULTIMEDIA TOOLS AND APPLICATIONS,25.
MLA Zhang, Yujia,et al."Dilated temporal relational adversarial network for generic video summarization".MULTIMEDIA TOOLS AND APPLICATIONS (2019):25.

入库方式: OAI收割

来源:自动化研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。