中国科学院机构知识库网格系统: Dilated temporal relational adversarial network for generic video summarization

Dilated temporal relational adversarial network for generic video summarization

文献类型：期刊论文


作者	Zhang, Yujia2,3 ; Kampffmeyer, Michael 4; Liang, Xiaodan 5; Zhang, Dingwen 1; Tan, Min2,3 ; Xing, Eric P.5
刊名	MULTIMEDIA TOOLS AND APPLICATIONS
出版日期	2019-10-12
页码	25
关键词	Video summarization Dilated temporal relation Generative adversarial network Three-player loss
ISSN号	1380-7501
DOI	10.1007/s11042-019-08175-y
通讯作者	Zhang, Yujia(zhangyujia2014@ia.ac.cn)
英文摘要	The large amount of videos popping up every day, make it more and more critical that key information within videos can be extracted and understood in a very short time. Video summarization, the task of finding the smallest subset of frames, which still conveys the whole story of a given video, is thus of great significance to improve efficiency of video understanding. We propose a novel Dilated Temporal Relational Generative Adversarial Network (DTR-GAN) to achieve frame-level video summarization. Given a video, it selects the set of key frames, which contain the most meaningful and compact information. Specifically, DTR-GAN learns a dilated temporal relational generator and a discriminator with three-player loss in an adversarial manner. A new dilated temporal relation (DTR) unit is introduced to enhance temporal representation capturing. The generator uses this unit to effectively exploit global multi-scale temporal context to select key frames and to complement the commonly used Bi-LSTM. To ensure that summaries capture enough key video representation from a global perspective rather than a trivial randomly shorten sequence, we present a discriminator that learns to enforce both the information completeness and compactness of summaries via a three-player loss. The loss includes the generated summary loss, the random summary loss, and the real summary (ground-truth) loss, which play important roles for better regularizing the learned model to obtain useful summaries. Comprehensive experiments on three public datasets show the effectiveness of the proposed approach.
资助项目	Department of Defense[FA8702-15-D-0002] ; Carnegie Mellon University ; National Natural Science Foundation of China[61673378] ; National Natural Science Foundation of China[61333016] ; Norwegian Research Council FRIPRO grant[239844]
WOS研究方向	Computer Science ; Engineering
语种	英语
WOS记录号	WOS:000492236300003
出版者	SPRINGER
资助机构	Department of Defense ; Carnegie Mellon University ; National Natural Science Foundation of China ; Norwegian Research Council FRIPRO grant
源URL	[http://ir.ia.ac.cn/handle/173211/28910]
专题	自动化研究所_复杂系统管理与控制国家重点实验室_先进机器人控制团队
通讯作者	Zhang, Yujia
作者单位	1.Xidian Univ, Xian 710071, Shaanxi, Peoples R China 2.Chinese Acad Sci, Inst Automat, Beijing 100190, Peoples R China 3.Univ Chinese Acad Sci, Beijing 100049, Peoples R China 4.UiT Arctic Univ Norway, Machine Learning Grp, N-9019 Tromso, Norway 5.Carnegie Mellon Univ, Machine Learning Dept, Pittsburgh, PA 15213 USA
推荐引用方式 GB/T 7714	Zhang, Yujia,Kampffmeyer, Michael,Liang, Xiaodan,et al. Dilated temporal relational adversarial network for generic video summarization[J]. MULTIMEDIA TOOLS AND APPLICATIONS,2019:25.
APA	Zhang, Yujia,Kampffmeyer, Michael,Liang, Xiaodan,Zhang, Dingwen,Tan, Min,&Xing, Eric P..(2019).Dilated temporal relational adversarial network for generic video summarization.MULTIMEDIA TOOLS AND APPLICATIONS,25.
MLA	Zhang, Yujia,et al."Dilated temporal relational adversarial network for generic video summarization".MULTIMEDIA TOOLS AND APPLICATIONS (2019):25.

入库方式： OAI收割

来源：自动化研究所

下载0

Dilated temporal relational adversarial network for generic video summarization

其他版本