Dilated temporal relational adversarial network for generic video summarization
文献类型:期刊论文
作者 | Zhang, Yujia2,3![]() ![]() |
刊名 | MULTIMEDIA TOOLS AND APPLICATIONS
![]() |
出版日期 | 2019-10-12 |
页码 | 25 |
关键词 | Video summarization Dilated temporal relation Generative adversarial network Three-player loss |
ISSN号 | 1380-7501 |
DOI | 10.1007/s11042-019-08175-y |
通讯作者 | Zhang, Yujia(zhangyujia2014@ia.ac.cn) |
英文摘要 | The large amount of videos popping up every day, make it more and more critical that key information within videos can be extracted and understood in a very short time. Video summarization, the task of finding the smallest subset of frames, which still conveys the whole story of a given video, is thus of great significance to improve efficiency of video understanding. We propose a novel Dilated Temporal Relational Generative Adversarial Network (DTR-GAN) to achieve frame-level video summarization. Given a video, it selects the set of key frames, which contain the most meaningful and compact information. Specifically, DTR-GAN learns a dilated temporal relational generator and a discriminator with three-player loss in an adversarial manner. A new dilated temporal relation (DTR) unit is introduced to enhance temporal representation capturing. The generator uses this unit to effectively exploit global multi-scale temporal context to select key frames and to complement the commonly used Bi-LSTM. To ensure that summaries capture enough key video representation from a global perspective rather than a trivial randomly shorten sequence, we present a discriminator that learns to enforce both the information completeness and compactness of summaries via a three-player loss. The loss includes the generated summary loss, the random summary loss, and the real summary (ground-truth) loss, which play important roles for better regularizing the learned model to obtain useful summaries. Comprehensive experiments on three public datasets show the effectiveness of the proposed approach. |
资助项目 | Department of Defense[FA8702-15-D-0002] ; Carnegie Mellon University ; National Natural Science Foundation of China[61673378] ; National Natural Science Foundation of China[61333016] ; Norwegian Research Council FRIPRO grant[239844] |
WOS研究方向 | Computer Science ; Engineering |
语种 | 英语 |
WOS记录号 | WOS:000492236300003 |
出版者 | SPRINGER |
资助机构 | Department of Defense ; Carnegie Mellon University ; National Natural Science Foundation of China ; Norwegian Research Council FRIPRO grant |
源URL | [http://ir.ia.ac.cn/handle/173211/28910] ![]() |
专题 | 自动化研究所_复杂系统管理与控制国家重点实验室_先进机器人控制团队 |
通讯作者 | Zhang, Yujia |
作者单位 | 1.Xidian Univ, Xian 710071, Shaanxi, Peoples R China 2.Chinese Acad Sci, Inst Automat, Beijing 100190, Peoples R China 3.Univ Chinese Acad Sci, Beijing 100049, Peoples R China 4.UiT Arctic Univ Norway, Machine Learning Grp, N-9019 Tromso, Norway 5.Carnegie Mellon Univ, Machine Learning Dept, Pittsburgh, PA 15213 USA |
推荐引用方式 GB/T 7714 | Zhang, Yujia,Kampffmeyer, Michael,Liang, Xiaodan,et al. Dilated temporal relational adversarial network for generic video summarization[J]. MULTIMEDIA TOOLS AND APPLICATIONS,2019:25. |
APA | Zhang, Yujia,Kampffmeyer, Michael,Liang, Xiaodan,Zhang, Dingwen,Tan, Min,&Xing, Eric P..(2019).Dilated temporal relational adversarial network for generic video summarization.MULTIMEDIA TOOLS AND APPLICATIONS,25. |
MLA | Zhang, Yujia,et al."Dilated temporal relational adversarial network for generic video summarization".MULTIMEDIA TOOLS AND APPLICATIONS (2019):25. |
入库方式: OAI收割
来源:自动化研究所
浏览0
下载0
收藏0
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。