中国科学院机构知识库网格系统: Adversarial Multimodal Network for Movie Story Question Answering

Adversarial Multimodal Network for Movie Story Question Answering

文献类型：期刊论文


作者	Yuan, Zhaoquan 1; Sun, Siyuan 2,3; Duan, Lixin 2,3; Li, Changsheng 4; Wu, Xiao 1; Xu, Changsheng5
刊名	IEEE TRANSACTIONS ON MULTIMEDIA
出版日期	2021
卷号	23 页码:1744-1756
关键词	Knowledge discovery Motion pictures Visualization Task analysis Generators Gallium nitride Natural languages Movie question answering adversarial network multimodal understanding
ISSN号	1520-9210
DOI	10.1109/TMM.2020.3002667
通讯作者	Duan, Lixin(lxduan@uestc.edu.cn) ; Li, Changsheng(lcs@bit.edu.cn)
英文摘要	Visual question answering by using information from multiple modalities has attracted more and more attention in recent years. However, it is a very challenging task, as the visual content and natural language have quite different statistical properties. In this work, we present a method called Adversarial Multimodal Network (AMN) to better understand video stories for question answering. In AMN, we propose to learn multimodal feature representations by finding a more coherent subspace for video clips and the corresponding texts (e.g., subtitles and questions) based on generative adversarial networks. Moreover, a self-attention mechanism is developed to enforce our newly introduced consistency constraint in order to preserve the self-correlation between the visual cues of the original video clips in the learned multimodal representations. Extensive experiments on the benchmark MovieQA and TVQA datasets show the effectiveness of our proposed AMN over other published state-of-the-art methods.
资助项目	Major Project for New Generation of AI[2018AAA0100400] ; National Natural Science Foundation of China[61802053] ; National Natural Science Foundation of China[61772436] ; National Natural Science Foundation of China[61772118] ; National Natural Science Foundation of China[61806044] ; Sichuan Science and Technology Program[2020YJ0037] ; Sichuan Science and Technology Program[2020YJ0207] ; Foundation for Department of Transportation of Henan Province[2019J-2-2] ; Fundamental Research Funds for the Central Universities[2682019CX62]
WOS研究方向	Computer Science ; Telecommunications
语种	英语
WOS记录号	WOS:000655830300021
出版者	IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
资助机构	Major Project for New Generation of AI ; National Natural Science Foundation of China ; Sichuan Science and Technology Program ; Foundation for Department of Transportation of Henan Province ; Fundamental Research Funds for the Central Universities
源URL	[http://ir.ia.ac.cn/handle/173211/45316]
专题	自动化研究所_模式识别国家重点实验室_多媒体计算与图形学团队
通讯作者	Duan, Lixin; Li, Changsheng
作者单位	1.Southwest Jiaotong Univ, Sch Informat Sci & Technol, Chengdu 610031, Peoples R China 2.Univ Elect Sci & Technol China, Big Data Res Ctr, Chengdu 610051, Peoples R China 3.Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu 610051, Peoples R China 4.Beijing Inst Technol, Sch Comp Sci & Technol, Beijing 100081, Peoples R China 5.Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing 100190, Peoples R China
推荐引用方式 GB/T 7714	Yuan, Zhaoquan,Sun, Siyuan,Duan, Lixin,et al. Adversarial Multimodal Network for Movie Story Question Answering[J]. IEEE TRANSACTIONS ON MULTIMEDIA,2021,23:1744-1756.
APA	Yuan, Zhaoquan,Sun, Siyuan,Duan, Lixin,Li, Changsheng,Wu, Xiao,&Xu, Changsheng.(2021).Adversarial Multimodal Network for Movie Story Question Answering.IEEE TRANSACTIONS ON MULTIMEDIA,23,1744-1756.
MLA	Yuan, Zhaoquan,et al."Adversarial Multimodal Network for Movie Story Question Answering".IEEE TRANSACTIONS ON MULTIMEDIA 23(2021):1744-1756.

入库方式： OAI收割

来源：自动化研究所

下载0

Adversarial Multimodal Network for Movie Story Question Answering

其他版本