GCNet: Graph Completion Network for Incomplete Multimodal Learning in Conversation
文献类型:期刊论文
作者 | Lian, Zheng3![]() ![]() ![]() ![]() ![]() |
刊名 | IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
![]() |
出版日期 | 2023-07-01 |
卷号 | 45期号:7页码:8419-8432 |
关键词 | Oral communication Correlation Data models Task analysis Feature extraction Tensors Benchmark testing Conversational data graph complete network (GCNet) incomplete multimodal learning speaker-sensitive modeling temporal-sensitive modeling |
ISSN号 | 0162-8828 |
DOI | 10.1109/TPAMI.2023.3234553 |
通讯作者 | Liu, Bin(liubin@nlpr.ia.ac.cn) ; Tao, Jianhua(jhtao@tsinghua.edu.cn) |
英文摘要 | Conversations have become a critical data format on social media platforms. Understanding conversation from emotion, content and other aspects also attracts increasing attention from researchers due to its widespread application in human-computer interaction. In real-world environments, we often encounter the problem of incomplete modalities, which has become a core issue of conversation understanding. To address this problem, researchers propose various methods. However, existing approaches are mainly designed for individual utterances rather than conversational data, which cannot fully exploit temporal and speaker information in conversations. To this end, we propose a novel framework for incomplete multimodal learning in conversations, called "Graph Complete Network (GCNet)," filling the gap of existing works. Our GCNet contains two well-designed graph neural network-based modules, "Speaker GNN" and "Temporal GNN," to capture temporal and speaker dependencies. To make full use of complete and incomplete data, we jointly optimize classification and reconstruction tasks in an end-to-end manner. To verify the effectiveness of our method, we conduct experiments on three benchmark conversational datasets. Experimental results demonstrate that our GCNet is superior to existing state-of-the-art approaches in incomplete multimodal learning. |
资助项目 | National Key Research and Development Plan of China[2020AAA0140003] ; National Natural Science Foundation of China (NSFC)[62201572] ; National Natural Science Foundation of China (NSFC)[61831022] ; National Natural Science Foundation of China (NSFC)[62276259] ; National Natural Science Foundation of China (NSFC)[U21B2010] ; Open Research Projects of Zhejiang Lab[2021KH0AB06] |
WOS研究方向 | Computer Science ; Engineering |
语种 | 英语 |
WOS记录号 | WOS:001004665900031 |
出版者 | IEEE COMPUTER SOC |
资助机构 | National Key Research and Development Plan of China ; National Natural Science Foundation of China (NSFC) ; Open Research Projects of Zhejiang Lab |
源URL | [http://ir.ia.ac.cn/handle/173211/53615] ![]() |
专题 | 多模态人工智能系统全国重点实验室 |
通讯作者 | Liu, Bin; Tao, Jianhua |
作者单位 | 1.Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R China 2.Tsinghua Univ, Dept Automat, Beijing 100084, Peoples R China 3.Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing 100190, Peoples R China |
推荐引用方式 GB/T 7714 | Lian, Zheng,Chen, Lan,Sun, Licai,et al. GCNet: Graph Completion Network for Incomplete Multimodal Learning in Conversation[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,2023,45(7):8419-8432. |
APA | Lian, Zheng,Chen, Lan,Sun, Licai,Liu, Bin,&Tao, Jianhua.(2023).GCNet: Graph Completion Network for Incomplete Multimodal Learning in Conversation.IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,45(7),8419-8432. |
MLA | Lian, Zheng,et al."GCNet: Graph Completion Network for Incomplete Multimodal Learning in Conversation".IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 45.7(2023):8419-8432. |
入库方式: OAI收割
来源:自动化研究所
浏览0
下载0
收藏0
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。