中国科学院机构知识库网格系统: MEAD: A Large-scale Audio-visual Dataset for Emotional Talking Face Generation

中国科学院机构知识库网格

Chinese Academy of Sciences Institutional Repositories Grid

MEAD: A Large-scale Audio-visual Dataset for Emotional Talking Face Generation

文献类型：会议论文


作者	王凯思源 6; Song LS(宋林森)4,5 ; Wu QY(吴潜溢)4,5; Yang ZQ(杨卓谦)1; Wu WY(吴文岩)6; Qian C(钱晨)6; He R(赫然)4,5 ; Qiao Y(乔宇)2; Loy, Chen Change 3
出版日期	2020-08-23
会议日期	2020-08-23
会议地点	Glasgow
英文摘要	The synthesis of natural emotional reactions is an essential criterion in vivid talking-face video generation. This criterion is neverthe- less seldom taken into consideration in previous works due to the absence of a large-scale, high-quality emotional audio-visual dataset. To address this issue, we build the Multi-view Emotional Audio-visual Dataset (MEAD), a talking-face video corpus featuring 60 actors and actresses talking with eight different emotions at three different intensity levels. High-quality audio-visual clips are captured at seven different view angles in a strictly-controlled environment. Together with the dataset, we release an emotional talking-face generation baseline that enables the manipulation of both emotion and its intensity. Our dataset could bene- fit a number of di↵erent research fields including conditional generation, cross-modal understanding and expression recognition. Code, model and data are publicly available on our project page.
语种	英语
源URL	[http://ir.ia.ac.cn/handle/173211/52265]
专题	自动化研究所_智能感知与计算研究中心
作者单位	1.卡内基梅隆大学 2.中科院深圳先进技术研究院 3.南洋理工大学 4.中科院自动化所 5.中国科学院大学 6.北京商汤科技有限公司
推荐引用方式 GB/T 7714	王凯思源,Song LS,Wu QY,et al. MEAD: A Large-scale Audio-visual Dataset for Emotional Talking Face Generation[C]. 见:. Glasgow. 2020-08-23.

入库方式： OAI收割

来源：自动化研究所

浏览0

下载0

收藏0

其他版本

除非特别说明，本系统中所有内容都受版权保护，并保留所有权利。