Multi-caption text-to-face synthesis: Dataset and algorithm
文献类型:会议论文
作者 | Sun, Jianxin![]() ![]() ![]() ![]() |
出版日期 | 2021-10 |
会议日期 | 2021-10 |
会议地点 | China |
英文摘要 | Text-to-Face synthesis with multiple captions is still an important yet less addressed problem because of the lack of effective algorithms and large-scale datasets. We accordingly propose a Semantic Embedding and Attention (SEA-T2F) network that allows multiple captions as input to generate highly semantically related face images. With a novel Sentence Features Injection Module, SEA-T2F can integrate any number of captions into the network. In addition, an attention mechanism named Attention for Multiple Captions is proposed to fuse multiple word features and synthesize fine-grained details. Considering text-to-face generation is an ill-posed problem, we also introduce an attribute loss to guide the network to generate sentence-related attributes. Existing datasets for text-to-face are either too small or roughly generated according to attribute labels, which is not enough to train deep learning based methods to synthesize natural face images. Therefore, we build a large-scale dataset named CelebAText-HQ, in which each image is manually annotated with 10 captions. Extensive experiments demonstrate the effectiveness of our algorithm. |
源URL | [http://ir.ia.ac.cn/handle/173211/55261] ![]() |
专题 | 自动化研究所_智能感知与计算研究中心 |
通讯作者 | Sun, Zhenan |
作者单位 | Institute of Automation, Chinese Academy of Sciences |
推荐引用方式 GB/T 7714 | Sun, Jianxin,Li, Qi,Wang, Weining,et al. Multi-caption text-to-face synthesis: Dataset and algorithm[C]. 见:. China. 2021-10. |
入库方式: OAI收割
来源:自动化研究所
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。