AnyFace: Free-style Text-to-Face Synthesis and Manipulation
文献类型:会议论文
作者 | Sun, Jianxin1,2![]() ![]() ![]() ![]() ![]() |
出版日期 | 2022 |
会议日期 | 2022-6-19至2022-6-24 |
会议地点 | 美国新奥尔良/线上会议 |
关键词 | 人脸生成 文本到图像生成 |
页码 | 18687-18696 |
英文摘要 | Existing text-to-image synthesis methods generally are only applicable to words in the training dataset. However, human faces are so variable to be described with limited words. So this paper proposes the first free-style text-to-face method namely AnyFace enabling much wider open world applications such as metaverse, social media, cosmetics, forensics, etc. AnyFace has a novel two-stream framework for face image synthesis and manipulation given arbitrary descriptions of the human face. Specifically, one stream performs text-to-face generation and the other conducts face image reconstruction. Facial text and image features are extracted using the CLIP (Contrastive LanguageImage Pre-training) encoders. And a collaborative Cross Modal Distillation (CMD) module is designed to align the linguistic and visual features across these two streams. Furthermore, a Diverse Triplet Loss (DT loss) is developed to model fine-grained features and improve facial diversity. Extensive experiments on Multi-modal CelebA-HQ and CelebAText-HQ demonstrate significant advantages of AnyFace over state-of-the-art methods. AnyFace can achieve high-quality, high-resolution, and high-diversity face synthesis and manipulation results without any constraints on the number and content of input captions. |
源URL | [http://ir.ia.ac.cn/handle/173211/48943] ![]() |
专题 | 自动化研究所_智能感知与计算研究中心 |
通讯作者 | Li, Qi |
作者单位 | 1.Center for Research on Intelligent Perception and Computing, NLPR, CASIA 2.School of Artificial Intelligence, University of Chinese Academy of Sciences (UCAS) |
推荐引用方式 GB/T 7714 | Sun, Jianxin,Deng, Qiyao,Li, Qi,et al. AnyFace: Free-style Text-to-Face Synthesis and Manipulation[C]. 见:. 美国新奥尔良/线上会议. 2022-6-19至2022-6-24. |
入库方式: OAI收割
来源:自动化研究所
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。