中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
ProSpect: Prompt Spectrum for Attribute-Aware Personalization of Diffusion Models

文献类型:期刊论文

作者Zhang, Yuxin5,6; Dong, Weiming5,6; Tang, Fan4; Huang, Nisha5,6; Huang, Haibin3; Ma, Chongyang3; Lee, Tong-Yee2; Deussen, Oliver1; Xu, Changsheng5,6
刊名ACM TRANSACTIONS ON GRAPHICS
出版日期2023-12-01
卷号42期号:6页码:14
关键词Image generation Diffusion models Attribute-aware editing Model personalization
ISSN号0730-0301
DOI10.1145/3618342
通讯作者Dong, Weiming(weiming.dong@ia.ac.cn)
英文摘要Personalizing generative models offers a way to guide image generation with user-provided references. Current personalization methods can invert an object or concept into the textual conditioning space and compose new natural sentences for text-to-image diffusion models. However, representing and editing specific visual attributes such as material, style, and layout remains a challenge, leading to a lack of disentanglement and editability. To address this problem, we propose a novel approach that leverages the step-by-step generation process of diffusion models, which generate images from low to high frequency information, providing a new perspective on representing, generating, and editing images. We develop the Prompt Spectrum Space P*, an expanded textual conditioning space, and a new image representation method called ProSpect. ProSpect represents an image as a collection of inverted textual token embeddings encoded from per-stage prompts, where each prompt corresponds to a specific generation stage (i.e., a group of consecutive steps) of the diffusion model. Experimental results demonstrate that P* and ProSpect offer better disentanglement and controllability compared to existing methods. We apply ProSpect in various personalized attribute-aware image generation applications, such as image-guided or text-driven manipulations of materials, style, and layout, achieving previously unattainable results from a single image input without fine-tuning the diffusion models. Our source code is available at https: //github.com/zyxElsa/ProSpect.
资助项目National Key R&D Program of China[2020AAA0106200] ; National Natural Science Foundation of China[61832016] ; National Natural Science Foundation of China[62102162] ; National Natural Science Foundation of China[U20B2070] ; Beijing Natural Science Foundation[L221013] ; National Science and Technology Council[111-2221-E-006-112-MY3] ; Deutsche Forschungsgemeinschaft (DFG)[413891298]
WOS研究方向Computer Science
语种英语
WOS记录号WOS:001139790400072
出版者ASSOC COMPUTING MACHINERY
资助机构National Key R&D Program of China ; National Natural Science Foundation of China ; Beijing Natural Science Foundation ; National Science and Technology Council ; Deutsche Forschungsgemeinschaft (DFG)
源URL[http://ir.ia.ac.cn/handle/173211/55394]  
专题多模态人工智能系统全国重点实验室
通讯作者Dong, Weiming
作者单位1.Univ Konstanz, Constance, Germany
2.Natl Cheng Kung Univ, Tainan, Taiwan
3.Kuaishou Technol, Beijing, Peoples R China
4.Chinese Acad Sci, Inst Comp Technol, Beijing, Peoples R China
5.UCAS, Sch Artificial Intelligence, Beijing, Peoples R China
6.Chinese Acad Sci, Inst Automat, MAIS, Beijing, Peoples R China
推荐引用方式
GB/T 7714
Zhang, Yuxin,Dong, Weiming,Tang, Fan,et al. ProSpect: Prompt Spectrum for Attribute-Aware Personalization of Diffusion Models[J]. ACM TRANSACTIONS ON GRAPHICS,2023,42(6):14.
APA Zhang, Yuxin.,Dong, Weiming.,Tang, Fan.,Huang, Nisha.,Huang, Haibin.,...&Xu, Changsheng.(2023).ProSpect: Prompt Spectrum for Attribute-Aware Personalization of Diffusion Models.ACM TRANSACTIONS ON GRAPHICS,42(6),14.
MLA Zhang, Yuxin,et al."ProSpect: Prompt Spectrum for Attribute-Aware Personalization of Diffusion Models".ACM TRANSACTIONS ON GRAPHICS 42.6(2023):14.

入库方式: OAI收割

来源:自动化研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。