中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
DIVE: Inverting Conditional Diffusion Models for Discriminative Tasks

文献类型:期刊论文

作者Li, Yinqi1,2; Chang, Hong1,2; Hou, Ruibing1; Shan, Shiguang1,2; Chen, Xilin1,2
刊名IEEE TRANSACTIONS ON MULTIMEDIA
出版日期2026
卷号28页码:297-308
关键词Diffusion models Object detection Layout Noise Training Noise reduction Vocabulary Dogs Bicycles Image synthesis Diffusion model generative modeling discriminative Task object detection visual recognition
ISSN号1520-9210
DOI10.1109/TMM.2025.3623508
英文摘要Diffusion models have shown remarkable progress in various generative tasks such as image and video generation. This paper studies the problem of leveraging pretrained diffusion models for performing discriminative tasks. Specifically, we extend the discriminative capability of pretrained frozen generative diffusion models from the classification task (Li et al., 2023), (Clark et al., 2023) to the more complex object detection task, by "inverting" a pretrained layout-to-image diffusion model. To this end, a gradient-based discrete optimization approach for replacing the heavy prediction enumeration process, and a prior distribution model for making more accurate use of the Bayes' rule, are proposed respectively. Empirical results show that this method is on par with basic discriminative object detection baselines on COCO dataset. In addition, our method can greatly speed up the previous diffusion-based method (Li et al., 2023), (Clark et al., 2023) for classification without sacrificing accuracy. Code and models are available at https://github.com/LiYinqi/DIVE.
资助项目National Natural Science Foundation of China (NSFC)[62376259] ; National Natural Science Foundation of China (NSFC)[62306301] ; National Postdoctoral Program for Innovative Talents[BX20220310]
WOS研究方向Computer Science ; Telecommunications
语种英语
WOS记录号WOS:001658631000033
出版者IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
源URL[http://119.78.100.204/handle/2XEOYT63/42841]  
专题中国科学院计算技术研究所
通讯作者Chang, Hong
作者单位1.Chinese Acad Sci, Inst Comp Technol, Key Lab Intelligent Informat Proc, Beijing 100190, Peoples R China
2.Univ Chinese Acad Sci, Beijing 100049, Peoples R China
推荐引用方式
GB/T 7714
Li, Yinqi,Chang, Hong,Hou, Ruibing,et al. DIVE: Inverting Conditional Diffusion Models for Discriminative Tasks[J]. IEEE TRANSACTIONS ON MULTIMEDIA,2026,28:297-308.
APA Li, Yinqi,Chang, Hong,Hou, Ruibing,Shan, Shiguang,&Chen, Xilin.(2026).DIVE: Inverting Conditional Diffusion Models for Discriminative Tasks.IEEE TRANSACTIONS ON MULTIMEDIA,28,297-308.
MLA Li, Yinqi,et al."DIVE: Inverting Conditional Diffusion Models for Discriminative Tasks".IEEE TRANSACTIONS ON MULTIMEDIA 28(2026):297-308.

入库方式: OAI收割

来源:计算技术研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。