Hierarchical image-to-image translation with nested distributions modeling
文献类型:期刊论文
作者 | Qiao, Shishi1,2,3; Wang, Ruiping1,2; Shan, Shiguang1,2; Chen, Xilin1,2 |
刊名 | PATTERN RECOGNITION
![]() |
出版日期 | 2024-02-01 |
卷号 | 146页码:12 |
关键词 | Image-to-image translation Distribution modeling Information entropy Generative adversarial network |
ISSN号 | 0031-3203 |
DOI | 10.1016/j.patcog.2023.110058 |
英文摘要 | Unpaired image-to-image translation among category domains has achieved remarkable success in past decades. Recent studies mainly focus on two challenges. For one thing, such translation is inherently multi-modal (i.e. many-to-many mapping) due to variations of domain-specific information (e.g., the domain of house cat contains multiple sub-modes), which is usually addressed by predefined distribution sampling. For another, most existing multi-modal approaches have limits in handling more than two domains with one model, i.e. they have to independently build two distributions to capture variations for every pair of domains. To address these problems, we propose a Hierarchical Image-to-image Translation (HIT) method which jointly formulates the multi-domain and multi-modal problem in a semantic hierarchy structure by modeling a common and nested distribution space. Specifically, domains have inclusion relationships under a particular hierarchy structure. With the assumption of Gaussian prior for domains, distributions of domains at lower levels capture the local variations of their ancestors at higher levels, leading to the so-called nested distributions. To this end, we propose a nested distribution loss in light of the distribution divergence measurement and information entropy theory to characterize the aforementioned inclusion relations among domain distributions. Experiments on ImageNet, ShapeNet, and CelebA datasets validate the promising results of our HIT against state-of-the-arts, and as additional benefits of nested modeling, one can even control the uncertainty of multi-modal translations at different hierarchy levels. |
资助项目 | National Key R&D Program of China[2021ZD0111901] ; Natural Science Foundation of China[U21B2025] ; Natural Science Foundation of China[U19B2036] ; Natural Science Foundation of China[62206260] |
WOS研究方向 | Computer Science ; Engineering |
语种 | 英语 |
WOS记录号 | WOS:001102929200001 |
出版者 | ELSEVIER SCI LTD |
源URL | [http://119.78.100.204/handle/2XEOYT63/38089] ![]() |
专题 | 中国科学院计算技术研究所期刊论文_英文 |
通讯作者 | Wang, Ruiping |
作者单位 | 1.Univ Chinese Acad Sci, Beijing 100049, Peoples R China 2.Chinese Acad Sci, Inst Comp Technol, Key Lab Intelligent Informat Proc, Beijing 100190, Peoples R China 3.Ocean Univ China, Coll Informat Sci & Engn, Qingdao 266100, Peoples R China |
推荐引用方式 GB/T 7714 | Qiao, Shishi,Wang, Ruiping,Shan, Shiguang,et al. Hierarchical image-to-image translation with nested distributions modeling[J]. PATTERN RECOGNITION,2024,146:12. |
APA | Qiao, Shishi,Wang, Ruiping,Shan, Shiguang,&Chen, Xilin.(2024).Hierarchical image-to-image translation with nested distributions modeling.PATTERN RECOGNITION,146,12. |
MLA | Qiao, Shishi,et al."Hierarchical image-to-image translation with nested distributions modeling".PATTERN RECOGNITION 146(2024):12. |
入库方式: OAI收割
来源:计算技术研究所
浏览0
下载0
收藏0
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。