中国科学院机构知识库网格系统: Cross-modal semantic correlation learning by Bi-CNN network

中国科学院机构知识库网格

Chinese Academy of Sciences Institutional Repositories Grid

Cross-modal semantic correlation learning by Bi-CNN network

文献类型：期刊论文


作者	Wang, Chaoyi 2; Li, Liang 1; Yan, Chenggang 2; Wang, Zhan 3; Sun, Yaoqi 2; Zhang, Jiyong 2
刊名	IET IMAGE PROCESSING
出版日期	2021-03-18
页码	11
ISSN号	1751-9659
DOI	10.1049/ipr2.12176
英文摘要	Cross modal retrieval can retrieve images through a text query and vice versa. In recent years, cross modal retrieval has attracted extensive attention. The purpose of most now available cross modal retrieval methods is to find a common subspace and maximize the different modal correlation. To generate specific representations consistent with cross modal tasks, this paper proposes a novel cross modal retrieval framework, which integrates feature learning and latent space embedding. In detail, we proposed a deep CNN and a shallow CNN to extract the feature of the samples. The deep CNN is used to extract the representation of images, and the shallow CNN uses a multi-dimensional kernel to extract multi-level semantic representation of text. Meanwhile, we enhance the semantic manifold by constructing cross modal ranking and within-modal discriminant loss to improve the division of semantic representation. Moreover, the most representative samples are selected by using online sampling strategy, so that the approach can be implemented on a large-scale data. This approach not only increases the discriminative ability among different categories, but also maximizes the relativity between different modalities. Experiments on three real word datasets show that the proposed method is superior to the popular methods.
WOS研究方向	Computer Science ; Engineering ; Imaging Science & Photographic Technology
语种	英语
WOS记录号	WOS:000630032600001
出版者	WILEY
源URL	[http://119.78.100.204/handle/2XEOYT63/16807]
专题	中国科学院计算技术研究所期刊论文_英文
通讯作者	Zhang, Jiyong
作者单位	1.Chinese Acad Sci, Inst Comp Technol, Beijing, Peoples R China 2.Hangzhou Dianzi Univ, Hangzhou, Peoples R China 3.RTInvent Technol Co Ltd, Beijing, Peoples R China
推荐引用方式 GB/T 7714	Wang, Chaoyi,Li, Liang,Yan, Chenggang,et al. Cross-modal semantic correlation learning by Bi-CNN network[J]. IET IMAGE PROCESSING,2021:11.
APA	Wang, Chaoyi,Li, Liang,Yan, Chenggang,Wang, Zhan,Sun, Yaoqi,&Zhang, Jiyong.(2021).Cross-modal semantic correlation learning by Bi-CNN network.IET IMAGE PROCESSING,11.
MLA	Wang, Chaoyi,et al."Cross-modal semantic correlation learning by Bi-CNN network".IET IMAGE PROCESSING (2021):11.

入库方式： OAI收割

来源：计算技术研究所

浏览0

下载0

收藏0

其他版本

除非特别说明，本系统中所有内容都受版权保护，并保留所有权利。