中国科学院机构知识库网格系统: Image-Text Cross-Modal Retrieval via Modality-Specific Feature Learning

中国科学院机构知识库网格

Chinese Academy of Sciences Institutional Repositories Grid

Image-Text Cross-Modal Retrieval via Modality-Specific Feature Learning

文献类型：会议论文


作者	Jian Wang; Yonghao He; Cuicui Kang; Shiming Xiang; Chunhong Pan
出版日期	2015
会议日期	2015-6
会议地点	Shanghai, China
英文摘要	Cross-modal retrieval extends the ability of search engines to deal with the massive cross-modal data. The goal of image-text cross-modal retrieval is to search images (texts) by using text (image) queries by computing the similarities of images and texts directly. Many existing methods rely on low-level visual features and textual features for cross-modal retrieval, ignoring the characteristics existing in the raw data of different modalities. In this paper, a novel model based on modality-specific feature learning is proposed. Considering the characteristics of different modalities, the model uses two types of convolutional neural networks to map the raw data to the latent space representations for images and texts, respectively. Particularly, the convolution based network used for texts involves word embedding learning, which has been proved effective to extract meaningful textual features for text classification. In the latent space, the mapped features of images and texts form relevant and irrelevant image-text pairs, which are used by the one-vs-more learning scheme. This learning scheme can achieve ranking functionality by allowing for one relevant and more irrelevant pairs. The standard back-propagation technique is employed to update the parameters of two convolutional networks. Extensive cross-modal retrieval experiments are carried out on three challenging datasets that consist of image-document pairs or image-query click-through data from a search engine, and the results firmly demonstrate that the proposed model is much more effective.
源URL	[http://ir.ia.ac.cn/handle/173211/20369]
专题	自动化研究所_模式识别国家重点实验室_遥感图像处理团队
作者单位	National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences
推荐引用方式 GB/T 7714	Jian Wang,Yonghao He,Cuicui Kang,et al. Image-Text Cross-Modal Retrieval via Modality-Specific Feature Learning[C]. 见:. Shanghai, China. 2015-6.

入库方式： OAI收割

来源：自动化研究所

浏览0

下载0

收藏0

其他版本

除非特别说明，本系统中所有内容都受版权保护，并保留所有权利。