|
作者 | Huang, Yan; Long, Yang; Wang, Liang
|
出版日期 | 2019-01
|
会议日期 | 2019.1.27-2019.2.1
|
会议地点 | Honolulu
|
关键词 | Image And Sentence Matching
|
卷号 | 33
|
期号 | 0
|
DOI | 0
|
页码 | 8489-8496
|
英文摘要 | Although image and sentence matching has been widely studied,
its intrinsic few-shot problem is commonly ignored,
which has became a bottleneck for further performance improvement.
In this work, we focus on the challenging problem
of few-shot image and sentence matching, and propose a Gated
Visual-Semantic Embedding (GVSE) model to deal with
it. The model consists of three corporative modules in terms
of uncommon VSE, common VSE, and gated metric fusion.
The uncommon VSE exploits external auxiliary sources to
extract generic features for describing uncommon instances
and words in images and sentences, and then integrates them
by modeling their semantic relation to obtain global representations
for association analysis. To better model the most
common instances and words in rest content of images and
sentences, the common VSE learns their discriminative representations
directly from scratch. After obtaining two similarity
metrics from the two VSE modules, the gated metric
fusion module adaptively fuses them by automatically balancing
their relative importance. Based on the fused metric,
we performance extensive experiments in terms of few-shot
and conventional image and sentence matching, and demonstrate
the effectiveness of the proposed model by achieving
the state-of-the-art results on two public benchmark datasets. |
源文献作者 | Zhihua Zhou
|
会议录出版者 | IEEE
|
会议录出版地 | USA
|
URL标识 | 查看原文
|
源URL | [http://ir.ia.ac.cn/handle/173211/25798] |
专题 | 自动化研究所_智能感知与计算研究中心
|
作者单位 | 中科院自动化所
|
推荐引用方式 GB/T 7714 |
Huang, Yan,Long, Yang,Wang, Liang. Few-Shot Image and Sentence Matching via Gated Visual-Semantic Embedding[C]. 见:. Honolulu. 2019.1.27-2019.2.1.
|