Image2song: Song Retrieval via Bridging Image Content and Lyric Words
文献类型:会议论文
作者 | Li, Xuelong1; Hu, Di2; Lu, Xiaoqiang1 |
出版日期 | 2017-12-22 |
会议日期 | 2017-10-22 |
会议地点 | Venice, Italy |
卷号 | 2017-October |
DOI | 10.1109/ICCV.2017.602 |
页码 | 5650-5659 |
英文摘要 | Image is usually taken for expressing some kinds of emotions or purposes, such as love, celebrating Christmas. There is another better way that combines the image and relevant song to amplify the expression, which has drawn much attention in the social network recently. Hence, the automatic selection of songs should be expected. In this paper, we propose to retrieve semantic relevant songs just by an image query, which is named as the image2song problem. Motivated by the requirements of establishing correlation in semantic/content, we build a semantic-based song retrieval framework, which learns the correlation between image content and lyric words. This model uses a convolutional neural network to generate rich tags from image regions, a recurrent neural network to model lyric, and then establishes correlation via a multi-layer perceptron. To reduce the content gap between image and lyric, we propose to make the lyric modeling focus on the main image content via a tag attention. We collect a dataset from the social-sharing multimodal data to study the proposed problem, which consists of (image, music clip, lyric) triplets. We demonstrate that our proposed model shows noticeable results in the image2song retrieval task and provides suitable songs. Besides, the song2image task is also performed. © 2017 IEEE. |
产权排序 | 1 |
会议录 | Proceedings - 2017 IEEE International Conference on Computer Vision, ICCV 2017 |
会议录出版者 | Institute of Electrical and Electronics Engineers Inc. |
语种 | 英语 |
ISSN号 | 15505499 |
ISBN号 | 9781538610329 |
源URL | [http://ir.opt.ac.cn/handle/181661/29942] |
专题 | 西安光学精密机械研究所_光学影像学习与分析中心 |
作者单位 | 1.Xi'An Institute of Optics and Precision Mechanics, Chinese Academy of Sciences, Xi'an, 710119, China 2.School of Computer Science, Center for OPTical IMagery Analysis and Learning (OPTIMAL), Northwestern Polytechnical University, Xi'an, 710072, China |
推荐引用方式 GB/T 7714 | Li, Xuelong,Hu, Di,Lu, Xiaoqiang. Image2song: Song Retrieval via Bridging Image Content and Lyric Words[C]. 见:. Venice, Italy. 2017-10-22. |
入库方式: OAI收割
来源:西安光学精密机械研究所
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。