中国科学院机构知识库网格系统: a novel duplicate images detection method based on plsa model

中国科学院机构知识库网格

Chinese Academy of Sciences Institutional Repositories Grid

a novel duplicate images detection method based on plsa model

文献类型：会议论文


作者	Liao Xiaofeng ; Wang Yongji ; Ding Liping ; Gu Jian
出版日期	2012
会议名称	4th International Conference on Machine Vision: Machine Vision, Image Processing, and Pattern Analysis, ICMV 2011
会议日期	December 9, 2011 - December 10, 2011
会议地点	Singapore, Singapore
关键词	Affine transforms Clustering algorithms Image retrieval Semantics
页码	-
中文摘要	Web image search results usually contain duplicate copies. This paper considers the problem of detecting and clustering duplicate images contained in web image search results. Detecting and clustering the duplicate images together facilitates users' viewing. A novel method is presented in this paper to detect and cluster duplicate images by measuring similarity between their topics. More specifically, images are viewed as documents consisting of visual words formed by vector quantizing the affine invariant visual features. Then a statistical model widely used in text domain, the PLSA(Probabilistic Latent Semantic Analysis) model, is utilized to map images into a probabilistic latent semantic space. Because the main content remains unchanged despite small digital alteration, duplicate images will be close to each other in the derived semantic space. Based on this, a simple clustering process can successfully detect duplicate images and cluster them together. Comparing to those methods based on comparison between hash value of visual words, this method is more robust to the visual feature level alteration posed on the images. Experiments demonstrates the effectiveness of this method. © 2012 Copyright Society of Photo-Optical Instrumentation Engineers (SPIE).
英文摘要	Web image search results usually contain duplicate copies. This paper considers the problem of detecting and clustering duplicate images contained in web image search results. Detecting and clustering the duplicate images together facilitates users' viewing. A novel method is presented in this paper to detect and cluster duplicate images by measuring similarity between their topics. More specifically, images are viewed as documents consisting of visual words formed by vector quantizing the affine invariant visual features. Then a statistical model widely used in text domain, the PLSA(Probabilistic Latent Semantic Analysis) model, is utilized to map images into a probabilistic latent semantic space. Because the main content remains unchanged despite small digital alteration, duplicate images will be close to each other in the derived semantic space. Based on this, a simple clustering process can successfully detect duplicate images and cluster them together. Comparing to those methods based on comparison between hash value of visual words, this method is more robust to the visual feature level alteration posed on the images. Experiments demonstrates the effectiveness of this method. © 2012 Copyright Society of Photo-Optical Instrumentation Engineers (SPIE).
收录类别	EI
会议主办者	Int. Assoc. Comput. Sci. Inf. Technol. (IACSIT)
会议录	Proceedings of SPIE - The International Society for Optical Engineering
语种	英语
ISSN号	0277-786X
ISBN号	9780819490254
源URL	[http://ir.iscas.ac.cn/handle/311060/15725]
专题	软件研究所_软件所图书馆_会议论文
推荐引用方式 GB/T 7714	Liao Xiaofeng,Wang Yongji,Ding Liping,et al. a novel duplicate images detection method based on plsa model[C]. 见:4th International Conference on Machine Vision: Machine Vision, Image Processing, and Pattern Analysis, ICMV 2011. Singapore, Singapore. December 9, 2011 - December 10, 2011.

入库方式： OAI收割

来源：软件研究所

浏览0

下载0

收藏0

其他版本

除非特别说明，本系统中所有内容都受版权保护，并保留所有权利。