基于哈希编码的大规模图像检索方法研究
文献类型:学位论文
作者 | 袁勇 |
学位类别 | 硕士 |
答辩日期 | 2011-05-24 |
授予单位 | 中国科学院研究生院 |
授予地点 | 北京 |
导师 | 李学龙 |
关键词 | 基于内容的图像检索 近似最近邻 稀疏重构哈希 潜在语义最小哈希 |
学位专业 | 信号与信息处理 |
中文摘要 |
随着数字多媒体技术、互联网技术以及计算机硬件技术的快速发展,人们每天接触到的图片数量得到了空前的增长,以图像为基础的各类应用正遍及人们的日常,并为人们的生活和生产提供了极大的便利。图片作为信息表示的载体,能够传递更加丰富的信息,因而它是一种更加直观自然的表达方式。人类在利用图像数据并对图像内容进行结构化组织管理的时候,第一步要解决的是海量图像数据的检索问题。而对于海量图像数据,因图像数量大、图像表达特征维度高以及要求在检索时要求实时响应等特点,使得基于内容的图像检索技术对海量数据的检索面临极大的挑战。传统的线性扫描方法虽然能够获得很好的检索精度,但这种传统的线性扫描方式极其耗时而且内存使用效率非常低下,因此基于树结构的近似最近邻方法被提出来。基于树结构的搜索方法虽然能够较好的处理低维数据,但对于高维数据基于树结构的搜索方法其搜索效率会急剧下降,使得基于树结构的图像检索方法难以适应海量图像数据的索引,为此基于图像哈希的检索方法被提出来。基于图像哈希的检索方法将一幅图像对应的特征用一个二进制编码序列来表示,并采用编码序列间的汉明距离来度量图像间的相似性。由于将原特征用更紧凑的哈希码来表示,并且在计算汉明距离的时候可以充分发挥计算机长于位异或的运算,使得基于图像哈希的检索方法不仅大大提高了内存的使用效率,而且极大的缩短了检索响应的时间,因而它能够更好的适应海量图像数据的检索。本文围绕基于哈希编码的大规模图像检索展开了深入研究,论文的主要工作和贡献总结如下:
(1) 研究了大规模图像检索中典型的近似最近邻搜索方法,概括了它们的优缺点,总结归纳了目前的图像哈希方法,并对主流的图像哈希方法的基本思想进行了阐述。
(2) 提出了基于稀疏重构的图像哈希方法,对重构的系数用l21 范数进行稀疏约束以达到特征选择的目的,增强了特征在编码时的判别性,并且在所提出的方法中,将稀疏重构与哈希编码过程紧密关联起来,最后在调和协方差矩阵和最小化数据的重构误差间建立了一种合理的均衡机制。此外,作为本方法的拓展,在原有的基础上构建了图拉普拉斯进行局部邻域关系约束,使得特征在编码过程中其编码后的数据局部结构关系能够保持原特征空间数据局部结构关系,从而保证了原空间特征之间的局部关系和编码空间哈希码之间局部关系的一致性。本方法在多个公开的图像数据集上进行了测试,与一些主流的无监督图像哈希方法相比,本文方法在查准率、召回率以及平均检索精度指标下均有了明显的提升,本文方法的提出在一定程度上弥补了现有方法在构建图像哈希算法中对数据稀疏结构研究的不足。
(3) 提出了一种基于潜在语义最小哈希图像检索方法,该方法主要考虑了哈希编码过程是特征从连续域到离散域进行离散化的一个过程,在量化过程中其量化损失应尽可能的小,并且从特征层面看,现有的很多欧氏距离相似性保持图像哈希方法直接或间接以降维的方式将特征从原特征空间量化到汉明空间,缺乏对原始特征的再次提炼,有可能造成在量化时特征表达得就不是很好。出于这两点考虑,在确保特征量化损失最小的前提下,本文将特征以矩阵分解的方式将其映射到潜在语义空间,然后将它们结合起来构建了一种潜在语义最小哈希模型,在多个公开的图像检索数据库上与目前一些主流的图像哈希算法进行比较,本文所提出的潜在语义最小哈希图像检索方法在图像检索精度方面均取得了最好的结果。 |
英文摘要 |
With the rapid development of the multimedia technology, internet technology and computer hardware technology, the number of image people can get access to increases with exploration. The applications based on image provide great convenience for life and production. As an information carrier, image can deliver more information, so it is a more natural way to use it for communication. To mine the content of images and make them into well management, the first step is to retrieve the vast amounts of images. It’s a challenge work to retrieve on such vast amounts of images, since the number of images is large, and the feature dimension is high, and the search time need to be reduced. The brute-force search can get good accuracy, but it’s extremely time-consuming and low memory usage efficiency, so tree-based method has been proposed. The tree-based method shows promising performance, however, the performance of tree indexing drastically degrades to exhaustive linear search for high-dimensional data. Besides,
it also suffers from memory constraint. In order to alleviate this problem, hashing-based
method is exploited as a solution to the Approximate Nearest Neighbor (ANN) search. Compared with the tree-based method, hashing-based method maps feature into compact binary codes. Since relatively few bits are required , hashing-based method greatly reduces the storage consumption, which makes the storage efficient. In addition, hamming distance between two binary codes can be computed with milliseconds by conducting bit XOR operation, which makes the retrieval finish in a sub-linear or even constant time. This paper focuses on hashing-based method for large-scale image retrieval, and the main work and contributions can be summarized as follows:
(1). This paper makes research on the typical ANN methods for image retrieval, and it
summarizes their advantages and shortcomings. The existing hashing-based methods have been surveyed and classified into categories. Besides, for popular hashing-based methods, we summarize theirs main ideas.
(2). A hashing method based on sparse reconstruction for image retrieval is proposed. For the reconstructed coefficient of the proposed method, a constraint of l21 norm is imposed for feature selection, so as to increase the discrimination of feature before encoding binary codes. Moreover, the sparse reconstruction is fully related with the encoding process in the framework, so that it can interact and influence with each other. Finally, a reasonable balance mechanism is built between adjusted covariance matrix and the minimization reconstruction error. As a further expansion, local neighborhood relation is constrained to the sparse reconstruction using graph Laplacian matrix , which makes sure the local structure relationship in the Hamming space preserves the local structure relationship in the Euclidean space. Therefore, it ensures the local relationship between original feature space and binary codes is the same. Compared with the mainstream popular unsupervised hashing methods, the proposed hashing method tested on several public image datasets shows the performance is improved under different evaluations.
(3). A latent semantic minimal hashing for image retrieval is proposed. The motivations
of the proposed hashing method has two points. Firstly, since the binary encoding process for data is a discrete domain to a continuous domain, the quantization loss should be as small as possible. Secondly, from the view of feature representation level, the existing Euclidean distance similarity hashing methods directly or indirectly reduce dimensional feature from the feature space to quantified hamming space, which lacks of feature re-refining. The feature may not represent good when it quantifys to binary codes. Taking these two considerations, the proposed hashing method using matrix decomposition to map the feature to the latent semantic feature space under the premise of minimal quantization loss, and a latent semantic minimal hashing method model is proposed in this paper.
|
学科主题 | 信号与模式识别 |
语种 | 中文 |
源URL | [http://ir.opt.ac.cn/handle/181661/27967] ![]() |
专题 | 西安光学精密机械研究所_研究生部 |
作者单位 | 中国科学院西安光学精密机械研究所 |
推荐引用方式 GB/T 7714 | 袁勇. 基于哈希编码的大规模图像检索方法研究[D]. 北京. 中国科学院研究生院. 2011. |
入库方式: OAI收割
来源:西安光学精密机械研究所
浏览0
下载0
收藏0
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。