中国科学院机构知识库网格系统: 基于词袋模型的图像表示及其在图像分类中的应用

中国科学院机构知识库网格

Chinese Academy of Sciences Institutional Repositories Grid

基于词袋模型的图像表示及其在图像分类中的应用

文献类型：学位论文


作者	孙涛
学位类别	工学硕士
答辩日期	2012-11-29
授予单位	中国科学院大学
授予地点	中国科学院自动化研究所
导师	卢汉清
关键词	图像分类图像表示词袋模型特征降维局部特征上下文 Image classification Image representation Bag of Words Feature dimension reduction Local feature context
其他题名	Image Representation Based on The Bag-of-Words Model and Application in Image Classification
学位专业	模式识别与智能系统
中文摘要	随着多媒体技术以及互联网的迅速发展，多媒体数据海量的涌现，图像作为一种信息载体，可以从中挖掘出很多有用信息。面对海量的图像资源，如何对图像进行自动、有效的分析、识别和索引，成为当前具有挑战的研究问题之一。图像分类是根据目标在图像信息中所反映的不同特征，把不同类别的目标区分开来的图像处理方法，是计算机视觉和模式识别中的一个经典问题。做好图像分类的工作意义重大，它可以帮助人们按照语义内容对图像进行浏览和管理，在一些图像分享网站上大大的减少人工标注，辅助图像检索等等。图像分类已经在一些领域得到了成功的应用，这些应用领域包括：搜索引擎；商标分类系统；家庭相册管理；数字图书馆等。本文的主要工作是基于词袋模型设计有效、鲁棒的图像表示方法，使其能更好地应用在图像分类任务中。首先，介绍图像分类算法中的主要模块，研究每个模块功能、存在的问题和解决方案；然后，基于词袋模型，重点对图像分类任务中的图像表示进行了深入的研究，其主要研究成果包括以下两个方面：首先，我们提出了一种基于半监督谱判别分析的特征降维算法，它综合了局部保持投影和局部费舍尔判别分析的优势，既可以保持未标注样本的全局结构，又可以分开不同类别的标注样本。同时，这种算法还避免了大部分监督特征降维算法在标注样本比较少的情况下，会出现的过拟合问题。我们将优化问题求解转化为特征值分解问题，从而通过解析形式可以得到全局最优解，保证了算法的计算可行性和效率。由此，我们得到了更为鲁棒、简单、且具有判别力的图像特征表示。其次，我们提出了一种考虑局部特征上下文关系的图像表示方法，为了表示图像某个位置的视觉信息，我们不仅用这个位置上的局部特征，还利用其他局部特征相对于这个位置的距离和角度。利用局部特征上下文关系表示的图像特征更具有判别力，而且具有旋转和尺度变化不变的特性，这种新颖的基于局部特征上下文关系的图像表示方法，可以与当下流行的图像分类方法有效的结合，并取得较好的分类性能。
英文摘要	With the rapid development of multimedia technology and Internet, the emergence of multimedia data is vast. As a carrier of information, image has a lot of information to be mined. Facing the massive image resources, the research on automatic, effective image analysis, recognition and indexing become challenging. Image classification is an image processing method based on different characteristics reflected by the target in the image information which distinguish different types of target, is a classical topic in the field of computer vision and pattern recognition. The work of image classification is of great significance, it can help people to view and manage the image by the semantic content, greatly reduce manual annotation in some image sharing sites, help the image retrieval and so on. Image classification has some successful applications such as: search engine; trademark classification systems; family album management; digital library and so on. The main work of this paper is to design efficient, robust image representation methods based on the bag-of-words (BoW) model which can be applied successfully in the task of image classification. Firstly, we introduce the main modules of the image classification algorithms, study the function of each module, the existing problems and the solutions. Then, we focus on the study of image representation in the image classification task based on the BoW model. The main contributions of this paper include the following two aspects: First, we propose a feature dimension reduction algorithm based on semi-supervised spectral discriminant analysis. It combines the locality preserving projections and the local fisher discriminant analysis so that the global structure of the unlabeled samples can be preserved as well as the labeled samples in different classes can be separated from each other. Our method avoids the overfitting problem which limits the development of most supervised dimensionality reduction methods when the number of available labeled samples is small. We transform the optimization problem into an eigenvalue decomposition problem so that the globally optimal solution is achieved with an analytic form. This guarantees that the proposed method is computationally reliable and efficient. This algorithm helps us to obtain more robust, more simple, more discriminative image representation. Second, we propose an image representation method by considering the local feature context. Given a positio...
语种	中文
其他标识符	200628014628045
源URL	[http://ir.ia.ac.cn/handle/173211/7650]
专题	毕业生_硕士学位论文
推荐引用方式 GB/T 7714	孙涛. 基于词袋模型的图像表示及其在图像分类中的应用[D]. 中国科学院自动化研究所. 中国科学院大学. 2012.

入库方式： OAI收割

来源：自动化研究所

浏览0

下载0

收藏0

其他版本

除非特别说明，本系统中所有内容都受版权保护，并保留所有权利。