中国科学院机构知识库网格系统: 图像内容分析方法与应用研究

中国科学院机构知识库网格

Chinese Academy of Sciences Institutional Repositories Grid

图像内容分析方法与应用研究

文献类型：学位论文


作者	张荣国
学位类别	工学博士
答辩日期	2011-12-07
授予单位	中国科学院研究生院
授予地点	中国科学院自动化研究所
导师	王春恒
关键词	图像内容分析图像分类目标检测稀疏表示特征变换在线学习 Image Content Analysis Image Classification Object Detection Sparse Representation Feature Transformation Online Learning
其他题名	Image Content Analysis Method and Applications
学位专业	模式识别与智能系统
中文摘要	随着多媒体技术、互联网技术及移动信息技术的快速发展，图像和视频成为日益重要的信息载体。如何有效地管理海量的图片和视频资源，并高效地从这些资源中挖掘出有应用价值的信息，是当前计算机应用技术领域的重要研究课题。其中模式识别和计算机视觉技术的应用，为解决这一问题提供了必要的方法和手段，图像内容分析技术作为计算机视觉技术的一个重要组成部分，具有重要的研究价值和应用前景。在实际应用中，图像中的目标（Object）通常是理解图像内容的重要线索，也是分析图像内容的重要层次之一。以目标为中心（object-based）的内容分析是图像内容分析研究的一个重要分支，本文着重针对基于视觉目标的图像内容分析相关问题：图像中的目标分类、图像中的目标检测、基于检测的目标跟踪以及特征融合和特征变换，展开方法研究和应用系统开发。本文的主要工作总结如下： (1) 设计了一种基于非负稀疏分解的图像目标分类方法。不同于传统的稀疏表示分类方法，i)对每一类通过非负稀疏表示学习得到该类别的“正负词典”，而不是采用与类别无关的统一的一个词典，增强了稀疏表示词典的区分能力；ii）分解系数非负性约束的引入，不仅具有更明确的生物物理背景，而且非负分解特性更合乎人类视觉感知的直观体验；iii）基于正负词典对测试图像进行非负稀疏重构，通过对重构系数进行分析而不是通过重构误差来完成对该图像的分类。基于bag-of-words模型本身的数据非负性和整体是局部的非负线性组合特性，该方法与bag-of-words模型的融合在图像目标分类实验中取得了较好的分类效果。 (2) 针对模式识别与计算机视觉领域常用的直方图特征，引入了一种简单有效的幂指数特征变换方法。大量实验结果表明，这种特征变换能够提高直方图特征的线性可分性和判别能力，使得变换后的直方图特征无需采用复杂的卡方距离或者测地距离（EMD）度量而采用简单的欧氏距离度量就可以明显提高图像分类与目标检测的正确率。在目标检测的实验中，该变换在SVM的RBF核上取得了优于卡方核的实验效果，但运算速度比卡方核快了近20倍。 (3) 设计了一种基于超像素的图像目标检测方法。通过对图像的过分割处理，设计了一种基于超像素得到候选检测窗口的高效算法。实验结果表明，相对于传统的滑动窗方法，候选窗口的数目平均减少了38%，同时检测的准确度也得到了提高，并且通过一次分割和算法分析得到的候选检测窗口，可以用来进行图像中多个目标的检测。 (4) 设计了一种半监督在线学习的视频目标检测跟踪方法。针对监督式机器学习需要大量带标签样本的问题，研制了一种基于视频帧域自适应的半监督在线学习方法，充分利用同一帧域内视频帧的相似性，结合随机森林分类器用自训练的方式对分类器进行更新，实现了对视频中特定目标对象的在线自适应检测与跟踪。对比实验表明了该方法的有效性。 (5) 在以上研究成果的基础上，结合实际应用开发完成了非法图像信息判别系统和用于电子商务平台的图像检索系统。其中，非法图像信息判别系统已经在江苏电信上线稳定运行一年半，在广西电信上线稳定运行一年，并由华为公司用来进行海外业务推广。
英文摘要	With the rapid progress of multimedia technology, internet technology and mobile information technology , images and videos play more and more important roles as the carriers of information. How to deal with the massive source and extract valuable information efficiently become a significant research issue in application of computer technology. The applications of pattern recognition and computer vision technology have achieved some remarkable results. As an important part of computer vision technology，the image content analysis technology is always with important research value and application prospect. In practical applications, the objects are usually important clues. This thesis focuses the research on the object-based image content analysis: the object classification, the object detection, the object tracking by detection, feature fusion and feature transformation.The application systems based on the above research were built. The main contributions include: (1) An image classification method based on non-negative sparse representation was proposed. i)The positive and negative dictionaries were got ¯rstly based on the images' patches. This method improved the separating capacity of the sparse representation dictionaries.ii)The constraint condition of non-negative corresponded to the human cognition and had clear physical meaning. iii)The learned dictionaries were used to reconstruct the test image sparsely.The method achieved the classification purpose by analyzing the reconstruction coefficients. And the non-negative sparse representation was also fused into bag-of-words model to get the histogram feature. (2) The thesis proposed a kind of data transformation for the common histogram feature. A lot of experiments proved that, the feature transformation can enhanced the linear separability and discrimination ability. In SVM classifier, the RBF kernel based on feature transformation achieved better result than chi-square kernel, and the speed was almost 20 times faster than chi-square kernel.A lot of experiences proving that, the performances of object detection and classification have been enhanced after the feature transformation. (3) An algorithm based on the superpixels was proposed, the low-level and mid-level vision features are fused and the candidate detection windows are obtained effectively. This method reduced the number of candidate detection windows greatly and improved the precision of detection. The detection windows got by the...
语种	中文
其他标识符	200818014628076
源URL	[http://ir.ia.ac.cn/handle/173211/6411]
专题	毕业生_博士学位论文
推荐引用方式 GB/T 7714	张荣国. 图像内容分析方法与应用研究[D]. 中国科学院自动化研究所. 中国科学院研究生院. 2011.

入库方式： OAI收割

来源：自动化研究所

浏览0

下载0

收藏0

其他版本

除非特别说明，本系统中所有内容都受版权保护，并保留所有权利。