中国科学院机构知识库网格系统: 快速判别性稀疏编码及其在图像识别中的应用

中国科学院机构知识库网格

Chinese Academy of Sciences Institutional Repositories Grid

快速判别性稀疏编码及其在图像识别中的应用

文献类型：学位论文


作者	姜锐
学位类别	工学博士
答辩日期	2015-12
授予单位	中国科学院研究生院
授予地点	北京
导师	乔红
关键词	判别性稀疏编码机器学习图像识别特征提取
中文摘要	稀疏编码方法自1996年在国际著名学术期刊《Nature》上被首次提出以来，已经成为图像处理领域的研究热点。作为稀疏编码基本模型的一类推广模型，判别性稀疏编码模型被成功应用于针对识别任务的图像特征提取，形成了基于判别性稀疏编码的图像特征学习技术和图像特征构造技术。然而，判别性稀疏编码模型的训练和推断的时间长，该问题已经成为限制这类模型走向实际应用的瓶颈。针对这一问题，本文开展了对判别性稀疏编码快速算法的研究。具体地，我们聚焦于以下几个问题： (1) 图正则化稀疏编码模型是一种典型的判别性稀疏编码模型。虽然该模型已经在图像识别领域被成功应用于图像特征学习和图像特征构造，但是图正则化稀疏编码非常耗时的问题使得这些技术很难走向实用。针对该问题，需要设计图正则化稀疏编码的快速算法。 (2) Fisher判别性字典学习方法是近年来提出的一种判别性稀疏编码方法。该方法的原始算法非常耗时，虽然简化的模型训练速度很快，但该模型忽视了协同性重构的作用，因此在不同类别变化不平衡的分类任务中表现不稳定。针对这样的分类任务，需要为Fisher判别性字典学习重新设计快速而稳定的算法。 (3) 现有的非负字典更新策略是批量式的，或者是为基于ℓ0“范数”的非负稀疏编码模型所设计。针对这一问题，需要设计适用于一般非负稀疏编码模型的字典更新快速算法。本文致力于解决以上问题。全文的主要创新点和贡献总结如下。 (1) 针对图正则化稀疏编码（Graph regularized Sparse Coding，GSC）运行速度过慢的问题，本文提出了一种加速GSC的方法。GSC采用了一种交替优化框架，该框架反复解一个ℓ1正则最小二乘问题的变体问题，我们简称该问题为GSRsub。解GSRsub的传统方法是解该问题的原问题，虽然GSRsub原问题的目标函数是强凸的，但是不可微，这导致算法收敛过慢。我们提出通过解GSRsub的一个对偶问题来加速GSC，我们简称该问题为D-GSRsub。D-GSRsub的目标函数强凸、平滑，且D-GSRsub的变量数比GSRsub要少。基于D-GSRsub的性质，我们尝试了四种计算复杂度较低的对偶上升策略，并且在真实数据集的图像分类、图像聚类任务上测试了这四种策略的效果。实验结果表明，这些策略在保证GSC效果的同时，可以显著提高GSC的计算效率。 (2) 为了设计适用于不同类别变化不平衡的分类任务的快速Fisher判别性字典学习（Efficient Fisher Discrimination Dictionary Learning，E-FDDL）算法，我们提出解一个近似的Fisher判别性稀疏表示（Fisher Discrimination based Sparse Representation，FDSR）问题，它的目标函数是原始FDDL算法中FDSR问题目标函数的上界。该近似FDSR（Approximate FDSR，AFDSR）问题考虑了判别性重构和协同性重构两方面的作用，并且稍稍重视后者的作用，这使得E-FDDL在处理同类别变化不均匀的分类任务时更加鲁棒。进一步地，A-FDSR问题的结构使得快速的优化策略适用于该问题，这又带来了E-FDDL的快速性。我们在人脸识别实验的结果证实了E-FDDL快速稳定的表现。 (3) 针对快速非负字典更新策略的问题，我们设计了一种既适用于基于ℓ0“范数”的非负稀疏编码又适用于基于ℓ1范数的非负稀疏编码的字典更新快速算法。该算法按顺序解K个单位范数和非负约束下的投影最大化问题，我们简称该方法为K-UNPM。我们比较了K-UNPM和N-K-SVD在单个字典列更新过程中的浮点运算次数，说明K-UNPM比N-K-SVD的计算复杂度低。我们设计了一种以K-UNPM为字典更新策略的基于图像块的判别性非负稀疏编码（PDNSC）方法，该方法在USPS手写体数字数据集上的聚类实验中超越了经典特征学习方法，这间接佐证了K-UNPM在优化PDNSC模型时的有效性。本文的研究工作得到了国家自然科学基金委重点项目（课题编号：11131006）“稀疏信息处理的数学理论与方法”、国家自然科学基金委重点项目（课题编号：61033011）“引入人的视―触觉认知及其融合实现机器人与人的仿人交互和合作”和国家自然科学基金国际合作与交流项目（课题编号：61210009）“‘视觉认知―反应’融合神经系统可计算模型及其在机器人中的应用――面向重大任务的智能决策研究”的支持，在理论和应用上都具有重要的研究意义。
英文摘要	Sparse coding has drawn great research interests in the field of image processing, since it was firstly proposed in Nature in 1996. As extended versions of basic models of sparse coding, Discriminative Sparse Coding (DSC) models has been successfully applied to feature extraction for image recognition tasks. The developed feature extraction techniques generally fall into two categories, the DSC based image feature learning and the DSC based image feature construction. However, the great execution time of DSC algorithms has been a bottleneck that restricts their practical use. In view of this problem, in this thesis we studied fast algorithms for DSC models. Specifically, we focused on the following issues. (1) Graph regularized Sparse Coding (GSC) model is a typical DSC model and has been successfully applied in image feature learning and image feature construction. However, GSC is very time-consuming. Thus, efficient strategies are aspired to speed up GSC. (2) Fisher Discrimination Dictionary Learning (FDDL) is another effective DSC method presented in resent years. However, the original FDDL is timeconsuming. The simplified FDDL is fast, but it ignored the role of collaborative reconstruction and thus has an unstable performance in classification tasks with unbalanced changes in different classes. For such classification problem, an efficient and robust FDDL is needed. (3) Current nonnegative dictionary update strategies are either batch projected gradient descent method or designed for ℓ0 norm based Nonnegative Sparse Coding (NSC) model. A fast nonnegative dictionary update strategy for more nonnegative sparse coding models has not been developed. This thesis is devoted to addressing the above issues. And the innovations and contributions of this thesis can be summarized as follows. (1) For the time-consuming issue of GSC, we presented a new efficient way to speed up GSC. Specifically, the alternating optimization framework for GSC involves repeatedly solving a variant of ℓ1 regularized least square problem referred to as GSRsub. Traditional ways to deal with GSRsub are to generalize optimization strategies for ℓ1 regularized least square problem to solve its primal problem that is strongly convex but non-differentiable, thus converging slowly. We propose that GSC can be accelerated by solving a new dual problem of GSRsub called D-GSRsub. Compared with the primal form and the existing dual form of GSRsub, D-GSRsub has a strongly convex and smooth objective function with less variables. Based on these properties, four dual gradient ascent strategies with lower computational complexities are developed. Experimental results on real-world datasets demonstrate that these strategies can dramatically and stably speed up GSC without affecting its performance in the corresponding image classification and clustering tasks. (2) For developing an Efficient FDDL (E-FDDL) method for classification tasks with unbalanced changes in different classes, we propose to solve an approximate Fisher Discrimination based Sparse Representation (FDSR) problem whose objective function is an upper bound of that of the original FDSR problem in the O-FDDL algorithm. The Approximate FDSR (A-FDSR) problem considers the role of both the discriminative reconstruction and the collaborative reconstruction, and is slightly weighted in favor of the latter. This grants E-FDDL more stability when dealing with the above issues. Furthermore, the structure of A-FDSR makes fast optimization strategies applicable, thus leading to the high efficiency of E-FDDL. Results of face recognition experiments demonstrates the fast and stable performance of E-FDDL. (3) For fast nonnegative dictionary update strategy issue, we proposed an efficient nonnegative dictionary update strategy which is suitable for both ℓ0 norm based NSC model and ℓ1 norm based NSC model. This strategy sequentially updates dictionary columns by sequentially solving K Unit-norm and Nonnegativity constrained Projection Maximization problems. Thus this strategy is referred to as K-UNPM. We compared floating point operation counts of K-UNPM and Nonnegative K-SVD (N-K-SVD) during each single dictionary column update, and establish the lower computational complexity of K-UNPM. Furthermore, we design a Patch based Discriminative Nonnegative Sparse Coding (PDNSC) method which adopts K-UNPM as the dictionary update strategy. PDNSC outperforms some classical feature learning methods in an image clustering experiment on the USPS handwritten digit database, thus verifying the effectiveness of K-UNPM. The research work in this thesis is partially supported by the National Natural and Science Foundation (NNSF) key project “The mathematical theory and method of sparse information processing” under grant no.11131006, the NNSF key project “Introducing human visual-tactile cognition and its fusion to robothuman imitated interaction and cooperation” under grant no.61033011, the NNSF international cooperation and exchange project “‘Visual Cognition-Reaction’ fused Neurocomputional Model — for key tasks intelligent strategy investigation” under grant no.61210009. Therefore, the research work in this thesis is very important in both theory and application.
学科主题	模式识别与智能系统
源URL	[http://ir.ia.ac.cn/handle/173211/12463]
专题	毕业生_博士学位论文
作者单位	中国科学院自动化研究所
推荐引用方式 GB/T 7714	姜锐. 快速判别性稀疏编码及其在图像识别中的应用[D]. 北京. 中国科学院研究生院. 2015.

入库方式： OAI收割

来源：自动化研究所

浏览0

下载0

收藏0

其他版本

除非特别说明，本系统中所有内容都受版权保护，并保留所有权利。