中国科学院机构知识库网格系统: 排序学习相关问题的研究

中国科学院机构知识库网格

Chinese Academy of Sciences Institutional Repositories Grid

排序学习相关问题的研究

文献类型：学位论文


作者	孙正雅
学位类别	工学博士
答辩日期	2011-05-29
授予单位	中国科学院研究生院
授予地点	中国科学院自动化研究所
导师	王珏
关键词	排序学习位置敏感评价约简方法稀疏学习泛函梯度下降 Learn to Rank Position-Sensitive Measures Reduction Approach Sparse Learning Functional Gradient Descent
其他题名	Study on Learning to Rank Related Problems
学位专业	模式识别与智能系统
中文摘要	本文主要针对排序学习的一些相关问题进行研究。由于排序学习重要的实际意义，因此已经得到了机器学习以及信息检索领域研究学者的广泛关注。排序学习是一个典型的非光滑优化问题，为探索解决问题的有效方法，论文主要内容如下：依据可约简性原则提出从位置敏感排序学习到分类的统一约简框架，并且在这个框架下引入相关性增益函数的定义。通过遗憾转换分析指出保证约简具有一致性的充分条件是学习依据相关性增益加权的二元偏好关系，然后证明加权后的分类遗憾是如何有界约束位置敏感排序遗憾的，从而得出遗憾之间的转换比率不超过最大相邻位置折扣偏差的两倍，当只关心前面位置的排序性能时，进而推导获得更精确的遗憾上界。结合分类技术设计以优化位置敏感评价准则为目标的排序学习算法。为了实现二元偏好学习的有效求解，研究具有贝叶斯一致性的凸替代损失，同时为了解决有限样本情况下加权偏好损失与排序损失之间的不匹配，在优化过程中引入位置敏感算子作为启发式信息，并且提出基于启发式梯度的排序学习算法，在证明算法单调性的同时指出该算法对于处理大规模问题的有效性。利用启发式梯度的更新机制分别求解两种稀疏优化目标，得到相应的稀疏排序学习算法。对于L1约束下最小化凸替代损失，提出将逐点确定下降方向的类梯度迭代机制与L1投影相结合的单调求解过程；对于最小化凸替代损失与L1正则化项的求和，提出将启发式梯度构造与截断技术相结合加以求解，同时指出截断梯度技术实质上相当于在不定可行域范围内的L1投影。基于泛函空间中的梯度下降思想设计排序学习算法。针对位置敏感排序学习设计相应的泛函梯度更新规则，同时在搜索弱学习模型的过程中融入位置敏感算子导出的启发式信息，从而使得经过有限次迭代之后可以最大限度地减少位置敏感排序损失，如果采用提前终止策略，那么在约束弱学习模型生成个数的同时，也起到控制学习模型复杂度的作用。总体说来，本文围绕排序学习从约简框架的提出到遗憾转换的分析，从优化目标的构建到求解算法的设计，从线性模型的假设到弱学习器的集群，构成了一个连贯统一的排序学习研究体系。
英文摘要	This dissertation studies on learning to rank related problems. The ranking problem has drawn much attention in the machine learning community as well as in the information retrieval community primarily due to its significant practical interests. Direct optimization of natural ranking measures leads to non-smooth optimization problems. Therefore a computationally more tractable approach is needed. We summarize the main contributions of our work as follows: Based on reducible principles, we provide a unifying reduction framework from position-sensitive ranking to classification, under which the relevance gain function is introduced. Through regret transform analysis, we prove that the sufficient condition on consistent reduction is given by learning pairwise preferences assigned with importance weights according to relevance gains. We then quantify the reduction with consistency guarantee on at most how much the classification regret can be transferred into the position-sensitive ranking regret, and show that their ratio is at most 2 times the maximal deviation of discounts between adjacent positions. We also present a refined version of this bound when only the quality over the top rank positions is of concern. Coupled with binary classification techniques, we proceed with ranking algorithm design, which is evaluated by position-sensitive measures. For the sake of effective computation, we investigate convex surrogate losses satisfying Bayes consistency with respect to weighted pairwise preferences. When only finite examples are available, we also deal with the mismatch during optimization between weighted preference losses and ranking losses. By introducing position-sensitive operator, we propose a novel ranking algorithm that performs heuristic gradient. Not only its monotonicity is proved, the power of handling large-scale problems is also emphasized. In order to achieve sparsity, two novel ranking algorithms are devised, which learn sparse representations using heuristic gradient descent. In terms of L1 constrained convex surrogate, we alternate monotonously between gradient related estimation and L1 projection; In terms of L1 regularized convex surrogate, we combine calculating heuristic gradient with truncation techniques. Meanwhile, we find that truncation techniques can be regarded as L1 projection within variable constraints. We propose a novel ranking algorithm which performs gradient related optimization in functional space. With th...
语种	中文
其他标识符	200718014628060
源URL	[http://ir.ia.ac.cn/handle/173211/6368]
专题	毕业生_博士学位论文
推荐引用方式 GB/T 7714	孙正雅. 排序学习相关问题的研究[D]. 中国科学院自动化研究所. 中国科学院研究生院. 2011.

入库方式： OAI收割

来源：自动化研究所

浏览0

下载0

收藏0

其他版本

除非特别说明，本系统中所有内容都受版权保护，并保留所有权利。