Online Similarity Learning for Big Data with Overfitting
文献类型:期刊论文
作者 | Fan BJ(范保杰); Zeng P(曾鹏)![]() ![]() ![]() |
刊名 | IEEE Transactions on Big Data
![]() |
出版日期 | 2018 |
卷号 | 4期号:1页码:78-89 |
关键词 | online learning similarity learning low rank sparse representation feature selection overfitting, redundancy |
ISSN号 | 2332-7790 |
产权排序 | 1 |
通讯作者 | Cong Y(丛杨) |
中文摘要 | In this paper, we propose a general model to address the overfitting problem in online similarity learning for big data, which is generally generated by two kinds of redundancies: 1) feature redundancy, that is there exists redundant (irrelevant) features in the training data; 2) rank redundancy, that is non-redundant (or relevant) features lie in a low rank space. To overcome these, our model is designed to obtain a simple and robust metric matrix through detecting the redundant rows and columns in the metric matrix and constraining the remaining matrix to a low rank space. To reduce feature redundancy, we employ the group sparsity regularization, i.e., the `2;1 norm, to encourage a sparse feature set. To address rank redundancy, we adopt the low rank regularization, the max norm, instead of calculating the SVD as in traditional models using the nuclear norm. Therefore, our model can not only generate a low rank metric matrix to avoid overfitting, but also achieves feature selection simultaneously. For model optimization, an online algorithm based on the stochastic proximal method is derived to solve this problem efficiently with the complexity of O(d2). To validate the effectiveness and efficiency of our algorithms, we apply our model to online scene categorization and synthesized data and conduct experiments on various benchmark datasets with comparisons to several state-of-the-art methods. Our model is as efficient as the fastest online similarity learning model OASIS, while performing generally as well as the accurate model OMLLR. Moreover, our model can exclude irrelevant / redundant feature dimension simultaneously. |
收录类别 | EI |
语种 | 英语 |
源URL | [http://ir.sia.cn/handle/173321/21394] ![]() |
专题 | 沈阳自动化研究所_机器人学研究室 |
作者单位 | 1.State Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110016, China 2.Department of Computer Science, University of Rochester, Rochester, NY 14611 USA 3.College of Automation, Nanjing University of Posts and Telecommunications, Nanjing, 210042 China |
推荐引用方式 GB/T 7714 | Fan BJ,Zeng P,Yu HB,et al. Online Similarity Learning for Big Data with Overfitting[J]. IEEE Transactions on Big Data,2018,4(1):78-89. |
APA | Fan BJ,Zeng P,Yu HB,Luo JB,Cong Y,&Liu J.(2018).Online Similarity Learning for Big Data with Overfitting.IEEE Transactions on Big Data,4(1),78-89. |
MLA | Fan BJ,et al."Online Similarity Learning for Big Data with Overfitting".IEEE Transactions on Big Data 4.1(2018):78-89. |
入库方式: OAI收割
来源:沈阳自动化研究所
浏览0
下载0
收藏0
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。