The Strength of Nesterov's Extrapolation in the Individual Convergence of Nonsmooth Optimization
文献类型:期刊论文
作者 | Tao, Wei1; Pan, Zhisong1; Wu, Gaowei3![]() |
刊名 | IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
![]() |
出版日期 | 2020-07-01 |
卷号 | 31期号:7页码:2557-2568 |
关键词 | Convergence Extrapolation Optimization Acceleration Machine learning Task analysis Machine learning algorithms Individual convergence machine learning Nesterov's extrapolation nonsmooth optimization sparsity |
ISSN号 | 2162-237X |
DOI | 10.1109/TNNLS.2019.2933452 |
通讯作者 | Tao, Qing(taoqing@gmail.com) |
英文摘要 | The extrapolation strategy raised by Nesterov, which can accelerate the convergence rate of gradient descent methods by orders of magnitude when dealing with smooth convex objective, has led to tremendous success in training machine learning tasks. In this article, the convergence of individual iterates of projected subgradient (PSG) methods for nonsmooth convex optimization problems is theoretically studied based on Nesterov's extrapolation, which we name individual convergence. We prove that Nesterov's extrapolation has the strength to make the individual convergence of PSG optimal for nonsmooth problems. In light of this consideration, a direct modification of the subgradient evaluation suffices to achieve optimal individual convergence for strongly convex problems, which can be regarded as making an interesting step toward the open question about stochastic gradient descent (SGD) posed by Shamir. Furthermore, we give an extension of the derived algorithms to solve regularized learning tasks with nonsmooth losses in stochastic settings. Compared with other state-of-the-art nonsmooth methods, the derived algorithms can serve as an alternative to the basic SGD especially in coping with machine learning problems, where an individual output is needed to guarantee the regularization structure while keeping an optimal rate of convergence. Typically, our method is applicable as an efficient tool for solving large-scale l(1)-regularized hinge-loss learning problems. Several comparison experiments demonstrate that our individual output not only achieves an optimal convergence rate but also guarantees better sparsity than the averaged solution. |
资助项目 | NSFC[61673394] ; National Key Research and Development Program of China[2016QY03D0501] |
WOS研究方向 | Computer Science ; Engineering |
语种 | 英语 |
WOS记录号 | WOS:000546986600027 |
出版者 | IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC |
资助机构 | NSFC ; National Key Research and Development Program of China |
源URL | [http://ir.ia.ac.cn/handle/173211/40050] ![]() |
专题 | 精密感知与控制研究中心_人工智能与机器学习 |
通讯作者 | Tao, Qing |
作者单位 | 1.Army Engn Univ PLA, Command & Control Engn Coll, Nanjing 210007, Peoples R China 2.Army Acad Artillery & Air Def, Hefei 230031, Peoples R China 3.Chinese Acad Sci, Inst Automat, Beijing, Peoples R China |
推荐引用方式 GB/T 7714 | Tao, Wei,Pan, Zhisong,Wu, Gaowei,et al. The Strength of Nesterov's Extrapolation in the Individual Convergence of Nonsmooth Optimization[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS,2020,31(7):2557-2568. |
APA | Tao, Wei,Pan, Zhisong,Wu, Gaowei,&Tao, Qing.(2020).The Strength of Nesterov's Extrapolation in the Individual Convergence of Nonsmooth Optimization.IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS,31(7),2557-2568. |
MLA | Tao, Wei,et al."The Strength of Nesterov's Extrapolation in the Individual Convergence of Nonsmooth Optimization".IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 31.7(2020):2557-2568. |
入库方式: OAI收割
来源:自动化研究所
浏览0
下载0
收藏0
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。