Model-Free Optimal Tracking Control via Critic-Only Q-Learning
文献类型:期刊论文
作者 | Luo, Biao1; Liu, Derong2; Huang, Tingwen3; Wang, Ding1; Luo,Biao |
刊名 | IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS |
出版日期 | 2016-10-01 |
卷号 | 27期号:10页码:2134-2144 |
关键词 | Critic-only Q-learning (Coql) Model-free Nonaffine Nonlinear Systems Optimal Tracking Control |
DOI | 10.1109/TNNLS.2016.2585520 |
文献子类 | Article |
英文摘要 | Model-free control is an important and promising topic in control fields, which has attracted extensive attention in the past few years. In this paper, we aim to solve the model-free optimal tracking control problem of nonaffine non-linear discrete-time systems. A critic-only Q-learning (CoQL) method is developed, which learns the optimal tracking control from real system data, and thus avoids solving the tracking Hamilton-Jacobi-Bellman equation. First, the Q-learning algorithm is proposed based on the augmented system, and its convergence is established. Using only one neural network for approximating the Q-function, the CoQL method is developed to implement the Q-learning algorithm. Furthermore, the convergence of the CoQL method is proved with the consideration of neural network approximation error. With the convergent Q-function obtained from the CoQL method, the adaptive optimal tracking control is designed based on the gradient descent scheme. Finally, the effectiveness of the developed CoQL method is demonstrated through simulation studies. The developed CoQL method learns with off-policy data and implements with a critic-only structure, thus it is easy to realize and overcome the inadequate exploration problem. |
WOS关键词 | TIME NONLINEAR-SYSTEMS ; H-INFINITY CONTROL ; ADAPTIVE OPTIMAL-CONTROL ; SPATIALLY DISTRIBUTED PROCESSES ; LINEAR-SYSTEMS ; CONTROL DESIGN ; UNKNOWN DYNAMICS ; CONTROL SCHEME ; ATTITUDE TRACKING ; POLICY ITERATION |
WOS研究方向 | Computer Science ; Engineering |
语种 | 英语 |
WOS记录号 | WOS:000384644000012 |
资助机构 | National Natural Science Foundation of China(61233001 ; State Key Laboratory of Management and Control for Complex Systems ; National Priorities Research Program through the Qatar National Research Fund (a member of Qatar Foundation)(NPRP 7-1482-1-278) ; 61273140 ; 61304086 ; 61374105 ; 61503377 ; 61533017 ; U1501251) |
源URL | [http://ir.ia.ac.cn/handle/173211/12301] |
专题 | 自动化研究所_复杂系统管理与控制国家重点实验室_智能化团队 |
通讯作者 | Luo,Biao |
作者单位 | 1.Chinese Acad Sci, Inst Automat, State Key Lab Management & Control Complex Syst, Beijing 100190, Peoples R China 2.Univ Sci & Technol Beijing, Sch Automat & Elect Engn, Beijing 100083, Peoples R China 3.Texas A&M Univ Qatar, Doha 23874, Qatar |
推荐引用方式 GB/T 7714 | Luo, Biao,Liu, Derong,Huang, Tingwen,et al. Model-Free Optimal Tracking Control via Critic-Only Q-Learning[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS,2016,27(10):2134-2144. |
APA | Luo, Biao,Liu, Derong,Huang, Tingwen,Wang, Ding,&Luo,Biao.(2016).Model-Free Optimal Tracking Control via Critic-Only Q-Learning.IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS,27(10),2134-2144. |
MLA | Luo, Biao,et al."Model-Free Optimal Tracking Control via Critic-Only Q-Learning".IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 27.10(2016):2134-2144. |
入库方式: OAI收割
来源:自动化研究所
浏览0
下载0
收藏0
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。