中国科学院机构知识库网格系统: Data-Based Reinforcement Learning for Nonzero-Sum Games With Unknown Drift Dynamics

Data-Based Reinforcement Learning for Nonzero-Sum Games With Unknown Drift Dynamics

文献类型：期刊论文


作者	Zhang, Qichao1,2 ; Zhao, Dongbin1,2
刊名	IEEE TRANSACTIONS ON CYBERNETICS
出版日期	2019-08-01
卷号	49 期号:8 页码:2874-2885
关键词	Integral reinforcement learning (IRL) neural network (NN) nonzero-sum (NZS) games off-policy single-critic unknown drift dynamics
ISSN号	2168-2267
DOI	10.1109/TCYB.2018.2830820
英文摘要	This paper is concerned about the nonlinear optimization problem of nonzero-sum (NZS) games with unknown drift dynamics. The data-based integral reinforcement learning (IRL) method is proposed to approximate the Nash equilibrium of NZS games iteratively. Furthermore, we prove that the data-based IRL method is equivalent to the model-based policy iteration algorithm, which guarantees the convergence of the proposed method. For the implementation purpose, a singl-ecritic neural network structure for the NZS games is given. To enhance the application capability of the data-based IRL method, we design the updating laws of critic weights based on the offline and online iterative learning methods, respectively. Note that the experience replay technique is introduced in the online iterative learning, which can improve the convergence rate of critic weights during the learning process. The uniform ultimate boundedness of the critic weights are guaranteed using the Lyapunov method. Finally, the numerical results demonstrate the effectiveness of the data-based M. algorithm for nonlinear NZS games with unknown drift dynamics.
WOS关键词	H-INFINITY CONTROL ; NONLINEAR-SYSTEMS ; ALGORITHM
资助项目	National Natural Science Foundation of China[61533017] ; National Natural Science Foundation of China[61573353] ; National Key Research and Development Plan[2016YFB0101000]
WOS研究方向	Automation & Control Systems ; Computer Science
语种	英语
WOS记录号	WOS:000467561700005
出版者	IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
源URL	[http://ir.ia.ac.cn/handle/173211/24567]
专题	复杂系统管理与控制国家重点实验室_深度强化学习
通讯作者	Zhao, Dongbin
作者单位	1.Chinese Acad Sci, Inst Automat, State Key Lab Management & Control Complex Syst, Beijing 100190, Peoples R China 2.Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R China
推荐引用方式 GB/T 7714	Zhang, Qichao,Zhao, Dongbin. Data-Based Reinforcement Learning for Nonzero-Sum Games With Unknown Drift Dynamics[J]. IEEE TRANSACTIONS ON CYBERNETICS,2019,49(8):2874-2885.
APA	Zhang, Qichao,&Zhao, Dongbin.(2019).Data-Based Reinforcement Learning for Nonzero-Sum Games With Unknown Drift Dynamics.IEEE TRANSACTIONS ON CYBERNETICS,49(8),2874-2885.
MLA	Zhang, Qichao,et al."Data-Based Reinforcement Learning for Nonzero-Sum Games With Unknown Drift Dynamics".IEEE TRANSACTIONS ON CYBERNETICS 49.8(2019):2874-2885.

入库方式： OAI收割

来源：自动化研究所

下载0

Data-Based Reinforcement Learning for Nonzero-Sum Games With Unknown Drift Dynamics

其他版本