A data-based online reinforcement learning algorithm satisfying probably approximately correct principle
文献类型:期刊论文
作者 | Zhu, Yuanheng![]() ![]() |
刊名 | NEURAL COMPUTING & APPLICATIONS
![]() |
出版日期 | 2015-05-01 |
卷号 | 26期号:4页码:775-787 |
关键词 | Reinforcement learning Probably approximately correct Kd-tree |
英文摘要 | This paper proposes a probably approximately correct (PAC) algorithm that directly utilizes online data efficiently to solve the optimal control problem of continuous deterministic systems without system parameters for the first time. The dependence on some specific approximation structures is crucial to limit the wide application of online reinforcement learning (RL) algorithms. We utilize the online data directly with the kd-tree technique to remove this limitation. Moreover, we design the algorithm in the PAC principle. Complete theoretical proofs are presented, and three examples are simulated to verify its good performance. It draws the conclusion that the proposed RL algorithm specifies the maximum running time to reach a near-optimal control policy with only online data. |
WOS标题词 | Science & Technology ; Technology |
类目[WOS] | Computer Science, Artificial Intelligence |
研究领域[WOS] | Computer Science |
关键词[WOS] | TIME NONLINEAR-SYSTEMS |
收录类别 | SCI |
语种 | 英语 |
WOS记录号 | WOS:000353356000003 |
源URL | [http://ir.ia.ac.cn/handle/173211/8114] ![]() |
专题 | 复杂系统管理与控制国家重点实验室_深度强化学习 |
作者单位 | Chinese Acad Sci, Inst Automat, State Key Lab Management & Control Complex Syst, Beijing, Peoples R China |
推荐引用方式 GB/T 7714 | Zhu, Yuanheng,Zhao, Dongbin. A data-based online reinforcement learning algorithm satisfying probably approximately correct principle[J]. NEURAL COMPUTING & APPLICATIONS,2015,26(4):775-787. |
APA | Zhu, Yuanheng,&Zhao, Dongbin.(2015).A data-based online reinforcement learning algorithm satisfying probably approximately correct principle.NEURAL COMPUTING & APPLICATIONS,26(4),775-787. |
MLA | Zhu, Yuanheng,et al."A data-based online reinforcement learning algorithm satisfying probably approximately correct principle".NEURAL COMPUTING & APPLICATIONS 26.4(2015):775-787. |
入库方式: OAI收割
来源:自动化研究所
浏览0
下载0
收藏0
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。