Finite Sample Analysis of LSTD with Random Projections and Eligibility Traces
文献类型:会议论文
| 作者 | Li HF(李海芳)1 ; Yingce Xia2; Wensheng Zhang1
|
| 出版日期 | 2018-04 |
| 会议日期 | July 13-19 2018 |
| 会议地点 | Stockholm, Sweden |
| 英文摘要 | Policy evaluation with linear function approximation is an important problem in reinforcement learning. When facing high-dimensional feature spaces, such a problem becomes extremely hard considering the computation efficiency and quality of approximations. We propose a new algorithm, LSTD(λ)-RP, which leverages random projection techniques and takes eligibility traces into consideration to tackle the above two challenges. We carry out theoretical analysis of LSTD(λ)-RP, and provide meaningful upper bounds of the estimation error, approximation error and total generalization error. These results demonstrate that LSTD(λ)-RP can benefit from random projection and eligibility traces strategies, and LSTD(λ)-RP can achieve better performances than prior LSTDRP and LSTD(λ) algorithms. |
| 会议录出版者 | Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18) |
| 语种 | 英语 |
| 源URL | [http://ir.ia.ac.cn/handle/173211/26084] ![]() |
| 专题 | 精密感知与控制研究中心_人工智能与机器学习 |
| 通讯作者 | Li HF(李海芳) |
| 作者单位 | 1.Institute of Automation, Chinese Academy of Sciences 2.University of Science and Technology of China |
| 推荐引用方式 GB/T 7714 | Li HF,Yingce Xia,Wensheng Zhang. Finite Sample Analysis of LSTD with Random Projections and Eligibility Traces[C]. 见:. Stockholm, Sweden. July 13-19 2018. |
入库方式: OAI收割
来源:自动化研究所
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。

