A Model-Based Exploration Policy in Deep Q-Network
文献类型:会议论文
作者 | Li SL(李帅龙)2,3,4; Zhang W(张伟)2,4![]() ![]() ![]() |
出版日期 | 2021 |
会议日期 | December 3-4, 2021 |
会议地点 | Virtual, Chengdu, China |
关键词 | reinforcement learning exploration and exploitation dilemma model-based exploration method |
页码 | 336-343 |
英文摘要 | Reinforcement learning has successfully been used in many applications and achieved prodigious performance (such as video games), and DQN is a well-known algorithm in RL. However, there are some disadvantages in practical applications, and the exploration and exploitation dilemma is one of them. To solve this problem, common strategies about exploration like -greedy have risen. Unfortunately, there are sample inefficient and ineffective because of the uncertainty of later exploration. In this paper, we propose a model-based exploration method that learns the state transition model to explore. Using the training rules of machine learning, we can train the state transition model networks to improve exploration efficiency and sample efficiency. We compare our algorithm with -greedy on the Deep Q-Networks (DQN) algorithm and apply it to the Atari 2600 games. Our algorithm outperforms the decaying -greedy strategy when we evaluate our algorithm across 14 Atari games in the Arcade Learning Environment (ALE). |
产权排序 | 1 |
会议录 | 2021 International Conference on Digital Society and Intelligent Systems, DSInS 2021
![]() |
会议录出版者 | IEEE |
会议录出版地 | New York |
语种 | 英语 |
ISBN号 | 978-1-6654-0630-7 |
源URL | [http://ir.sia.cn/handle/173321/30505] ![]() |
专题 | 沈阳自动化研究所_空间自动化技术研究室 |
通讯作者 | Zhang W(张伟); Leng YQ(冷雨泉) |
作者单位 | 1.Department of Mechanical and Energy Engineering, Southern University of Science and Technology, Shenzhen, China 2.Institutes for Robotics and Intelligent Manufacturing, Chinese Academy of Sciences, Shenyang, China 3.University of Chinese Academy of Sciences, Shenyang, China 4.State Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang, China |
推荐引用方式 GB/T 7714 | Li SL,Zhang W,Leng YQ,et al. A Model-Based Exploration Policy in Deep Q-Network[C]. 见:. Virtual, Chengdu, China. December 3-4, 2021. |
入库方式: OAI收割
来源:沈阳自动化研究所
浏览0
下载0
收藏0
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。