中国科学院机构知识库网格系统: Offline reinforcement learning with representations for actions

Offline reinforcement learning with representations for actions

文献类型：期刊论文


作者	Lou, Xingzhou 4,5; Yin, Qiyue4 ; Zhang, Junge4 ; Yu, Chao 1; He, Zhaofeng2 ; Cheng, Nengjie 3; Huang, Kaiqi4
刊名	INFORMATION SCIENCES
出版日期	2022-09-01
卷号	610 页码:746-758
关键词	Offline reinforcement learning Action embedding
ISSN号	0020-0255
DOI	10.1016/j.ins.2022.08.019
通讯作者	Zhang, Junge()
英文摘要	Prevailing offline reinforcement learning (RL) methods limit the policy within the area sup-ported by the offline dataset to avoid the distributional shift problem. But potential high -reward actions, which are out of the distribution of the dataset, are neglected in these meth-ods. To address such issue, we propose a new method, which generalizes from the offline dataset to out-of-distribution (OOD) actions. Specifically, we design a novel action embed-ding model to help infer the effect of actions. As a result, our value function reaches a better generalization over the action space, and further alleviate the distributional shift caused by overestimation of OOD actions. Theoretically, we give an information-theoretic explanation on the improvement of the value function's generalization over the action space. Experiments on D4RL demonstrate that our model improves the performance compared to previous offline RL methods, especially when the experience in the offline dataset is good. We conduct further study and validate that the value function's generalization on OOD actions is improved, which reinforces the effectiveness of our proposed action embedding model. (c) 2022 Published by Elsevier Inc.
资助项目	National Natural Science Foundation of China[61876181] ; Beijing Nova Program of Science and Technology[Z191100001119043] ; Youth Innovation Promotion Association, CAS
WOS研究方向	Computer Science
语种	英语
WOS记录号	WOS:000860782400007
出版者	ELSEVIER SCIENCE INC
资助机构	National Natural Science Foundation of China ; Beijing Nova Program of Science and Technology ; Youth Innovation Promotion Association, CAS
源URL	[http://ir.ia.ac.cn/handle/173211/50376]
专题	智能系统与工程
通讯作者	Zhang, Junge
作者单位	1.Sun Yat Sen Univ, Guangzhou, Peoples R China 2.Beijing Univ Posts & Telecommun, Beijing, Peoples R China 3.Nanchang Univ, Nanchang, Peoples R China 4.Chinese Acad Sci, Inst Automat, Beijing, Peoples R China 5.Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing, Peoples R China
推荐引用方式 GB/T 7714	Lou, Xingzhou,Yin, Qiyue,Zhang, Junge,et al. Offline reinforcement learning with representations for actions[J]. INFORMATION SCIENCES,2022,610:746-758.
APA	Lou, Xingzhou.,Yin, Qiyue.,Zhang, Junge.,Yu, Chao.,He, Zhaofeng.,...&Huang, Kaiqi.(2022).Offline reinforcement learning with representations for actions.INFORMATION SCIENCES,610,746-758.
MLA	Lou, Xingzhou,et al."Offline reinforcement learning with representations for actions".INFORMATION SCIENCES 610(2022):746-758.

入库方式： OAI收割

来源：自动化研究所

下载0

Offline reinforcement learning with representations for actions

其他版本