中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
Soft Contrastive Learning with Q-irrelevance Abstraction for Reinforcement Learning

文献类型:期刊论文

作者Liu MS(刘民颂)1; Li LT(李伦通)1; Hao S(郝帅)2; Zhu YH(朱圆恒)1; Zhao DB(赵冬斌)1
刊名IEEE Transactions on Cognitive and Developmental Systems
出版日期2023-09
卷号15期号:3页码:1463 - 1473
DOI10.1109/TCDS.2022.3218940
英文摘要

The difference between training and testing environments is a huge challenge to generalizing reinforcement learning (RL) algorithms. We propose a soft contrastive learning with a coarser approximate Q -irrelevance abstraction for RL (SCQRL) to increase RL generalization. Specifically, we specify the coarser approximate Q -irrelevance abstraction as the feature of the state with a theoretical analysis for better generalization ability. We construct a positive and negative sample selection mechanism based on the Q value for contrastive learning to achieve efficient representation learning. Considering the selection error of positive and negative samples, we design soft contrastive learning and combine it with RL in the form of an auxiliary task to propose SCQRL. The generalization experiments on several Procgen environments demonstrate that SCQRL outperforms the excellent generalized RL algorithm.

源URL[http://ir.ia.ac.cn/handle/173211/57521]  
专题复杂系统管理与控制国家重点实验室_深度强化学习
通讯作者Zhu YH(朱圆恒)
作者单位1.中国科学院自动化研究所
2.北京航空航天大学
推荐引用方式
GB/T 7714
Liu MS,Li LT,Hao S,et al. Soft Contrastive Learning with Q-irrelevance Abstraction for Reinforcement Learning[J]. IEEE Transactions on Cognitive and Developmental Systems,2023,15(3):1463 - 1473.
APA Liu MS,Li LT,Hao S,Zhu YH,&Zhao DB.(2023).Soft Contrastive Learning with Q-irrelevance Abstraction for Reinforcement Learning.IEEE Transactions on Cognitive and Developmental Systems,15(3),1463 - 1473.
MLA Liu MS,et al."Soft Contrastive Learning with Q-irrelevance Abstraction for Reinforcement Learning".IEEE Transactions on Cognitive and Developmental Systems 15.3(2023):1463 - 1473.

入库方式: OAI收割

来源:自动化研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。