中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
Intrinsic Reward with Peer Incentives for Cooperative Multi-Agent Reinforcement Learning

文献类型:会议论文

作者Zhang TL(张天乐)1,2; Liu Z(刘振)1,2; Wu SG(吴士广)1,2; Pu ZQ(蒲志强)1,2; Yi JQ(易建强)1,2
出版日期2022
会议日期18-23 July 2022
会议地点Online
英文摘要

In this paper, we propose a novel Intrinsic Reward method with Peer Incentives (IRPI) to promote the inter-agent direct interactions and implicitly address the credit assignment problem in cooperative multi-agent reinforcement learning (MARL). The IRPI method can build mutual incentives between agents by using their causal effect, to realize their advanced cooperation. Specifically, a new intrinsic reward mechanism is conducted, which equips each agent with the ability to reward other agent by using the causal effect between them. Moreover, the mechanism is built through a neural network and learned by using causal effect between the agents. Furthermore, the counterfactual reasoning is used to infer the causal effect between the agents using the joint action-state value function, and then assess the quality of the effect using individual state value function in MARL. Simulational results in Starcraft II Micromanagement demonstrate that the proposed IRPI can enhance cooperation among the RL agents to achieve better performance than some state-of-the-art MARL methods in various cooperative multi-agent tasks.

会议录出版者IEEE
语种英语
源URL[http://ir.ia.ac.cn/handle/173211/51961]  
专题综合信息系统研究中心_飞行器智能技术
通讯作者Liu Z(刘振)
作者单位1.中国科学院大学人工智能学院
2.中国科学院自动化研究所
推荐引用方式
GB/T 7714
Zhang TL,Liu Z,Wu SG,et al. Intrinsic Reward with Peer Incentives for Cooperative Multi-Agent Reinforcement Learning[C]. 见:. Online. 18-23 July 2022.

入库方式: OAI收割

来源:自动化研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。