中国科学院机构知识库网格系统: Multi-Agent Reinforcement Learning Based on Clustering in Two-Player Games

中国科学院机构知识库网格

Chinese Academy of Sciences Institutional Repositories Grid

Multi-Agent Reinforcement Learning Based on Clustering in Two-Player Games

文献类型：会议论文


作者	Li WF(李伟凡); Zhu YH(朱圆恒); Zhao DB(赵冬斌)
出版日期	2019-12
会议日期	2019-12-6
会议地点	Xiamen, China
关键词	reinforcement learning unsupervised clustering matrix game
英文摘要	Non-stationary environment is general in real environment, including adversarial environment and multi-agent problem. Multi-agent environment is a typical non-stationary environment. Each agent of the shared environment must learn a efficient interaction for maximizing the expected reward. Independent reinforcement learning (InRL) is the simplest form in which each agent treats other agents as part of environment. In this paper, we present Max-Mean-Learning-Win-or-LearnFast (MML-WoLF), which is an independent on-policy learning algorithm based on reinforcement clustering. A variational autoencoder method based on reinforcement learning is proposed to extract features for unsupervised clustering. Based on clustering results, MML-WoLF uses statistics and the dominated factor to calculate the values of the states that belong to a certain category. The agent policy is iteratively updated by the value. We apply our algorithm to multi-agent problems including matrixgame, grid world, and continuous world game. The clustering results are able to show the strategies distribution under the agent's current policy. The experiment results suggest that our method significantly improves average performance over other independent learning algorithms in multi-agent problems.
会议录出版者	IEEE
源URL	[http://ir.ia.ac.cn/handle/173211/52212]
专题	复杂系统管理与控制国家重点实验室_深度强化学习
通讯作者	Zhu YH(朱圆恒)
作者单位	State Key Laboratory of Management and Control for Complex Systems Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
推荐引用方式 GB/T 7714	Li WF,Zhu YH,Zhao DB. Multi-Agent Reinforcement Learning Based on Clustering in Two-Player Games[C]. 见:. Xiamen, China. 2019-12-6.

入库方式： OAI收割

来源：自动化研究所

浏览0

下载0

收藏0

其他版本

除非特别说明，本系统中所有内容都受版权保护，并保留所有权利。