Multi-Agent Reinforcement Learning Based on Clustering in Two-Player Games
文献类型:会议论文
作者 | Li WF(李伟凡)![]() ![]() ![]() |
出版日期 | 2019-12 |
会议日期 | 2019-12-6 |
会议地点 | Xiamen, China |
关键词 | reinforcement learning unsupervised clustering matrix game |
英文摘要 | Non-stationary environment is general in real environment, including adversarial environment and multi-agent problem. Multi-agent environment is a typical non-stationary environment. Each agent of the shared environment must learn a efficient interaction for maximizing the expected reward. Independent reinforcement learning (InRL) is the simplest form in which each agent treats other agents as part of environment. In this paper, we present Max-Mean-Learning-Win-or-LearnFast (MML-WoLF), which is an independent on-policy learning algorithm based on reinforcement clustering. A variational autoencoder method based on reinforcement learning is proposed to extract features for unsupervised clustering. Based on clustering results, MML-WoLF uses statistics and the dominated factor to calculate the values of the states that belong to a certain category. The agent policy is iteratively updated by the value. We apply our algorithm to multi-agent problems including matrixgame, grid world, and continuous world game. The clustering results are able to show the strategies distribution under the agent's current policy. The experiment results suggest that our method significantly improves average performance over other independent learning algorithms in multi-agent problems. |
会议录出版者 | IEEE |
源URL | [http://ir.ia.ac.cn/handle/173211/52212] ![]() |
专题 | 复杂系统管理与控制国家重点实验室_深度强化学习 |
通讯作者 | Zhu YH(朱圆恒) |
作者单位 | State Key Laboratory of Management and Control for Complex Systems Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China |
推荐引用方式 GB/T 7714 | Li WF,Zhu YH,Zhao DB. Multi-Agent Reinforcement Learning Based on Clustering in Two-Player Games[C]. 见:. Xiamen, China. 2019-12-6. |
入库方式: OAI收割
来源:自动化研究所
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。