Learning Superior Cooperative Policy in Competitive Multi-team Reinforcement Learning
文献类型:会议论文
作者 | Qingxu Fu1,2![]() ![]() ![]() ![]() ![]() |
出版日期 | 2023-06 |
会议日期 | 2023-6 |
会议地点 | Gold Coast, Australia |
英文摘要 | Multi-agent Reinforcement Learning (MARL) has become a powerful tool for addressing multi-agent challenges. Existing studies have explored numerous models to use MARL to solve single-team cooperation (competition) problems and adversarial problems with opponents controlled by static knowledge-based policies. However, most studies in the literature often ignore adversarial multi-team problems involving dynamically evolving opponents. We investigate adversarial multi-team problems where all participating teams use MARL learners to learn policies against each other. Two objectives are achieved in this study. Firstly, we design an adversarial team-versus-team learning framework to generate cooperative multi-agent policies to compete against opponents without preprogrammed opponent partners or any supervision. Secondly, we explore the key factors to achieve win-rate superiority during dynamic competitions. Then we put forward a novel FeedBack MARL (FBMARL) algorithm that takes advantage of feedback loops to adjust optimizer hyper-parameters based on real-time game statistics. Finally, the effectiveness of our FBMARL model is tested in a benchmark environment named Multi-Team Decentralized Collective Assault (MT-DCA). The results demonstrate that our feedback MARL model can achieve superior performance over baseline competitor MARL learners in 2-team and 3-team dynamic competitions. |
源URL | [http://ir.ia.ac.cn/handle/173211/57225] ![]() |
专题 | 综合信息系统研究中心_飞行器智能技术 |
作者单位 | 1.80146-中国科学院自动化研究所 2.80170-中国科学院大学 |
推荐引用方式 GB/T 7714 | Qingxu Fu,Tenghai Qiu,Zhiqiang Pu,et al. Learning Superior Cooperative Policy in Competitive Multi-team Reinforcement Learning[C]. 见:. Gold Coast, Australia. 2023-6. |
入库方式: OAI收割
来源:自动化研究所
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。