中国科学院机构知识库网格系统: Learning in bi-level markov games

中国科学院机构知识库网格

Chinese Academy of Sciences Institutional Repositories Grid

Learning in bi-level markov games

文献类型：会议论文


作者	Meng Linghui1,2 ; Ruan Jingqing 1,2; Xing Dengpeng1 ; Xu Bo1,2
出版日期	2022-07
会议日期	2022.7.18-2022.7.23
会议地点	Padua, Italy
英文摘要	Although multi-agent reinforcement learning (MARL) has demonstrated remarkable progress in tackling sophisticated cooperative tasks, the assumption that agents take simultaneous actions still limits the applicability of MARL for many real-world problems. In this work, we relax the assumption by proposing the framework of the bi-level Markov game (BMG). BMG breaks the simultaneity by assigning two players with a leader-follower relationship in which the leader considers the policy of the follower who is taking the best response based on the leader's actions. We propose two provably convergent algorithms to solve BMG: BMG-1 and BMG-2. The former uses the standard Q-learning, while the latter relieves solving the local Stackelberg equilibrium in BMG-1 with the further two-step transition to estimate the state value. For both methods, we consider temporal difference learning techniques with both tabular and neural network representations. To verify the effectiveness of our BMG framework, we test on a series of games, including Seeker, Cooperative Navigation, and Football, that are challenging to existing MARL solvers find challenging to solve: Seeker, Cooperative Navigation, and Football. Experimental results show that our BMG methods achieve competitive advantages in terms of better performance and lower variance.
源URL	[http://ir.ia.ac.cn/handle/173211/57335]
专题	数字内容技术与服务研究中心_听觉模型与认知计算
作者单位	1.Institute of Automation, Chinese Academy of Sciences 2.School of Artiﬁcial Intelligence, University of Chinese Academy of Sciences
推荐引用方式 GB/T 7714	Meng Linghui,Ruan Jingqing,Xing Dengpeng,et al. Learning in bi-level markov games[C]. 见:. Padua, Italy. 2022.7.18-2022.7.23.

入库方式： OAI收割

来源：自动化研究所

浏览0

下载0

收藏0

其他版本

除非特别说明，本系统中所有内容都受版权保护，并保留所有权利。