中国科学院机构知识库网格系统: Clique-based cooperative multiagent reinforcement learning using factor graphs

中国科学院机构知识库网格

Chinese Academy of Sciences Institutional Repositories Grid

Clique-based cooperative multiagent reinforcement learning using factor graphs

文献类型：期刊论文


作者	Zhang,Zhen1 ; Zhao DB(赵冬斌)2
刊名	IEEE/CAA Journal of Automatica Sinica
出版日期	2015
卷号	3 期号:1 页码:248-256
关键词	Reinforcement Learning Factor Graphs
英文摘要	In this paper, we propose a clique-based sparse reinforcement learning (RL) algorithm for solving cooperative tasks. The aim is to accelerate the learning speed of the original sparse RL algorithm and to make it applicable for tasks decomposed in a more general manner. First, a transition function is estimated and used to update the Q-value function, which greatly reduces the learning time. Second, it is more reasonable to divide agents into cliques, each of which is only responsible for a specific subtask. In this way, the global Q-value function is decomposed into the sum of several simpler local Q-value functions. Such decomposition is expressed by a factor graph and exploited by the general maxplus algorithm to obtain the greedy joint action. Experimental results show that the proposed approach outperforms others with better performance.
源URL	[http://ir.ia.ac.cn/handle/173211/19321]
专题	复杂系统管理与控制国家重点实验室_深度强化学习
作者单位	1.Department of Electric Engineering, College of Automation Engineering, Qingdao University 2.State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, China
推荐引用方式 GB/T 7714	Zhang,Zhen,Zhao DB. Clique-based cooperative multiagent reinforcement learning using factor graphs[J]. IEEE/CAA Journal of Automatica Sinica,2015,3(1):248-256.
APA	Zhang,Zhen,&Zhao DB.(2015).Clique-based cooperative multiagent reinforcement learning using factor graphs.IEEE/CAA Journal of Automatica Sinica,3(1),248-256.
MLA	Zhang,Zhen,et al."Clique-based cooperative multiagent reinforcement learning using factor graphs".IEEE/CAA Journal of Automatica Sinica 3.1(2015):248-256.

入库方式： OAI收割

来源：自动化研究所

浏览0

下载0

收藏0

其他版本

除非特别说明，本系统中所有内容都受版权保护，并保留所有权利。