中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
SIDE: State Inference for Partially Observable Cooperative Multi-Agent Reinforcement Learning

文献类型:会议论文

作者Zhiwei Xu1,2; Yunpeng Bai1,2; Dapeng Li1,2; Bin Zhang1,2; Guoliang Fan1,2
出版日期2022
会议日期May 9-13, 2022
会议地点Auckland, New Zealand
DOI10.5555/3535850.3536006
页码1400-1408
英文摘要

As one of the solutions to the decentralized partially observable Markov decision process (Dec-POMDP) problems, the value decomposition method has achieved significant results recently. However, most value decomposition methods require the fully observable state of the environment during training, but this is not feasible in some scenarios where only incomplete and noisy observations can be obtained. Therefore, we propose a novel value decomposition framework, named State Inference for value DEcomposition (SIDE), which eliminates the need to know the global state by simultaneously seeking solutions to the two problems of optimal control and state inference. SIDE can be extended to any value decomposition method to tackle partially observable problems. By comparing with the performance of different algorithms in StarCraft II micromanagement tasks, we verified that though without accessible states, SIDE can infer the current state that contributes to the reinforcement learning process based on past local observations and even achieve superior results to many baselines in some complex scenarios.

语种英语
URL标识查看原文
源URL[http://ir.ia.ac.cn/handle/173211/56522]  
专题融合创新中心_决策指挥与体系智能
通讯作者Guoliang Fan
作者单位1.Institute of Automation, Chinese Academy of Sciences
2.School of Artificial Intelligence, University of Chinese Academy of Sciences
推荐引用方式
GB/T 7714
Zhiwei Xu,Yunpeng Bai,Dapeng Li,et al. SIDE: State Inference for Partially Observable Cooperative Multi-Agent Reinforcement Learning[C]. 见:. Auckland, New Zealand. May 9-13, 2022.

入库方式: OAI收割

来源:自动化研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。