中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
强化学习在多智能体系统任务分配中的应用

文献类型:学位论文

作者王国权
学位类别硕士
答辩日期2004-06-29
授予单位中国科学院沈阳自动化研究所
授予地点中国科学院沈阳自动化研究所
导师于海斌
关键词多智能体系统 任务分配 强化学习 状态集结 博弈论
其他题名Research on Task Allocation in Multi-agent Systemswith Reinforcement Learning
学位专业机械电子工程
中文摘要智能体、多智能体系统理论与技术不仅是分布式人工智能和计算机科学技术的重要研究内容之一,也是制造工程和社会学等其他学科的重要研究课题。任务分配是指针对某个目标,遵循一定的策略将规模为N的任务映射和划分到P个个体上。传统的任务分配算法大致可分为基于图论的分配策略、整数规划方法、试探法及在此基础上衍生出来的其它改进型算法;其目的一般是在众多的匹配方案中寻找一个最合理的子任务分配方案,以达到提高效率、节约成本、合理配置资源或总体收益最佳等系统目标。 多智能体系统的特点决定了其中的任务分配与传统的任务分配有很大的不同。它没有类似于集中式系统中存在的中央控制器,各个智能体只能了解到系统局部的信息,而不是全局的信息。在进行任务分配时,各智能体只能依据自身的信息以及其部分“邻居”的信息来进行任务选择和竞争。当前,多智能体系统任务分配问题已经成为分布式人工智能领域的研究热点。 强化学习是一种重要的机器学习方法,它能够利用不确定的环境奖赏值发现系统的最优策略,实现动态环境的在线学习,因而强化学习技术是构成智能体具有智能的基础之一。 本文在强调智能体之间相互竞争的基础上,对多智能体系统中的任务分配问题进行了较为深入的研究,主要工作可分为以下几点: 分析了多智能体系统任务分配问题的研究现状,较详细地归纳总结了几种典型的分布式任务分配算法,指出了各自的特点和适用范围。强调指出了基于竞争的多智能体系统任务分配算法的可行性。对强化学习基本理论、方法进行了概述,较详细地归纳总结了当前分布式强化学习研究现状,并对三种典型方法进行了分析比较,指出了它们的优劣。 介绍了多智能体系统联盟和联盟形成理论;基于社会生活中的竞争现象,提出了一种多智能体系统任务分配竞争模型:智能体通过竞争和协商的手段,实现系统任务分配;文中给出了该模型的数学描述。 介绍了博弈论基本理论;给出了任务竞争和资源竞争的详细算法描述;在求解次优收益时采用了分布式学习中的群体强化学习法,考虑到群体强化学习所需的状态存储空间随智能体个数的增加而激增的缺陷,文中采了状态集结法加以解决;给出了对状态集结法收敛性验证的仿真结果和多智能体系统任务分配的仿真试验,并对实验现象进行了分析。 最后,对本文工作进行了总结,指出其中的缺点和不足,并对未来工作进行了展望。
索取号TP18/W32/2004
英文摘要The theory and technology of agent and multi-agent system are not only the research focus of distributed artificial intelligence and computer science, but also the content of manufacturing engineering and other subjects. The task allocation problem has been studied extensively for a number of years and has been proved as an NP-hard problem. The approaches to solving the task allocation problem can be classified to the following three categories: graph theoretic approaches, integer programming approaches, heuristic approaches. Traditionally, task allocations are often considered in a centralized and static setting. A single controller would gather and examine all the relevant information about the current organization and mission, then decide and allocate tasks for every fellow agent. The characters of Multi-agent system decide that its task allocations are absolutely different from what has been described above. There is no central controller in a MAS, and agents are independent and autonomic relatively. Communications among agents are deficient and limited. And also, a solution to task allocation in MAS must consider the dynamic aspects of the environment and unexpected changes in agents. Agents in organization must be able to negotiate without any fixed leaders and find a relatively satisfying solution to all the agents. Reinforcement learning is a kind of important machine learning approach. It can find the optimal system strategy by uncertain enviroment awards through realizing the system to lean online. So it is very important to make agents intelligent. In this thesis, we propose a MAS model based on market competition to solve the task allocation problem. The main contributions are summarized as the follows: State the problem of task allocation, and summarize the presently existed algorithms to solve it. Point out the merit and demerit of each algorithm. Draw the conclusion that the task allocation algorithms based on competition is promising. Introduce the basic theory and approaches of reinforcement learning. Analyze in detail the current situations and pending problems of distributed reinforcement learning, and point out the merit and demerit of each approach. Introduce the basic theory of coalition and coalition forming. Additionally, a MAS model based on market competition has been proposed. The mathematical model is established. Introduce the basic theory of Game Theory and Nash Equilibrium. Apply game theory into task allocation problem to solve the problem of interactive agents. Based on statistics theory we analyze the experimental results. Our work is innovative, because the profits of each agent are emphasized and the payoff of the individual is principal. Based on this precondition, each agent contributes to the group profit. Experiments prove the effectiveness of our approach. Finally, draw the conclusion of this thesis, point out the innovative parts, and forecast the future research work.
语种中文
公开日期2012-08-29
产权排序1
分类号TP18
源URL[http://ir.sia.ac.cn/handle/173321/9471]  
专题沈阳自动化研究所_工业信息学研究室_工业控制系统研究室
推荐引用方式
GB/T 7714
王国权. 强化学习在多智能体系统任务分配中的应用[D]. 中国科学院沈阳自动化研究所. 中国科学院沈阳自动化研究所. 2004.

入库方式: OAI收割

来源:沈阳自动化研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。