Self-teaching adaptive dynamic programming for Gomoku
文献类型:期刊论文
作者 | Zhao, Dongbin![]() ![]() |
刊名 | NEUROCOMPUTING
![]() |
出版日期 | 2012-02-15 |
卷号 | 78期号:1页码:23-29 |
关键词 | Gomoku Reinforcement learning Adaptive dynamic programming Temporal difference learning Neural network |
英文摘要 | In this paper adaptive dynamic programming (ADP) is applied to learn to play Gomoku. The critic network is used to evaluate board situations. The basic idea is to penalize the last move taken by the loser and reward the last move selected by the winner at the end of a game. The results show that the presented program is able to improve its performance by playing against itself and has approached the candidate level of a commercial Gomoku program called 5-star Gomoku. We also examined the influence of two methods for generating games: self-teaching and learning through watching two experts playing against each other and presented the comparison results and reasons. (C) 2011 Elsevier B.V. All rights reserved. |
WOS标题词 | Science & Technology ; Technology |
类目[WOS] | Computer Science, Artificial Intelligence |
研究领域[WOS] | Computer Science |
关键词[WOS] | PLAY |
收录类别 | SCI |
语种 | 英语 |
WOS记录号 | WOS:000298528200004 |
源URL | [http://ir.ia.ac.cn/handle/173211/3874] ![]() |
专题 | 自动化研究所_复杂系统管理与控制国家重点实验室_智能化团队 |
作者单位 | Chinese Acad Sci, Inst Automat, State Key Lab Intelligent Control & Management Co, Beijing 100190, Peoples R China |
推荐引用方式 GB/T 7714 | Zhao, Dongbin,Zhang, Zhen,Dai, Yujie. Self-teaching adaptive dynamic programming for Gomoku[J]. NEUROCOMPUTING,2012,78(1):23-29. |
APA | Zhao, Dongbin,Zhang, Zhen,&Dai, Yujie.(2012).Self-teaching adaptive dynamic programming for Gomoku.NEUROCOMPUTING,78(1),23-29. |
MLA | Zhao, Dongbin,et al."Self-teaching adaptive dynamic programming for Gomoku".NEUROCOMPUTING 78.1(2012):23-29. |
入库方式: OAI收割
来源:自动化研究所
浏览0
下载0
收藏0
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。