Infinitely Many-Armed Bandits with Budget Constraints
文献类型:会议论文
作者 | Li HF(李海芳)1![]() |
出版日期 | 2016-11-28 |
会议日期 | 2017/2/4-2017/2/9 |
会议地点 | San Francisco, California USA |
关键词 | Multi-armed Bandits Budget Constraints Infinitely Many Arms |
英文摘要 | We study the infinitely many-armed bandit problem with budget constraints, where the number of arms can be infinite and much larger than the number of possible experiments. The player aims at maximizing his/her total expected reward under a budget constraint B for the cost of pulling arms. We introduce a weak stochastic assumption on the ratio of expected-reward to expected-cost of a newly pulled arm which characterizes its probability of being a near-optimal arm. We propose an algorithm named RCB-I to this new problem, in which the player first randomly picks K arms, whose order is sub-linear in terms of B, and then runs the algorithm for the finite-arm setting on the selected arms. Theoretical analysis shows that this simple algorithm enjoys a sub-linear regret in term of the budget B. We also provide a lower bound of any algorithm under Bernoulli setting. The regret bound of RCB-I matches the lower bound up to a logarithmic factor. We further extend this algorithm to the any-budget setting (i.e., the budget is unknown in advance) and conduct corresponding theoretical analysis. |
源URL | [http://ir.ia.ac.cn/handle/173211/15670] ![]() |
专题 | 自动化研究所_09年以前成果 |
通讯作者 | Li HF(李海芳) |
作者单位 | 1.中国科学院自动化研究所 2.中国科学技术大学 |
推荐引用方式 GB/T 7714 | Li HF,Yingce Xia. Infinitely Many-Armed Bandits with Budget Constraints[C]. 见:. San Francisco, California USA. 2017/2/4-2017/2/9. |
入库方式: OAI收割
来源:自动化研究所
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。