Some reward–penalty rules for the multi-armed bandit problem which are asymptotically optimal

K. D. Glazebrook

doi:10.2307/1426995

Some reward–penalty rules for the multi-armed bandit problem which are asymptotically optimal

Published online by Cambridge University Press: 01 July 2016

K. D. Glazebrook

Show author details

K. D. Glazebrook*: Affiliation:
University of Newcastle upon Tyne
*: ∗Postal address: Department of Statistics, The University, Newcastle upon Tyne, NE1 7RU, U.K.

Article contents

Abstract
References

Rights & Permissions

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

In the mathematical learning literature, reward–penalty rules have been studied in various decision-theoretic and game-theoretic contexts, the multi-armed bandit problem included. Here we propose an elaboration of Bather's randomised allocation indices which yields rules for the multi-armed bandit which are both reward-penalty and asymptotically optimal.

Keywords

GITTINS INDEX: MATHEMATICAL LEARNING

Information

Type: Letters to the Editor
Information: Advances in Applied Probability , Volume 15 , Issue 1 , March 1983 , pp. 221 - 222

DOI: https://doi.org/10.2307/1426995 [Opens in a new window]

References

Bather, J. (1980) Randomised allocation of treatments in sequential trials. Adv. Appl. Prob. 12, 174–182.Google Scholar

Bather, J. (1981) Randomised allocation of treatments in sequential experiments (with discussion). J. R. Statist. Soc., B43, 265–292.Google Scholar

Glazebrook, K. D. (1980) On randomized dynamic allocation indices for the sequential design of experiments. J. R. Statist. Soc., B42, 342–346.Google Scholar

Meybodi, M. R. and Lackshmivarahan, S. (1982) e-optimality of a general class of learning algorithms. In Proc. Conf. Mathematical Learning Models–Theory and Applications. To appear.Google Scholar

Article contents

Some reward–penalty rules for the multi-armed bandit problem which are asymptotically optimal

Abstract

Keywords

Information

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests