Hostname: page-component-78c5997874-mlc7c Total loading time: 0 Render date: 2024-11-10T13:55:17.212Z Has data issue: false hasContentIssue false

Bounds for discounted stochastic scheduling problems

Published online by Cambridge University Press:  14 July 2016

K. D. Glazebrook*
Affiliation:
University of Newcastle upon Tyne
*
Postal address: Department of Mathematics and Statistics, The University, Newcastle upon Tyne NE1 7RU, UK.

Abstract

Suppose that π is a policy for resource allocation in a stochastic environment and π ∗ is an optimal policy. Two existing procedures for policy evaluation are described and compared. Both of these evaluate π by means of upper bounds on R(π ∗) – R(π), the total reward lost when making resource allocations according to π rather than π∗. The bounds developed by these two methods are called Type 1 and Type 2. We demonstrate by example that neither of these procedures dominates the other in the sense of always yielding tighter bounds. A modification to Type 2 bounds is proposed resulting in an improved procedure which always dominates the Type 1 approach.

Type
Research Papers
Copyright
Copyright © Applied Probability Trust 1991 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

During the course of this research the author was supported by the National Research Council as a Senior Research Associate at the Department of Operations Research, Naval Postgraduate School, Monterey, CA 93943–5000, USA.

References

[1] Bather, J. A. (1981) Randomised allocation of treatments in sequential experiments (with discussion). J. R. Statist. Soc. B43, 265292.Google Scholar
[2] Benkherouf, L., Glazebrook, K. D. and Owen, R. W. (1991) Gittins indices and oil exploration.10.1111/j.2517-6161.1992.tb01877.xGoogle Scholar
[3] Bergman, S. W. and Gittins, J. C. (1985) Statistical Methods for Pharmaceutical Research Planning. Marcel Dekker, New York.Google Scholar
[4] Bruno, J. and Hofri, M. (1975) On scheduling chains of jobs on one processor with limited preemption. SIAM J. Comput. 4, 478490.10.1137/0204041Google Scholar
[5] Gittins, J. C. (1979) Bandit processes and dynamic allocation indices (with discussion). J. R. Statist. Soc. B41, 148177.Google Scholar
[6] Gittins, J. C. (1989) Multi-armed Bandit Allocation Indices. Wiley, New York.Google Scholar
[7] Gittins, J. C. and Glazebrook, K. D. (1977) On Bayesian models in stochastic scheduling. J. Appl. Prob. 14, 556565.Google Scholar
[8] Glazebrook, K. D. (1978) On the optimal allocation of two or more treatments in a controlled clinical trial. Biometrika 65, 335340.10.1093/biomet/65.2.335Google Scholar
[9] Glazebrook, K. D. (1982) On the evaluation of suboptimal strategies for families of alternative bandit processes. J. Appl. Prob. 19, 716722.10.2307/3213534Google Scholar
[10] Glazebrook, K. D. (1983) Methods for the evaluation of permutations as strategies in stochastic scheduling. Management Sci. 29, 11421155.10.1287/mnsc.29.10.1142Google Scholar
[11] Glazebrook, K. D. (1987) Sensitivity analysis for stochastic scheduling problems. Math. Operat. Res. 12, 205223.10.1287/moor.12.2.205Google Scholar
[12] Glazebrook, K. D. (1990) Procedures for the evaluation of strategies for resource allocation in a stochastic environment. J. Appl. Prob. 27, 215220.Google Scholar
[13] Katehakis, M. N. and Veinott, A. F. (1987) The multi-armed bandit problem: decomposition and computation. Math. Operat. Res. 12, 262268.10.1287/moor.12.2.262Google Scholar
[14] Ross, S. M. (1970) Applied Probability Models with Optimization Applications. Holden-Day, San Francisco.Google Scholar