The Computation of Average Optimal Policies in Denumerable State Markov Decision Chains

Linn I. Sennott

doi:10.2307/1427863

The Computation of Average Optimal Policies in Denumerable State Markov Decision Chains

Part of: Operations research and management science Stochastic systems and control Markov processes

Published online by Cambridge University Press: 01 July 2016

Linn I. Sennott

Show author details

Linn I. Sennott*: Affiliation:
Illinois State University
*: ∗ Postal address: Department of Mathematics 4520, Illinois State University, Normal, IL 61790-4520, USA. sennott@math.ilstu.edu

Article contents

Abstract
Footnotes
References

Get access

Rights & Permissions

Abstract

This paper studies the expected average cost control problem for discrete-time Markov decision processes with denumerably infinite state spaces. A sequence of finite state space truncations is defined such that the average costs and average optimal policies in the sequence converge to the optimal average cost and an optimal policy in the original process. The theory is illustrated with several examples from the control of discrete-time queueing systems. Numerical results are discussed.

Keywords

CONTROLLED MARKOV PROCESSES COMPUTATION OF AVERAGE COST OPTIMAL STATIONARY POLICIES CONTROL OF DISCRETE-TIME QUEUES

MSC classification

Primary: 93E20: Optimal stochastic control

Secondary: 90B22: Queues and service 60J10: Markov chains (discrete-time Markov processes on discrete state spaces)

Type: General Applied Probability
Information: Advances in Applied Probability , Volume 29 , Issue 1 , March 1997 , pp. 114 - 137

DOI: https://doi.org/10.2307/1427863 [Opens in a new window]
Copyright: Copyright © Applied Probability Trust 1997

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

This material is based upon work supported by the National Science Foundation under Grant No. ECS-9309154.

References

Apostol, T. (1974) Mathematical Analysis. Addison-Wesley, Reading, MA.Google Scholar

Arapostathis, A., Borkar, V. S., FernÁNdez-Gaucherand, E., Ghosh, M. K. and Marcus, S. I. (1993) Discrete-time controlled Markov processes with average cost criterion: a survey. SIAM J. Control Optim. 31, 282–344.Google Scholar

Bertsekas, D. (1987) Dynamic Programming. Prentice-Hall, Englewood Cliffs, NJ.Google Scholar

Borkar, V. S. (1984) On minimum cost per unit time control of Markov chains. SIAM J. Control. Optim. 22, 965–978.Google Scholar

Borkar, V. S. (1989) Control of Markov chains with long-run average cost criterion: the dynamic programming equations. SIAM J. Control Optim. 27, 642–657.Google Scholar

Cavazos-Cadena, R. (1986) Finite-state approximations for denumerable state discounted Markov decision processes. Appl. Math. Optim. 14, 1–26.Google Scholar

Cavazos-Cadena, R. and Sennott, L. I. (1992) Comparing recent assumptions for the existence of average optimal stationary policies. Operat. Res. Lett. 11, 33–37.Google Scholar

Chung, K. L. (1967) Markov Chains with Stationary Transition Probabilities. 2nd edn. Springer, New York.Google Scholar

Fox, B. L. (1971) Finite-state approximations to denumerable state dynamic programs. J. Math. Anal. Appl. 34, 665–670.Google Scholar

Gibson, D. and Seneta, E. (1986) Augmented truncations of infinite stochastic matrices. Report. Department of Mathematical Statistics, University of Sydney.Google Scholar PubMed

Gibson, D. and Seneta, E. (1987) Augmented truncations of infinite stochastic matrices. J. Appl. Prob. 24, 600–608.Google Scholar

Golub, G. H. and Seneta, E. (1974) Computation of the stationary distribution of an infinite stochastic matrix of special form. Bull. Austral. Math. Soc. 10, 255–261.Google Scholar

Grassman, W. K., Taksar, M. I. and Heyman, D. P. (1985) Regenerative analysis and steady state distributions for Markov chains. Operat. Res. 33, 1107–1116.Google Scholar

Hernandez-Lerma, O. (1989) Adaptive Markov Control Processes (Applied Mathematical Sciences 79). Springer, New York.CrossRef Google Scholar

Heyman, D. P. (1991) Approximating the stationary distribution of an infinite stochastic matrix. J. Appl. Prob. 28, 96–103.Google Scholar

Puterman, M. L. (1994) Markov Decision Processes. Wiley, New York.Google Scholar

Ross, S. M. (1983) Introduction to Stochastic Dynamic Programming. Academic Press, New York.Google Scholar

Royden, H. L. (1968) Real Analysis. 2nd edn. Macmillan, New York.Google Scholar

Sennott, L. I. (1989) Average cost optimal stationary policies in infinite state Markov decision processes with unbounded costs. Operat. Res. 37, 626–633.Google Scholar

Sennott, L. I. (1993) The average cost optimality equation and critical number policies. Prob. Eng. inf. Sci. 7, 47–67.Google Scholar

Sennott, L. I. (1995) Another set of conditions for average optimality in Markov control processes. Sys. Control. Lett. 24, 147–151.Google Scholar

Thomas, L. C. and Stengos, D. (1985) Finite state approximation algorithms for average cost denumerable state Markov decision processes. OR Spek. 7, 27–37.Google Scholar

Tidball, M. M. and Altman, E. (1996) Approximations in dynamic zero-sum games. I. SIAM J. Control Optim. 34, 311–328.Google Scholar

White, D. J. (1980) Finite state approximations for denumerable state infinite horizon discounted Markov decision processes: the method of successive approximations. In Recent Developments in Markov Decision Processes. ed. Hartley, R., Thomas, L. C. and White, D. J.. Academic Press, New York. pp. 57–72.Google Scholar

Wolf, D. (1980) Approximation of the invariant probability measure of an infinite stochastic matrix. Adv. Appl. Prob. 12, 710–726.CrossRef Google Scholar

Article contents

The Computation of Average Optimal Policies in Denumerable State Markov Decision Chains

Abstract

Keywords

MSC classification

Access options

Footnotes

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests