Hostname: page-component-cd9895bd7-dzt6s Total loading time: 0 Render date: 2024-12-28T00:46:34.756Z Has data issue: false hasContentIssue false

Sensitivity Analysis in Markov Decision Processes with Uncertain Reward Parameters

Published online by Cambridge University Press:  14 July 2016

Chin Hon Tan*
Affiliation:
University of Florida
Joseph C. Hartman*
Affiliation:
University of Florida
*
Postal address: Department of Industrial and Systems Engineering, University of Florida, 303 Weil Hall, PO Box 116595, Gainesville, FL 32611-6595, USA.
Postal address: Department of Industrial and Systems Engineering, University of Florida, 303 Weil Hall, PO Box 116595, Gainesville, FL 32611-6595, USA.
Rights & Permissions [Opens in a new window]

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

Sequential decision problems can often be modeled as Markov decision processes. Classical solution approaches assume that the parameters of the model are known. However, model parameters are usually estimated and uncertain in practice. As a result, managers are often interested in how estimation errors affect the optimal solution. In this paper we illustrate how sensitivity analysis can be performed directly for a Markov decision process with uncertain reward parameters using the Bellman equations. In particular, we consider problems involving (i) a single stationary parameter, (ii) multiple stationary parameters, and (iii) multiple nonstationary parameters. We illustrate the applicability of this work through a capacitated stochastic lot-sizing problem.

Type
Research Papers
Copyright
Copyright © Applied Probability Trust 2011 

References

[1] Bazaraa, M. S., Jarvis, J. J. and Sherali, H. D. (2005). Linear Programming and Network Flows, 3rd edn. John Wiley, Hoboken, NJ.Google Scholar
[2] Bellman, R. (1957). Dynamic Programming. Princeton University Press.Google Scholar
[3] Charalambous, C. and Gittins, J. C. (2008). Optimal selection policies for a sequence of candidate drugs. Adv. Appl. Prob. 40, 359376.Google Scholar
[4] Erkin, Z. et al. (2010). Eliciting patients' revealed preferences: an inverse Markov decision process approach. Decision Anal. 7, 358365.Google Scholar
[5] Gal, T. and Greenberg, H. J. (1997). Advances in Sensitivity Analysis and Parametric Programming. Kluwer, Dordrecht.Google Scholar
[6] Glazebrook, K. D., Ansell, P. S., Dunn, R. T. and Lumley, R.R. (2004). On the optimal allocation of service to impatient tasks. J. Appl. Prob. 41, 5172.Google Scholar
[7] Harmanec, D. (2002). Generalizing Markov decision processes to imprecise probabilities. J. Statist. Planning Infer. 105, 199213.Google Scholar
[8] Hopp, W. J. (1988). Sensitivity analysis in discrete dynamic programming. J. Optimization Theory Appl. 56, 257269.Google Scholar
[9] Iyengar, G. N. (2005). Robust dynamic programming. Math. Operat. Res. 30, 257280.Google Scholar
[10] Lim, C., Bearden, J. N. and Smith, J. C. (2006). Sequential search with multiattribute options. Decision Anal. 3, 315.Google Scholar
[11] Manne, A. S. (1960). Linear programming and sequential decisions. Manag. Sci. 6, 259267.Google Scholar
[12] Mitrophanov, A.Y., Lomsadze, A. and Borodovsky, M. (2005). Sensitivity of hidden Markov models. J. Appl. Prob. 42, 632642.Google Scholar
[13] Muckstadt, J. A. and Sapra, A. (2010). Principles of Inventory Management. Springer, New York.Google Scholar
[14] Nilim, A. and El Ghaoui, L. (2005). Robust control of Markov decision processes with uncertain transition matrices. Operat. Res. 53, 780798.Google Scholar
[15] Powell, W. B. (2007). Approximate Dynamic Programming. John Wiley, Hoboken, NJ.Google Scholar
[16] Puterman, M. L. (1994). Markov Decision Processes. Discrete Stochastic Dynamic Programming. John Wiley, New York.Google Scholar
[17] Sandvik, B. and Thorlund-Petersen, L. (2010). Sensitivity analysis of risk tolerance. Decision Anal. 7, 313321.Google Scholar
[18] Tan, C. H. and Hartman, J. C. (2010). Equipment replacement analyis with an uncertain finite horizon. IIE Trans. 42, 342353.Google Scholar
[19] Tan, C. H. and Hartman, J. C. (2011). Sensitivity analysis and dynamic programming. In Wiley Encyclopedia of Operations Research and Management Science, ed. Cochran, J. J., John Wiley, New York.Google Scholar
[20] Topaloglu, H. and Powell, W. B. (2007). Sensitivity analysis of a dynamic fleet management model using approximate dynamic programming. Operat. Res. 55, 319331.Google Scholar
[21] Veinott, A. F. Jr. and Wagner, H. M. (1965). Computing optimal (s, S) inventory policies. Manag. Sci. 11, 525552.Google Scholar
[22] Wallace, S. W. (2000). Decision making under uncertainty: is sensitivity analysis of any use? Operat. Res. 48, 2025.Google Scholar
[23] Ward, J. E. and Wendell, R. E. (1990). Approaches to sensitivity analysis in linear programming. Ann. Operat. Res. 27, 338.Google Scholar
[24] Wendell, R. E. (1985). The tolerance approach to sensitivity analysis in linear programming. Manag. Sci. 31, 564578.Google Scholar
[25] White, C. C. and El-Deib, H. K. (1986). Parameter imprecision in finite state, finite action dynamic programs. Operat. Res. 34, 120129.Google Scholar
[26] White, D. J. (1993). Markov Decision Processes. John Wiley, Chichester.Google Scholar