Published online by Cambridge University Press: 14 October 2015
We study the optimal buffer capacity K for the M/M/1/K queue under some standard cost and reward structures by comparing various Markov reward processes. Using explicit expressions for the deviation matrix of the underlying Markov chains, we find the bias optimal value for K in the case of a tie between two consecutive optimal gain policies. We show that the bias optimal value depends both on whether the reward is granted upon arrival or departure of the customers, and on the initial queue size. Moreover, we demonstrate that in some specific cases the optimal policy is threshold-based with respect to the initial queue size.