A Certainty Equivalence Result in Team-Optimal Control of Mean-Field Coupled Markov Chains

Published 2 Dec 2020 in math.OC | (2012.01020v1)

Abstract: This paper studies a large number of homogeneous Markov decision processes where the transition probabilities and costs are coupled in the empirical distribution of states (also called mean-field). The state of each process is not known to others, which means that the information structure is fully decentralized. The objective is to minimize the average cost, defined as the empirical mean of individual costs, for which a sub-optimal solution is proposed. This solution does not depend on the number of processes, yet it converges to the optimal solution of the so-called mean-field sharing as the number of processes tends to infinity. Under some mild conditions, it is shown that the convergence rate of the proposed decentralized solution is proportional to the square root of the inverse of the number of processes. Finding this sub-optimal solution involves a non-smooth non-convex optimization problem over an uncountable set, in general. To overcome this drawback, a combinatorial optimization problem is introduced that achieves the same rate of convergence.