A belief-state restless bandit model for treatment adherence: Whittle indexability via partial conservation laws

Published 11 Jan 2026 in math.OC | (2601.06976v1)

Abstract: We study capacity-constrained treatment-adherence outreach via a belief-state restless multi-armed bandit model where patients are a partially observed two-state (adherent/nonadherent) Markov processes and interventions induce reset-type belief dynamics. Using partial conservation laws (PCLs), we establish Whittle indexability of the single-patient problem and derive a closed-form Whittle (marginal productivity) index, together with closed-form reward/work performance metrics under threshold policies and an explicit optimal threshold map. This yields an analytic Lagrangian relaxation: the single-patient Lagrangian value is a piecewise-affine convex function of the intervention price, enabling efficient computation of multi-patient dual bounds and certified relative optimality gaps. We also analyze how the Whittle index depends on the lapse and spontaneous-recovery parameters, providing qualitative insights on intervention priorities. In a large-scale numerical study over heterogeneous two-type populations, we compare Whittle's index policy with a myopic index rule and simple baselines; while myopic is highly competitive on most instances, Whittle's policy yields substantial gains in tight-capacity regimes with a fragile minority, reaching up to about $26\%$ higher reward and markedly smaller relative optimality gaps.