Implicit Numerical Scheme for HJBQVI

Updated 27 January 2026

The paper introduces an implicit backward‐Euler discretization for HJBQVI, achieving numerical stability without restrictive time‐step limitations.
It combines monotone finite-difference/finite-element approximations with structured policy iteration to solve coupled nonlinear systems in stochastic and impulse control.
Applications in market-making and optimal harvesting showcase fast convergence, reliable error bounds, and effective impulse control decision-making.

An implicit numerical scheme for the Hamilton-Jacobi-Bellman quasi-variational inequality (HJBQVI) is a discretization technique for combined stochastic and impulse control problems, which targets numerical stability and convergence to the viscosity solution without restrictive time-step limitations. This approach has been developed and refined in the context of optimal stochastic control with impulses, with recent applications including optimal market-making with alpha signals and resource management problems. Implicit time discretization, when combined with monotone finite difference or finite element approximations and structured policy iteration, is now a central tool for numerically solving HJBQVIs (Meteykin, 24 Dec 2025, Ieda, 2013, Azimzadeh et al., 2017, Reisinger et al., 2018).

1. Formulation of the HJBQVI

The canonical HJBQVI for a combined stochastic and impulse control problem is expressed as: $\min\Big\{ - \partial_t v(t,\alpha,q) - \sup_{l^a, l^b \in \{0,1\}} \big[L^{l^a, l^b}v(t,\alpha,q) + \tilde f^{l^a, l^b}(t, \alpha, q)\big], \; v(t,\alpha,q) - \mathcal{M}v(t,\alpha,q) \Big\} = 0$ with terminal condition $v(T,\alpha,q) = \tilde g(\alpha, q)$ . The local generator $L^{l^a, l^b}$ incorporates controlled diffusion and pure-jump effects, while the nonlocal (impulse) operator $\mathcal{M}$ encodes discrete, instantaneous changes due to impulse controls. The variational inequality reflects the trade-off: at each point, the system either continues under the local stochastic control or executes an impulse if the latter yields a higher value (Meteykin, 24 Dec 2025, Ieda, 2013).

2. Implicit Scheme Construction

A backward-Euler (fully implicit) time discretization of the HJBQVI, coupled with a monotone finite-difference operator in space, is implemented on a structured grid for state-space variables (e.g., time, α, inventory $q$ ):

Time derivative: $(\partial_t v) \approx \frac{v^{n+1}_{i,j} - v^n_{i,j}}{\delta t}$ , evaluated at the future (implicit) level $n+1$ .
Local dynamics (drift, diffusion, controlled jumps): finite-difference stencils, with upwinding for advection to ensure monotonicity.
Impulse operator: the continuation value $(v^{n+1}_{i,j})$ is compared to the maximized post-impulse value, $\sup_{z\in\{\pm1\}} [v^{n+1}_{i,j+z} - \Upsilon]$ . This forms, at each grid-point, a coupled system of nonlinear algebraic equations reflecting both local and impulsive controls (Meteykin, 24 Dec 2025, Ieda, 2013, Azimzadeh et al., 2017).

The general discrete equation at mesh-point $(i, j)$ and time-level $n$ is: $\max \Bigg\{ \sup_{l^a, l^b} \left( \frac{v^{n+1}_{i,j} - v^n_{i,j}}{\delta t} + L_h^{l^a, l^b} v^{n+1}_{i,j} + \tilde{f}^{l^a, l^b, n}_{i,j} \right), \ \sup_{z \in \{\pm1\}} \left( v^{n+1}_{i, j+z} - v^{n+1}_{i, j} - \Upsilon \right) \Bigg\} = 0$ with all operators applied “at the new time” for implicit stability (Meteykin, 24 Dec 2025). The spatial approximation uses central or upwind finite differences for derivatives and monotone linear interpolation for non-grid-aligned impulses (Ieda, 2013, Azimzadeh et al., 2017).

3. Policy Iteration and Solution Algorithm

At each time slice, a policy iteration (Howard’s algorithm) solves the coupled nonlinear discrete system:

Each “policy” specifies for every node which local control and (if any) impulse is taken.
The system is reformulated as a linear complementarity problem indexed by the policy, yielding a sparse linear system with M-matrix structure (strictly diagonally dominant, nonpositive off-diagonals).
The iterative algorithm alternates between:
1. Policy improvement—selecting controls and impulses that maximize the local value at each node.
2. Policy evaluation—solving the associated linear system. Termination is achieved when the relative error between consecutive iterates falls below a tolerance (e.g., $10^{-8}$ ) (Meteykin, 24 Dec 2025, Ieda, 2013, Reisinger et al., 2018).

The matrix system at each policy step has block-tridiagonal structure (in α) and is sparse. Standard direct sparse LU or Krylov solvers (e.g., GMRES with preconditioning) are applied efficiently because of the Z-matrix/M-matrix properties (Meteykin, 24 Dec 2025, Ieda, 2013).

4. Monotonicity, Stability, and Convergence

Three theoretical properties underpin the reliability and accuracy of implicit HJBQVI schemes:

Monotonicity: The scheme operator is nondecreasing in the candidate solution, enforced by the selection of upwind/central discretization and positive weights. This property ensures the validity of the comparison principle for the discrete system (Meteykin, 24 Dec 2025, Azimzadeh et al., 2017).
Stability: The coefficient matrix at each time-step is an M-matrix, guaranteeing that the discrete maximum principle holds and that solutions remain uniformly bounded independent of discretization (i.e., unconditional stability). No CFL-type restriction ties the time-step to the spatial grid size (Meteykin, 24 Dec 2025, Ieda, 2013, Azimzadeh et al., 2017).
Consistency: As grid parameters $\delta t$ , $\delta \alpha$ (and analogues) approach zero, the scheme’s local truncation error vanishes, ensuring that the solution of the scheme converges to the unique viscosity solution of the HJBQVI (Azimzadeh et al., 2017). A Barles-Souganidis–type theorem confirms local uniform convergence of the scheme under these properties, assuming the continuous HJBQVI admits the standard viscosity comparison principle (Azimzadeh et al., 2017, Ieda, 2013).

5. Implementation Considerations and Extensions

Practical aspects include:

Matrix Assembly: The system matrix is assembled at each time-step based on the current policy; storage in compressed-row format exploits sparsity.
Boundary Conditions: Zero-flux (Neumann) at spatial edges, and either reflecting or Dirichlet conditions in control variables.
Dimensionality: For moderate state space sizes (e.g., a tri-variate function in (Meteykin, 24 Dec 2025)), computations are tractable on a standard desktop. For higher dimension, tensor decompositions or sparse-grid methods are beneficial (Meteykin, 24 Dec 2025).
Parallelization: The pointwise structure of policy improvement is amenable to parallel computing. The global linear system is solved using traditional sparse techniques (Ieda, 2013, Reisinger et al., 2018).

6. Applications and Numerical Results

Applications of implicit HJBQVI schemes span financial engineering and operations research:

Market-Making with Signal: (Meteykin, 24 Dec 2025) applies the scheme to an optimal market-making problem with a persistent “alpha” signal, incorporating both stochastic control (limit orders) and impulse control (market orders) with observed unconditional stability, fast (≤5 sweeps per time-slice) convergence, and tractable computational times (~2 minutes for full solution).
Optimal Harvesting: (Ieda, 2013) demonstrates the methodology in finite-horizon optimal forest harvesting, confirming second-order convergence in space and stability irrespective of grid sizes. Switching thresholds for control are tracked over time, with clear solution regularity and structure.
Numerical Accuracy: Quantitative error bounds match expected first-order accuracy in time and second-order in space (Ieda, 2013); penalization or semi-implicit variants further extend the range of admissible problem structures (Reisinger et al., 2018).

7. Theoretical Developments and Generalizations

Further theoretical analysis demonstrates:

Nonlocal Consistency: Discrete schemes must be nonlocally consistent with the nonlocal PDE, specifically handling the intervention operator in the limiting process (Azimzadeh et al., 2017).
Structural Generality: Convergence theorems established in (Azimzadeh et al., 2017) do not depend on the precise form of the nonlocal operator, making the approach robust to a broad class of impulse, jump, or intervention effects.
Penalty and Semi-implicit Schemes: The penalty approach introduces a penalized variational inequality, and semi-implicit variants combine backward-Euler discretization in diffusion and explicit treatment of jumps; these achieve a balance between unconditional stability and ease of numerical implementation (Reisinger et al., 2018).

The development of implicit numerical methods for HJBQVIs enables robust and efficient solution of a wide class of stochastic control problems with impulses, providing rigorous stability and convergence while accommodating general control structures and nonlocal effects (Meteykin, 24 Dec 2025, Ieda, 2013, Azimzadeh et al., 2017, Reisinger et al., 2018).