- The paper presents π-MPPI, which integrates a projection filter within MPPI to enforce bounds on control magnitudes and derivatives, significantly improving smoothness.
- It employs a custom ADMM-based QP solver and a neural warm-start strategy to achieve efficient, parallelized optimization suitable for real-time UAV operation.
- Empirical validations in obstacle avoidance and terrain following benchmarks demonstrate higher success rates, minimal constraint violations, and faster convergence compared to standard MPPI.
Projection-based Model Predictive Path Integral Scheme for Fixed-Wing Aerial Vehicles
Motivation and Background
Recent advances in sampling-based model predictive control algorithms, specifically Model Predictive Path Integral (MPPI), have enabled robust trajectory optimization for nonlinear systems. However, for fixed-wing aerial vehicles (FWVs), the inherent non-smoothness of MPPI-generated control sequences can result in actuator oscillations, risking instability. Attempts to remedy this by post-hoc filtering (e.g., Savitzky–Golay) or by augmenting control penalties within the cost function have proven inadequate for guaranteeing bounded control derivatives, leading to challenging cost-tuning and often suboptimal smoothness.
Algorithmic Contributions
The paper proposes π-MPPI, a variant of MPPI augmented by a projection filter π that enforces bounds not only on control magnitudes but also on higher-order derivatives throughout the control sequence. This projection is applied directly to sampled control trajectories, solving a quadratic program (QP) to minimally adjust the samples while satisfying derivative constraints. Notably, the computational overhead is mitigated by a custom, parallelizable ADMM-based solver, with further acceleration through a warm-start neural policy trained in a self-supervised manner.
The π-MPPI algorithm modifies the MPPI pipeline by:
- Applying projection-based filtering to all control samples prior to their evaluation and averaging,
- Updating sample statistics post-projection,
- Ensuring feasibility by projecting the final averaged control trajectory,
- Enabling arbitrary smoothness orders by adjusting constraints in the projection QP.
The impact is the production of smooth, feasible control profiles for FWVs—without reliance on cost penalties for smoothness or post-hoc filtering.
Efficient Quadratic Programming and Warm-Start
The QP for projection is designed with batch processing and GPU acceleration in mind. All constraints—initial conditions and bounds on control and derivatives—are handled via slack variables and augmented Lagrangian minimization, enabling efficient parallelized ADMM iterations.
Warm-starting is achieved by training a multi-layer perceptron (MLP) to output initial values for the QP variables, with training gradients backpropagated through the solver steps. This approach ensures the neural policy is cognizant of its downstream effect, yielding higher convergence rates while maintaining minimal constraint violations.
Control Parametrization
To further reduce computation, the control sequence is parametrized using time-dependent polynomials, mapping mean trajectories and perturbations to low-dimensional coefficient spaces. The resulting QP operates on these coefficients, compressing the optimization problem and improving speeds compared to way-point parametrization, especially for long planning horizons.
Empirical Validation
Benchmarks
Two complex scenarios were evaluated:
- Obstacle Avoidance: FWV encircles a static goal while avoiding randomized 3D obstacles.
- Terrain Following: FWV tracks a goal while maintaining altitude constraints above complex terrain.
Metrics included:
- Success rate (no collisions/crashes),
- Smoothness (constraint residuals on control and derivatives),
- Proximity to goal.
Quantitative Results
π-MPPI demonstrated:
- Higher success rates compared to classic MPPI variants, especially under high control covariance settings,
- Minimal constraint violations across all control components and derivatives—often several orders lower than MPPIwSGF and polynomial baseline,
- Reduced average distance to goal and fewer outlier events,
- Robustness against high perturbation noise, facilitating broader exploration without destabilizing the control process.
Warm-starting using the neural policy further reduced QP iterations required for convergence (as few as 2 iterations per batch), with negligible loss in constraint satisfaction and significant computational gains.
Ablative Studies
Alternative warm-start strategies (sample-based or direct solution prediction) failed to match the residual minimization achieved by the neural policy. Results are consistent across both obstacle avoidance and terrain following tasks.
Computation Time
Polynomial control parametrization consistently outperformed way-point parametrization as batch sizes and projection iterations grew, with π-MPPI maintaining feedback rates suitable for real-time FWV operation (≥50 Hz). Neural warm-starting matched MPPIwSGF computation times with lower worst-case delays.
π-MPPI generalizes smoothness enforcement in MPPI without penalizing the primary cost function or increasing system dynamics, as seen in [kim2022smooth]. Unlike evolutionary-optimization or learning-based noise adaptation in [bhardwaj2022storm], [sacks2023learning], the projection filter ensures smoothness and feasibility without covariances constraints or indirect tuning. The differentiable solver plus neural warm-start strategy aligns with theoretical advances in fixed-point optimization warm-starting [sambharya2024learning], offering practical benefits for real-time MPC pipelines.
Practical and Theoretical Implications
π-MPPI's demonstrable ability to generate smooth bounded control for FWVs addresses actuator reliability and mission robustness in high-speed agile flight. By tolerating larger perturbation noise, it widens the feasible exploration space, potentially benefiting safe navigation, target tracking, and collision avoidance across unmanned aerial vehicles.
On a theoretical front, projection-based enforcement of arbitrarily smooth constraints opens avenues for MPC policy design in other nonlinear systems where control smoothness is critical (e.g., vehicle guidance, manipulator arms, autonomous driving).
Conclusion
The π-MPPI scheme delivers robust, computationally efficient smooth trajectory optimization for fixed-wing aerial vehicles by embedding a projection-based QP filter into the MPPI pipeline and leveraging neural warm-starting. Empirical validations underscore its superiority over conventional MPPI and penalty-based methods in both robustness and smoothness across challenging benchmarks. Future work includes generalization to autonomous driving, non-convex projection of state constraints, and expanding the neural warm-starting paradigm for broader optimization algorithms (2504.10962).