Direct Leaf-Trajectory Optimization

Updated 1 February 2026

Direct leaf-trajectory optimization directly parameterizes system states and controls to generate feasible trajectories while enforcing complex nonlinear constraints.
It leverages diffusion models and augmented-Lagrangian techniques, using score-based methods to ensure dynamic feasibility and robust convergence.
Applications in robotics and radiotherapy demonstrate its practical benefits, including improved safety, efficient QP formulations, and deliverable VMAT plans.

Direct leaf-trajectory optimization refers to a class of optimization methods that generate complete state or control trajectories for dynamical or physical systems by directly parameterizing and optimizing over trajectories, often subject to complex nonlinear, equality, and inequality constraints. This paradigm contrasts with indirect or shooting-based methods, in which controls are optimized and system dynamics are imposed by integration or rollouts. Direct trajectory optimization is essential in fields such as nonlinear control, robotics, and medical physics, where physical, safety, or hardware-specific constraints must be satisfied exactly in the generated solutions.

1. Formulation of Direct Trajectory Optimization

Direct trajectory optimization is typically posed as a constrained nonlinear program (NLP) over discretized trajectories of system states and possibly controls. Let $x_{0:N} = (x_0, x_1, \dots, x_N)$ denote the sequence of system states, and $u_{0:N-1} = (u_0, \dots, u_{N-1})$ denote control inputs. The canonical discrete-time, finite-horizon optimal control problem is formulated as

$\begin{aligned} \min_{x_{0:N},\,u_{0:N-1}} \;\;\; & J(x_{0:N}, u_{0:N-1}) = \phi(x_N) + \sum_{t=0}^{N-1} \ell(x_t, u_t) \ \text{s.t.} \quad & x_{t+1} = f(x_t, u_t), \quad t = 0, \dots, N-1, \ & g(x_{0:N}, u_{0:N-1}) = 0, \ & x_0 = x_\mathrm{init}, \ & u_t \in [u_{\min}, u_{\max}], \quad x_t \in \mathcal{X}. \end{aligned}$

Here, $\ell$ and $\phi$ are the stage and terminal cost, $f$ is the (possibly nonlinear) system dynamics, $g$ represents equality constraints (e.g., collision or boundary constraints), and $\mathcal{X}$ denotes admissible state sets. Constraints are enforced at every node, enabling hard satisfaction of task, safety, and physical requirements (Kurtz et al., 2024, Chen et al., 6 Oct 2025).

In the context of radiotherapy planning, direct leaf-trajectory optimization parameterizes multileaf collimator (MLC) leaf trajectories with decision variables that directly encode the physical motion of leaves as a function of time, subject to machine constraints and clinical dose requirements (Papp et al., 2013).

2. Direct Trajectory Optimization via Diffusion Models

Recent work leverages diffusion models as generative samplers for trajectory optimization problems, due to their capacity to represent complex, multimodal distributions. The problem is recast as sampling feasible, low-cost trajectories $X$ from an implicit target distribution

$p(X) \propto \exp(-J(X)/\sigma^2)$

with annealed noise $\sigma\to 0$ (Kurtz et al., 2024, Chen et al., 6 Oct 2025).

Score-based Diffusion

A neural network score model $s_\theta(X, t)$ is trained to approximate the gradient of the log target density, $\nabla_X \log p_t(X)$ . In unconstrained cases, this allows application of Langevin or score-guided sampling to generate trajectories with decreasing noise. The forward process adds Gaussian noise to trajectories; the reverse process iteratively denoises trajectories guided by the score model, ultimately generating candidate solutions near local minima of the objective (Kurtz et al., 2024).

Enforcement of Nonlinear Constraints

Recent methods have extended score-based diffusion to handle the critical nonlinear equality constraints required for enforcing system dynamics or other physical feasibility (Kurtz et al., 2024, Chen et al., 6 Oct 2025):

Augmented-Lagrangian Langevin Diffusion: Incorporates Lagrange multipliers $\lambda$ and penalty coefficients $\mu$ for equality constraints $h(X) = 0$ , yielding updates via the augmented Lagrangian

$\mathcal{L}_\mu(X, \lambda) = J(X) + \lambda^T h(X) + \frac{\mu}{2}\|h(X)\|^2$

and interleaving state and multiplier diffusion dynamics.

Projection-Augmented Reverse Diffusion (PAD-TRO): Enforces dynamics in each reverse diffusion step via a gradient-free projection $\Pi$ . Given an unconstrained denoised sample $\tilde{X}$ , the next iterate is projected onto the feasible trajectory manifold:

$X^\mathrm{next} = \Pi(\tilde{X}) = \arg\min_X \|X - \tilde{X}\|^2 \quad \text{s.t.} \quad g(X) = 0$

This is typically solved using fixed-point or Gauss–Newton methods applied at each reverse step (Chen et al., 6 Oct 2025).

3. Direct Generation and Conditioning Mechanisms

Diffusion-based direct methods can generate full state sequences in a single reverse sampling chain, bypassing explicit control parameterization. Conditioning is achieved by appending binary masks and known values to the model input, thus enforcing fixed endpoints, waypoints, or obstacles within the generative process. Architectures such as temporal U-Nets or transformers are typically employed to capture trajectory-wide dependencies and respect imposed state constraints (Chen et al., 6 Oct 2025).

In classical direct methods (non-generative), variables such as time-indexed positions of system components (e.g., MLC leaves) parameterize the trajectory directly. For VMAT planning, each leaf trajectory is encoded as a piecewise-linear function determined by the breakpoints corresponding to when leaves cross discretized spatial bixel boundaries, with constraints that ensure mechanical deliverability (Papp et al., 2013).

4. Algorithmic Implementation

Diffusion-based Direct Trajectory Optimization

Key algorithmic steps include:

Offline Trajectory Generation: Create a dataset of feasible trajectories by solving direct trajectory optimization via collocation or other direct transcription methods.
Score Model Training: Add Gaussian noise and train the score network using denoising objectives.
Inference (Sampling): Run the reverse diffusion chain initialized with noise. At each step, apply projection to the feasible manifold, update the trajectory, and continue until convergence.

PAD-TRO specifically incorporates:

Reverse diffusion steps $N \sim 100$
Linear noise schedule $\beta_n$ ramping from $10^{-4}$ to $0.02$
U-Net architecture with temporal embedding and residual blocks
Repeated projection (at each reverse step) dominating computational runtime (Chen et al., 6 Oct 2025)

Direct Convex Optimization for Physical Systems

Direct leaf-trajectory optimization for VMAT is formulated as a convex quadratic program (QP) with variables representing the entrance/exit times of leaf edges at bixel boundaries. Constraints enforce:

Maximum leaf speed
Non-interdigitation (adjacent leaves do not cross)
Dose limits on each voxel
Fixed or modulated gantry speed and dose rate

Solutions are efficiently obtained via standard QP solvers (e.g., MOSEK), with an outer loop refining dose-influence matrices based on computed trajectories and convergence typically achieved in a few iterations (Papp et al., 2013).

5. Comparative Results and Performance

Diffusion Methods with Projection

Direct diffusion-based optimizers augmented with constraint projections (PAD-TRO) achieve the following in cluttered dynamical scenarios such as quadrotor navigation (Chen et al., 6 Oct 2025):

Method	Dyn. Error (mean)	Success Rate (%)
SDS	$1.2 \times 10^{-1}$	21
PAD-TRO	$8.5 \times 10^{-8}$	89

PAD-TRO drives dynamic residuals to negligible values, outperforming single-shooting diffusion baselines (e.g., SDS) by approximately a factor of four in planning success rate. The projection mechanism guarantees dynamic feasibility at every reverse-diffusion step.

Equality-Constrained Diffusion

Direct constrained diffusion demonstrates robust convergence for complex, multimodal tasks (e.g., pendulum swing-up, robot navigation in bug-trap mazes), showing resilience to initial infeasibility and local minima that commonly confound shooting or classical direct methods (Kurtz et al., 2024).

Classical Direct Optimization in Radiotherapy

For VMAT planning, direct leaf-trajectory optimization produces deliverable plans that—within 3–4 minutes of treatment time—closely match dosimetric quality benchmarks of 20-beam IMRT plans in diverse clinical cases (head-and-neck, prostate, paraspinal), provided machine constraints (dose rate, leaf speed, unidirectionality, and interdigitation) are all enforced in the QP (Papp et al., 2013).

6. Advantages, Limitations, and Applications

Advantages

Hard satisfaction of nonlinear equality constraints (e.g., system dynamics) throughout the optimization, avoiding post-hoc correction or infeasible intermediates (Kurtz et al., 2024, Chen et al., 6 Oct 2025).
Direct output of feasible trajectories, obviating the need for shooting-based integration and allowing physical or mechanical constraints to be imposed explicitly and efficiently (Papp et al., 2013).
Ability to escape local minima and improve robustness through early-stage stochasticity (diffusion methods) (Kurtz et al., 2024).
Efficient convex formulations (where possible) admit fast QP-based solutions for large-scale problems (Papp et al., 2013).

Limitations

Iterative projection or augmented-Lagrangian steps may incur high computational overhead, as projections must be solved repeatedly throughout generative diffusion sampling (Chen et al., 6 Oct 2025).
Diffusion hyperparameters (noise schedules, penalty coefficients) require careful tuning for numerical stability and performance.
For highly nonlinear or discontinuous constraints, projection and score estimation may become challenging.

7. Future Directions

Ongoing research explores:

Dynamics-aware, adaptive noise schedules to reduce the need for repeated projections and accelerate convergence in diffusion-based optimizers (Chen et al., 6 Oct 2025).
Development of learned, amortized projection operators for faster enforcement of complex feasibility constraints.
Extension to real-time and hardware-in-the-loop settings, including empirical deployment on robotic systems and medical devices.
Extension of differentiable direct trajectory optimization to classes of problems with richer constraints and objectives (e.g., hybrid or stochastic systems).
Theoretical advances in convergence analysis and sampling guarantees for equality-constrained diffusion processes (Kurtz et al., 2024).

Advances in direct leaf-trajectory optimization continue to enhance the fidelity, safety, and efficiency of trajectory generation in both control and physical application domains. These developments enable the direct synthesis of rich, constraint-satisfying solutions within tractable computational frameworks (Papp et al., 2013, Kurtz et al., 2024, Chen et al., 6 Oct 2025).

Markdown Report Issue Upgrade to Chat

References (3)

Equality Constrained Diffusion for Direct Trajectory Optimization (2024)

PAD-TRO: Projection-Augmented Diffusion for Direct Trajectory Optimization (2025)

Direct leaf trajectory optimization for volumetric modulated arc therapy planning with sliding window delivery (2013)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Direct Leaf-Trajectory Optimization.