Convexified Shape Layers & Depth Ordering
- The paper presents a vectorization framework that decomposes quantized raster images into scalable vector shape layers with an explicit, energy-driven depth order.
- It employs curvature-based convexification via variational inpainting to remove pixelation while preserving boundary smoothness and accurate occlusion relationships.
- The pipeline integrates combinatorial graph construction, efficient Bézier curve fitting, and SVG export, achieving lower MSE and faster performance compared to previous methods.
Convexified shape layers with depth ordering is a framework for image vectorization that decomposes a color-quantized raster image into scalable vector shape layers with an explicit, energy-driven global depth order. The methodology integrates combinatorial graph construction, convexification via variational inpainting, and scalable SVG export with layer semantics. This approach addresses both the removal of pixelization and the preservation of boundary smoothness while simultaneously encoding relative depth relationships among segmented shapes. The pipeline leverages curvature-based inpainting to convexify occluded shapes, following a depth order inferred by pairwise covered-area measures, and outputs vector graphics amenable to editing and semantic grouping (Law et al., 2024).
1. Depth-Ordering Energy
Given an image grid and a -color quantization , each connected component of a single color defines a shape layer with binary mask . The pairwise covered-area measure for is
where is the convex hull mask of . This quantifies the fraction of 's area occluded by 's convex hull. The depth-ordering energy is then
Thresholding by leads to directed edges in a shape graph: if , is above ; if , above ; otherwise, no order is set. The resulting directed graph , with nodes for each , may have cycles. Cycles are broken by identifying the edge in each cycle maximizing the convex-hull symmetric difference
and removing edges accordingly until is acyclic. A topological sort extracts the linear order . These steps involve combinatorial operations: Graham scan for convex hulls , raster intersections for area calculations, and no continuous optimization.
2. Curvature-Based Convexification via Inpainting
Given the total order , each is convexified into , constrained to only grow into occluded regions
where groups small, spurious regions. The minimization target is Euler’s elastica energy,
for positive and curvature , promoting smooth, near-convex boundaries into .
The sharp-interface elastica formulation is approximated with a Modica–Mortola diffuse interface:
- The phase field satisfies inside, outside.
- The double-well potential penalizes intermediate values.
Diffuse elastica energy: subject to on and on . As , this -converges to sharp-interface elastica plus fidelity at inpainting data .
The solver introduces and splits the problem into two linear subproblems, each efficiently solved by FFT due to the choice of Laplacian discretization and Dirichlet boundary conditions. Thresholding yields the convexified region .
3. Bézier Curve Fitting and SVG Layered Export
The boundary of is extracted as the zero-level set , then sampled into a closed, ordered set of points . Discrete curvature,
is computed at each sample to identify curvature extrema above threshold , which demarcate Bézier curve segments.
For each segment, cubic Bézier curves with control points are fitted via least-squares minimization,
with the normalized arc-length parameter of each point. Segments with Hausdorff distance exceeding a threshold are subdivided and re-fitted recursively.
SVG export creates one <path> element per fitted Bézier, grouped by shape-layer index . Layers are ordered in SVG z-order according to reverse (bottom-to-top). Each path is filled with its original color and full opacity; SVG <g> wrappers encode grouping for semantic units.
4. Algorithmic Workflow and Computational Complexity
The vectorization procedure follows:
- Color quantization (K-means) yielding .
- Extraction of shape layers ; denoising forms .
- For all : compute convex hulls, , and ; build directed graph .
- Remove cycles in by deleting maximal edges; topologically sort to obtain .
- For each (following ): construct occlusion mask ; solve diffuse elastica with FFT splitting; extract ; sample boundary and curvature extrema; fit cubic Béziers.
- Export SVG, stacking layers by .
Computational complexity per stage:
- Convex hull per shape: .
- Pairwise measures: .
- Cycle removal: .
- Inpainting: per shape.
- Bézier fitting: linear in boundary length.
For typical settings (–$50$, image size ), the entire pipeline executes in tens of seconds on modern CPUs. The iterative elastica solver converges in $50$–$200$ iterations, with -convergence ensuring alignment with the sharp-interface elastica minimizer as .
5. Empirical Comparison with Prior Layered Vectorization Methods
Quantitative and qualitative performance was assessed against LIVE [Ma et al. 2022], DiffVG [Li et al. 2020], and LIVSS [Wang et al. 2024] on benchmark scenes (e.g., pixels). Representative numeric results:
| Method | # Bézier Curves | MSE↓ | PSNR↑ | Time (s) |
|---|---|---|---|---|
| Ours (≈7 layers) | 93 | 13.4 | 41.6 dB | 37 |
| LIVE (32 paths) | 128 | 28.6 | 38.3 dB | 20,640 |
| DiffVG (128 paths) | 517 | 71.4 | 34.4 dB | 194 |
| LIVSS | 200–500 | – | – | 888 |
This framework accurately recovers the correct depth ordering of occluded regions, yields fewer Bézier segments per semantic shape, produces lower rasterization error (MSE) than LIVE and DiffVG, and executes – faster than LIVE in experimental runs.
Limitations include instability for quantizations generating very small noisy shapes (mitigated by pre-grouping), potential over-convexification of highly concave objects, and sensitivity to pairwise area cues where T-junctions are ambiguous.
6. Theoretical and Practical Implications
By convexifying image shape layers and establishing explicit, globally consistent depth orderings, this approach offers a principled tool for producing editable vector representations, compatible with human visual perceptual biases (e.g., boundary smoothness, convex completion). The integration of variational inpainting and combinatorial depth inference differentiates this pipeline from previous layer-based vectorization techniques.
The methodology is closely tied to -convergence theory (ensuring the diffuse interface energy converges to the elastica), links to raster-to-vector learning paradigms, and provides a foundation for further integration of learned depth cues, interactive layer annotation, or GPU-accelerated elastica solvers. Grouping of shape layers for semantic vectorization is also considered, suggesting directions for future work in semantic abstraction and user-guided editing (Law et al., 2024).