Convexified Shape Layers & Depth Ordering

Updated 18 January 2026

The paper presents a vectorization framework that decomposes quantized raster images into scalable vector shape layers with an explicit, energy-driven depth order.
It employs curvature-based convexification via variational inpainting to remove pixelation while preserving boundary smoothness and accurate occlusion relationships.
The pipeline integrates combinatorial graph construction, efficient Bézier curve fitting, and SVG export, achieving lower MSE and faster performance compared to previous methods.

Convexified shape layers with depth ordering is a framework for image vectorization that decomposes a color-quantized raster image into scalable vector shape layers with an explicit, energy-driven global depth order. The methodology integrates combinatorial graph construction, convexification via variational inpainting, and scalable SVG export with layer semantics. This approach addresses both the removal of pixelization and the preservation of boundary smoothness while simultaneously encoding relative depth relationships among segmented shapes. The pipeline leverages curvature-based inpainting to convexify occluded shapes, following a depth order inferred by pairwise covered-area measures, and outputs vector graphics amenable to editing and semantic grouping (Law et al., 2024).

1. Depth-Ordering Energy

Given an image grid $\Omega \subset \mathbb{Z}^2$ and a $K$ -color quantization $f:\Omega\to\{c_\ell\}_{\ell=1}^K$ , each connected component $S_i \subset \Omega$ of a single color defines a shape layer with binary mask $\chi_i(x) = 1_{S_i}(x)$ . The pairwise covered-area measure for $i \neq j$ is

$A(i,j) = \frac{\int_\Omega \chi_i(x)\,\chi_j^{\rm Conv}(x)\,dx}{\int_\Omega \chi_i(x)\,dx} \in [0,1],$

where $\chi_j^{\rm Conv}$ is the convex hull mask of $S_j$ . This quantifies the fraction of $S_i$ 's area occluded by $S_j$ 's convex hull. The depth-ordering energy is then

$D(i,j) = A(i,j) - A(j,i),\;\;\; D(i,j)\in [-1,1].$

Thresholding $D(i,j)$ by $\delta>0$ leads to directed edges in a shape graph: if $D(i,j)>\delta$ , $i$ is above $j$ ; if $D(i,j)<-\delta$ , $j$ above $i$ ; otherwise, no order is set. The resulting directed graph $G=(M,E)$ , with nodes for each $S_i$ , may have cycles. Cycles are broken by identifying the edge $(i,j)$ in each cycle maximizing the convex-hull symmetric difference

$V(i,j) = \int_\Omega \left(\chi_i(x) + \chi_j(x) - \chi_i(x)\,\chi_j^{\rm Conv}(x)\right)\,dx,$

and removing edges accordingly until $G$ is acyclic. A topological sort extracts the linear order $\mathcal{D}:\{1,\ldots,N_s\}\to\{1,\ldots,N_s\}$ . These steps involve combinatorial operations: Graham scan for convex hulls $(O(n\log n))$ , raster intersections for area calculations, and no continuous optimization.

2. Curvature-Based Convexification via Inpainting

Given the total order $\mathcal{D}$ , each $S_i$ is convexified into $C_i\supseteq S_i$ , constrained to only grow into occluded regions

$O_i = \bigcup_{j:\mathcal{D}(j)\leq \mathcal{D}(i)} S_j \cup S_{\rm noise},$

where $S_{\rm noise}$ groups small, spurious regions. The minimization target is Euler’s elastica energy,

$E(C_i) = \int_{\partial C_i}(a + b\,\kappa^2)\,ds,$

for positive $a,b$ and curvature $\kappa$ , promoting smooth, near-convex boundaries into $O_i$ .

The sharp-interface elastica formulation is approximated with a Modica–Mortola diffuse interface:

The phase field $u:\Omega\to[-1,1]$ satisfies $u\approx +1$ inside, $u\approx -1$ outside.
The double-well potential $W(u) = (u-1)^2(u+1)^2$ penalizes intermediate values.

Diffuse elastica energy: $E_{\epsilon}(u) = \int_{O_i} \left[a\left(\frac{\epsilon}{2}|\nabla u|^2 + \frac{W(u)}{2\epsilon}\right) + \frac{b}{\epsilon}\left(\epsilon\Delta u - \frac{W'(u)}{2\epsilon}\right)^2 \right]dx + \sum_{p\in\mathcal{B}_i}\int_{B(p,r)\cap O_i}(u-\psi_p)^2\,dx$ subject to $u=+1$ on $S_i$ and $u=-1$ on $\Omega\setminus O_i$ . As $\epsilon\to 0$ , this $\Gamma$ -converges to sharp-interface elastica plus fidelity at inpainting data $\psi_p\in\{-1,0,1\}$ .

The solver introduces $v = \epsilon\Delta u - W'(u)/2\epsilon$ and splits the problem into two linear subproblems, each efficiently solved by FFT due to the choice of Laplacian discretization and Dirichlet boundary conditions. Thresholding $u>0$ yields the convexified region $C_i$ .

3. Bézier Curve Fitting and SVG Layered Export

The boundary of $C_i$ is extracted as the zero-level set $\Gamma_i = \{u=0\}$ , then sampled into a closed, ordered set of points $\{p_k\}$ . Discrete curvature,

$\kappa(p_k) = -2\,\frac{\det\left[\overrightarrow{p_kp_{k-h}},\,\overrightarrow{p_kp_{k+h}}\right]}{\|\overrightarrow{p_kp_{k-h}}\|\;\|\overrightarrow{p_kp_{k+h}}\|\;\|\overrightarrow{p_{k+h}p_{k-h}}\|},$

is computed at each sample to identify curvature extrema above threshold $T$ , which demarcate Bézier curve segments.

For each segment, cubic Bézier curves with control points $P_0,\ldots,P_3$ are fitted via least-squares minimization,

$\min_{P_0,\ldots,P_3}\,\sum_{q=1}^Q \left\|\sum_{i=0}^3\binom{3}{i}(1-t_q)^{3-i}t_q^i\,P_i-p_q\right\|^2,$

with $t_q$ the normalized arc-length parameter of each point. Segments with Hausdorff distance exceeding a threshold $\tau$ are subdivided and re-fitted recursively.

SVG export creates one <path> element per fitted Bézier, grouped by shape-layer index $i$ . Layers are ordered in SVG z-order according to reverse $\mathcal{D}$ (bottom-to-top). Each path is filled with its original color $c$ and full opacity; SVG <g> wrappers encode grouping for semantic units.

4. Algorithmic Workflow and Computational Complexity

The vectorization procedure follows:

Color quantization (K-means) yielding $f:\Omega\to\{c_\ell\}$ .
Extraction of shape layers $\{S_i,\chi_i\}$ ; denoising forms $S_{\rm noise}$ .
For all $i<j$ : compute convex hulls, $A(i,j)$ , and $D(i,j)$ ; build directed graph $G$ .
Remove cycles in $G$ by deleting maximal $V(i,j)$ edges; topologically sort to obtain $\mathcal{D}$ .
For each $S_i$ (following $\mathcal{D}$ ): construct occlusion mask $O_i$ ; solve diffuse elastica $E_\epsilon(u_i)$ with FFT splitting; extract $C_i$ ; sample boundary and curvature extrema; fit cubic Béziers.
Export SVG, stacking layers by $\mathcal{D}$ .

Computational complexity per stage:

Convex hull per shape: $O(n_i\log n_i)$ .
Pairwise measures: $O(N_s^2 \cdot n_{\rm avg})$ .
Cycle removal: $O(\#\text{cycles} \cdot |C|)$ .
Inpainting: $O(|O_i|\,\log|O_i|)$ per shape.
Bézier fitting: linear in boundary length.

For typical settings ( $N_s \approx 10$ –$50$, image size $\leq 500\times 500$ ), the entire pipeline executes in tens of seconds on modern CPUs. The iterative elastica solver converges in $50$–$200$ iterations, with $\Gamma$ -convergence ensuring alignment with the sharp-interface elastica minimizer as $\epsilon\to 0$ .

5. Empirical Comparison with Prior Layered Vectorization Methods

Quantitative and qualitative performance was assessed against LIVE [Ma et al. 2022], DiffVG [Li et al. 2020], and LIVSS [Wang et al. 2024] on benchmark scenes (e.g., $400\times 400$ pixels). Representative numeric results:

Method	# Bézier Curves	MSE↓	PSNR↑	Time (s)
Ours (≈7 layers)	93	13.4	41.6 dB	37
LIVE (32 paths)	128	28.6	38.3 dB	20,640
DiffVG (128 paths)	517	71.4	34.4 dB	194
LIVSS	200–500	–	–	888

This framework accurately recovers the correct depth ordering of occluded regions, yields fewer Bézier segments per semantic shape, produces lower rasterization error (MSE) than LIVE and DiffVG, and executes $10^2$ – $10^3\times$ faster than LIVE in experimental runs.

Limitations include instability for quantizations generating very small noisy shapes (mitigated by pre-grouping), potential over-convexification of highly concave objects, and sensitivity to pairwise area cues where T-junctions are ambiguous.

6. Theoretical and Practical Implications

By convexifying image shape layers and establishing explicit, globally consistent depth orderings, this approach offers a principled tool for producing editable vector representations, compatible with human visual perceptual biases (e.g., boundary smoothness, convex completion). The integration of variational inpainting and combinatorial depth inference differentiates this pipeline from previous layer-based vectorization techniques.

The methodology is closely tied to $\Gamma$ -convergence theory (ensuring the diffuse interface energy converges to the elastica), links to raster-to-vector learning paradigms, and provides a foundation for further integration of learned depth cues, interactive layer annotation, or GPU-accelerated elastica solvers. Grouping of shape layers for semantic vectorization is also considered, suggesting directions for future work in semantic abstraction and user-guided editing (Law et al., 2024).

Markdown Report Issue Upgrade to Chat

References (1)

Image Vectorization with Depth: convexified shape layers with depth ordering (2024)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Convexified Shape Layers with Depth Ordering.