Papers
Topics
Authors
Recent
Search
2000 character limit reached

Convexified Shape Layers & Depth Ordering

Updated 18 January 2026
  • The paper presents a vectorization framework that decomposes quantized raster images into scalable vector shape layers with an explicit, energy-driven depth order.
  • It employs curvature-based convexification via variational inpainting to remove pixelation while preserving boundary smoothness and accurate occlusion relationships.
  • The pipeline integrates combinatorial graph construction, efficient Bézier curve fitting, and SVG export, achieving lower MSE and faster performance compared to previous methods.

Convexified shape layers with depth ordering is a framework for image vectorization that decomposes a color-quantized raster image into scalable vector shape layers with an explicit, energy-driven global depth order. The methodology integrates combinatorial graph construction, convexification via variational inpainting, and scalable SVG export with layer semantics. This approach addresses both the removal of pixelization and the preservation of boundary smoothness while simultaneously encoding relative depth relationships among segmented shapes. The pipeline leverages curvature-based inpainting to convexify occluded shapes, following a depth order inferred by pairwise covered-area measures, and outputs vector graphics amenable to editing and semantic grouping (Law et al., 2024).

1. Depth-Ordering Energy

Given an image grid ΩZ2\Omega \subset \mathbb{Z}^2 and a KK-color quantization f:Ω{c}=1Kf:\Omega\to\{c_\ell\}_{\ell=1}^K, each connected component SiΩS_i \subset \Omega of a single color defines a shape layer with binary mask χi(x)=1Si(x)\chi_i(x) = 1_{S_i}(x). The pairwise covered-area measure for iji \neq j is

A(i,j)=Ωχi(x)χjConv(x)dxΩχi(x)dx[0,1],A(i,j) = \frac{\int_\Omega \chi_i(x)\,\chi_j^{\rm Conv}(x)\,dx}{\int_\Omega \chi_i(x)\,dx} \in [0,1],

where χjConv\chi_j^{\rm Conv} is the convex hull mask of SjS_j. This quantifies the fraction of SiS_i's area occluded by SjS_j's convex hull. The depth-ordering energy is then

D(i,j)=A(i,j)A(j,i),      D(i,j)[1,1].D(i,j) = A(i,j) - A(j,i),\;\;\; D(i,j)\in [-1,1].

Thresholding D(i,j)D(i,j) by δ>0\delta>0 leads to directed edges in a shape graph: if D(i,j)>δD(i,j)>\delta, ii is above jj; if D(i,j)<δD(i,j)<-\delta, jj above ii; otherwise, no order is set. The resulting directed graph G=(M,E)G=(M,E), with nodes for each SiS_i, may have cycles. Cycles are broken by identifying the edge (i,j)(i,j) in each cycle maximizing the convex-hull symmetric difference

V(i,j)=Ω(χi(x)+χj(x)χi(x)χjConv(x))dx,V(i,j) = \int_\Omega \left(\chi_i(x) + \chi_j(x) - \chi_i(x)\,\chi_j^{\rm Conv}(x)\right)\,dx,

and removing edges accordingly until GG is acyclic. A topological sort extracts the linear order D:{1,,Ns}{1,,Ns}\mathcal{D}:\{1,\ldots,N_s\}\to\{1,\ldots,N_s\}. These steps involve combinatorial operations: Graham scan for convex hulls (O(nlogn))(O(n\log n)), raster intersections for area calculations, and no continuous optimization.

2. Curvature-Based Convexification via Inpainting

Given the total order D\mathcal{D}, each SiS_i is convexified into CiSiC_i\supseteq S_i, constrained to only grow into occluded regions

Oi=j:D(j)D(i)SjSnoise,O_i = \bigcup_{j:\mathcal{D}(j)\leq \mathcal{D}(i)} S_j \cup S_{\rm noise},

where SnoiseS_{\rm noise} groups small, spurious regions. The minimization target is Euler’s elastica energy,

E(Ci)=Ci(a+bκ2)ds,E(C_i) = \int_{\partial C_i}(a + b\,\kappa^2)\,ds,

for positive a,ba,b and curvature κ\kappa, promoting smooth, near-convex boundaries into OiO_i.

The sharp-interface elastica formulation is approximated with a Modica–Mortola diffuse interface:

  • The phase field u:Ω[1,1]u:\Omega\to[-1,1] satisfies u+1u\approx +1 inside, u1u\approx -1 outside.
  • The double-well potential W(u)=(u1)2(u+1)2W(u) = (u-1)^2(u+1)^2 penalizes intermediate values.

Diffuse elastica energy: Eϵ(u)=Oi[a(ϵ2u2+W(u)2ϵ)+bϵ(ϵΔuW(u)2ϵ)2]dx+pBiB(p,r)Oi(uψp)2dxE_{\epsilon}(u) = \int_{O_i} \left[a\left(\frac{\epsilon}{2}|\nabla u|^2 + \frac{W(u)}{2\epsilon}\right) + \frac{b}{\epsilon}\left(\epsilon\Delta u - \frac{W'(u)}{2\epsilon}\right)^2 \right]dx + \sum_{p\in\mathcal{B}_i}\int_{B(p,r)\cap O_i}(u-\psi_p)^2\,dx subject to u=+1u=+1 on SiS_i and u=1u=-1 on ΩOi\Omega\setminus O_i. As ϵ0\epsilon\to 0, this Γ\Gamma-converges to sharp-interface elastica plus fidelity at inpainting data ψp{1,0,1}\psi_p\in\{-1,0,1\}.

The solver introduces v=ϵΔuW(u)/2ϵv = \epsilon\Delta u - W'(u)/2\epsilon and splits the problem into two linear subproblems, each efficiently solved by FFT due to the choice of Laplacian discretization and Dirichlet boundary conditions. Thresholding u>0u>0 yields the convexified region CiC_i.

3. Bézier Curve Fitting and SVG Layered Export

The boundary of CiC_i is extracted as the zero-level set Γi={u=0}\Gamma_i = \{u=0\}, then sampled into a closed, ordered set of points {pk}\{p_k\}. Discrete curvature,

κ(pk)=2det[pkpkh,pkpk+h]pkpkh  pkpk+h  pk+hpkh,\kappa(p_k) = -2\,\frac{\det\left[\overrightarrow{p_kp_{k-h}},\,\overrightarrow{p_kp_{k+h}}\right]}{\|\overrightarrow{p_kp_{k-h}}\|\;\|\overrightarrow{p_kp_{k+h}}\|\;\|\overrightarrow{p_{k+h}p_{k-h}}\|},

is computed at each sample to identify curvature extrema above threshold TT, which demarcate Bézier curve segments.

For each segment, cubic Bézier curves with control points P0,,P3P_0,\ldots,P_3 are fitted via least-squares minimization,

minP0,,P3q=1Qi=03(3i)(1tq)3itqiPipq2,\min_{P_0,\ldots,P_3}\,\sum_{q=1}^Q \left\|\sum_{i=0}^3\binom{3}{i}(1-t_q)^{3-i}t_q^i\,P_i-p_q\right\|^2,

with tqt_q the normalized arc-length parameter of each point. Segments with Hausdorff distance exceeding a threshold τ\tau are subdivided and re-fitted recursively.

SVG export creates one <path> element per fitted Bézier, grouped by shape-layer index ii. Layers are ordered in SVG z-order according to reverse D\mathcal{D} (bottom-to-top). Each path is filled with its original color cc and full opacity; SVG <g> wrappers encode grouping for semantic units.

4. Algorithmic Workflow and Computational Complexity

The vectorization procedure follows:

  1. Color quantization (K-means) yielding f:Ω{c}f:\Omega\to\{c_\ell\}.
  2. Extraction of shape layers {Si,χi}\{S_i,\chi_i\}; denoising forms SnoiseS_{\rm noise}.
  3. For all i<ji<j: compute convex hulls, A(i,j)A(i,j), and D(i,j)D(i,j); build directed graph GG.
  4. Remove cycles in GG by deleting maximal V(i,j)V(i,j) edges; topologically sort to obtain D\mathcal{D}.
  5. For each SiS_i (following D\mathcal{D}): construct occlusion mask OiO_i; solve diffuse elastica Eϵ(ui)E_\epsilon(u_i) with FFT splitting; extract CiC_i; sample boundary and curvature extrema; fit cubic Béziers.
  6. Export SVG, stacking layers by D\mathcal{D}.

Computational complexity per stage:

  • Convex hull per shape: O(nilogni)O(n_i\log n_i).
  • Pairwise measures: O(Ns2navg)O(N_s^2 \cdot n_{\rm avg}).
  • Cycle removal: O(#cyclesC)O(\#\text{cycles} \cdot |C|).
  • Inpainting: O(OilogOi)O(|O_i|\,\log|O_i|) per shape.
  • Bézier fitting: linear in boundary length.

For typical settings (Ns10N_s \approx 10–$50$, image size 500×500\leq 500\times 500), the entire pipeline executes in tens of seconds on modern CPUs. The iterative elastica solver converges in $50$–$200$ iterations, with Γ\Gamma-convergence ensuring alignment with the sharp-interface elastica minimizer as ϵ0\epsilon\to 0.

5. Empirical Comparison with Prior Layered Vectorization Methods

Quantitative and qualitative performance was assessed against LIVE [Ma et al. 2022], DiffVG [Li et al. 2020], and LIVSS [Wang et al. 2024] on benchmark scenes (e.g., 400×400400\times 400 pixels). Representative numeric results:

Method # Bézier Curves MSE↓ PSNR↑ Time (s)
Ours (≈7 layers) 93 13.4 41.6 dB 37
LIVE (32 paths) 128 28.6 38.3 dB 20,640
DiffVG (128 paths) 517 71.4 34.4 dB 194
LIVSS 200–500 888

This framework accurately recovers the correct depth ordering of occluded regions, yields fewer Bézier segments per semantic shape, produces lower rasterization error (MSE) than LIVE and DiffVG, and executes 10210^2103×10^3\times faster than LIVE in experimental runs.

Limitations include instability for quantizations generating very small noisy shapes (mitigated by pre-grouping), potential over-convexification of highly concave objects, and sensitivity to pairwise area cues where T-junctions are ambiguous.

6. Theoretical and Practical Implications

By convexifying image shape layers and establishing explicit, globally consistent depth orderings, this approach offers a principled tool for producing editable vector representations, compatible with human visual perceptual biases (e.g., boundary smoothness, convex completion). The integration of variational inpainting and combinatorial depth inference differentiates this pipeline from previous layer-based vectorization techniques.

The methodology is closely tied to Γ\Gamma-convergence theory (ensuring the diffuse interface energy converges to the elastica), links to raster-to-vector learning paradigms, and provides a foundation for further integration of learned depth cues, interactive layer annotation, or GPU-accelerated elastica solvers. Grouping of shape layers for semantic vectorization is also considered, suggesting directions for future work in semantic abstraction and user-guided editing (Law et al., 2024).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Convexified Shape Layers with Depth Ordering.