SwiftTailor: Efficient 3D Garment Generation with Geometry Image Representation

Published 19 Mar 2026 in cs.CV and cs.GR | (2603.19053v1)

Abstract: Realistic and efficient 3D garment generation remains a longstanding challenge in computer vision and digital fashion. Existing methods typically rely on large vision- LLMs to produce serialized representations of 2D sewing patterns, which are then transformed into simulation-ready 3D meshes using garment modeling framework such as GarmentCode. Although these approaches yield high-quality results, they often suffer from slow inference times, ranging from 30 seconds to a minute. In this work, we introduce SwiftTailor, a novel two-stage framework that unifies sewing-pattern reasoning and geometry-based mesh synthesis through a compact geometry image representation. SwiftTailor comprises two lightweight modules: PatternMaker, an efficient vision-LLM that predicts sewing patterns from diverse input modalities, and GarmentSewer, an efficient dense prediction transformer that converts these patterns into a novel Garment Geometry Image, encoding the 3D surface of all garment panels in a unified UV space. The final 3D mesh is reconstructed through an efficient inverse mapping process that incorporates remeshing and dynamic stitching algorithms to directly assemble the garment, thereby amortizing the cost of physical simulation. Extensive experiments on the Multimodal GarmentCodeData demonstrate that SwiftTailor achieves state-of-the-art accuracy and visual fidelity while significantly reducing inference time. This work offers a scalable, interpretable, and high-performance solution for next-generation 3D garment generation.

Abstract PDF Upgrade to Chat

Authors (4)

Summary

The paper introduces the Garment Geometry Image (GGI) to unify geometry, semantic, and stitching information, eliminating reliance on slow physics-based simulation.
It employs a two-stage pipeline with PatternMaker and GarmentSewer to transform multimodal inputs into a dense, watertight 3D mesh with inference times as low as 0.02s.
Ablation studies highlight that edge-aware losses and explicit stitching supervision are critical for achieving topologically accurate and simulation-free garment reconstruction.

SwiftTailor: Efficient 3D Garment Generation with Geometry Image Representation

Introduction and Motivation

Photorealistic and efficient 3D garment generation presents substantial computational and structural challenges, especially in the context of digital fashion and downstream CAD/simulation workflows. Prior approaches conventionally decouple structure and geometry, predicting serialized 2D sewing patterns using high-capacity VLMs or autoregressive models, with subsequent transformation into 3D meshes via physics-based solvers (e.g., GarmentCode). While the resulting meshes are structurally plausible, such methods exhibit severe latency due to iterative simulation and panel alignment, as well as topological failures when confronted with complex garments.

The proposed SwiftTailor framework departs from reliance on physical simulation during assembly, introducing a two-stage system that directly bridges multimodal garment specification (image or text) to a dense, simulatable 3D mesh. Central to SwiftTailor is the Garment Geometry Image (GGI), an image-based representation unifying geometric, semantic, and stitching topologies in a shared UV space. With this explicit encoding of both mesh structure and seam topology, SwiftTailor can synthesize coherent 3D garments at orders-of-magnitude faster inference rates while improving reconstruction accuracy.

System Overview

The SwiftTailor pipeline operationalizes 3D garment generation through two modular components:

PatternMaker: A compact multimodal LLM (InternVL-3-2B backbone) is fine-tuned to map input images or textual queries to discrete sewing pattern tokens, including panel geometries, edge connectivity, and rigid-body transforms.
GarmentSewer: A dense prediction transformer, processing as input the semantic and stitching images derived from the predicted sewing pattern, regresses the geometry image channel of the GGI. Post-processing (remeshing and stitching) yields the final watertight garment mesh.
Figure 2: The overall pipeline specifies the modular flow from multimodal input through PatternMaker and GarmentSewer, culminating in efficient 3D mesh recovery without physical simulation.

This modularity allows PatternMaker or GarmentSewer to be replaced or inserted into other pipelines with minimal interface effort.

Garment Geometry Image Representation

The GGI is defined as a triplet of aligned image maps in a repacked UV layout:

Semantic Image: Encodes unique panel types through color, permitting panel disambiguation for complex, multi-component garments.
Stitching Image: Color-coded edge pixels designate seam correspondences to be merged in post-processing.
Geometry Image: Each pixel stores a normalized 3D coordinate, sampled from the panel’s mesh via a hybrid interpolation scheme (linear along boundaries, barycentric in interiors), producing smoothly varying geometry and preventing boundary artifacts.
Figure 3: Geometry images parameterize the 3D garment surface into texture-like charts, tightly coupling sewing pattern topology and geometry for efficient learning and mesh reconstruction.

The remeshing pipeline constructs regular, dense meshes panel-wise in UV space and applies seam alignment (dynamic time warping followed by vertex merging) using the stitching image to ensure watertight and coherent mesh assembly.

Figure 1: Garment Geometry Image components preparation (geometry, semantic, stitching) and subsequent remeshing/stitching pipeline for final mesh generation.

Model Architecture and Losses

PatternMaker: Employs the InternVL-3-2B model, leveraging its superior efficiency and structural reasoning compared to larger 7B-scale VLMs. Multitask learning optimizes token prediction for discrete pattern elements and regression heads for panel geometry, rotation, and translation.

GarmentSewer: Utilizes a ViT-based encoder (ImageNet-initialized) with a multi-scale decoder (Dense Prediction Transformer). Inputs are the semantic and stitching maps; outputs are the 3D coordinate channels for the geometry image.

Training objectives for GarmentSewer:

Edge-aware L1 regression on the geometry image, with heightened penalty in proximity to panel boundaries.
Chamfer-distance loss on predicted seam boundaries, directly enforcing stitch alignment in 3D.
Normal regularization term for mesh smoothness, adopted from [turkulainen2025dn].

Empirical Evaluation

SwiftTailor demonstrates substantial improvements in both generative fidelity and efficiency. On the GarmentCodeData benchmark (and GCD-MM), PatternMaker achieves superior sewing-pattern recovery (Vertex L2 as low as 1.5, stitching accuracy >97%) with a 2B-parameter model, outperforming larger competitors such as AIpparel and ChatGarment.

3D Mesh Generation:

MMD: 5.31 (best)
COV: 0.68 (best)
Inference time (stage 2): 0.02s (orders-of-magnitude faster than GarmentCode-based solvers, which require 30+ seconds/mesh)

Notably, SwiftTailor generates physically plausible and well-aligned 3D meshes without physics-based simulation, consistently beating physics-driven GarmentCode pipelines in both mesh quality and computation—robustly handling diverse multimodal inputs.

Figure 4: Qualitative comparison on 3D garment modeling: SwiftTailor produces more coherent, topologically-valid meshes than existing state-of-the-art methods under image, text, and multimodal conditioning.

Ablation experiments confirm that encoding panel semantic classes, employing edge-aware losses, and explicit stitching supervision are all crucial for topological fidelity and mesh connectivity.

Figure 5: The effect of ablation on loss functions for GarmentSewer—stitching loss resolves seam misalignments, yielding closed garments even in complex topologies.

Figure 7: Inputting dense geometry and semantic images, as opposed to binary masks, is essential for panel disambiguation and correct topology.

Practical and Theoretical Implications

SwiftTailor establishes that high-fidelity, simulation-free 3D garment generation is tractable at latency levels compatible with commercial design pipelines and interactive applications. The GGI representation’s explicit intertwining of geometry, topology, and seam logic offers a template for further work in mesh-based generative modeling wherever structured assembly and dense geometry are required.

The pipeline’s modularity suggests immediate avenues for substitution or extension, such as adaptive texture synthesis, material property inference, or downstream physics-enhanced wrinkle refinement. The GGI formulation enables direct integration into virtual try-on and industrial CAD tools, and the learned mapping paradigm can be generalized to other structured-manifold assembly problems beyond garments.

Limitations and Future Directions

High-frequency geometry (wrinkles, detailed folds) is smoothed by GarmentSewer’s architecture and loss formulation; future work can adopt hybrid refinement (learned or physics-based) as a postprocess for microgeometry recovery.
Failure handling for out-of-distribution garments and complex background occlusions remains an open problem, necessitating more robust vision-language frontends or data augmentation.
The modular system invites investigation into user-driven editing at the GGI level, real-time garment editing, or end-to-end learning incorporating dynamic cloth simulation.

Conclusion

SwiftTailor provides a new paradigm for 3D garment generation by directly regressing a unified geometry image representation from explicit sewing patterns, removing the bottleneck of physical simulation. The result is a pipeline that delivers state-of-the-art mesh accuracy, topological soundness, and computational efficiency, offering a scalable foundation for practical digital fashion, virtual try-on, and automated design applications.

Figure 6: Remeshing strategy from the geometry image; dense triangulation in UV space enables regular, high-quality panel reconstruction prior to stitching.

Figure 10: Stitching via DTW and vertex merging converts the panel-wise mesh into a watertight, simulation-ready 3D garment.

Figure 8: Additional qualitative garment generations visualized on SMPL, illustrating style diversity and mesh fidelity across the pipeline.

Markdown Report Issue

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Paper Prompts

Top Community Prompts

Explain it Like I'm 14

off on

Knowledge Gaps

off on

Practical Applications

off on

Glossary

off on

Conceptual Simplification

off on

Explain it Like I'm 14

SwiftTailor: A simple explanation for teens

What’s this paper about?

This paper introduces SwiftTailor, a fast and smart way for a computer to make 3D clothes (like shirts, skirts, or hoodies) from a picture or a text description. The big idea is to follow how real clothes are made—start with flat sewing patterns—then quickly turn those flat pieces into a 3D garment without running slow physics simulations.

What were the researchers trying to do?

They focused on two main goals:

Make 3D clothes that look realistic and are built the way real clothes are (from sewing patterns), so they’re easy to understand, edit, and even manufacture.
Do it much faster than usual methods, which often take half a minute or more because they simulate how cloth moves and drapes.

How does SwiftTailor work? (Methods in everyday terms)

Think of real clothing design:

First, you draw and cut flat paper shapes (sewing patterns).
Then you sew the edges together to form a 3D item.

SwiftTailor follows this in two stages:

PatternMaker: a small AI that makes the “paper pieces”

Input: a picture of a garment, a text description (“a blue hoodie with a front pocket”), or both.
Output: a sewing pattern—flat panels (front, back, sleeves, etc.) and instructions for which edges should be sewn together.
Analogy: It’s like a helper that looks at a reference and writes a clear recipe of the pieces you need and how to join them.

GarmentSewer: a fast tool that “sews” the pieces into 3D—without heavy physics

Instead of simulating cloth step-by-step (which is slow), it predicts a special “picture” called a Garment Geometry Image (GGI).
What’s a GGI? Imagine you took all the flat pattern pieces and packed them neatly into a square image, like laying countries on a map. Each pixel stores the 3D position of the cloth at that spot—so a 2D image now “contains” a 3D shape.
The GGI actually has three aligned images working together:
- A semantic image: which part each pixel belongs to (e.g., left sleeve, collar).
- A geometry image: the 3D coordinates per pixel (where it sits in 3D).
- A stitching image: which edges should be joined (matching colors mean “sew these edges together”).
After predicting these images, a quick “post-processing” step cuts the shapes from the image and zips (stitches) matching edges—like reconnecting puzzle pieces—into one clean 3D clothing mesh.

Helpful analogy:

“UV space” (where they pack the panels) is like flattening a globe into a world map.
The “geometry image” is like a paint-by-numbers sheet where each pixel doesn’t just hold a color—it holds a 3D point. Read the whole image, and the 3D garment pops out.

What did they find, and why is it important?

It’s much faster: Their second stage (building the 3D mesh) runs in about hundredths of a second, not tens of seconds. Overall, the whole pipeline is roughly 4× faster than common alternatives that depend on physics engines.
It’s accurate and robust: Their system makes high-quality clothes with correct seams (edges line up cleanly), and it scored better on standard datasets than other methods. It works from images, text, or both.
It’s practical: Because it keeps the sewing pattern, the result is easy to understand, edit, and reuse in design tools. And while it skips slow physics during construction, the final 3D clothes are still compatible with later simulations if you want extra details like draping or wrinkles.

What could this change in the real world?

Faster design cycles: Fashion designers, game developers, and AR/VR creators can prototype 3D clothes quickly and adjust styles on the fly.
Easier collaboration: The system’s “pattern-first” approach is understandable to both humans and machines, which helps teams edit and manufacture designs.
Scalable and flexible: Because it avoids slow physics at the core “sewing” step, it can handle lots of garments quickly—useful for large collections, try-on apps, or content creation.

The authors also note future directions, like making the pattern generation step even snappier and adding realistic textures and wrinkles without needing heavy physics.

View Paper Prompt View All Prompts

Knowledge Gaps

Knowledge gaps, limitations, and open questions

Below is a concise, actionable list of what remains uncertain or unexplored in the paper and would benefit from follow-up research.

Representation and reconstruction

Sensitivity to GGI resolution and packing: How geometry quality scales with the geometry image resolution, and how different UV packing strategies (panel layout, spacing, aspect ratios) affect reconstruction fidelity, seam alignment, and artifacts.
Invertibility accuracy: Quantitative error introduced by rasterization, interpolation, and inverse mapping (remeshing + stitching) is not reported; no ablation on aliasing/blur vs. fine detail retention (e.g., sharp seam features, darts, pleats).
Seam correctness beyond proximity: The stitching loss enforces edge proximity via Chamfer distance but not orientation, arclength matching, or one-to-one correspondence; it is unclear how the method prevents seam twists, length mismatches, or layer inversions without a physics solver.
Watertightness and manifoldness: No statistics on rates of watertight meshes, non-manifold edges, self-intersections, or seam gaps after post-processing.
Global scale and placement: 3D coordinates are normalized, but how global scale, garment thickness, and placement relative to a body or world frame are recovered or standardized is unspecified.
Panel-level topology variation: It is unclear how the framework handles garments with complex topology (e.g., slits, holes, multi-piece collars), non-disk panels, or panels requiring multiple charts within a panel.
Multi-layer garments and accessories: Current GGI focuses on panels and seams; representation of pockets, linings, hoods with inner layers, belts, zippers, buttons, and trims is not specified.
Material-aware geometry: The GGI encodes only surface geometry; there is no representation or prediction of thickness, multilayer offsets, or structural features (e.g., seam allowances, hem rolls).

Learning and generalization

Error propagation from PatternMaker: Robustness of GarmentSewer to incorrect or partially wrong patterns/stitch maps is not quantified; no mechanism for uncertainty-aware assembly or correction.
Generalization to unseen garment categories: The method is trained on GCD-MM; performance on categories, panel taxonomies, or seam conventions not present in this dataset remains unknown.
Zero-shot extensibility: It is unclear whether GarmentSewer can adapt to new panel types or taxonomies without retraining, despite claims of modularity.
Domain gap to real-world inputs: Generalization from synthetic/curated datasets (e.g., GCD-MM) to in-the-wild photos, noisy sketches, or real CAD patterns is not evaluated.
Body shape/pose conditioning: The approach does not model pose- or shape-conditioned drape; how geometry changes with SMPL shape/pose or different mannequins is not studied.

Physical plausibility and downstream simulation

Absence of physical parameter modeling: No estimation or conditioning on fabric parameters (stretch, bending, shear) or seam properties; unclear how results behave under downstream physics-based draping.
Dynamic behavior and wrinkles: The method produces static meshes; generating fine wrinkles, fold structures, and dynamic cloth behavior without simulation is deferred to future work and remains open.
Simulation-readiness at scale: Although “compatible with downstream simulation” is claimed, there is no quantitative validation (e.g., collision stability, convergence rate, or failure rate) across diverse garments in standard physics engines.

Training and losses

Edge-aware loss design: Weighting and band width hyperparameters for edge-aware regression are fixed; sensitivity analyses are missing, and it is unclear how band width interacts with image resolution and panel sizes.
Stitching loss limitations: Chamfer-based edge matching does not enforce consistent arc-length parameterization; whether adding explicit correspondence, normal/curvature continuity, or tangent alignment improves seams is untested.
Training realism gap: It is not stated whether GarmentSewer is trained on ground-truth vs. predicted patterns; robustness to train–test mismatch (teacher-forcing vs. free-running) is not analyzed.

Evaluation and baselines

Limited metrics: MMD/COV (point-cloud based) do not capture seam integrity, manifoldness, or reconstruction topology; no benchmarks for edge alignment error, hole counts, or surface self-intersection rates.
Missing baselines: Comparisons exclude recent learning-based pattern-to-mesh methods and geometry-image-based stitchers beyond the listed works; direct mesh generators (e.g., atlas-based or SDF-based) with stitch-aware postprocessing are not included.
Text-conditioned evaluation: Quantitative evaluation for text-only inputs is sparse; no human study on text-to-design intent fidelity or edit satisfaction.
Fairness of runtime comparisons: Stage-2 timing (0.02s) excludes non-neural post-processing (~4.83s); hardware (A100) may not reflect deployment conditions (e.g., edge/mobile), and memory footprint is not reported.

Robustness and failure cases

Complex garments and high panel counts: Performance degradation with many panels, symmetric parts (left/right confusion), or small/narrow panels is qualitatively hinted but not systematically quantified.
UV overlap and degenerate cases: Handling of overlapping panels in the packed layout, extremely thin features, or very short seam segments is not described.
Self-collision and interpenetration: The pipeline lacks explicit constraints preventing self-intersection between panels during reconstruction; prevalence of such artifacts is unreported.

Usability and extensibility

Editing-to-geometry path: While PatternMaker supports editing, the end-to-end impact on final mesh quality (post-edit) is not evaluated; no latency or stability analysis for interactive use.
Texture and appearance: Texture/material generation is left for future work; the GGI framework’s compatibility with consistent UVs for texture authoring, and how textures would align across stitched seams, is unspecified.
CAD interoperability: Detailed mapping to industrial CAD constraints (seam allowances, notches, grainlines) is absent; integration requirements for real apparel workflows are unclear.

Open research directions

Seam-consistent correspondence losses that enforce orientation and arclength matching, and topology-aware regularization to guarantee watertightness and manifoldness.
Conditioning on fabric parameters and body shape/pose to produce physically plausible, simulation-ready geometry with predictable drape.
Multi-layer and accessory-aware GGI extensions supporting closures and non-sewn constraints (zippers, buttons) and internal structures (linings).
Domain adaptation to real-world photos/CAD patterns and uncertainty-aware inference to mitigate pattern prediction errors.
Comprehensive benchmarks with topology and seam-quality metrics, and large-scale simulation validation to quantify downstream stability and realism.

View Paper Prompt View All Prompts

Practical Applications

Immediate Applications

The following applications can be deployed now by leveraging SwiftTailor’s two-stage pipeline (PatternMaker + GarmentSewer), the Garment Geometry Image (GGI) representation, and the demonstrated 4× end-to-end speedup over physics-based pipelines.

Rapid 3D garment prototyping and pre-visualization [Fashion/Apparel, CAD]
- Use PatternMaker to draft patterns from moodboards, sketches, or text, and GarmentSewer to instantly preview coherent 3D meshes without a sewing simulator.
- Tools/workflows: Plugin for CLO3D/Style3D/Marvelous Designer; Blender/Unity/Unreal importer for GGI; “instant preview” inside CAD.
- Assumptions/dependencies: Designs close to GarmentCodeData distribution; complex multilayer/lining details may still need manual refinement or later simulation.
Fast content creation for e-commerce and virtual catalogs [E-commerce, Marketing, 3D Assets]
- Batch-generate consistent garment meshes from product photos or descriptions for 360° viewers and basic try-on previews.
- Tools/workflows: Cloud API for text/image-to-3D garments; pipeline to GLB/FBX export; WebGL/three.js viewers.
- Assumptions/dependencies: “Simulation-ready” meshes may still require drape refinements for premium photoreal renders.
Iterative design and editing-by-text/image [Fashion Design, HCI]
- Text-guided edits (e.g., “add a hood,” “shorten sleeves”) on pattern structure with immediate updated 3D mesh via GGI.
- Tools/workflows: Interactive UI that visualizes panels, seams, and instant 3D; versioning for A/B design exploration.
- Assumptions/dependencies: Edit reliability depends on correct stitching maps and panel semantics; edge cases need human oversight.
Game, VFX, and AR asset generation at scale [Media/Entertainment, AR/VR]
- Generate lightweight, coherent garment meshes for NPC wardrobes or AR filters with minimal simulation overhead.
- Tools/workflows: DCC add-ons (Maya/Blender) for GGI import; Unreal/Unity asset pipeline; LOD baking (GGI → decimated meshes).
- Assumptions/dependencies: For close-ups or cloth–body interactions in motion, physics simulation or rig-specific adjustments remain necessary.
Dataset augmentation for learning-based fashion systems [Academia, R&D]
- Create large, labeled sets of patterns, semantics, seams, and meshes for training virtual try-on, sewing reasoning, or reconstruction models.
- Tools/workflows: Synthetic data generators based on GGI; curriculum of complexity (panels/seams) for model training.
- Assumptions/dependencies: Avoid domain shift by mixing real and synthetic; ensure license compliance for seed images.
CAD interoperability and panel QA checks [Manufacturing, CAD]
- Use the semantic/stitching maps to validate panel counts, seam pairings, and topology before production.
- Tools/workflows: GGI-to-DXF(AAMA/ASTM) converters; automated seam consistency reports; pre-production QA dashboards.
- Assumptions/dependencies: Export compliance with house CAD standards; coverage of accessories (zippers, pockets) may be partial.
Reduced simulation cost via better initial states [Simulation, HPC]
- Feed GarmentSewer’s stitched, coherent mesh to physics engines to shorten convergence and reduce failed drapes.
- Tools/workflows: Bridge to XPBD/C-IPC/Newton-based solvers; “warm-start” draping from GGI.
- Assumptions/dependencies: Gains depend on solver configuration, fabric properties, and avatar pose coverage.
Education and training for patternmaking [Education]
- Visualize how 2D pattern changes affect 3D form in real time; practice tasks with instant structural feedback.
- Tools/workflows: Classroom app showing panels, seam maps, and 3D outcomes; step-by-step assignments.
- Assumptions/dependencies: Curriculum alignment; simplified coverage of advanced tailoring may still require traditional instruction.

Long-Term Applications

The following applications are feasible with further research, scaling, standardization, or integration with external systems (materials, bodies, robotics, or policy frameworks).

Real-time mobile AR try-on with controllable garments [Retail, AR/VR]
- On-device PatternMaker + GarmentSewer for instant garment synthesis, editing, and approximate drape in AR mirrors.
- Tools/workflows: Mobile-optimized MLLMs and DPT; on-device GGI rasterization; body-segmentation pipelines.
- Assumptions/dependencies: Robust performance on diverse body shapes/poses; efficient cloth–body collision approximations.
End-to-end digital-to-physical pipeline (text/image → CNC cutting) [Manufacturing, Robotics]
- Translate designs to validated patterns, nest panels, and drive cutters/robots with minimal human intervention.
- Tools/workflows: GGI→DXF nesting; BOM and marker making; robotic sewing integration; QC checkpoints.
- Assumptions/dependencies: Industrial-grade validation of fit, tolerances, and materials; safety and compliance procedures.
Material-aware, simulation-free drape prediction [Simulation, Materials]
- Extend GGI with material priors and dynamic cues to approach physics fidelity without expensive solvers.
- Tools/workflows: GGI channels for fabric properties; neural surrogates of cloth dynamics; hybrid differentiable pipelines.
- Assumptions/dependencies: Large-scale multimaterial datasets; generalization to complex garments and motions.
Personalization and made-to-measure at scale [Retail, Health/Fitness]
- Combine body scans/measurements with pattern reasoning to auto-adjust panels and seams for fit before production.
- Tools/workflows: SMPL/SMPL-X alignment; anthropometric constraints in PatternMaker; automated fit simulation.
- Assumptions/dependencies: Accurate body capture and privacy-preserving data handling; returns/fitting policies.
Cross-platform standard for garment geometry images [Standards, Interoperability]
- Establish GGI as an interchange format bridging CAD, content creation, and simulation systems.
- Tools/workflows: Open spec for semantic/stitching channels; reference encoders/decoders; conformance tests.
- Assumptions/dependencies: Industry adoption (CLO3D, Style3D, apparel CAD vendors); governance by standards bodies.
Sustainable sampling and carbon reduction via digital twins [Policy, Sustainability]
- Replace a large fraction of physical prototypes with high-fidelity digital samples and analytics.
- Tools/workflows: Lifecycle assessment dashboards; audit trails linking designs to virtual tests; procurement overlays.
- Assumptions/dependencies: Stakeholder buy-in; audit standards for “digital-first” approvals; traceability frameworks.
IP protection, watermarking, and audit for AI-generated apparel [Policy, Legal]
- Embed provenance and watermarking into GGI channels; define usage policies for text/image-to-garment systems.
- Tools/workflows: Content credentials for GGI; license-aware generation; rights management integrations.
- Assumptions/dependencies: Legal clarity on training data; enforcement across platforms and jurisdictions.
Multimodal co-design assistants for non-experts [Consumer, Creator Economy]
- Conversational agents that turn style goals into manufacturable patterns, surfaces, and textures with cost/fit constraints.
- Tools/workflows: Constraint-aware PatternMaker; budget/fabric-aware suggestions; marketplaces for shareable GGI assets.
- Assumptions/dependencies: Reliable constraint satisfaction; accessible UX for complex garment logic; moderation and safety.
Robotics-aware garment construction and inspection [Manufacturing, Robotics]
- Use stitching maps and panel semantics to plan robotic assembly steps and automated seam inspection.
- Tools/workflows: Path planning from GGI; vision-in-the-loop QA; feedback to update pattern constraints.
- Assumptions/dependencies: Robust manipulation of deformable objects; alignment with factory hardware.
Research platforms for structured 3D generation [Academia, Core AI]
- Benchmarking structured 3D representations and zippering/stitching algorithms; studying MLLMs for CAD reasoning.
- Tools/workflows: Open datasets with GGI; challenge tracks on seam alignment, panel reasoning, and material surrogates.
- Assumptions/dependencies: Community curation and reproducibility; diverse garment typologies and materials.

Notes on feasibility across applications:

Coverage limits: Current training (GCD-MM) may under-represent extreme styles (multi-layer, boning, complex pleats) and accessories.
Physics gap: “Simulation-ready” does not equal physically accurate drape in motion; premium use-cases still benefit from solvers.
Interoperability: Production use requires robust exporters (DXF/AAMA/ASTM) and panel annotation conventions.
Ethics/IP: Generating from third-party product images risks infringement without rights or provenance controls.
Compute/latency: On-device and at-scale deployments need model compression, quantization, and hardware-specific optimizations.

View Paper Prompt View All Prompts

Glossary

Atlas representation: A way to partition a 3D surface into multiple parameterized patches (charts) for processing or packing. "leveraging an atlas representation that partitions the surface into a geometrically natural set of charts."
Barycentric interpolation: An interpolation method over triangles using barycentric coordinates, useful for smoothly filling values across mesh faces. "we apply a hybrid interpolation strategy that combines linear and barycentric interpolation to fill missing pixel values"
C-IPC: A contact-aware implicit collision handling method used in physics-based simulation for robust cloth/rigid-body interactions. "based on XPBD~\cite{xpbd}, or C-IPC~\cite{cipc}, or more recent Newton framework~\cite{newton}."
Chamfer Distance (CD): A set-to-set distance metric between point clouds, often used to evaluate 3D reconstruction quality. "We compute Chamfer Distance between point clouds for distance-based metrics."
Coverage (COV): A diversity metric assessing how well a set of generated samples covers the distribution of references. "we evaluate garment generation quality using Minimum Matching Distance (MMD) and Coverage (COV)."
Dense Prediction Transformer (DPT): A transformer architecture for per-pixel (dense) prediction tasks such as depth or geometry estimation. "Our GarmentSewer is a dense prediction transformer (DPT) that predicts a garment geometry image"
Dynamic stitching: A process of algorithmically reconnecting panel boundaries to form a continuous mesh without iterative physics simulation. "remeshing and dynamic stitching algorithms to directly assemble the garment"
Earth Mover’s Distance (EMD): A metric (Wasserstein distance) measuring the cost of transforming one distribution into another; used for comparing point sets. "EMD $\downarrow$ "
Garment Geometry Image (GGI): A unified image-based representation that encodes garment geometry, semantics, and stitching in a common UV layout. "the Garment Geometry Image (GGI), which represents 3D garment meshes in a unified UV texture space."
GarmentCode: A programmable garment modeling framework that simulates sewing patterns into 3D garments. "using garment modeling framework such as GarmentCode."
Geometry Image (GIM): A 2D image-like encoding of a 3D surface where pixels store geometric information mapped from the surface. "A Geometry Image (GIM)~\cite{gu2002geometry} represents a 3D surface in a 2D image-like format"
Inverse mapping (f^{-1}): The operation that reconstructs 3D geometry and connectivity from the 2D geometry image domain. "Reconstructing the original surface requires the inverse mapping $f^{-1}$ "
Minimum Matching Distance (MMD): A fidelity metric that measures how close generated samples are to the nearest references. "we evaluate garment generation quality using Minimum Matching Distance (MMD) and Coverage (COV)."
Multi-chart Geometry Image (MCGIM): An extension of geometry images that packs multiple parameterized charts into a single image. "For more complicated shapes, the Multi-chart Geometry Image (MCGIM) extends this concept"
Normal-regularization term: A loss that encourages smooth surface normals to reduce artifacts in reconstructed meshes. "we adopt the normal-regularization term from~\cite{turkulainen2025dn}"
NVIDIA Warp: A GPU-accelerated computational framework used to implement high-performance simulation kernels. "built on NVIDIA Warp~\cite{warp}"
Remeshing: The process of reconstructing or resampling a surface into a new mesh, often with different connectivity or regularity. "A remeshing step then reconstructs individual panel surfaces"
Semantic UV map: A UV layout image where pixels encode panel types or parts, guiding geometry prediction and reconstruction. "Ablation on semantic UV map and auxiliary losses"
SMPL: A skinned, parametric human body model commonly used for clothing and animation tasks. "their draping on SMPL~\cite{smpl} is often misaligned"
Stitching loss: A training loss that enforces alignment of corresponding panel edges to enable seamless garment assembly. "Stitching loss: To enable garment assembly without re-simulating sewing, stitched panel edges must closely align in 3D."
UV mapping: The correspondence from 3D surface points to 2D texture coordinates, used here to place geometry into image pixels. "and rasterize each garment-mesh vertex to its corresponding pixel location via UV mapping."
UV space: The 2D parameter domain in which surface geometry or textures are represented and manipulated. "encoding the 3D surface of all garment panels in a unified UV space."
Vision Transformer (ViT): A transformer-based architecture applied to image patches for visual representation learning. "a ViT-based encoder"
Vision-LLM (VLM): A model that jointly processes visual and textual inputs for tasks like pattern reasoning. "an large vision-LLM(VLMs) such as LLaVA-1.5V-7B"
XPBD: Extended Position-Based Dynamics, a constraint-based physics formulation for stable and efficient simulation. "based on XPBD~\cite{xpbd}, or C-IPC~\cite{cipc}, or more recent Newton framework~\cite{newton}."
Zippering scheme: A technique for reconnecting chart boundaries during inverse mapping to produce a watertight mesh. "MCGIM~\cite{sander2003multi} also introduces the zippering scheme as a part of inverse mapping $f^{-1}$ to reconnect charts."

SwiftTailor: Efficient 3D Garment Generation with Geometry Image Representation

Summary

SwiftTailor: Efficient 3D Garment Generation with Geometry Image Representation

Introduction and Motivation

System Overview

Garment Geometry Image Representation

Model Architecture and Losses

Empirical Evaluation

Practical and Theoretical Implications

Limitations and Future Directions

Conclusion

Paper to Video (Beta)

Whiteboard

Paper Prompts

Top Community Prompts

Explain it Like I'm 14

SwiftTailor: A simple explanation for teens

What’s this paper about?

What were the researchers trying to do?

How does SwiftTailor work? (Methods in everyday terms)

What did they find, and why is it important?

What could this change in the real world?

Knowledge Gaps

Knowledge gaps, limitations, and open questions

Representation and reconstruction

Learning and generalization

Physical plausibility and downstream simulation

Training and losses

Evaluation and baselines

Robustness and failure cases

Usability and extensibility

Open research directions

Practical Applications

Immediate Applications

Long-Term Applications

Glossary

Open Problems

Continue Learning

Collections

Tweets