- The paper introduces a novel joint continuous-discrete diffusion framework using Gaussian-Softmax diffusion to enhance the fidelity and diversity of CAD sketch generation.
- It employs a permutation-equivariant transformer architecture that effectively addresses heterogeneous primitive parameterization and permutation invariance.
- The approach achieves state-of-the-art results on the SketchGraphs dataset by significantly reducing FID and Negative Log-Likelihood compared to previous methods.
SketchDNN: Joint Continuous-Discrete Diffusion for CAD Sketch Generation
The paper "SketchDNN: Joint Continuous-Discrete Diffusion for CAD Sketch Generation" introduces a novel approach for generating computer-aided design (CAD) sketches using a unified continuous-discrete diffusion process. This model, termed SketchDNN, improves upon prior methods by addressing intrinsic challenges in CAD sketch generation through a Gaussian-Softmax diffusion mechanism, which allows for a blend of class label probabilities, enhancing both the fidelity and diversity of generated sketches on the SketchGraphs dataset.
Figure 1: The generation pipeline of SketchDNN. Starting from a pure noise seed XT​, the denoiser network iteratively refines the sample, leading to the final generated sketch X0​ through successive denoising steps.
Methodology
SketchDNN is based on a diffusion model where both continuous parameters and discrete labels of CAD primitives are modeled jointly. The process employs Gaussian-Softmax diffusion, which perturbs logits with Gaussian noise and maps them onto a probability simplex through a softmax transformation. This facilitates a seamless handling of discrete variables like class labels of primitives, overcoming two key challenges in CAD sketching: heterogeneity in primitive parameterizations and permutation invariance.
Continuous and Discrete Diffusion
Continuous diffusion uses a Markov chain to add Gaussian noise over timesteps, progressively destroying information about input data until it resembles pure noise. Discrete diffusion, however, traditionally struggled with such a gradual transition due to its inherent categorical constraints. SketchDNN's discrete diffusion uses the Gaussian-Softmax distribution, providing a continuous relaxation by transforming Gaussian vectors with softmax, enabling blended class labels.

Figure 2: Left: The orange curve represents the raw cosine variance schedule at​​, while the blue curve depicts the probability that the class label remains unchanged.
CAD Sketch Representation and Permutation Invariance
Each primitive in a CAD sketch is characterized by discrete and continuous attributes. SketchDNN models each attribute independently within the diffusion framework. The model employs permutation-equivariant diffusion, wherein the denoising process is independent of primitive order in sketches, thus conserving geometric representation despite varied primitive sequences. This is crucial for CAD sketches where there are many permutations of primitives that are geometrically identical.
Implementation and Training
The denoiser network within SketchDNN leverages a permutation-equivariant transformer architecture without positional encodings, preserving equivalence across permutations. SketchDNN was trained over a significant number of epochs using a large batch size distributed across multiple GPUs, focusing on minimizing reconstruction loss via Mean-Squared Error for continuous variables and Cross-Entropy for discrete variables. An important aspect of training included masking irrelevant primitive parameters based on predicted class probabilities, ensuring higher fidelity predictions.
Results and Evaluation
SketchDNN significantly reduced the Fréchet Inception Distance (FID) and Negative Log-Likelihood (NLL) compared to existing benchmarks such as Vitruvion and SketchGen, establishing new state-of-the-art performance. The model excelled in capturing both fidelity and diversity in generated sketches, demonstrating its superior capability in CAD sketch generation tasks.















































Figure 3: Comparison of CAD sketches from the SketchGraphs dataset (top), generations from SketchDNN (middle), and generations from Vitruvion (bottom).
Conclusion
SketchDNN introduces a robust, innovative framework for generating CAD sketches by integrating continuous and discrete diffusion processes. Its Gaussian-Softmax diffusion paradigm is a significant step forward in overcoming traditional challenges associated with discrete variable diffusion, promising broader applicability to other generative tasks involving mixed data types. Future work can explore its extension to conditional generation tasks and further optimization for a wider range of CAD applications.