- The paper introduces SketchGraphs, a dataset of 15 million CAD sketches represented as geometric constraint graphs.
- It employs Onshape's API to extract real-world sketches and model design intent using multi-hypergraph structures.
- The paper demonstrates applications in autoconstrain and generative modeling, paving the way for advanced CAD automation.
Overview of SketchGraphs: A Large-Scale Dataset for Modeling Relational Geometry in Computer-Aided Design
The paper "SketchGraphs: A Large-Scale Dataset for Modeling Relational Geometry in Computer-Aided Design" introduces a comprehensive dataset along with a processing pipeline designed to enhance research in machine learning applied to parametric computer-aided design (CAD). At its core, SketchGraphs is an extensive collection of 15 million sketches extracted from real-world CAD models, facilitated by access to Onshape, a cloud-based CAD platform. The dataset emphasizes relational geometry by representing each sketch as a geometric constraint graph, which is crucial for reasoning and generative modeling tasks in CAD.
Dataset Features and Acquisition
SketchGraphs is constructed by extracting sketches from publicly available CAD models on Onshape. A sketch contains geometric primitives, such as lines, circles, and arcs, and constraints like coincidence, perpendicularity, and symmetry, forming the basis for CAD models. Each sketch is encoded as a constraint graph where nodes represent primitives, and edges denote imposed constraints. This representation allows researchers to explore relational reasoning and program synthesis in a structured fashion.
The data acquisition process leverages Onshape's API and spans documents created over five years, resulting in over 15 million sketches. The dataset includes sketches varying from simple ones to complex configurations with large constraint graphs, providing insights into design operations and choices.
Figure 1: Example sketches from the dataset containing at least six geometric primitives. Dashed lines indicate construction geometry, which is used as a reference for other primitives but not physically realized.
Geometric Constraint Graphs
Each sketch in the SketchGraphs dataset is represented by a geometric constraint graph, a pivotal concept for modeling relational geometry. Nodes correspond to geometric primitives with parameters such as coordinates and dimensional properties, and edges denote constraints imposed by the designer to define relationships between these primitives. The constraint graph accurately captures the dependencies and geometric reasoning involved in CAD design processes.
To handle constraints, specialized data structures are introduced, representing them as multi-hypergraphs with support for loops and hyperedges. This approach accommodates complex constraint conditions and interactions between multiple primitives, reflecting real-world design scenarios.
Figure 2: Example sketch (left) and a portion of its geometric constraint graph (right). Constraints are denoted as edges that either act on a primitive as a whole or some subcomponent of the primitive.
Supported Applications
The paper outlines several applications enabled by the SketchGraphs dataset, focusing on generative modeling and constraint inference, which can both facilitate and automate CAD design workflows. Two key applications are autoconstrain models and generative modeling, each targeting different aspects of the design process:
Autoconstrain Models
Autoconstrain models aim to infer design intent by predicting constraints for unconstrained geometric primitives. This task involves considering the natural set of constraints a human designer might impose. By treating ground truth constraints from the dataset as a predictive target, researchers can develop models that streamline design operations in CAD environments.
Figure 3: Autoconstraining a sketch. On the left is the original input sketch. Two user modifications are shown: dragging the top circle upwards and enlarging it to the right. The model correctly identifies several constraints, aligning with expected design intent.
Generative Modeling
Generative modeling addresses both generating plausible CAD sketches and assessing their reasonableness based on learned patterns from the dataset. Such models can offer completion suggestions or corrections within CAD software, further advancing interactive design processes. The baseline generative model trained on SketchGraphs data exhibits capabilities to model constraint sequences, although primitive initialization requires additional effort.
Figure 4: Random samples from the trained generative model containing at least two primitives. A solver is used to determine the final configuration of the sketch after the model samples the geometric constraint graph.
Implications and Future Developments
SketchGraphs provides a fertile ground for research at the intersection of machine learning, CAD design, relational modeling, and program synthesis. By facilitating the exploration of these domains, the dataset can lead to improved design automation, enhanced modeling techniques, and insights into human design practices.
Potential future developments include extending the dataset to encompass other CAD operations beyond 2D sketches and establishing benchmarks for CAD inference from images. Such extensions could broaden the applicability and impact of the SketchGraphs dataset, potentially transforming AI-driven design practices in engineering.
Conclusion
The introduction of SketchGraphs marks a significant step towards integrating machine learning into CAD workflows. Its vast dataset and detailed constraint representation create numerous opportunities for advancing design automation and studying relational geometry. By providing benchmarks and facilitating model development, SketchGraphs empowers researchers to push the boundaries of CAD modeling and reasoning.