- The paper introduces LegoGPT, an autoregressive LLM fine-tuned to generate sequences of LEGO brick placements while ensuring physical stability.
- It employs rejection sampling and physics-aware rollback mechanisms to validate design integrity and maintain structural realism.
- Experimental results show superior performance over baselines by achieving high text-structure alignment and manufacturable LEGO designs.
Generating Physically Stable and Buildable Brick Structures from Text
The paper presents a methodology for generating LEGO brick structures from textual descriptions, focusing on physical stability and constructibility. Leveraging advances in autoregressive LLMs, the research introduces LegoGPT, which promises transformational applications in education, entertainment, and beyond.
Overview of LegoGPT
LegoGPT's core innovation is an autoregressive LLM fine-tuned to predict the sequence of bricks needed to construct a stable LEGO design from text-based prompts. This task involves generating each next brick by extending the capabilities of traditional LLM architectures, focusing on ensuring physical stability and buildability.
Dataset and Training
The foundation of the research is the StableText2Lego dataset, encompassing over 47,000 LEGO structures derived from 28,000 unique 3D objects (Figure 1). These structures are accompanied by detailed captions, ensuring rich data for model training. The models are trained to predict the positioning and orientation of each brick, adhering strictly to structural integrity principles.
Figure 1: StableText2Lego Dataset.
Methodology
Autoregressive Model Training
The approach reformulates LEGO design generation as a text prediction task. Using a modified architecture of LLaMA-3.2-Instruct-1B, the model generates a sequence of brick placements. Each step involves tokenizing LEGO structures and incorporating both size and positioning data into a streamlined text format (Figure 2).
Figure 2: Method overview including tokenization, model fine-tuning, and prediction validation.
Ensuring Structural Stability
A critical component of LegoGPT is ensuring designs are not only aesthetically aligned with prompts but also physically viable. This includes physics-aware checks during model inference to rollback invalid predictions that may result in unstable structures. Stability is further ensured through a rigorous force model that simulates real-world forces acting on blocks (Figure 3).
Figure 3: Force Model illustrating various forces considered in stability assessments.
Model Inference and Optimization
The inference process incorporates both validity checks and physics-aware rollbacks. Rejection sampling is employed to seamlessly integrate these components into the model prediction pipeline, ensuring designs adhere to defined physical constraints before assembly is deemed complete.
Texturing and Coloring
Beyond basic structural generation, the model includes functionality for detailed texturing and coloring (Figure 4). This extension utilizes UV mapping and FlashTex for text-based mesh texturing, showcasing the diverse stylistic possibilities LEGO structures can achieve.
Figure 4: Textured and Colored LEGO Generation.
Practical Applications
The real-world applicability of LegoGPT extends to both automated robotic assembly and manual construction (Figures 7 and 10). The system is designed for automated assembly using dual-robotic arms, leveraging task-distribution frameworks and precise manipulation policies.
Figure 5: Automated Assembly.
Figure 6: Manual Assembly.
Experimental Results
Quantitative analysis showcases the model's superiority over existing baselines, with higher proportions of stable and valid designs. The LegoGPT approach significantly enhances text-structure alignment, stability, and adherence to LEGO geometry. Techniques like the described physics-aware rollback and stitching enable robust, real-world implementation.
Conclusion
This research identifies and overcomes key challenges in real-world object generation, advancing the field of text-to-3D model generation by ensuring structural stability and buildability of LEGO designs. Future directions involve scaling the dataset and exploring more complex structures to further enhance the granularity of generated models. The methodology promotes wider access to automated design tools across disciplines, emphasizing the bridging of digital models to tangible constructs.