Compositional Generation in 3D/4D Gaussian Splatting

Develop a compositional text-to-3D and text-to-4D generation framework based on 3D Gaussian Splatting that supports the simultaneous creation and arrangement of multiple objects within a single scene, overcoming current limitations where most methods do not support compositional creation and addressing constraints such as rigid-body-only interactions and inability to represent topological changes in dynamic sequences.

Background

The paper surveys text-to-3D and text-to-4D generation methods utilizing 3D Gaussian Splatting and identifies gaps in current approaches. While techniques such as DreamGaussian, GaussianDreamer, and AGG provide efficient object-level generation, they largely focus on single assets rather than multi-object composition.

The authors note that CG3D attempts a compositional framework but only supports rigid-body interactions, and AYG’s compositional 4D sequences cannot depict topological changes. This motivates an explicit call-out that achieving true compositional generation with 3D Gaussian Splatting remains unresolved.

References

Compositional generation remains an open problem since most methods do not support such creation~\citep{yin2023_4dgen, yi2023gaussiandreamer, tang2023dreamgaussian, xu2024agg}.

3D Gaussian as a New Era: A Survey  (2402.07181 - Fei et al., 2024) in Challenges, Section 6 (Generation)