Advanced Text-to-3D Generative Model

Invention Description

Generating high-quality 3D content is essential for applications such as gaming, film, and virtual reality, but it typically requires large, well-annotated 3D datasets. These datasets are expensive and time-consuming to create, limiting the scalability of current text-to-3D generation methods. As a result, many existing models struggle to produce geometrically consistent and high-fidelity 3D objects. There is a need for approaches that can generate accurate 3D content without relying on massive training datasets.

Researchers at Arizona State University have developed a text-to-3D generative model that leverages high-fidelity 3D objects, depth maps and deep geometric moments (DGM) to improve the quality and consistency of 3D outputs. By incorporating geometric constraints directly into the learning process, the model ensures structurally accurate representations even with limited training data. It also integrates ControlNet and LoRA to condition on depth data, ensuring diverse and consistent 3D representations. Data scarcity challenges are able to be overcome while maintaining strong geometric integrity. Utilizing 3D Gaussian Splatting for efficient rendering and refinement, this model produces well-structured and high-quality 3D models validated against state-of-the-art techniques.

This novel text-to-3D generative model significantly improves geometric consistency and reduces viewpoint bias in 3D object generation without large datasets and enables efficient generation of high-fidelity 3D assets suitable for use in games, films, and virtual reality environments.

Potential Applications

3D asset design and creation for gaming, VR and augmented reality environments
Enhanced content generation for film and visual effects production
3D model generation for simulation and training
Creative tools for digital artists, studios, or education content
Rapid prototyping and visualization in design and manufacturing sectors
Platforms for e-commerce enabling detailed 3D product visualization

Benefits and Advantages

Minimal dependency on large-scale 3D datasets
Applicable to diverse domains such as gaming, film, and VR
Uses 3D Gaussian Splatting for efficient rendering and geometric refinement
Reduces viewpoint bias and geometric distortions such as the Janus problem
Incorporates high-fidelity depth maps and deep geometric moments for enhanced shape awareness
Employs ControlNet and LoRA for conditioning on depth data, improving model consistency
Demonstrates superior performance with a 38% improvement in Janus rate over leading competitors

For more information about this opportunity, please see

Nath et al – IEEE-CVF WACV – 2025

Inventor(s)

Technology categories

Licensing Contacts