US20250218121
2025-07-03
Physics
G06T17/20
The patent application introduces a method for quickly generating three-dimensional (3D) heads using natural language inputs. This process involves converting two-dimensional images into a 3D neural radiance field (NeRF), which is then adjusted according to descriptive text inputs. An open-source model called CLIP is utilized to evaluate how well the resulting image aligns with the provided text, ultimately producing a 3D NeRF that can be transformed into a polygonal mesh for use in computer simulations, such as video games.
Creating characters for computer simulations, like non-player characters (NPCs) in video games, traditionally demands significant time and expertise. The application addresses this by offering a more accessible and efficient approach for game developers and users to create characters using text or voice commands. This innovation aims to streamline the character creation process, making it faster and more intuitive.
The system described includes computer storage containing instructions executable by a processor to generate a base 3D NeRF from multiple images. Text input is used with the CLIP model to modify the base NeRF, creating a virtual human head that can be displayed in simulations. The CLIP model scores image-text similarity using cosine similarity, ensuring the generated 3D head closely matches the textual description.
The application details the use of machine learning models to minimize discrepancies between text descriptions and the generated 3D heads. These models are trained on image-text pairs, utilizing a fully connected deep network architecture. Inputs to the model include spatial and viewing dimensions, while outputs encompass volume density and view-dependent radiance.
The system can be implemented across various devices such as game consoles, VR headsets, smart TVs, and mobile devices, operating on diverse platforms like Linux, Microsoft, Apple, or Google systems. Networked environments facilitate data exchange between client and server components, ensuring secure and reliable operations. This flexibility allows for broad application in consumer electronics ecosystems.