Invention Title:

ARTIFICIAL INTELLIGENCE DEVICE FOR A HYBRID NEURAL RENDERING MODEL FOR 3D ANIMATION AND METHOD THEREOF

Publication number:

US20250252643

Publication date:
Section:

Physics

Class:

G06T13/40

Inventors:

Assignee:

Applicant:

Drawings (4 of 16)

Smart overview of the Invention

A novel method for controlling a device involves transforming two-dimensional (2D) images into three-dimensional (3D) animations using a hybrid neural rendering model. This approach utilizes a combination of triangular mesh and neural feature maps to accurately animate fine details, such as hair, in 3D avatars. The process begins by receiving a 2D image and a hybrid 3D model composed of two sets of triangles, each with specific rigging information. These triangles are deformed based on animation parameters to create dynamic and realistic animations.

Rendering Process

The rendering of the deformed triangles involves two distinct techniques. The first set of triangles undergoes texture mapping, while the second set is processed through deferred neural rendering, leveraging neural feature maps and alpha maps. This dual approach ensures that the final output is a high-quality animated 3D object that seamlessly integrates with the input 2D image. The method is particularly effective for real-time applications on devices with limited processing power, such as mobile or edge devices.

Technical Challenges

Current methods in 3D animation face challenges such as capturing intricate details like hair and achieving high-fidelity rendering on resource-constrained devices. Existing techniques often require substantial computational resources, leading to inefficiencies in storage and processing. Moreover, accurately portraying natural facial expressions and hair movements remains a significant hurdle, affecting user engagement and realism.

Model Training

The training process for the AI model includes receiving video images along with camera data and fitting them to a 3D morphable model (3DMM). A prism lattice structure is constructed over detailed areas like hair, with two neural fields—opacity and feature fields—defined over the canonical space. A color prediction network converts features into colors based on viewing direction. Training involves minimizing errors between rendered and ground truth images, refining opacity predictions, and ensuring efficient rendering for real-time applications.

Inference and Optimization

During inference, the trained AI model is optimized for use on edge devices by removing occluded triangles and sampling opacity fields to refine the model. This process generates texture maps storing sampled values and feature vectors, which are then used to render the hybrid model efficiently. The final rendering includes post-processing steps that combine neural features with camera directions to produce vibrant and realistic animations of 3D avatars.