Invention Title:

AVATAR CONTROL

Publication number:

US20240290021

Publication date:

2024-08-29

Section:

Physics

Class:

G06T13/40

Inventors:

Koichiro NIINUMA Pittsburgh, PA, United States

Heng YU Pittsburgh, PA, United States

Laszlo JENI Pittsburgh, PA, United States

Joel JULIN Pittsburgh, PA, United States

Zoltán Ádám MILACSKI Pittsburgh, PA, United States

Assignees:

FUJITSU LIMITED Kawasaki-shi, Japan

Carnegie Mellon University Pittsburgh, PA, United States

Applicant:

Fujitsu Limited Kawasaki-shi, Japan

Drawings (4 of 7)

Smart overview of the Invention

A method for avatar control involves the manipulation of rays associated with dynamic objects using neural networks. Initially, a ray linked to a dynamic object is deformed at a specific time through a first neural network and a latent code, resulting in a deformed ray. This deformed ray is then processed to obtain a hyperspace code by inputting it into a second neural network, along with the original ray and the time parameter.

Sampling and Combining Data

The next step includes sampling multiple points from the deformed ray. These sampled points are combined with the hyperspace code to create a comprehensive network input. This combined input is then fed into a third neural network, which generates RGB values necessary for rendering images of a three-dimensional scene that represents the dynamic object at a later time.

Advancements in Machine Vision

Recent advancements in machine vision have allowed for improved representation of 3D objects derived from 2D images. Traditional methods often struggled with dynamic objects due to increased computational complexity and longer processing times. The current approach aims to streamline this process, enabling faster rendering and more accurate representations of dynamic objects compared to static ones.

Application of Hyperspace Representation

The concept of hyperspace is crucial in this method, representing multiple dimensions, including time and radiance. By converting 3D rays into higher-dimensional representations, the system can accommodate topological variations in dynamic objects. This allows for more realistic renderings, particularly for complex features such as human facial expressions, enhancing the overall visual output.

Neural Network Configuration

The implementation involves three distinct neural networks, each serving specific functions within the process. The first two networks are typically shallow feedforward networks designed to handle the initial deformation and hyperspace coding. The third network processes the combined data to produce final RGB values for rendering. This structured approach facilitates efficient computation and accurate visual representation of avatars based on dynamic objects.