Invention Title:

SYSTEM AND METHOD FOR GENERATING VIDEOS DEPICTING VIRTUAL CHARACTERS

Publication number:

US20240346735

Publication date:
Section:

Physics

Class:

G06T13/40

Inventors:

Applicant:

Smart overview of the Invention

Advancements in generative machine learning focus on creating virtual characters for various applications, including chatbots and virtual environments. This technology allows for the generation of videos that depict these characters, enhancing user interaction by making it more engaging and lifelike. The process involves using audio and visual inputs to create realistic representations of characters that can react emotionally and physically to stimuli.

Video Processing and Character Generation

The method starts by accessing a video featuring a first subject, which includes corresponding audio of the subject's speech. Alongside this, an image of a second subject is utilized. These inputs are processed through machine learning models to generate a new video showcasing the second subject. The generated content allows the second subject to exhibit actions such as blinking, which are responsive to the first subject's speech and expressions.

Feature Extraction and Emotional Representation

To create a realistic representation, the system extracts various feature vectors from the initial video. These vectors capture both visual traits like facial expressions and audio traits such as speech patterns. By analyzing these features, the system constructs an emotion vector that reflects the emotional state of the first subject, which is then used to inform the behavior of the second subject.

Latent Space Mapping and Avatar Creation

The method further involves mapping continuous latent spaces to discrete representations that encapsulate motion characteristics of the second subject. This mapping enables the generation of coefficients that dictate how the avatar behaves visually. The final avatar is constructed using these coefficients, resulting in a sequence of frames that portray both visual movements and emotional reactions accurately.

Applications and User Expectations

This technology addresses user expectations for dynamic interactions in virtual settings. Users anticipate that virtual characters will not only mimic human-like movements but also respond with appropriate emotional cues during conversations. By utilizing these advanced generative techniques, virtual characters can engage in more authentic interactions, thereby enhancing experiences in gaming, customer service, and online education.