Invention Title:

GENERATIVE FACIAL MAPPING AND BODY BLENDING DURING VIDEO CAPTURE

Publication number:

US20240412379

Publication date:
Section:

Physics

Class:

G06T7/20

Inventors:

Applicant:

Smart overview of the Invention

The patent application details a video generation facility that captures and enhances audio/video sequences of individuals speaking. It utilizes facial mapping techniques to create a seamless blend between a live video frame and an existing image or video of the person. This process involves spatial correlation and merging of specific regions within the frames, resulting in a coherent merged video accompanied by an audio track.

Background

In modern communication, people heavily rely on textual forms like email and messaging, which lack social cues inherent in face-to-face interactions. Despite the availability of video as a more expressive medium, it remains underutilized due to barriers such as workflow disruption and self-consciousness about appearance during recording.

Challenges and Solutions

The inventors identified key issues preventing widespread asynchronous video use, such as unpreparedness for recording due to appearance or environment. Conventional solutions like filters and background changes are insufficient when users are not dressed appropriately or are in unsuitable settings. The facility addresses these challenges by enabling generative facial mapping and body blending, allowing users to create professional-looking videos regardless of their real-time conditions.

Implementation

The facility can be implemented across various platforms, including mobile apps, desktop applications, browser plugins, or websites. It offers users the ability to combine baseline videos with previously recorded acceptable versions, maintaining essential expressions while improving aesthetic elements. The system supports real-time processing and employs individual identity models to ensure accurate representation.

Technological Impact

This facility not only facilitates professional video creation but also optimizes computing resources by reducing processing demands. Its complex operations, involving machine learning models and real-time blending, exceed human cognitive capabilities. The facility's integration into diverse computing systems enhances digital communication by making expressive video messaging more accessible and efficient.