US20250272808
2025-08-28
Physics
G06T5/77
The patent application describes a system for enhanced image generation that manipulates images by transferring motion from one image to another while preserving the original appearance features. This system uses a three-dimensional (3D) flow field to perform spatial transformations, allowing for the creation of a warped image that mimics the target motion. The technology is particularly useful in applications such as face reenactment, animation, and video synthesis, where maintaining the visual identity of the source object is crucial.
Generative artificial intelligence (AI) techniques are commonly used in image synthesis to create realistic images. Traditional methods like image warping, style-based generative adversarial networks (GANs), and volumetric 3D head reconstruction each have limitations. Image warping struggles with pose variations, GANs often miss fine details, and volumetric methods can produce rigid results. The described system aims to overcome these challenges by integrating both 2D and 3D methodologies, enhancing facial expression transfer and head pose variation handling.
The image generation model includes several components: motion estimation, image warping, and image refinement. It utilizes adaptive instance normalization (AdaIN) for feature modulation and a U-shaped network (UNet)-based architecture for refining images. The model employs a cyclic warp loss technique to improve motion estimation accuracy, ensuring realistic rendering of facial details while preventing unwanted background motion.
The training process occurs in two phases. Initially, the 3D warping and inpainting networks are independently pre-trained to ensure effective deformation generation and background restoration. In the second phase, end-to-end training optimizes the entire model using cyclic warp loss for accurate motion estimation. This approach minimizes differences between synthesized images and source images, ensuring precise expression transfer and improved performance in challenging face reenactment tasks.