Invention Title:

SEMANTIC IMAGE SYNTHESIS

Publication number:

US20250086849

Publication date:
Section:

Physics

Class:

G06T11/00

Inventors:

Applicant:

Smart overview of the Invention

Overview:

The patent application discusses a novel system for generating images through a machine learning-based approach. It involves an image processing apparatus that utilizes a multi-scale guided diffusion model to synthesize images from text prompts. Users can specify layout information and precision levels for elements within an image, allowing for detailed control over the placement and appearance of objects in the generated image.

Image Generation Process:

The system receives user input via a user interface, including text prompts describing image elements, layout information indicating target regions, and precision levels for these elements. This input is used to create a text feature pyramid, which consists of multiple text feature maps at different scales. The pyramid serves as the foundation for generating images where objects are accurately placed and shaped according to user specifications.

Diffusion Model and Precision Control:

A key component of the system is the diffusion model, which is trained to interpret the text feature pyramid and produce images with specified object layouts. The model incorporates a mask pyramid that encodes shape precision by selectively omitting regions with lower precision, thus ensuring that generated objects adhere closely to the desired layout. Users can adjust precision levels to control how strictly objects conform to the input layout.

Applications and Benefits:

This technology offers significant improvements over traditional image generation models by providing enhanced control over object placement and shape within synthesized images. It enables applications in various fields requiring precise image synthesis, such as digital art creation, design visualization, and automated content generation. The system's ability to integrate semantic labels and virtual brushes further enhances its utility in creative and professional settings.

Implementation Details:

The patent provides detailed descriptions of the system architecture and processes involved in training and using the diffusion model. It includes references to figures illustrating example applications, system components, and procedural steps for image generation. This comprehensive approach ensures that users can effectively implement and benefit from this advanced image synthesis technology.