Invention Title:

AI-BASED VISUAL STYLE TRANSFER

Publication number:

US20250225430

Publication date:
Section:

Physics

Class:

G06N20/00

Inventors:

Assignee:

Applicant:

Drawings (4 of 17)

Smart overview of the Invention

The patent application introduces a data processing system designed to streamline AI-based visual style transfer. It aims to simplify the process of generating visual content by merging a user-selected style image with a topic content item, such as text or another image. This system leverages generative models to create outputs that maintain the aesthetic style of the original image while infusing new thematic elements, enhancing user experience by automating complex tasks traditionally requiring manual input.

Background

Existing AI-driven image generation platforms often require users to craft detailed text prompts to achieve desired visual outputs. This process can be cumbersome and time-consuming, as users must frequently adjust their inputs to obtain satisfactory results. The need for an easier and more intuitive approach to visual content creation has become evident, especially for those who wish to utilize pre-existing images and designs without extensive prompt engineering.

System Functionality

The system operates by receiving a user prompt that includes both a style visual content item and a topic content item. A prompt construction unit then combines these inputs with instructions for a generative model, creating a textual description that guides the generation of the desired output. This output is produced by replacing visual elements in the style image according to the topic content while preserving the original style, and is then presented via a user interface.

Technical Solution

Addressing challenges in existing systems, this approach automates the conversion of images into text prompts for generative models. Users can directly upload images as style prompts, allowing the system to handle the complexities of image generation and style transfer autonomously. This pipeline not only simplifies workflows but also enhances accessibility, enabling users to produce high-quality stylized images efficiently without needing expertise in crafting text prompts.

Benefits

  • User Convenience: By allowing users to upload images directly as style prompts, the system reduces reliance on intricate text prompt engineering.
  • Enhanced User Experience: The automated process offers multiple output choices, improving satisfaction by presenting high-quality images consistent with user-selected styles and topics.
  • Technical Efficiency: Leveraging large multimodal models (LMMs) and large visual models (LVMs) facilitates effective style transfer within design platforms.