US20240378763
2024-11-14
Physics
G06T11/00
The patent application describes a method and system for creating an extended reality (XR) try-on experience using a diffusion model. This involves generating a realistic image where a target fashion item from one image is digitally placed onto a real-world object depicted in another image. The process uses machine learning to seamlessly integrate the fashion item into the original image, ensuring any incomplete areas are filled to create a cohesive visual representation.
The invention pertains to generating images using a diffusion model, particularly in augmented reality (AR) contexts. AR systems enhance real-world environments by overlaying virtual elements, offering applications in gaming, messaging, and beyond. These systems collectively fall under the category of extended reality (XR).
Current AR systems often require expensive equipment and significant user effort to produce high-quality images. Users typically need to manually adjust settings like lighting and placement to achieve desired results, which can be costly and time-consuming. The disclosed method aims to automate and simplify this process, making it more accessible and efficient for users.
The system receives two images: one of a real-world object and another of a target fashion item. It creates a warped image by replacing parts of the real-world object with the fashion item and uses segmentation maps to identify incomplete areas. A generative machine learning model then fills these gaps, producing an artificial image that realistically shows the object wearing the fashion item. This technique reduces the resources needed for creating high-quality XR experiences.
The system operates within a networked environment where interaction clients on user devices communicate via servers. These servers provide functionalities such as data exchange and media processing, supporting the seamless integration of AR features in user interactions. The interaction clients can access external applications, enhancing their capabilities through linked resources provided by third parties or the system's creator.