US20250148701
2025-05-08
Physics
G06T15/40
The patent application describes a method for dynamically overlapping moving objects with real and virtual scenes using a video see-through (VST) extended reality (XR) device. It involves capturing image frames and depth data of a scene, which includes both static contents and a moving object that is part of a user's body. The process utilizes machine learning to generate masks that distinguish human skin pixels from the rest of the scene, enabling accurate image reconstructions.
The method integrates several steps to achieve seamless blending of real and virtual elements. Initially, image frames and depth data are obtained from the VST XR device's sensors. A machine learning model then creates masks to separate moving objects from static scenes by identifying human skin pixels. The reconstructed images of both moving objects and static scenes are then combined with virtual features to produce a cohesive visual output.
The VST XR device is equipped with imaging sensors, displays, and processing units capable of executing the described method. The processing unit captures scene data, generates masks, reconstructs images, and combines them with virtual elements for display. This configuration ensures that moving objects, such as parts of a user's body, are accurately integrated into both real and virtual environments.
A non-transitory machine-readable medium stores instructions for executing the method on the VST XR device. These instructions guide the processor in capturing scene data, generating masks using machine learning models, reconstructing images, and rendering the final combined visuals. This software component is essential for achieving the dynamic overlapping of real and virtual scenes.
This technology enhances user interaction with XR systems by providing a more immersive experience through accurate blending of real-world movements with virtual content. It enables applications in various fields such as gaming, training simulations, and augmented reality environments, where realistic integration of user movement is crucial.