Invention Title:

MACHINE-BASED OBJECT RECOGNITION OF VIDEO CONTENT

Publication number:

US20240403946

Publication date:
Section:

Physics

Class:

G06Q30/0643

Inventors:

Assignee:

Applicant:

Drawings (4 of 35)

Smart overview of the Invention

The invention introduces a system that utilizes neural networks to identify items within video content dynamically and unobtrusively. Unlike traditional methods that require manual annotation, this system responds to user interactions such as voice queries, touchscreen taps, or cursor movements. This approach enhances the user experience by providing information about both visible and non-visible items in a seamless manner, thus avoiding interruptions during video playback.

Background and Current Challenges

Existing interactive video interfaces are limited due to their reliance on manual annotations, which are costly and time-consuming. Previous attempts to enhance video interactivity have been disruptive and unintuitive, failing to provide comprehensive information. The new technology aims to overcome these limitations by automating metadata generation and offering an improved user interface that integrates smoothly with the viewing experience.

Innovative User Interfaces

The technology includes two main user interfaces: the Overlaid Video Player and the Adjacent Layout. These interfaces work with a real-time image recognition engine to identify objects in videos and present relevant information. The Overlaid Video Player allows users to interact with products shown in videos efficiently, increasing engagement and purchase rates. Meanwhile, the Adjacent Layout offers additional visibility for products, facilitating easier purchases without disrupting the video content.

Technical Solutions for Seamless Interaction

A key challenge addressed by this technology is the frequent updating of the Overlaid Video Player interface. By preloading and caching data for each video segment on the user's device, updates can occur rapidly without relying on continuous network requests. This ensures a smooth user experience while maintaining synchronization with the video content. Compatibility with various video player technologies is achieved through a generic relay system, allowing broad distribution across different platforms.

Advanced Object Recognition Capabilities

The system employs neural networks to recognize objects within videos by generating embeddings compared against a database of known objects. This allows for accurate identification and retrieval of metadata, which can include product links or similar items. The system supports various input methods for user requests, such as voice commands or touch interactions, further enhancing accessibility and usability across different devices.