Invention Title:

SYSTEM AND METHOD FOR GENERATING A REAL-TIME, INTERACTIVE COMPANION ON A USER DEVICE

Publication number:

US20260112099

Publication date:
Section:

Physics

Class:

G06T13/40

Inventors:

Applicant:

Smart overview of the Invention

The invention describes a system and method to create a real-time, interactive digital companion on a user device. This system comprises processors and memory that store instructions for processing user inputs like text, voice, touch, or gesture data. By employing natural-language processing, the system extracts semantic and emotional context to generate animation parameters for facial expressions, gestures, and full-body motion. These animations are then rendered in sync with user inputs without transmitting data off-device, ensuring low latency, enhanced privacy, and reduced reliance on cloud infrastructure.

Technical Field

The invention is situated in the field of real-time animation of characters, focusing on generating interactive companions on user devices. This approach allows for real-time, low-latency animation while preserving privacy through on-device processing and minimizing dependency on cloud services. Traditional systems often rely on fixed templates or cloud-based computation, which limits adaptability and introduces latency due to data transmission.

Background

Traditional animation systems use static frames, providing predictable and smooth outputs but lacking adaptability and spontaneity, resulting in less engaging interactions. These systems often depend on cloud-based computations, which introduce latency and require constant internet connectivity, limiting their use in low-bandwidth or offline environments. Moreover, the transmission of sensitive user data to third-party servers raises privacy concerns and incurs high operational costs.

Objectives

The invention aims to dynamically generate real-time animations from audio or contextual input, improving user engagement and reducing costs by eliminating cloud infrastructure dependency. It also seeks to enable offline-ready animation without continuous internet connectivity and to synchronize body, face, and lip-sync animations with conversation context, enhancing natural and emotionally expressive interactions.

Embodiments

Various embodiments of the invention include using a transformer-based encoder-decoder for natural interactions and a mapping module for synchronized animation. The system normalizes multimodal inputs for reliable processing and uses a short-term memory for consistent emotional expressions. On-device optimization minimizes latency and power consumption, enabling efficient operation. These features collectively enhance user engagement by creating adaptive and contextually appropriate animations.