Invention Title:

IInteraction of Multimodal Behavior Models with Natural Language Prompts

Publication number:

US20250094454

Publication date:
Section:

Physics

Class:

G06F16/3329

Inventors:

Applicant:

Smart overview of the Invention

The patent application outlines a computer system that utilizes an integrated multimodal neural network platform. This system processes sensor data collected from various devices in a physical environment. The data is used to identify signature events over a specific time period. The system then applies a large behavior model (LBM) to generate multimodal outputs based on natural language prompts, offering user-friendly information such as textual statements, software code, or interactive interfaces.

Technical Field

The application focuses on data processing through a multimodal neural network platform. This platform leverages large behavior models to handle diverse data types, including sensor and content data, producing outputs that are easily accessible to users and their devices. The system supports a variety of sensors like cameras, motion detectors, and temperature sensors, which may be distributed across different venues.

Functionality

The system's functionality includes compressing and presenting sensor data. It processes sequences of sensor samples to create parametric representations, identifying significant events independent of sensor types. A large behavior model processes these representations alongside natural language prompts to produce outputs in predefined formats, such as dashboards or narrative messages, enhancing user interaction with the data.

Implementation

The integrated platform can either utilize its own LBM or rely on third-party models. When using third-party models, the platform focuses on preprocessing sensor data into required formats for external processing. The processed outputs are then integrated back into the platform for local use. This approach allows for flexibility in handling different types of user prompts and content data.

Applications

The system supports various implementations, including on-device processing to enhance privacy by minimizing cloud transmission. Machine learning techniques are employed both locally and in the cloud to optimize results. The LBM is adaptable for multimodal learning, capable of generating outputs like user interfaces or software code from diverse input types such as sensor data and textual queries.