Invention Title:

ENHANCED USER INTERFACES FOR PARALINGUISTICS

Publication number:

US20250232786

Publication date:

2025-07-17

Section:

Physics

Class:

G10L25/63

Inventors:

Andrea Britto MATTOS LIMA 🇧🇷 Sao Paulo, Brazil

Weiwei Yang 🇺🇸 Seattle, WA, United States

Amber Dawn HOAK 🇺🇸 Silverdale, WA, United States

David A. TITTSWORTH 🇺🇸 Gig Harbor, WA, United States

Spencer FOWERS 🇺🇸 Duvall, WA, United States

Assignee:

Microsoft Technology Licensing, LLC 🇺🇸 Redmond, WA, United States

Applicant:

Microsoft Technology Licensing, LLC 🇺🇸 Redmond, WA, United States

Drawings (4 of 9)

Drawing 01 for ENHANCED USER INTERFACES FOR PARALINGUISTICS

Drawing 02 for ENHANCED USER INTERFACES FOR PARALINGUISTICS

Drawing 03 for ENHANCED USER INTERFACES FOR PARALINGUISTICS

Drawing 04 for ENHANCED USER INTERFACES FOR PARALINGUISTICS

Smart overview of the Invention

The patent application describes a system that enhances user interfaces by integrating paralinguistic data with linguistic data during real-time interactions. Paralinguistic data, which includes non-verbal cues such as facial expressions and vocal intonations, can be simulated by AI or inferred from users through sensors. This integration aims to improve communication, facilitate understanding, and reduce misunderstandings in conversations involving AI agents or other users.

Background

Generative AI, particularly large language models (LLMs), can create novel content similar to their training data. These models are capable of generating natural language and are valuable across various domains. The application focuses on using AI to present paralinguistic data alongside verbal data, enhancing the interaction experience by making it more intuitive and human-like.

Technological Solutions

The system addresses challenges in presenting AI-generated paralinguistic data effectively. It ensures that conversations with AI agents are natural by conveying human-like characteristics such as emotions. Additionally, it tackles potential inaccuracies in machine-learning models that infer human paralinguistic cues, which could lead to misinterpretations. The system provides interfaces that inform users about their own and others' inferred paralinguistic data, fostering better awareness and communication.

System Architecture

An example paralinguistic system includes an interactive application with a client and server, facilitating interaction among users and AI agents. Sensors detect various contextual cues from users, which are processed to generate paralinguistic classifications. These classifications help the AI agent respond more appropriately by considering the user's emotional state and other contextual factors.

Data Processing

Sensors collect diverse data types such as audio, video, and physiological signals to understand the user's paralinguistic state. The system processes this sensor data to enhance its accuracy and relevance. By utilizing machine-learning models, it classifies emotions and other paralinguistic cues in real time, augmenting verbal interactions with additional context. This approach enables more empathetic and context-aware responses from AI agents.