US20250200855
2025-06-19
Physics
G06T13/40
A conversational AI system is designed to enhance human-computer interaction by incorporating real-time multimodal emotion recognition. The system leverages machine learning to facilitate natural conversations between users and virtual humans. This technology aims to improve the empathy expression of virtual characters during interactions, creating a more engaging and responsive experience for users.
The system comprises three main components: a model server, a terminal, and a multimodal empathetic conversation-generation system. The model server hosts a machine learning-based conversational model that processes user input and generates appropriate responses. The terminal acts as an interface for users, allowing them to interact with the virtual human and display its responses. Additionally, it captures the user's facial expressions to assess their emotional state.
Emotion recognition is achieved through analyzing facial images captured by the terminal during user interactions. The system evaluates these images to determine the user's emotional state in real-time. By understanding the user's emotions, the AI can adjust the virtual human's expressions accordingly, ensuring that responses are not only contextually appropriate but also empathetic.
The multimodal empathetic conversation-generation system plays a crucial role in controlling the virtual human's expressions. Based on the assessed emotions of the user, it modifies the virtual character's facial expressions and gestures to reflect empathy. This dynamic adjustment creates a more personalized interaction, fostering a sense of connection between the user and the virtual entity.
This AI system has broad applications across various fields, including customer service, education, and mental health support. By providing empathetic interactions, it enhances user satisfaction and engagement. The technology's ability to adapt to individual emotional states offers significant potential for improving human-computer interactions in diverse settings.