Invention Title:

ARTIFICIAL INTELLIGENCE DRIVEN PRESENTER

Publication number:

US20240371089

Publication date:

2024-11-07

Section:

Physics

Class:

G06T17/00

Inventors:

Chenguang Yang Seattle, WA, United States

Xuan Li Bellevue, WA, United States

Max WANG Seattle, WA, United States

Michael Louis URCIUOLI Seattle, WA, United States

Faryal Ali KHAN Seattle, WA, United States

Qing DAI Sammamish, WA, United States

Assignee:

Microsoft Technology Licensing, LLC Redmond, WA, United States

Applicant:

Microsoft Technology Licensing, LLC Redmond, WA, United States

Smart overview of the Invention

The patent application describes a data processing system designed to automate the presentation process using artificial intelligence. The system focuses on transforming traditional presentation content, which includes slides and transcripts, into a more engaging format by employing a virtual presenter. This virtual presenter is capable of delivering the presentation content in a conversational style, tailored to specific presentation styles stored in a datastore. The system leverages large language models (LLMs) to rewrite the textual cues of transcripts into a more natural and interactive dialogue.

Functionality

The process begins with obtaining presentation content and style information. The system generates a prompt for an LLM, which uses this information to create an augmented transcript. This transcript is then converted into speech through a text-to-speech model. A virtual presenter, represented by a 3D model, is animated to synchronize with the audio content, creating a cohesive audiovisual experience. This content is streamed to the audience's devices, facilitating an automated presentation without the need for live human involvement.

Technical Implementation

The system employs various tools and techniques to address existing challenges in automatic presentations. It can be integrated into different platforms like communication or collaboration platforms, allowing users to create and modify presentation content for remote or in-person audiences. The virtual presenter not only narrates slide content but also adapts its delivery style based on user-defined attributes and presentation styles. This adaptability ensures that presentations are dynamic and engaging, overcoming the limitations of static prerecorded content.

Customization

Users have the flexibility to define the presentation style and personality of the virtual presenter. The system can automatically generate slides and transcripts from user-provided documents or allow users to collaborate with an LLM for content creation. The augmented transcript includes emotional cues for gestures and expressions, enhancing the virtual presenter's ability to engage with the audience effectively. This customization ensures that each presentation aligns with the intended tone and style.

Benefits

The AI-driven presenter offers significant advantages by reducing the labor-intensive nature of traditional presentations. It allows real-time generation of audio from augmented transcripts, enabling seamless animation of the virtual avatar. This approach not only alleviates the stress associated with public speaking but also provides a flexible solution that can adapt to various presentation contexts and audience needs, making it a valuable tool for modern communication.