Invention Title:

PROMPT AUTO-GENERATION FOR AI ASSISTANT BASED ON SCREEN UNDERSTANDING

Publication number:

US20250199829

Publication date:

2025-06-19

Section:

Physics

Class:

G06F9/453

Inventors:

Justin James WAGLE 🇺🇸 Pacifica, CA, United States

Rogerio BONATTI 🇺🇸 Bellevue, WA, United States

Assignee:

Microsoft Technology Licensing, LLC 🇺🇸 Redmond, WA, United States

Applicant:

Microsoft Technology Licensing, LLC 🇺🇸 Redmond, WA, United States

Smart overview of the Invention

Large language models (LLMs) and visual-language models (VLMs) are powerful tools that provide effective results when used with appropriate formatting. However, users often struggle to fully utilize these models due to a lack of expertise or patience. The patent application addresses this challenge by introducing a method to automatically generate AI prompts based on the understanding of the user's screen activity.

Screen Understanding

The method involves using an image encoder to process a current screenshot into an image embedding. This embedding is then compared with text embeddings that represent various screenshot activities. By identifying the text embedding that closely matches the image embedding, the system can determine the specific activity being performed by the user on their screen.

Automatic Prompt Generation

Once the screen activity is identified, AI prompts, referred to as "pills," are generated in real-time. These prompts are designed to assist users by providing suggestions or solutions related to their current activity. This real-time assistance aims to enhance user interaction with LLMs and VLMs by making them more accessible and easier to navigate.

Benefits

Enhanced Usability: By generating context-aware prompts, users can interact more effectively with AI systems.
Time Efficiency: Real-time prompt generation reduces the time users spend figuring out how to phrase their queries.
Increased Accessibility: Users with varying levels of expertise can benefit from AI assistance without needing deep technical knowledge.

Potential Applications

The automatic generation of AI prompts can be applied in various scenarios where screen activity is involved, such as in educational tools, customer support systems, and productivity software. By leveraging screen understanding, this technology has the potential to significantly improve user experience across different platforms and applications.