Invention Title:

Large-Scale, Privacy Preserving Personalized Large Language Models (LLMs)

Publication number:

US20240403564

Publication date:

2024-12-05

Section:

Physics

Class:

G06F40/35

Inventors:

Michael Bendersky Cupertino, CA, United States

Mingyang Zhang San Jose, CA, United States

Assignee:

Google LLC Mountain View, CA, United States

Applicant:

Google LLC Mountain View, CA, United States

Drawings (4 of 9)

Drawing 01 for Large-Scale, Privacy Preserving Personalized Large Language Models (LLMs)

Drawing 02 for Large-Scale, Privacy Preserving Personalized Large Language Models (LLMs)

Drawing 03 for Large-Scale, Privacy Preserving Personalized Large Language Models (LLMs)

Drawing 04 for Large-Scale, Privacy Preserving Personalized Large Language Models (LLMs)

Smart overview of the Invention

The patent describes a method for generating personalized responses to user prompts using large-scale language models (LLMs) while preserving user privacy. It involves receiving a prompt from a user and obtaining user-specific features. These features are used to create a user prompt embedding, which conditions the LLM's response to be personalized for the user. The personalized response is then delivered to the user's device.

Technical Field

The technology focuses on enhancing large language models by incorporating user context to produce more individualized outputs. This approach addresses the limitations of generic responses by integrating user-specific data efficiently during the model's inference process.

Methodology

The method includes collecting user features such as location, age, or gender to determine a user prompt embedding. This embedding guides the LLM in generating a personalized response without altering the model's parameters. The process involves classifying users into categories with pre-fined embeddings that tailor responses based on these classifications.

Implementation Details

User prompt embeddings are refined through a clustered fine-tuning process using training datasets. Each dataset consists of prompts, ground-truth responses, and user features, helping classify users and adjust embeddings for accurate personalization. The embeddings can also be predicted using a trained model that processes user features.

Additional Features

The system can incorporate local context like recent activities or geographical data to enhance personalization further. If needed, it can access personal data repositories with specific search queries, ensuring relevant information is included in generating responses.