Invention Title:

Large Language Model(s) System for Capturing, Maintaining, and Separating Copyrighted Information Within a Blockchain Network with Automatic Output of Information

Publication number:

US20250209141

Publication date:

2025-06-26

Section:

Physics

Class:

G06F21/106

Inventors:

Laura LEHMANN 🇺🇸 New York, NY, United States

Sorat TUNGKASIRI 🇺🇸 Skillman, NJ, United States

Applicant:

Pangee, Inc. 🇺🇸 New York, NY, United States

Smart overview of the Invention

Overview: The patent application describes a system for managing data input into generative AI models or large language models (LLMs). It involves creating a non-fungible token (NFT) for each data object, assigning a smart contract to each NFT to regulate interactions, and recording these onto a blockchain. This process ensures that copyrighted information is tracked, maintained, and separated appropriately.

Technical Field: The innovation lies in the intersection of artificial intelligence and blockchain technology. By generating NFTs for each data object fed into an AI model, the system offers a method for tracking and managing copyrighted material. Smart contracts attached to these NFTs further control how the data is used and interacted with, ensuring compliance with copyright laws.

Background: Copyrighted materials are crucial in protecting intellectual property and fostering creativity. However, the rise of generative AI has led to challenges, including potential copyright infringements. The system addresses these issues by providing a mechanism to monitor and separate copyrighted content from the training data of AI models, thus respecting ownership rights and encouraging innovation.

Detailed Description: The system includes an intermediary AI model that works with search algorithms to compartmentalize copyrighted data without directly integrating it into model training. This approach allows publishers to opt out of sharing their data, preventing unauthorized use in training models. Additionally, it enables tracking the sources of information used by AI models, adding value by ensuring content authenticity and accuracy.

Applications and Benefits: By separating copyrighted data from open-source models, the system not only protects intellectual property but also enhances the reliability of AI-generated content. It supports industries reliant on precise information by facilitating cross-referencing of multiple sources. This capability is particularly valuable in fields such as politics and medicine, where accurate information is critical.