Invention Title:

FOUNDATION GENERATIVE ARTIFICIAL INTELLIGENCE (AI) MODEL WITH TRANSFORMER ARCHITECTURE FOR ENVIRONMENTAL, SOCIAL, AND GOVERNANCE (ESG) IMPACT

Publication number:

US20250299059

Publication date:
Section:

Physics

Class:

G06N3/092

Inventors:

Assignee:

Applicant:

Smart overview of the Invention

An innovative AI foundation model is introduced, specifically designed for the environmental, social, and governance (ESG) domain. Utilizing a Transformer-based architecture with approximately 30 billion parameters, it supports extensive context windows of up to 128,000 tokens. This capability is crucial for analyzing comprehensive ESG documents like sustainability reports and policies. The model integrates both textual and visual data through gated cross-attention and a Mixture-of-Experts (MoE) architecture, enhancing its ability to understand multimodal contexts effectively.

Challenges Addressed

Traditional AI models face limitations when applied to ESG tasks due to their general-purpose training and limited context windows. These models often fail to accurately process lengthy ESG-specific documents or integrate visual data with textual analysis. Moreover, typical adaptation techniques only partially adjust model parameters, leading to incomplete domain adaptation. The new model addresses these issues by offering a specialized approach that fully fine-tunes its parameters for the ESG domain, ensuring comprehensive understanding and analysis.

Multimodal Capabilities

ESG analysis often requires the integration of both text and visual data. For instance, environmental assessments may involve satellite imagery while corporate reports include charts. Traditional models lack the ability to process such multimodal data cohesively. The new model combines a vision encoder with a language model, enabling holistic analysis of ESG issues by correlating textual descriptions with visual evidence, thus providing a more complete understanding of complex ESG topics.

Advanced Reinforcement Learning

The model employs Group Relative Policy Optimization (GRPO), an advanced reinforcement learning strategy that refines outputs based on group-relative advantages from multiple candidate generations. This approach enhances the model's reasoning capabilities and output quality in ESG contexts by rewarding not just correctness but also comprehensiveness and articulation of responses. This method significantly improves upon standard reinforcement learning techniques by considering multiple outputs simultaneously.

Comprehensive Training and Safety

The model is trained on a vast corpus of approximately 20 trillion tokens from diverse sources, ensuring broad coverage of both general language and ESG-specific knowledge. It uses a detailed 47-class ESG classification framework during data preprocessing and training to maintain domain specificity. The training process includes full fine-tuning across all parameters and incorporates safety controls to mitigate bias or inappropriate content. This comprehensive approach ensures the model's outputs are accurate, coherent, and aligned with ESG values.