Invention Title:

DYNAMIC PRECISION MANAGEMENT FOR INTEGER DEEP LEARNING PRIMITIVES

Publication number:

US20260010969

Publication date:

2026-01-08

Section:

Physics

Class:

G06T1/20

Inventors:

Srinivas Sridharan 🇮🇳 Bangalore, India

NAVEEN K. MELLEMPUDI 🇮🇳 BANGALORE, India

Dheevatsa Mudigere 🇮🇳 Bangalore, India

Dipankar Das 🇮🇳 Pune, India

Assignee:

INTEL CORPORATION 🇺🇸 Santa Clara, CA, United States

Applicant:

Intel Corporation 🇺🇸 Santa Clara, CA, United States

Smart overview of the Invention

The patent application details a graphics processing unit (GPU) that enhances computations related to neural networks. This GPU features a dynamic precision fixed-point unit capable of converting floating-point tensors into fixed-point tensors. The innovation aims to improve the efficiency and precision of neural network computations by dynamically managing the precision of integer deep learning primitives.

Background

Modern graphics processors are designed to handle a wide range of operations beyond traditional graphics tasks. These include machine learning and general-purpose computations. The shift from fixed-function units to programmable units in GPUs has enabled them to support diverse operations. Single instruction, multiple thread (SIMT) architectures in GPUs are crucial for maximizing parallel processing efficiency, allowing groups of threads to execute instructions synchronously.

System Configuration

The computing system incorporates a processing subsystem with one or more processors connected to a system memory. A memory hub facilitates communication between the processor and memory, while an I/O subsystem manages input and output operations. The GPU, part of the parallel processor subsystem, can be connected via high-speed interconnects like PCIe or NVLink, optimizing the processing of commands and instructions for various applications, including graphics and machine learning tasks.

Integration and Communication

The system's architecture allows flexibility in integrating components. The GPU can be connected to the processor cores either through external interconnects or integrated directly on the same chip. This integration enhances the efficiency of command processing and supports various configurations, including system on chip (SoC) and system in package (SIP) designs. The components can also be part of a multi-chip module (MCM), facilitating modular computing systems.

Adaptability and Variability

The described computing system is adaptable, with potential variations in component connections and configurations. For instance, the memory hub and I/O hub could be integrated or separated, and multiple processors could be used. The system's design supports optional components, allowing customization based on specific needs. This flexibility ensures the system can cater to different computational demands and technological advancements.