Invention Title:

GPU Asynchronous Direct Memory Access Applications

Publication number:

US20250291755

Publication date:

2025-09-18

Section:

Physics

Class:

G06F13/28

Inventors:

ARAVINDH ANANTARAMAN 🇺🇸 Folsom, CA, United States

DODDABALLAPUR JAYASIMHA 🇺🇸 Saratoga, CA, United States

John A. Wiegert 🇺🇸 Aloha, OR, United States

Yongsheng Liu 🇺🇸 San Diego, CA, United States

Fataneh Ghodrat 🇺🇸 Cambridge, MA, United States

Assignee:

INTEL CORPORATION 🇺🇸 Santa Clara, CA, United States

Applicant:

Intel Corporation 🇺🇸 Santa Clara, CA, United States

Smart overview of the Invention

The patent application describes a graphics processor featuring a modular design with multiple chiplets. Each chiplet is connected to a base die via chiplet sockets, enhancing the scalability and performance of the processor. This architecture aims to improve the efficiency of data processing and memory management within graphics processing units (GPUs).

Graphics Core Cluster

A key component of each chiplet is the graphics core cluster, which contains several graphics cores. These cores are responsible for rendering and processing graphical data. The cluster design allows for parallel processing, significantly boosting the speed and efficiency of graphics computations.

Distributed Shared Local Memory

The graphics cores within each chiplet are equipped with distributed shared local memory. This memory architecture ensures that each core has access to shared resources, facilitating faster data retrieval and reducing latency during complex computations. It optimizes the use of memory resources across the different cores.

Asynchronous Direct Memory Access Engine

Each graphics core includes a direct memory access (DMA) engine designed to operate asynchronously. This engine enables efficient data transfer by copying data from external memory devices to the distributed shared local memory without interrupting ongoing processes. Such asynchronous operations help maintain high throughput and reduce bottlenecks.

Benefits and Applications

The described GPU architecture offers significant improvements in handling large-scale graphical tasks by leveraging chiplets and asynchronous DMA engines. Potential applications include gaming, simulations, and any computationally intensive tasks requiring rapid data processing and efficient memory usage. The design promises enhanced performance while maintaining flexibility in resource allocation.