US20250291755
2025-09-18
Physics
G06F13/28
The patent application describes a graphics processor featuring a modular design with multiple chiplets. Each chiplet is connected to a base die via chiplet sockets, enhancing the scalability and performance of the processor. This architecture aims to improve the efficiency of data processing and memory management within graphics processing units (GPUs).
A key component of each chiplet is the graphics core cluster, which contains several graphics cores. These cores are responsible for rendering and processing graphical data. The cluster design allows for parallel processing, significantly boosting the speed and efficiency of graphics computations.
The graphics cores within each chiplet are equipped with distributed shared local memory. This memory architecture ensures that each core has access to shared resources, facilitating faster data retrieval and reducing latency during complex computations. It optimizes the use of memory resources across the different cores.
Each graphics core includes a direct memory access (DMA) engine designed to operate asynchronously. This engine enables efficient data transfer by copying data from external memory devices to the distributed shared local memory without interrupting ongoing processes. Such asynchronous operations help maintain high throughput and reduce bottlenecks.
The described GPU architecture offers significant improvements in handling large-scale graphical tasks by leveraging chiplets and asynchronous DMA engines. Potential applications include gaming, simulations, and any computationally intensive tasks requiring rapid data processing and efficient memory usage. The design promises enhanced performance while maintaining flexibility in resource allocation.