US20240152457
2024-05-09
Physics
G06F12/0815
The disclosed technology introduces a scalable method for managing data coherency using shared virtual memory in heterogeneous processing systems. Unlike traditional hardware-based solutions that rely on costly structures like inclusive caches and snoop filters, this approach stores coherency information in system memory as page table metadata. This change allows for a limitless directory structure, reducing the granularity of coherency tracking and eliminating hardware penalties associated with capacity-related victimizations.
The invention pertains to graphics processing logic, focusing on data coherency management within heterogeneous processing systems. Traditional methods of coherency tracking are inefficient in terms of power and die area, especially when scaled for increased bandwidth processing. The approach described here aims to address these inefficiencies by leveraging shared virtual memory.
By moving coherency state storage from dedicated hardware blocks to system memory, the invention creates a more efficient and scalable structure. The use of page tables as metadata storage enables local coherency caching within CPU and GPU components via a translation lookaside buffer (TLB). This innovation allows for flexible application across various processor types, including graphics and general-purpose processors.
The system architecture includes one or more processors and graphics processors, potentially forming part of a system-on-a-chip (SoC) for mobile or embedded devices. The architecture supports various applications, from gaming platforms to wearable technology, utilizing processors with multiple cores capable of executing diverse instruction sets. These processors may include internal and external caches to optimize performance.
The architecture supports extensive peripheral integration through high-speed I/O buses. Components such as audio controllers, wireless transceivers, and data storage devices are interconnected via an I/O controller hub. This setup ensures seamless communication between the central processing units and peripheral devices, enhancing the overall functionality and flexibility of the system.