Victim cache

A victim cache is a small, typically fully associative cache placed in the refill path of a CPU cache. It stores all the blocks evicted from that level of cache and was originally proposed in 1990. In modern architectures, this function is typically performed by Level 3 or Level 4 caches.

Victim caching is a hardware technique to improve performance of caches proposed by Norman Jouppi. As mentioned in his paper:^[1]

Miss caching places a fully-associative cache between cache and its re-fill path. Misses in the cache that hit in the miss cache have a one cycle penalty, as opposed to a many cycle miss penalty without the miss cache. Victim Caching is an improvement to miss caching that loads the small fully-associative cache with victim of a miss and not the requested cache line. ^[1]

A victim cache is a hardware cache designed to reduce conflict misses and enhance hit latency for direct-mapped caches. It is utilized in the refill path of a Level 1 cache, where any cache-line evicted from the cache is cached in the victim cache. As a result, the victim cache is populated only when data is evicted from the Level 1 cache. When a miss occurs in the Level 1 cache, the missed entry is checked in the victim cache. If the access yields a hit, the contents of the Level 1 cache line and the corresponding victim cache line are swapped.

Though initially proposed by Jouppi to improve cache performance of a direct-mapped cache Level 1, modern day microprocessors with multi-level cache hierarchy employ Level 3 or Level 4 cache to act as victim cache for the cache lying above it in the memory hierarchy. Intel's Crystal Well^[2] of its Haswell processors introduced an on-package Level 4 cache which serves as a victim cache to processor's Level 3 cache.^[3] A 4–12 MB Level 3 cache is used as a victim cache in POWER5 (IBM) microprocessors.

Background

As hardware architecture and technology advanced, processor performance and frequency increased at a much faster rate than memory cycle times, resulting in a significant performance gap. The challenge of rising memory latency compared to processor speed has been addressed by incorporating high-speed cache memory.

Direct-mapped caches have faster access time than set-associative caches. However, in direct-mapped caches, when multiple cache blocks in memory map to the same cache line, they end up evicting each other whenever one of them is accessed. This issue, known as the cache-conflict problem, arises due to the limited associativity of the cache. Increasing cache associativity can mitigate this problem, but there are implementation complexities and limitations to how much associativity can be increased. To address the cache conflict problem within the constraints of limited cache associativity, a victim cache is often employed.

Background

Implementation

Example

Performance implications

References

Related Articles