Modern shared-memory multi-core processors typically have
shared Level 2 (L2) or Level 3 (L3) caches. Cache bottlenecks and replacement
strategies are the main problems of such architectures, where multiple
cores try to access the shared cache simultaneously. The main problem in
improving memory performance is the shared cache architecture and cache
replacement. This paper documents the implementation of a Dual-Port
Content Addressable Memory (DPCAM) and a modified Near-Far Access
Replacement Algorithm (NFRA), which was previously proposed as a shared
L2 cache layer in a multi-core processor. Standard Performance Evaluation
Corporation (SPEC) Central Processing Unit (CPU) 2006 benchmark
workloads are used to evaluate the benefit of the shared L2 cache layer.
Results show improved performance of the multicore processor’s DPCAM
and NFRA algorithms, corresponding to a higher number of concurrent
accesses to shared memory. The new architecture significantly increases
system throughput and records performance improvements of up to 8.7% on
various types of SPEC 2006 benchmarks. The miss rate is also improved by
about 13%, with some exceptions in the sphinx3 and bzip2 benchmarks. These
results could open a new window for solving the long-standing problems with
shared cache in multi-core processors.
Authors
Allam Abumwais
Mahmoud Obaid
Pages From
4952
Pages To
4963
ISSN
https://doi.org/10.32604/cmc.2023.032822
Journal Name
Computers, Materials Continua
Volume
74
Issue
3
Keywords
Multi-core processor; shared cache; content addressable memory; dual port CAM; replacement algorithm; benchmark program
Abstract