Jenga: Responsive Tiered Memory Management without Thrashing
Abstract
A heterogeneous memory has a single address space with fast access to some addresses (a fast tier of DRAM) and slow access to other addresses (a capacity tier of CXL-attached memory or NVM). A tiered memory system aims to maximize the number of accesses to the fast tier via page migrations between the fast and capacity tiers. Unfortunately, previous tiered memory systems can perform poorly due to (1) allocating hot and cold objects in the same page and (2) abrupt changes in hotness measurements that lead to thrashing. This paper presents Jenga, a tiered memory system that addresses both problems. Jenga's memory allocator uses a novel context-based page allocation strategy. Jenga's accurate measurements of page hotness enable it to react to memory access behavior changes in a timely manner while avoiding thrashing. Compared to the best previous tiered memory system, Jenga runs memory-intensive applications 28% faster across 10 applications, when the fast tier capacity matches the working set size, at a CPU overhead of <3% of a single core and a memory overhead of <0.3%