Intel 5-level paging
Intel 5-level paging, referred to simply as 5-level paging in Intel documents, is a processor extension for the x86-64 line of processors.[1]: 11 It extends the size of virtual addresses from 48 bits to 57 bits by adding an additional level to x86-64's multilevel page tables, increasing the addressable virtual memory from 256 TB to 128 PB. The extension was first implemented in the Ice Lake processors.[2]
Technology
[edit]In the 4-level paging scheme (previously known as IA-32e paging), the 64-bit virtual memory address is divided into five parts. The lowest 12 bits contain the offset within the 4 KiB memory page, and the following 36 bits are evenly divided between the four 9 bit descriptors, each linking to a 64-bit page table entry in a 512-entry page table for each of the four paging levels. This makes it possible to use bits 0 through 47 in the virtual address, for a total of 256 TB.[3]: 4-2
5-level paging adds another 9 bit page table descriptor, making it possible to use bits 0 through 56. This multiplies the address space by 512 and increases the limit to 128 PB.
With 5-level paging enabled, bits 57 through 63 must be copies of bit 56.[1]: 17 This is the same as with 4-level paging, where the high-order bits of a virtual address that do not participate in address translation must be the same as the most significant implemented bit.
The 5-level paging is enabled by setting bit 12 of the CR4 register (known as LA57).[1]: 16 This is only used when the processor is operating in 64 bit mode, and only may be modified when it is not.[1]: 16 If the bit is not set, or the 5-level paging feature is not supported, the processor uses the 4-level page table structure when operating in 64-bit mode.[3]: 4-22 This is similar to Physical Address Extension (PAE), where the third level of paging tables to allow 36-bit addressing was enabled by setting a bit in the CR4 register.[4][3]: 4-14
Future processors may allow full 64-bit virtual address space by extending the size of page table descriptors to 12 bits (4096 page table entries) and memory offset to 16 bits (64 KiB page size) in the 4-level paging scheme or 21 bits (2 MiB page size) in the 5-level scheme.[5] Extending page table entry size from 64 to 128 bits would allow arbitrary page sizes, as additional hardware flags would change the size and operation of descriptors on lower paging levels.[5]
Drawbacks
[edit]Adding another level of indirection makes page table "walks" longer.[6] A page table walk occurs when either the processor's memory management unit or the memory management code in the operating system navigates the tree of page tables to find the page table entry corresponding to a virtual address.[7][3]: 4-22 This means that, in the worst case, the processor or the memory manager has to access physical memory six times for a single virtual memory access, rather than five for the previous iteration of x86-64 processors. This results in slightly reduced memory access speed.[8] In practice this cost is greatly mitigated by caches such as the translation lookaside buffer (TLB).[8] Future extensions may reduce page walks by limiting virtual address space per application, with dedicated hardware flags in an extended 128 bit page table entry, and allowing a larger 64 KiB or 2 MiB page sizes and backward compatibility with 4 KB page operations.[5]
Implementation
[edit]5-level paging is implemented by the Ice Lake microarchitecture,[2] EPYC 9004 and 8004 Series Processors[9][10] and Storm peak Ryzen Threadripper PRO 7900WX series.[11]
The 4.14 Linux kernel adds support for it.[12] Support for the extension was submitted as a set of patches to the Linux kernel on 8 December 2016.[13] As was reported on the Linux kernel mailing list, it consisted of extending the Linux memory model to use five levels rather than four.[14] This is because, although Linux abstracts the details of the page tables, it still depends on having a number of levels in its own representation. When an architecture supports fewer levels, Linux emulates extra levels that do nothing.[15] A similar change was previously made to extend from three levels to four.[16]
Windows 10 and 11 with server versions also support this extension in their latest updates, where it is provided by a separate kernel image called ntkrla57.exe.[17]
References
[edit]- ^ a b c d "5-Level Paging and 5-Level EPT". Intel Corporation. May 2017.
- ^ a b Cutress, Ian. "Sunny Cove Microarchitecture: A Peek At the Back End". Intel's Architecture Day 2018: The Future of Core, Intel GPUs, 10nm, and Hybrid x86. Retrieved 15 October 2019.
- ^ a b c d Intel® 64 and IA-32 Architectures Software Developer's Manual. Vol. 3A. Intel Corporation.
- ^ Hudek, Ted (June 2017). "Operating Systems and PAE Support - Windows 10 hardware dev". Microsoft Learn. Retrieved 27 January 2024.
- ^ a b c US patent 9858198, Larry Seiler, "64KB page system that supports 4KB page operation", published 2016-12-29, issued 2018-01-02, assigned to Intel Corp.
- ^ "CSALT: Context Switch Aware Large TLB". MICRO-50: the 50th Annual IEEE/ACM International Symposium on Microarchitecture : proceedings. Cambridge, MA: Institute of Electrical and Electronics Engineers., IEEE Computer Society., ACM Special Interest Group on Microprogramming. 14 October 2017. p. 450. doi:10.1145/3123939.3124549. ISBN 978-1-4503-4952-9. OCLC 1032337814.
- ^ "ARM Information Center". infocenter.arm.com. Retrieved 26 April 2018.
- ^ a b Levy, Hank (Autumn 2008). "CSE 451: Operating Systems: Paging & TLBs" (PDF). University of Washington. Retrieved 26 April 2018.
- ^ "Tuning Guide for AMD EPYC™ 9004 Processors" (PDF). AMD. September 2023.
- ^ "4TH GEN AMD EPYC™ PROCESSOR ARCHITECTURE" (PDF). AMD. May 2024.
- ^ "CPUID dump for 96-Core AMD Ryzen Threadripper PRO 7995WX (Storm Peak) Zen4". GitHub. 19 October 2023.
- ^ Tung, Liam. "First Linux 4.14 release adds "very core" features, arrives in time for kernel's 26th birthday". ZDNet. Retrieved 25 April 2018.
- ^ Michael Larabel (9 December 2016). "Intel Working On 5-Level Paging To Increase Linux Virtual/Physical Address Space - Phoronix". Phoronix. Retrieved 26 April 2018.
- ^ Shutemov, Kirill A. (8 December 2016). "[RFC, PATCHv1 00/28] 5-level paging". Linux kernel mailing list (Mailing list). Retrieved 26 April 2018.
- ^ "Page Table Management". www.kernel.org. Retrieved 26 April 2018.
- ^ "Four-level page tables [LWN.net]". lwn.net. 12 October 2004. Retrieved 26 April 2018.
- ^ @aionescu (23 June 2019). "Old farts like me will remember the days of ntoskrnl.exe, ntkrnlpa.exe, ntkrnlmp.exe and ntkrpamp.exe" (Tweet) – via Twitter.