Linux Kernel

Oracle is working on pinning multithreaded VFIO pages for about 10x faster QEMU initialization

For those who assign VFIO devices to guest VMs, the boot / boot process could soon be much faster with a set of fixes from Oracle.

Oracle engineers have worked on pinning multithreaded VFIO pages to speed up the boot process and can have a pretty noticeable impact for large guest VMs. The patch bundle providing this multithreaded VFIO page pin is currently on a “Request for Comment” and the patch cover letter explains the motivation and benefits:

Assigning a VFIO device to a guest requires pinning every page of guest memory, which is expensive for large guests, even if the memory has already been faulted and cleared with something like qemu prealloc.

Some recent optimizations have reduced costs, but this remains a significant bottleneck for guest initialization time. Parallelize with padata to get the most out of memory bandwidth, giving up to 12x speedups for pinning VFIO pages and 10x speedups for overall qemu guest initialization. Detailed performance results are in patch 8.

The first phase of multithreaded work caused the deferred structure page initialization to use all processors on x86. This is a special case because it occurs during startup when the machine waits for the page to complete initialization and there is usually no resource control to violate.

Page pinning, on the other hand, can be performed by a human task (the “main thread” in a job), so helper threads must obey the main thread’s resource controls that are relevant to the pinning ( CPU, memory) and prioritize other tasks on the system. This RFC has some but not all of the elements to do so.

A 12x acceleration for VFIO page pinning through multi-threading is quite a difference and mostly translates to around 10x acceleration for the overall initialization of the QEMU guest. This patch message has more details on performance testing of AMD and Intel servers.

Large servers with a lot of RAM are obviously the ones that benefit the most from this VFIO multi-threaded page pinning.

Oracle has been offering some of these fixes in its downstream kernel releases for Oracle Enterprise Linux for about three years. See this series of patches for the initial 16 RFC fixes.

Source link