Xen Project Hypervisor Power Management: Suspend-to-RAM on Arm Architectures
About a year ago, we started a project to lay the foundation for full-scale power management for applications involving the Xen Project Hypervisor on Arm architectures. We intend to make Xen on Arm's power management the open source reference design for other Arm hypervisors in need of power management capabilities.
Looking at Previous Examples for Initial Approach
We looked at the older ACPI-based power management for Xen on x86, which features CPU idling (cpu-idle), CPU frequency scaling (cpu-freq), and suspend-to-RAM. We also looked at the PSCI platform management and pass-through capabilities of Xen on Arm, which already existed, but did not have any power management support. We decided to take a different path compared to x86 because we could not rely on ACPI for Arm, which is not widespread in the Arm embedded community. Xen on Arm already used PSCI for booting secondary CPUs, system shutdown, restart and other miscellaneous platform functions; thus, we decided to follow the trend, and base our implementation on PSCI.
Among the typical power management features, such as cpu-idle, cpu-freq, suspend-to-RAM, hibernate and others, we concluded that suspend-to-RAM would be the one best suited for our initial targets, systems-on-chips (SoCs). Most SoCs allow the CPU voltage domain to be completely powered off while the processor subsystem is suspended, and the state preserved in the RAM self-refresh mode, thereby significantly cutting the power consumption, often down to just tens of milliwatts.
Our Design Approach
Our solution provides a framework that is well suited for embedded applications. In our suspend-to-RAM approach, each unprivileged guest is given a chance to suspend on its own and to configure its own wake-up devices. At the same time, the privileged guest (Dom0) is considered to be a decision maker for the whole system: it can trigger the suspend of Xen, regardless of the states of the unprivileged guests.
These two features allow for different Xen embedded configurations and use-cases. They make it possible to freeze an unprivileged guest due to an ongoing suspend procedure, or to inform it about the suspend intent, giving it a chance to cooperate and suspend itself. These features are the foundation for higher level coordination mechanisms and use-case specific policies.
Our solution relies on the PSCI interface to allow guests to suspend themselves, and to enable the hypervisor to suspend the physical system. It further makes use of EEMI to enable guest notifications when the suspend-to-RAM procedure is initiated. EEMI stands for Embedded Energy Management Interface, and it is used to communicate with the power management controller on Xilinx devices. On the Xilinx Zynq UltraScale+ MPSoC we were able to suspend the whole application subsystem with Linux and Xen and put the MPSoC into its deep-sleep state, where it consumes only 35 mW. Resuming from this state is triggered by a wake-up interrupt that can be owned by either Dom0 or an unprivileged guest.
After the successful implementation of suspend-to-ram, the logical next step is to introduce CPU frequency scaling and CPU idling based on the aggregate load and performance requirements of all VMs.
While an individual VM may be aware of its own performance need, its utilization level, and the resulting CPU load, this information only applies to the virtual CPUs assigned to the guest. Since the VMs are not aware of the virtual to physical CPU mappings, while also lacking awareness of all the other VMs and their performance needs, a VM is not in a position to make suitable decisions regarding the power and performance states of the SoC.
The hypervisor, on the other hand, is scheduling the virtual CPUs and needs to be aware of their utilization of the physical CPUs. Having this visibility, the hypervisor is well suited to make power management decisions concerning the frequency and idle states of the physical CPUs. In our vision, the hypervisor scheduler will become energy aware and allocate energy consumption slots to guests, rather than time slots.
Currently, our work is focused on testing the new Xen suspend-to-RAM feature on Xilinx Zynq UltraScale+ MPSoC. We are calling the Xen Project developers to join the Xen power management activity and implement and test the new feature on other Arm architectures, so we accelerate the upstreaming effort and the accompanying cleanup.
Mirela Grujic, Principal Engineer at AGGIOS
Davorin Mista, VP Engineering and Co-Founder at AGGIOS
Stefano Stabellini, Principal Engineer at Xilinx and Xen Project Maintainer
Vojin Zivojnovic, CEO and Co-Founder at AGGIOS