Virtual CPU Topology

Michael Zhao
4 min readFeb 25, 2022

--

What is CPU Topology

CPU topology describes the layout of physical CPUs in the system. On a SMP system, following instances exist in the CPU hierarchy:

  • Socket
  • Core
  • Thread

A typical CPU topology could by like:

Checking CPU topology

On Linux system, you can check the CPU topology with lscpu command. On my desktop computer with Intel CPUs, the output is like:

~ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 39 bits physical, 48 bits virtual
CPU(s): 8
On-line CPU(s) list: 0-7
Thread(s) per core: 1
Core(s) per socket: 8
Socket(s): 1

......

Pay attention to the fields in bold font. My machine has a very simple CPU topology: there is 1 socket, 8 cores are in the socket, each core has only 1 thread.

A server usually has far more CPUs, you can see a lot of CPUs in a more complex layout. Here is the topology of a Arm64 server:

~ lscpu
Architecture: aarch64
Byte Order: Little Endian
CPU(s): 224
On-line CPU(s) list: 0-223
Thread(s) per core: 4
Core(s) per socket: 28
Socket(s): 2

......

There are 2 sockets, each socket contains as many as 28 cores, and each core has 4 threads. The total CPU count up to 2*28*4 = 224. What a large machine!

Setting VCPU topology on Cloud Hypervisor

Many modern virtual machine monitors (VMM) can set the Virtual CPU topology as you wish. In this section I will introduce how to configure the VCPU topology in Cloud Hypervisor.

If you have never heard about Cloud Hypervisor, I can introduce it quickly in 3 sentences:

  • Cloud Hypervisor is a VMM written in Rust language.
  • It supports different architectures (X86_64 and AArch64) and different hypervisors (Kvm, HyperV and Mac OS Hypervisor soon).
  • It is a Linux Foundation project.

On Cloud Hypervisor you can set VCPU topology with --cpu option like:

--cpus boot=8,max=16,topology=2:4:1:2

You can check the help message for the meaning of each parameter:

~ cloud-hypervisor --help
......
--cpus <cpus> boot=<boot_vcpus>,max=<max_vcpus>,topology=<threads_per_core>:<cores_per_die>:<dies_per_package>:<packages>,kvm_hyperv=on|off,max_phys_bits=<maximum_number_of_physical_bits>,affinity=<list_of_vcpus_with_their_associated_cpuset>
......

The highlighted parameters are those for VCPU topology:

  • threads_per_core - Corresponds to Thread(s) per core of lscpu output
  • cores_per_die and dies_per_package - The product of the parameters corresponds to Core(s) per socket of lscpu output. dies_per_package is quite special. It can not be checked directly in the a virtual machine. It is only reflected in the total cores of each socket. And dies_per_package is only available on X86_64 architecture. On AArch64 its value is restricted to 1.
  • packages - Corresponds to Socket(s) of lscpu

How is VCPU Topology Implemented in VMM

In this section I will introduce how VCPU topology is implemented in a virtual machine monitor on AArch64 architecture. I will take Cloud Hypervisor as an example when I need to refer to some source code.

Before I continue, I have to say there aren’t many secrets in the part. Because VMM only applies some “settings” for the topology, the real secret is in hypervisor. For Kvm, if you are interested in how it works, you need to investigate the source code in Linux kernel.

On AArch64, there are 2 ways to boot an operating system and manage devices: Flattened Device Tree (FDT) and UEFI&ACPI. VCPU topology implementation is different under each way.

In the case of FDT, you need to build a cpu-map tree structure in cpus node to describe the topology you want. If you specify the topology with parameter topology=2:2:1:1, the cpu-map added into the FDT in Cloud Hypervisor would be like:

cpus {
// Other content on cpus node are ignore. ...
cpu-map {
cluster0 {
core0 {
thread0 {
cpu = <&CPU0>;
};
thread1 {
cpu = <&CPU1>;
};
}; // core0
core1 {
thread0 {
cpu = <&CPU2>;
};
thread1 {
cpu = <&CPU3>;
};
}; // core1
}; // cluster0
}; // cpu-map
}; // cpus

The cluster node corresponds to Socket(s) of lscpu. The cluster node can be embedded, that means you can add multiple layers of cluster, all the cluster nodes in the lowest layer will be counted into the Socket(s) of lscpu. core and thread nodes match the same names in lscpu.

If you are interested in the detail, please see the source code here.

In Linux kernel, there is a document explaining how to work with cpu-map. However, in writing the source code by following the guideline, I was a bit confused. Before I began my coding, I expected the socket node mentioned in the doc matches the same concept in lscpu output. But it didn’t work that way. When I added any number of socket node in the FDT, the Socket(s) value of lscpu became 1. I need to look into kernel code for how socket works.

In the case of ACPI. What you need to do is adding a table named Processor Properties Topology Table (PPTT). The table is described in Chapter 5.2.29 of ACPI specification version 6.3. Similar with FDT, this table depicts a tree-structured VCPU hierarchy. See here for source code.

If you find anything wrong or have any question about the VCPU topology in Cloud Hypervisor, please feel free to ask on the issue board.

--

--