Cloud Hypervisor + GDB + Arm64 Part 5: AArch64 Address Translation Sketch

Michael Zhao
8 min readJul 7, 2022

--

This article is a collection of essential knowledge of address translation on AArch64. It doesn’t cover every aspects of the translation. I only focus on the information that is related to the scenario that happens when we use GDB to debug the guest kernel.

To enable the GDB support on Arm64 architecture for Cloud Hypervisor (a virtual machine monitor written in Rust), we need to translate the virtual address (VA) used by the guest kernel to the guest physical address (IPA). This scenario covers these requirements:

  • Only stage 1 translation is needed.
  • Only high address range is taken care: 0xFFFF_0000_0000_0000 ~ 0xFFFF_FFFF_FFFF_FFFF, because it’s in kernel space.
  • Only Exception Level 1 (EL1) is involved, because the guest kernel is running on it.

Most of the content is quoted from AArch64 reference manual [1].

Address

The aim of address translation is to convert the virtual address (VA) to physical address (PA)or intermediate physical address (IPA).

Address Type

  • Virtual address (VA)
    An address used in an instruction, as a data or instruction address, is a Virtual Address (VA).
  • Intermediate physical address (IPA)
    In a translation regime that provides two stages of address translation, the IPA is:
    • The output address (OA) from the stage 1 translation.
    • The output address (IA) for the stage 2 translation.
  • Physical address (PA)
    The address of a location in a physical memory map. That is, an output address from the PE to the
    memory system.

Address Size

The address size (for both the input and output address of the translation process) is determined by following rules:

  • Up to 52 bits when all of the following are true:
    - FEAT_LPA2 is implemented.
    - TCR_ELx.DS==1 for the translation regime controlled by that register.
    - The 4KB or 16KB translation granule is used.
  • Up to 52 bits when both of the following are true:
    - FEAT_LVA is implemented
    - The 64KB translation granule is used.
  • Up to 48 bits, otherwise.

Address Size Configuration

Determining Physical Address Size

The ID_AA64MMFR0_EL1.PARange field indicates the implemented PA size:

Configuring Input Address Size

TCR_ELx.TxSZ fields specify the input address size:

  • For a stage of translation that can support two VA ranges
    The TCR_ELx has two TxSZ fields, corresponding to the two VA ranges:
    - TCR_ELx.T0SZ specifies the size for the lower VA range, translated using TTBR0_ELx.
    - TCR_ELx.T1SZ specifies the size for the upper VA range, translated using TTBR1_ELx.
  • For a stage of translation that supports only a single input address (IA) range
    - The TCR_ELx has a single T0SZ field, and IAs are translated using TTBR0_ELx.

Configuring Output Address Size

TCR_ELx.{I}PS must be programmed to maximum output address size for a stage of translation:

Translation Regime

A translation regime comprises either:

  • A single stage of address translation.
    - This maps an input VA to an output PA.
  • Two, sequential, stages of address translation, where:
    - Stage 1 maps an input VA to an output IPA.
    - Stage 2 maps an input IPA to an output PA.

Translation Table Walks

The translation table walk is the set of lookups that are required to translate the VA to the PA.

The translation result includes:

  • The required PA
  • The memory attributes for the target memory region
  • The access permissions for the target memory regions

The following diagram generally describes a stage of a 3-level lookup:

Granule Size

VMSAv8–64 supports translation granule sizes of 4KB, 16KB, and 64KB.

The memory translation granule size defines both:

  • The maximum size of a single translation table.
  • The memory page size. That is, the granularity of a translation table lookup.

Identifying supported granule sizes

For stage 1 translation:

For the stage 1 translation, the supported granule sizes can be found by checking ID_AA64MMFR0_EL1.TGram* field:

For stage 2 translation

Ignored.

Effect Of Granule Size On Translation

Different granule sizes make differences in following aspects of the address translation:

  • Page size
  • Page address range
  • Address bits resolved in one level of lookup
  • Maximum number of entries in a translation table

The table below lists how the granule size affects everything:

Granule Size’s Effect On IA Breaking Down

Take 4KB for example, a 52-bit IA should be broken down in this way:

Granule Size’s Effect On Translation Tables

The effect of granule size on TTBR_ELx and the translation table of different levels can be interpreted from this table:

x in the table above is the least significant bits index in the IA for current translation level.

Take 4KB granule size and 48-bit IA for example, following information is obtain from the table:

  • The translation table address can be found at TTBR_ELx[47:12].
  • The address IA[47:12] are to be broken down for each level of translation.
  • On each level, 9 bits of IA[47:12] should be resolved as the index in a translation table. For level 1, the index is IA[47(=39+8), 39].
  • The address IA[11:0] is the offset inside a page.

Address Translation Process

With the background knowledge introduced above, not it’s time to take a deeper look into the details of a translation process.

In this chapter I will only describe the process of a stage 1 translation with 4KB granule.

Initial Lookup Level

Before going through the translation table, the first question to answer is how many levels to look up.

Generally the rule of the initial lookup level is:

  • If the input address is in a bigger range, more translation levels need to be performed to cover that range, so the initial lookup level is lower;
  • Otherwise, if the input address is in a smaller range, less translation levels are needed, so the initial lookup level is higher.

For a stage 1 translation, the required initial lookup level is determined only by the required input address range specified by the corresponding TCR_ELx.TnSZ field.

Specifically, the range size is 2^(64 - TCR_ELx.TnSZ) bytes.

When using the 4KB translation granule, the relationship between the initial lookup level and TCR_ELx.TnSZ is described by the table:

Let me try to give an example for what the table means. For a range 1 translation, the input address is between 0xFFFF_0000_0000_0000 ~ 0xFFFF_FFFF_FFFF_FFFF. But the input address may not cover the whole range because it is really a big range (2⁴⁸ or 2⁵²). The actual range can be calculated with 2^(64 - TCR_ELx.T1SZ).

For example, if TC1_ELx.T1SZ = 28, the real input address range size is 2^(64 - 28) = 2³⁶. A 3-level translation table can cover all the possible addresses in this range. So level -1 and 0 is not needed. The initial lookup level is 1. This case matches the 3rd line of the table above.

Going Through Translation Table

Once the initial lookup level is identified, it’s time to go through the translation tables to find out the PA.

A simplified process of the going through is like:

  • TTBR_ELx register holds the address of the first table to go through.
  • Some bits in the VA contain the index of an entry in that table. And the entry holds the address of the next table to go through.
  • Repeat last step until coming to level-3 table.
  • Some bits in the VA contain the index of an entry in that table. The entry holds the address of a physical page.
  • The last bits (12 bits for 4KB page) of the VA is the offset in the physical page. By combining the address of the physical page identified in last step and the offset, the PA is concluded.

Selecting TTBR_ELx

When there is only one VA range is supported, TTBR0_ELx must be used for address translation.

But when two VA ranges are supported, the correct TTBR_ELx (TTBR0_ELx or TTBR1_ELx) register need to be selected:

  • TTBR0_ELx points to the initial translation table for the lower VA range:
    (48 bits VA) 0x0000_0000_0000_0000 ~ 0x0000_FFFF_FFFF_FFFF
    (52 bits VA) 0x0000_0000_0000_0000 ~ 0x000F_FFFF_FFFF_FFFF
  • TTBR1_ELx points to the initial translation table for the upper VA range:
    (48 bits VA) 0xFFF0_0000_0000_0000 ~ 0xFFFF_FFFF_FFFF_FFFF
    (52 bits VA) 0xFFF0_0000_0000_0000 ~ 0xFFFF_FFFF_FFFF_FFFF

So, which TTBR_ELx is used depends only on the VA presented for translation. The most significant bits of the VA must all be the same value and:

  • If the most significant bits of the VA are zero, then TTBR0_ELx is used.
  • If the most significant bits of the VA are one, then TTBR1_ELx is used.

Calculating Table Entry Address

Now let’s take a step further to see how the address of a table entry is calculated.

Take 4KB for an example, the table below gives all the information for the calculation.

  • BaseAddr: The base address for the level of lookup, as defined by:
    For the initial lookup level, the value of the appropriate TTBR_ELx.BADDR field.
    Otherwise, the translation table address returned by the previous level of lookup.
  • PAMax: The supported PA width, in bits.
  • Symbols in the calculation of stage 2 are ignored.

Relevant Registers

This chapter collects the introduction of some registers that are involved in our translation use case.

TCR_EL1

Translation Control Register (EL1)

The control register for stage 1 of the EL1&0 translation regime.

TTBR1_EL1

Translation Table Base Register 1 (EL1)

Holds the base address of the translation table for the initial lookup for stage 1 of the translation of an address from the higher VA range in the EL1&0 stage 1 translation regime, and other information for this translation regime.

ID_AA64MMFR0_EL1

AArch64 Memory Model Feature Register 0

Provides information about the implemented memory model and memory management support in AArch64 state.

PARange, bits [3:0] - Physical Address range supported:

  • 0b0000 32 bits, 4GB.
  • 0b0001 36 bits, 64GB.
  • 0b0010 40 bits, 1TB.
  • 0b0011 42 bits, 4TB.
  • 0b0100 44 bits, 16TB.
  • 0b0101 48 bits, 256TB.
  • 0b0110 52 bits, 4PB (for ARMv8.2-LPA only).
  • All other values are reserved.

Reference

--

--

Michael Zhao
Michael Zhao

Written by Michael Zhao

Major in virtualization, security and ARM.

No responses yet