ARM SMMU v2

1 Registers
Refer to SMMU v2 datasheet.

2 SA8155 32bit SMMU v2
15-bit Stream ID, support 95 Stream Match Registers (SMR).

2.1 FT4232
The third port of 4-port FT4232 is used for console, and SW4 for EDL mode.
reboot -f

2.2 Page 0
4KB Page
SMMU_GR0_BASE = 0x15000000
IDR0 = SMMU_GR0_BASE + 0x20
IDR1 = SMMU_GR0_BASE + 0x24

SMMU_SMR0 = SMMU_GR0_BASE + 0x800
SMMU_SMR1 = SMMU_GR0_BASE + 0x804
...
SMMU_SMR127 = SMMU_GR0_BASE + 0x9FC

SMMU_S2CR0 = SMMU_GR0_BASE + 0xC00
SMMU_S2CR1 = SMMU_GR0_BASE + 0xC04
...
SMMU_S2CR127 = SMMU_GR0_BASE + 0xDFC
SMMU_S2CRn.bit[17:16] is used to enable bypass mode.

2.3 Page 1
4KB Page
SMMU_GR0_BASE = 0x15000000
SMMU_CBA2R0 = SMMU_GR0_BASE + PAGESIZE + 0x800
SMMU_CBA2R1 = SMMU_GR0_BASE + PAGESIZE + 0x804
...
SMMU_CBA2R127 = SMMU_GR0_BASE + PAGESIZE + 0x9FC

2.4 Context Banks
4KB Page, each context bank has its own IRQ pin for SMMU v2.
x86 PCIe通过bus号在Root table里面找到相应的root_entry,然后再通过devfn在Context table(256 entries)里面找到对应的context_entry。
SMMU_GR0_BASE = 0x15000000
SMMU_CB_BASE = SMMU_GR0_BASE + (NUMPAGE x PAGESIZE) =
SMMU_GR0_BASE + (IDR1.bit31 x IDR1.bit[30:28]) =
0x15000000 + 4KB x 128 = 0x15080000

SMMU_CBn_TTBR0 = SMMU_CB_BASE + n x PAGESIZE + 0x20
SMMU_CBn_TTBR1 = SMMU_CB_BASE + n x PAGESIZE + 0x28
SMMU_CBn_TCR = SMMU_CB_BASE + n x PAGESIZE + 0x30
SMMU_CBn_FSYNR0 = SMMU_CB_BASE + n x PAGESIZE + 0x68, bit4 to identify DMA read data from memory or write data to memory

2.5 SMMU crash debugging
2.5.1 overlayFS
mount -t overlay -o lowerdir=/,upperdir=/tmp/upper,workdir=/tmp/workdir none /

2.5.2 showcase
arm-smmu 15000000.apps-smmu: FAR = 0x00000000284d3b0f
arm-smmu 15000000.apps-smmu: PAR = 0x0000000000000000
arm-smmu 15000000.apps-smmu: FSR = 0x40000402 [TF W SS ]
cb=32, SID=0x3c0 (SA8155)
cb=33, SID=0x7c0 (SA8195)

2.5.3 panic_notifier_list
register_die_notifier() for ARM64 SMMU crash notifier.

2.5.4 objdump
ARM64汇编中,x0 - x7用来传递函数第一到第七个参数,超出的参数通过堆栈来传递。
arm_smmu_context_fault+0xcf4/0xcf8
The first value 0xcf4 is the assembler offset address from entry arm_smmu_context_fault.
aarch64-poky-linux-objdump -d -S vmlinux > vmlinux.asm
aarch64-poky-linux-objdump -d -S xxx.ko > xxx.asm
gdb vmlinux
l *func_name+offset_addr: l means list

3 ARM64 memory barrier
ARM64 introduces the Store Buffer (not FIFO) for Store instruction data, Store Buffer is different from L1 Data Cache. The caches the same CPU cluster integrated are called Inner Sharable, the caches shared by all CPU clusters are called Outer Sharable.
LD: load-load/load-store
ST: store-store/store-load
SY: System, reads and writes
ISH: Inner sharable, reads and writes
ISHLD: Inner sharable Load, read only
ISHST: Inner sharable Store, writes only
OSH: Outer sharable, reads and writes

4 Linux ARM64 39-bit VA
4.1 39-bit VA Layout
Upper 25-bit is kernel FFFF_FF8 (TTBR1).
User space:
0x0000_0000_0000_0000
0x0000_007F_FFFF_FFFF

Kernel space:
0xFFFF_FF80_0000_0000
0xFFFF_FFFF_FFFF_FFFF

4.2 MMU
CONFIG_ARM64_PA_BITS_48
CONFIG_ARM64_VA_BITS_39
CONFIG_ARM64_VA_BITS_48
CONFIG_PGTABLE_LEVELS
3 level page table, every page table size is 4KB, and it has 512 entries, every entry size is 8 bytes, every page table uses 9 bits of VA to index the enrty, VA[11:0]  is used to index byte address.

entry offset address = entry_index * 8, entry bit[1:0] = Table descriptor or Block entry.

4.3 crash
https://www.kernel.org/
mainline - summary - Clone
git log --pretty=oneline
When VA causes crash, the kernel will print the 3-level page table 8-byte entry value (physical address) according to VA.

5 Abbreviations
CBAR: Context Bank Attribute Registers
S2CR: Stream-to-Context Register
SMR: Stream Match Register
TCR: Translation Control Register
TTBR: Translation Table Base Registers, per CPU registers, x86_64 calls CR3
PGD: Page Global Directory (38-30), TTBR0 if bit63 = 0 for user space, TTBR1 if bit63 = 1 for kernel space, VA[38:30] is an index to PGD entry
SMMU PRI: Page Request Interface

你可能感兴趣的:(Network,SMMU,v2)