在调试程序的时候,特别是内核模块时,涉及到某些内存地址的测试,如查看某个地址的值是否发生变化等,如果能够知道各进程的地址分布,特别是虚拟地址和物理地址之间的映射关系,可能有助于我们的程序调试。为此,我们找了网上关于内核地址空间的相关内容。本文开发了一个内核模块用于解决上述问题,但是,本文提出的模块可用于查询内核空间及用户程序空间虚拟地址分布及属性,以及查询对应虚拟地址的物理地址,但只适用于linux-5.5以下arm64版本,在linux-5.10内核布局空间已发生较大变化。对应的github地址GitHub - hengtianzhang/pid_page_tables: find task page tables (arm64)
本文对应的模块的编译方法为:在git根目录Makefile中将KDIR指向自己的linux内核代码目录即可,使用make编译出pid_page_tables.ko。注意此ko适用于arm64架构。
使用命令echo kernel > /sys/kernel/debug/pid_page_tables,部分打印如下:
root@xxxxx:~# insmod pid_page_tables.ko
root@xxxxx:~# echo kernel > /sys/kernel/debug/pid_page_tables
root@xxxxx:~# cat /sys/kernel/debug/pid_page_tables
-----[Kernel space]-----
---[ Modules start ]---
0xffff000000d40000-0xffff000000d41000 4K PTE ro x SHD AF NG UXN MEM/NORMAL
0xffff000000d41000-0xffff000000d42000 4K PTE ro NX SHD AF NG UXN MEM/NORMAL
0xffff000000d42000-0xffff000000d44000 8K PTE RW NX SHD AF NG UXN MEM/NORMAL
---[ Modules end ]---
---[ vmalloc() Area ]---
0xffff000008000000-0xffff000008001000 4K PTE RW NX SHD AF NG UXN DEVICE/nGnRE
0xffff000008002000-0xffff000008004000 8K PTE RW NX SHD AF NG UXN DEVICE/nGnRE
0xffff000008005000-0xffff000008006000 4K PTE RW NX SHD AF NG UXN DEVICE/nGnRE
0xffff000008007000-0xffff000008008000 4K PTE RW NX SHD AF NG UXN DEVICE/nGnRE
...
0xffff7dfffee00000-0xffff7dfffee10000 64K PTE RW NX SHD AF NG UXN DEVICE/nGnRE
---[ PCI I/O end ]---
---[ vmemmap start ]---
0xffff7e0000000000-0xffff7e0001000000 16M PMD RW NX SHD AF NG BLK UXN MEM/NORMAL
0xffff7e0002000000-0xffff7e0004000000 32M PMD RW NX SHD AF NG BLK UXN MEM/NORMAL
0xffff7e0006000000-0xffff7e0008000000 32M PMD RW NX SHD AF NG BLK UXN MEM/NORMAL
---[ vmemmap end ]---
---[ Linear Mapping ]---
0xffff800018000000-0xffff800018100000 1M PTE RW NX SHD AF NG CON UXN MEM/NORMAL
0xffff800080000000-0xffff800084000000 64M PMD RW NX SHD AF NG CON BLK UXN MEM/NORMAL
0xffff800084000000-0xffff800084e00000 14M PMD ro NX SHD AF NG BLK UXN MEM/NORMAL
0xffff800084e00000-0xffff800084f00000 1M PTE ro NX SHD AF NG UXN MEM/NORMA
如上,使用echo设置了/sys/kernel/debug/pid_page_tables后,cat /sys/kernel/debug/pid_page_tables将会显示所有地址空间映射属性。
使用命令echo kernel addr > /sys/kernel/debug/pid_page_tables
,打印如下:
root@xxxxx:~#
root@xxxxx:~# echo kernel 0xffff00000a634000 > /sys/kernel/debug/pid_page_tables
root@xxxxx:~# cat /sys/kernel/debug/pid_page_tables
-----[Kernel space]-----
Find virt addr: 0xffff00000a634000
Find result: I/O MEM or Resvered RAM [0x0000000031000000] No Mem Map
root@xxxxx:~#
root@xxxxx:~# echo kernel 0xffff8000c0800000 > /sys/kernel/debug/pid_page_tables
root@xxxxx:~# cat /sys/kernel/debug/pid_page_tables
-----[Kernel space]-----
Find virt addr: 0xffff8000c0800000
Find result: System RAM [0x00000000c0800000] Mem Map
root@xxxxx:~# echo kernel 0xffff005 > /sys/kernel/debug/pid_page_tables
root@xxxxx:~# cat /sys/kernel/debug/pid_page_tables
-----[Kernel space]-----
Find virt addr: 0x000000000ffff005
Find result: (null)
root@xxxxx:~# echo kernel 0xffff00000a611000 > /sys/kernel/debug/pid_page_tables
root@xxxxx:~# cat /sys/kernel/debug/pid_page_tables
-----[Kernel space]-----
Find virt addr: 0xffff00000a611000
Find result: top [0x0000000033001000] No Mem Map
root@xxxxx:~#
root@xxxxx:~#
如上输入对应的虚拟地址,cat会显示出对应的物理地址及地址是内存区还是外设地址区。
另外,对于线性映射区,Fixmap等区如果打印虚拟地址存在而物理地址找不到,则是设备树中被标记为了no-map属性不在内核映射范围以内,或者fixmap区有部分为临时映射区用于映射页表此段只映射到了pmd则也会找不pte。
对于内核管理的系统RAM将会被标记为Mem Map,否则就是No Mem Map。
使用命令echo pid 587 > /sys/kernel/debug/pid_page_tables
,部分打印如下:
root@xxxxx:~#
root@xxxxx:~# echo pid 587 > /sys/kernel/debug/pid_page_tables
root@xxxxx:~# cat /sys/kernel/debug/pid_page_tables
-----[User space]-----
Find pid comm: a.out
Real address space distribution:
Text code: 0x0000aaaad9a1e000-0x0000aaaad9a1e99c
Data: 0x0000aaaad9a2ed78-0x0000aaaad9a2f010
Brk (heap): 0x0000aaaaf88b6000-0x0000aaaaf88d7000
Mmap (logic): 0x0000ffffb688e000-0x0000ffffb66f3000
Stack top: 0x0000ffffc59fe320
Arg: 0x0000ffffc59febab-0x0000ffffc59febb3
Env: 0x0000ffffc59febb3-0x0000ffffc59feff0
---[ Start code ]---
0x0000aaaad9a1e000-0x0000aaaad9a1f000 4K PTE USR ro NX SHD AF NG MEM/NORMAL
---[ End code ]---
---[ Start data ]---
0x0000aaaad9a2e000-0x0000aaaad9a2f000 4K PTE USR ro NX SHD AF NG UXN MEM/NORMAL
0x0000aaaad9a2f000-0x0000aaaad9a30000 4K PTE USR RW NX SHD AF NG UXN MEM/NORMAL
---[ End data ]---
---[ Start brk (heap) ]---
0x0000aaaaf88b6000-0x0000aaaaf88b8000 8K PTE USR RW NX SHD AF NG UXN MEM/NORMAL
---[ End brk (heap) ]---
---[ Mmap end ]---
0x0000ffffb66f3000-0x0000ffffb66f4000 4K PTE USR ro NX SHD AF NG MEM/NORMAL
0x0000ffffb66f5000-0x0000ffffb671d000 160K PTE USR ro NX SHD AF NG MEM/NORMAL
0x0000ffffb671e000-0x0000ffffb6730000 72K PTE USR ro NX SHD AF NG MEM/NORMAL
0x0000ffffb6750000-0x0000ffffb677e000 184K PTE USR ro NX SHD AF NG MEM/NORMAL
0x0000ffffb677f000-0x0000ffffb6780000 4K PTE USR ro NX SHD AF NG MEM/NORMAL
0x0000ffffb6790000-0x0000ffffb67a0000 64K PTE USR ro NX SHD AF NG MEM/NORMAL
0x0000ffffb67b0000-0x0000ffffb67c8000 96K PTE USR ro NX SHD AF NG MEM/NORMAL
0x0000ffffb67c9000-0x0000ffffb67d0000 28K PTE USR ro NX SHD AF NG MEM/NORMAL
0x0000ffffb67f0000-0x0000ffffb6800000 64K PTE USR ro NX SHD AF NG MEM/NORMAL
0x0000ffffb6810000-0x0000ffffb6820000 64K PTE USR ro NX SHD AF NG MEM/NORMAL
0x0000ffffb6853000-0x0000ffffb6856000 12K PTE USR ro NX SHD AF NG UXN MEM/NORMAL
0x0000ffffb6856000-0x0000ffffb685a000 16K PTE USR RW NX SHD AF NG UXN MEM/NORMAL
0x0000ffffb685b000-0x0000ffffb685c000 4K PTE USR RW NX SHD AF NG UXN MEM/NORMAL
0x0000ffffb685c000-0x0000ffffb687b000 124K PTE USR ro NX SHD AF NG MEM/NORMAL
0x0000ffffb6887000-0x0000ffffb6889000 8K PTE USR RW NX SHD AF NG UXN MEM/NORMAL
0x0000ffffb688a000-0x0000ffffb688b000 4K PTE USR ro NX SHD AF NG MEM/NORMAL
0x0000ffffb688b000-0x0000ffffb688c000 4K PTE USR ro NX SHD AF NG UXN MEM/NORMAL
0x0000ffffb688c000-0x0000ffffb688e000 8K PTE USR RW NX SHD AF NG UXN MEM/NORMAL
---[ Mmap base ]---
---[ Misc start ]---
0x0000ffffc59fc000-0x0000ffffc59ff000 12K PTE USR RW NX SHD AF NG UXN MEM/NORMAL
---[ Misc end ]---
root@xxxxx:~#
root@xxxxx:~#
如上所示,将会打印出该任务是内核任务还是用户任务,并且还会打印任务名,并显示该任务的各个段分布。
注意,对于内核任务没有自己的mm_struct结构体,也就不存在上述各区段,与内核共享地址空间并借用上个用户任务的mm结构体
使用命令echo pid 587 0xaaaad9a1e000 > /sys/kernel/debug/pid_page_tables
,部分打印如下:
root@xxxxx:~#
root@xxxxx:~# echo pid 587 0xaaaad9a1e000
pid 587 0xaaaad9a1e000
root@xxxxx:~# echo pid 587 0xaaaad9a1e000 > /sys/kernel/debug/pid_page_tables
root@xxxxx:~# cat /sys/kernel/debug/pid_page_tables
-----[User space]-----
Find pid comm: a.out
Real address space distribution:
Text code: 0x0000aaaad9a1e000-0x0000aaaad9a1e99c
Data: 0x0000aaaad9a2ed78-0x0000aaaad9a2f010
Brk (heap): 0x0000aaaaf88b6000-0x0000aaaaf88d7000
Mmap (logic): 0x0000ffffb688e000-0x0000ffffb66f3000
Stack top: 0x0000ffffc59fe320
Arg: 0x0000ffffc59febab-0x0000ffffc59febb3
Env: 0x0000ffffc59febb3-0x0000ffffc59feff0
Find virt addr: 0x0000aaaad9a1e000
Find result: System RAM [0x00000001e450e000] Mem Map
root@xxxxx:~#
打印同内核空间相同,另外注意,如果查找的虚拟地址在Brk中存在但确没有对应的物理地址原因则可能是向内核申请的空间应用程序还未使用,而内核只做了页表映射并没有填充实际的物理地址,只有在应用程序实际访问时才会通过缺页异常来填充物理页。
使用命令echo pid 1 mmap > /sys/kernel/debug/pid_page_tables,部分打印如下:
root@a1000:/sys/kernel/debug#
root@a1000:/sys/kernel/debug#
root@a1000:/sys/kernel/debug# echo pid 1 mmap > pid_page_tables
root@a1000:/sys/kernel/debug# cat pid_page_tables
-----[User space]-----
Find pid comm: systemd
Real address space distribution:
Text code: 0x0000000000400000-0x0000000000501a8c
Data: 0x00000000005122b8-0x000000000054e458
Brk (heap): 0x000000002819c000-0x00000000282cc000
Mmap (logic): 0x0000ffffb1f81000-0x0000ffffa4000000
Stack top: 0x0000ffffc49fc980
Arg: 0x0000ffffc49fcfa8-0x0000ffffc49fcfb3
Env: 0x0000ffffc49fcfb3-0x0000ffffc49fcfed
0x0000000000400000-0x0000000000502000 Mmap file name [systemd]
Flags raw: [0x0000000000000875] READ EXEC MAYREAD MAYWRITE MAYEXEC DENYWRITE
0x0000000000512000-0x000000000054e000 Mmap file name [systemd]
Flags raw: [0x0000000000100871] READ MAYREAD MAYWRITE MAYEXEC DENYWRITE ACCOUNT
0x000000000054e000-0x000000000054f000 Mmap file name [systemd]
Flags raw: [0x0000000000100873] READ WRITE MAYREAD MAYWRITE MAYEXEC DENYWRITE ACCOUNT
0x000000002819c000-0x00000000282cc000 Anonymous mapping
Flags raw: [0x0000000000100073] READ WRITE MAYREAD MAYWRITE MAYEXEC ACCOUNT
0x0000ffffa4000000-0x0000ffffa4021000 Anonymous mapping
Flags raw: [0x0000000000200073] READ WRITE MAYREAD MAYWRITE MAYEXEC NORESERVE
0x0000ffffa4021000-0x0000ffffa8000000 Anonymous mapping
Flags raw: [0x0000000000200070] MAYREAD MAYWRITE MAYEXEC NORESERVE
0x0000ffffac000000-0x0000ffffac021000 Anonymous mapping
Flags raw: [0x0000000000200073] READ WRITE MAYREAD MAYWRITE MAYEXEC NORESERVE
0x0000ffffac021000-0x0000ffffb0000000 Anonymous mapping
Flags raw: [0x0000000000200070] MAYREAD MAYWRITE MAYEXEC NORESERVE
0x0000ffffb08a3000-0x0000ffffb08a4000 Anonymous mapping
Flags raw: [0x0000000000000070] MAYREAD MAYWRITE MAYEXEC
0x0000ffffb08a4000-0x0000ffffb10a4000 Anonymous mapping
Flags raw: [0x0000000000100073] READ WRITE MAYREAD MAYWRITE MAYEXEC ACCOUNT
root@xxxxx:~#
如上mmap可打印用户程序mmap的详细信息,包括匿名映射,私有共享可读可写等等信息。