第一个老东家,是做TV demodulator chip的, 后来被收购,尽管多了一个TV silicon tuner业务,但公司的底层软件组,主要的工作还是基于X86的Windows AVStram/BDA PCTV driver,间接写过一点Linux V4L2/BDA PCTV driver,当然,也是基于X86的。
第二个老东家,一直搞USB3.0 Device/xHCI host IP. 也因为验证需要,写过一些基于ARM板的UBoot, Linux的driver,但也仅限于写代码。
最近,开始抽时间学习一下ARM的东西, 以笔记的形式记录一下。
因为是v8学习,事实上,对v7也没有多少了解,是ARM方面的小白,所以,肯定不像USB那么熟悉,会有很多问题。
The change from 32-bit to 64-bit
There are several performance gains derived from moving to a 64-bit processor.
• The A64 instruction set provides some significant performance benefits, including a
larger register pool. The additional registers and the ARM Architecture Procedure Call
Standard (AAPCS) provide a performance boost when you must pass more than four
registers in a function call. On ARMv7, this would require using the stack, whereas in
AArch64 up to eight parameters can be passed in registers.
v8的寄存器池变多了?
v7是32BIT的寄存器,v8是64BIT的寄存器,位数变多了?
v7有32个通用寄存器,同时,还有一些特殊寄存器的BANK, v8是不是同时寄存器的数量也变多了?
AAPCS?
是不是v7与v8的AAPCS也有所改变?
文中说,v8可以通过寄存器同时传递多达8个参数,而v7如果需要同时传递超过4个参数的情况下,就需要使用STACK。
• Wider integer registers enable code that operates on 64-bit data to work more efficiently.
A 32-bit processor might require several operations to perform an arithmetic operation on
64-bit data. A 64-bit processor might be able to perform the same task in a single
operation, typically at the same speed required by the same processor to perform a 32-bit
operation. Therefore, code that performs many 64-bit sized operations is significantly
faster.
这里没有什么问题
v7是32BIT的寄存器,v8是64BIT的寄存器
针对64bit的操作数的运算,速度提升了
• 64-bit operation enables applications to use a larger virtual address space. While theLarge
Physical Address Extension (LPAE) extends the physical address space of a 32-bit
processor to 40-bit, it does not extend the virtual address space. This means that even with
LPAE, a single application is limited to a 32-bit (4GB) address space. This is because
some of this address space is reserved for the operating system.
应用程序的虚拟地址空间扩大了
v7的LPAE将物理地址空间由32增加到40,但虚拟地址空间并没有扩大(有些空间,是保留给OS的)
• Software running on a 32-bit architecture might need to map some data in or out of
memory while executing. Having a larger address space, with 64-bit pointers, avoids this
problem. However, using 64-bit pointers does incur some cost. The same piece of code
typically uses more memory when running with 64-pointers than with 32-bit pointers.
Each pointer is stored in memory and requires eight bytes instead of four. This might
sound trivial, but can add up to a significant penalty. Furthermore, the increased usage of
memory space associated with a move to 64-bits can cause a drop in the number of
accesses that hit in the cache. This in turn can reduce performance.
The larger virtual address space also enables memory-mapping larger files. This is the
mapping of the file contents into the memory map of a thread. This can occur even though
the physical RAM might not be large enough to contain the whole file.
Software running on a 32-bit architecture might need to map some data in or out of
memory while executing. Having a larger address space, with 64-bit pointers, avoids this
problem.
对于这句话的理解,
32bit的系统,最多有4G的虚拟地址空间
如果遇到比较大的应用程序,可能数据与代码会超越这样一个范围上限
而64bit的系统则因为虚拟空间足够在,超越范围的问题则不会出现,这就免去了将
程序某部分数据换入换出虚拟映射的操作
64BIT也有缺点:
1. 指针从32位变成64, 使用更多内存(或许有人说这点内存开销非常trivial, 我也这样认为,所以,不理解,原著者说这个penalty是想表达什么?)
2. cache hit率变低了,这个容易理解,好比字符串匹配,越长的当然匹配得越少,对吗?