The ARMv8-A architecture introduces a number of changes, which enable significantly higher
performance processor implementations to be designed.
Large physical address
This enables the processor to access beyond 4GB of physical memory.
物理地址空间变大了,V7是4GB,V8变成多少了呢?是2^64B吗?
64-bit virtual addressing
This enables virtual memory beyond the 4GB limit. This is important for modern
desktop and server software using memory mapped file I/O or sparse addressing.
虚拟地址空间变大了,V7是4GB, V8是不是变成了2^64B?
Automatic event signaling
This enables power-efficient, high-performance spinlocks.
SPINLOCK变得HPLP, 硬件如何设计,指令上如何体现的呢?
Larger register files
Thirty-one 64-bit general-purpose registers increase performance and reduce
stack use.
31个64BIT寄存器,提高了Performance,减少了STACK的使用, 硬件如何设计,指令上如何体现的呢?
Efficient 64-bit immediate generation
There is less need for literal pools.
不明白?是不是与XZR,WZR有关系?
Large PC-relative addressing range
A +/-4GB addressing range for efficient data addressing within shared libraries
and position-independent executables.
不明白?
Additional 16KB and 64KB translation granules
This reduces
Translation Lookaside Buffer
(TLB) miss rates and depth of page
walks.
是不是说,TRANSLATION LOOKASIZE BUFFER的粒度变细(小)了,从而使得地址转换结果的HIT率,或者说是重复使用率提高了,也减少了遍历的次数/时间?
New exception model
This reduces OS and hypervisor software complexity.
应该是说有了EL0~EL3几种EXCEPTION LEVEL。
Efficient cache management
User space cache operations improve dynamic code generation efficiency. Fast
Data cache clear using a Data Cache Zero instruction.
用户空间也有了CACHE的操作?
DATA CACHE清零指令,V7没有吗,该指令的使用场景是,设计意图是?
Hardware-accelerated cryptography
Provides 3
×
to 10
×
better software encryption performance. This is useful for
small granule decryption and encryption too small to offload to a hardware
accelerator efficiently, for example https.
加解密有了硬件加速?
Load-Acquire, Store-Release instructions
Designed for C++11, C11, Java memory models. They improve performance of
thread-safe code by eliminating explicit memory barrier instructions.
新的内存操作指令,显式地不使用内存屏障?
NEON double-precision floating-point advanced SIMD
This enables SIMD vectorization to be applied to a much wider set of algorithms,
for example, scientific computing,
High Performance Computing
(HPC) and
supercomputers.
SIMD有了进一步发展?