I’m confused about unaligned memory accesses on ARM.
My understanding was that they’re not allowed — that is,
dereferencing a 32-bit value from a pointer that’s not four-byte aligned will crash.
I’ve run into such crashes before.
But right now I’ve got a situation where the Release build of an app crashes on an unaligned access,
but the Debug build doesn’t!
It’s the exact same place in the code — a CRC32 hash implementation —
and I’ve verified that an odd address is being dereferenced (as a uint32*).
(This is a 32-bit process running on an iPhone 5, by the way.)
The only difference is in the assembly code generated.
In the debug build it’s using ldr, while in the release build it’s using ldrd.
(Apparently the optimizer is smart enough to realize that the next line of code is loading the next 4 bytes after the pointer,
so it decides to combine both lines into a single instruction.)
So I’m guessing that ldr (a 32-bit load) allows unaligned access,
but ldrd (64-bit) doesn’t?
Is there any comprehensive, up-to-date documentation about this?
I’ve found an article from 2010* stating that unaligned accesses are supported but are slower because they trigger an OS trap (as on PPC);
but it doesn’t say anything about ldrd.
There’s a StackOverflow Q&A about a crash with ldrd**, where the answer states that "ldrd needs the address to be 8-byte aligned”.
But if that’s so, then I think the compiler optimization (-Ofast level) in my app was incorrect, because the value it’s dereferencing is a uint32_t*,
so there’s no expectation that it's 8-byte aligned.
I believe ARM requires 8-byte loads (ldrd, or doubles into the FPU, etc) require 8-byte alignment and will crash otherwise.
4-byte loads do not require alignment but will be slower if the pointer is not 4-byte aligned.
It will not crash though.
对于arm中的双字节或者4字节数据的访问,不能直接通过数据类型的强制转换来实现,必须通过单字节的方式:
使用单字节赋值,或者memcpy等函数,不过这样做的时候,首先要先确定数据是大端还是小端模式。
对于Load/Store操作,如果是非对齐的数据访问操作,系统定义了下面3种可能的结果.
执行的结果不可预知.
忽略字单元地址的低两位,即访问地址为(address _and 0xffffffc)的字单元;忽略半字单元地址的最低位的值,即访问地址位(address _and 0xffffffe)的半字单元.
忽略字单元地址值种的低两位的值;忽略半字单元地址的最低位的值.有存储体统实现这种”忽略”.也就是说,这时该地址值原封不动的送到存储系统.
当发生非对齐的数据访问时,到底采用上述3种处理方法种的哪一种,是有各指令指定的.