首先请务必读完cppreference上reinterpret_cast的描述
能读下来吗?
很难吧
记住了吗?
记不住吧
放心,没有几个人全都记得住的。所以不明白的东西不要用,尽量避免代码中出现reinterpret_cast
,因为它引发未定义行为的几率太高了
Whenever an attempt is made to read or modify the stored value of an object of type DynamicType through a glvalue of type AliasedType, the behavior is undefined unless one of the following is true:
- AliasedType and DynamicType are similar.
- AliasedType is the (possibly cv-qualified) signed or unsigned variant of DynamicType.
- AliasedType is std::byte, (since C++17)char, or unsigned char: this permits examination of the object representation of any object as an array of bytes.
常见的数据类型转换中,只有其它类型转成std::byte
,char
和unsigned char
才是安全的
简单来说,大家恐怕都写过以下的代码
struct X {
int i;
};
void foo(char *data) {
reinterpret_cast<X *>(data)[0].i = 3;
}
但这是未定义行为,因为
Performing a class member access that designates a non-static data member or a non-static member function on a glvalue that does not actually designate an object of the appropriate type - such as one obtained through a reinterpret_cast - results in undefined behavior:
这种时候C++20里提供了std::bit_cast
,目前我们可以用memcpy
void bar(char *data) {
X x;
memcpy(&x, data, sizeof x);
x.i = 3;
memcpy(data, &x, sizeof x);
}
好玩的是不论gcc9.2还是clang9,只要打开了-O
,foo
和bar
产生的汇编代码是完全一样的
foo(char*): # @foo(char*)
mov dword ptr [rdi], 3
ret
bar(char*): # @bar(char*)
mov dword ptr [rdi], 3
ret
这是因为内存的操作,编译器使用memcpy
替换是常见的优化,例如典型的
void foo(int *a, std::size_t s) {
for (std::size_t i=0; i<s; ++i)
a[i] = 0;
}
编译器生成的汇编为
foo(int*, unsigned long):
test rsi, rsi
je .L1
lea rdx, [0+rsi*4]
xor esi, esi
jmp memset
.L1:
ret