C++备忘录028:reinterpret_cast与未定义行为

首先请务必读完cppreference上reinterpret_cast的描述

能读下来吗?

很难吧

记住了吗?

记不住吧

放心,没有几个人全都记得住的。所以不明白的东西不要用,尽量避免代码中出现reinterpret_cast,因为它引发未定义行为的几率太高了

Whenever an attempt is made to read or modify the stored value of an object of type DynamicType through a glvalue of type AliasedType, the behavior is undefined unless one of the following is true:

  1. AliasedType and DynamicType are similar.
  2. AliasedType is the (possibly cv-qualified) signed or unsigned variant of DynamicType.
  3. AliasedType is std::byte, (since C++17)char, or unsigned char: this permits examination of the object representation of any object as an array of bytes.

常见的数据类型转换中,只有其它类型转成std::bytecharunsigned char才是安全的

简单来说,大家恐怕都写过以下的代码

struct X {
    int i;
};

void foo(char *data) {
    reinterpret_cast<X *>(data)[0].i = 3;
}

但这是未定义行为,因为

Performing a class member access that designates a non-static data member or a non-static member function on a glvalue that does not actually designate an object of the appropriate type - such as one obtained through a reinterpret_cast - results in undefined behavior:

这种时候C++20里提供了std::bit_cast,目前我们可以用memcpy

void bar(char *data) {
    X x;
    memcpy(&x, data, sizeof x);
    x.i = 3;
    memcpy(data, &x, sizeof x);
}

好玩的是不论gcc9.2还是clang9,只要打开了-Ofoobar产生的汇编代码是完全一样的

foo(char*):                               # @foo(char*)
        mov     dword ptr [rdi], 3
        ret
bar(char*):                               # @bar(char*)
        mov     dword ptr [rdi], 3
        ret

这是因为内存的操作,编译器使用memcpy替换是常见的优化,例如典型的

void foo(int *a, std::size_t s) {
    for (std::size_t i=0; i<s; ++i)
        a[i] = 0;
}

编译器生成的汇编为

foo(int*, unsigned long):
        test    rsi, rsi
        je      .L1
        lea     rdx, [0+rsi*4]
        xor     esi, esi
        jmp     memset
.L1:
        ret

你可能感兴趣的:(C++,C++)