C 语言 float 内存布局详解

前言

C语言中的float并不像大多数人想象的那样, 由于计算机模拟的原因, 其本质是离散的而非连续的, 所以精度和范围是一定的, 这些都写在float.h头文件的宏中.

但通常, 我们对教材的每一个字都认识, 连起来就读不懂了, 所以, 写下此博文, 详解之.

学过深入理解计算机系统的同学, 都知道float的实现方式, 按照IEEE标准, 由符号位, 阶码位, 尾数位组成, 本文给出一个代码, 打印float的符号位, 阶码位, 尾数位.

一、float中的宏定义

这是float.h中有关float具体实现, 范围, 精度等的宏常量, 依据64位系统环境, 我们逐个解读.

#include 
    FLT_RADIX;      // 2;
    FLT_MANT_DIG;   // 24;
    FLT_DIG;        // 6;
    FLT_MIN_10_EXP; // (-37);
    FLT_MAX_10_EXP; // 38;
    FLT_EPSILON;    // 1.19209290e-7F;
    FLT_MAX;        // 3.40282347e+38F;
    FLT_MIN;        // 1.17549435e-38F;

FLT_RADIX 2

This is the value of the base, or radix, of the exponent representation. This is guaranteed to be a constant expression, unlike the other macros described in this section. The value is 2 on all machines we know of except the IBM 360 and derivatives.

这是指数表示的基数或基数的值。这保证是一个常量表达式，与本节中描述的其他宏不同。在我们所知的所有机器上，该值均为 2，除了 IBM 360 及其衍生产品。

float基于二进制实现, 其基数是2.

换句人话说, 所有float浮点数, 都是由一系列2的n次方相加组成.

比如 0.5 就是
2^-1 次方,

0.25就是
2^-2 次方,

0.75就是
2^-1 + 2^-2

FLT_MANT_DIG 24

This is the number of base-FLT_RADIX digits in the floating point mantissa for the float data type. The following expression yields 1.0 (even though mathematically it should not) due to the limited number of mantissa digits:

这是浮点数据类型的浮点尾数中的基数FLT_RADIX位数。由于尾数位数有限，以下表达式产生 1.0（尽管在数学上不应该产生）：

float radix = FLT_RADIX; // 2.0

1.0f + 1.0f / radix / radix / … / radix; //在float中, 1.0 其实是 1.0 + 1.0 连续除以 24 个 2.0

where radix appears FLT_MANT_DIG times.

radix 出现了FLT_MANT_DIG 次, 也就是 24 次.

这个细节如果有人看过深入理解操作系统, 那么就应该明白了, 在IEEE标准中, float的尾数是23位, 但由于通常都是1.0+0.N的形式, 所以相当于第一位是隐式的1, 加23位就是24位.

这24位就是float的二进制精度.

FLT_DIG 6

float的十进制精度, 包括整数部分及小数部分.

比如 123.456 这就是6位精度, 再多就不保证精度可用了, 比如123.4567, 其中7就不保证是正确的.

FLT_MIN_10_EXP (-37)

这代表float能保证的最小的10进制指数, 我这里是 10^-37 次方, 再小就不保证正确了.

FLT_MAX_10_EXP 38

这是float能保证的最大的10进制指数, 再大就不保证正确了

FLT_EPSILON 1.19209290e-7F

这个不好理解, 它的意思是 1.0 和比 1.0 大的最小的float值的差.

FLT_MAX 3.40282347e+38F

float能表示的最大数.

FLT_MIN 1.17549435e-38F

float能表示的最小数.

二、float 内存布局打印实现算法

基本思想是实现一个位域结构, 将一个32位的整数分成三份, 一份占1位, 指示符号, 一份占8位, 指示阶码, 一份占23位, 指示尾数.

typedef struct
{
    uint32_t Mantissa : 23;
    uint32_t Exponent : 8;
    uint32_t Sign : 1;
} fltToBit;

由于是小端序, 逆着排, 也就是尾数, 阶码, 符号.

通过itoa()函数, 将整数转为二进制字符串, 并进行打印.

二、float 内存布局打印实现代码

代码比较容易, 唯一不好理解的是:

    fltToBit test = *(fltToBit *)&a;

float 不能直接强制转为 fltToBit, 需要先取地址, 强转为 fltToBit 指针, 再解引用.

其他的代码都应该能懂.

#include 
#include 
#include 

typedef struct
{
    uint32_t Mantissa : 23;
    uint32_t Exponent : 8;
    uint32_t Sign : 1;
} fltToBit;

void print_float_as_binary(float a)
{
    char BufExponet[32] = "";
    char BufMantissa[32] = "";

    fltToBit test = *(fltToBit *)&a;

    printf("Sign: %u\nExponent: %08s\nMantissa: %023s\n", test.Sign,
           itoa(test.Exponent, BufExponet, 2),
           itoa(test.Mantissa, BufMantissa, 2));
}

int main()
{
    const float a = 1.4142136F;
    print_float_as_binary(a);
    return 0;
}

总结

float 是用离散方法实现有限精度的浮点数, 有时间应仔细查阅文档进行推敲, 很有意思.

本文介绍了IEEE对float的实现, 利用数据结构, 将其内部真实二进制值打印出来, 方便读者研究。