Integer

The Integer class wraps a value of the primitive type int in an object.
An object of type Integer contains a single field whose type is int.
In addition, this class provides several methods for converting an int to a String and a String to an int, as well as other constants and methods useful when dealing with an int.

基于Java8

定义和属性

类定义
public final class Integer extends Number implements Comparable
Integer类继承了Number抽象类，并实现了Comparable接口，故而可以比较大小。
属性

/**
     * The value of the {@code Integer}.
     *
     * @serial
     */
    private final int value;

Integer类封装了int型变量来承载数值。

构造器

Integer提供了两个重载的构造器，一个是直接把int封装成Integer对象，另一个是讲String对象转成Integer对象。前者很简单，在内部的实现只是单纯的私有属性（int value）的赋值，后者涉及字符串转换方法parseInt，方法后面提到数据转换再详述。

    /**
     * Constructs a newly allocated {@code Integer} object that
     * represents the specified {@code int} value.
     *
     * @param   value   the value to be represented by the
     *                  {@code Integer} object.
     */
    public Integer(int value) {
        this.value = value;
    }

    /**
     * Constructs a newly allocated {@code Integer} object that
     * represents the {@code int} value indicated by the
     * {@code String} parameter. The string is converted to an
     * {@code int} value in exactly the manner used by the
     * {@code parseInt} method for radix 10.
     *
     * @param      s   the {@code String} to be converted to an
     *                 {@code Integer}.
     * @exception  NumberFormatException  if the {@code String} does not
     *               contain a parsable integer.
     * @see        java.lang.Integer#parseInt(java.lang.String, int)
     */
    public Integer(String s) throws NumberFormatException {
        this.value = parseInt(s, 10);
    }

极值

int数据的长度是四个字节，范围是-2^31~231-1。Integer是int的包装类，自然取值范围是一致的。

    //补码
    @Native public static final int   MIN_VALUE = 0x80000000;

    /**
     * A constant holding the maximum value an {@code int} can
     * have, 2³¹-1.
     */
    @Native public static final int   MAX_VALUE = 0x7fffffff;

整型转为字符串

Integer提供了很多方法将整型转为多种字符串，主要有：

public static String toString(int i, int radix);
public static String toUnsignedString(int i, int radix);
public static String toHexString(int i); // 16进制字符串表示
public static String toOctalString(int i); // 8进制字符串表示
public static String toBinaryString(int i); // 2进制字符串表示
public static String toString(int i); // 十进制字符串表示
public static String toUnsignedString(int i);  // 10进制无符号数的字符串，调用第二个方法，radix=10
public String toString();

在了解过String类实现原理后我们知道，String类实际上是封装了char数组，int数据转换成String，实际上就是把数字写到char数组里面。但是我们知道，char类型数据的长度是2个字节，也就是说单个char，只能存储-65536~65535之间的数字。那么整型是怎么存到char数组里面的呢？带着疑问往下看源码是怎么做的。

先看下toString(int i, int radix)这个方法，它是将整型转换成指定基数（多少进制）的字符串表示。本以为下面的toHexStirng/toOctalString和toBinaryString是基于这个方法的，其实不是的。

    @Test
    public void test3() {
        // 结果是一致的，但是内部并没有方法栈的重叠
        System.out.println(Integer.toString(100, 16));
        System.out.println(Integer.toHexString(100));
        System.out.println(Integer.toString(100, 2));
        System.out.println(Integer.toBinaryString(100));
        System.out.println(Integer.toString(100, 8));
        System.out.println(Integer.toOctalString(100));
    }

toString(int i, int radix)：

    // 将int数据转成指定进制的字符串
    public static String toString(int i, int radix) {
        // 支持的基数（进制）为2-36（10和数字加上26个字母）
        if (radix < Character.MIN_RADIX || radix > Character.MAX_RADIX)
            radix = 10;
        // 十进制直接使用其他版本的方法
        /* Use the faster version */
        if (radix == 10) {
            return toString(i);
        }
        // 定义长度为33的char数组，第一个位置放数字的符号（+、-）
        char buf[] = new char[33];
        //  是否是负数
        boolean negative = (i < 0);
        int charPos = 32;
        // 统一转成负数处理
        if (!negative) {
            i = -i;
        }

        while (i <= -radix) {
            buf[charPos--] = digits[-(i % radix)];
            i = i / radix;
        }
        buf[charPos] = digits[-i];

        if (negative) {
            buf[--charPos] = '-';
        }

        return new String(buf, charPos, (33 - charPos));
    }

重点看一下while循环那块的代码，digits是一个定义好的数组：

    /**
     * All possible chars for representing a number as a String
     */
    final static char[] digits = {
        '0' , '1' , '2' , '3' , '4' , '5' ,
        '6' , '7' , '8' , '9' , 'a' , 'b' ,
        'c' , 'd' , 'e' , 'f' , 'g' , 'h' ,
        'i' , 'j' , 'k' , 'l' , 'm' , 'n' ,
        'o' , 'p' , 'q' , 'r' , 's' , 't' ,
        'u' , 'v' , 'w' , 'x' , 'y' , 'z'
    };

位置（下标）代表了该数字的字符串表示，这是典型的空间换时间的操作。while循环体做的事情就是辗转相除法的算法，辗转除以基数取余。
实际上这里为什么要把数值统一用负数来处理并不明白。如果统一使用正数像这样是不是也可以：

        char buf[] = new char[33];
        int charPos=32;
        while (i > 0) {
            buf[charPos--] = digits[(i % radix)];
            i = i / radix;
        }
//        buf[charPos] = digits[i];
        System.out.println(buf);

toXXXString的实现
toHexStirng，toOctalString和toBinaryString的都是调用了一个private static String toUnsignedString0(int val, int shift)的私有方法，看下源码：

    /**
     * Convert the integer to an unsigned number.
     */
    private static String toUnsignedString0(int val, int shift) {
        // assert shift > 0 && shift <=5 : "Illegal shift value";
        int mag = Integer.SIZE - Integer.numberOfLeadingZeros(val);
        // 判断数字对应进制实际要占的char数组的空间
        // mag是去零后二进制串的长度，2进制1位对应一位，8进制3位二进制对应一位，16进制4位二进制对应一位
        // ((mag + (shift - 1)) / shift) = (mag+shift-1)/shift = mag/shift+1-1/shift
        /*
        *shift=1,mag/shift+1-1/shift = mag
        * shift=3,mag/3+1
        * shift=4.mag/4+1
        */
        int chars = Math.max(((mag + (shift - 1)) / shift), 1);
        char[] buf = new char[chars];

        formatUnsignedInt(val, shift, buf, 0, chars);

        // Use special constructor which takes over "buf".
        return new String(buf, true);
    }

里面的shift对应的是log2<进制数>的值，如二进制shift就是1,8进制是3,1进制是4.Integer.numberOfLeadingZeros(val)方法是计算数值的二进制表示下，左边的零的个数（从左往右知道遇到第一个1为止）。代码如下:

    public static int numberOfLeadingZeros(int i) {
        // HD, Figure 5-6
        if (i == 0)
            return 32;
        int n = 1;
        if (i >>> 16 == 0) { n += 16; i <<= 16; }
        if (i >>> 24 == 0) { n +=  8; i <<=  8; }
        if (i >>> 28 == 0) { n +=  4; i <<=  4; }
        if (i >>> 30 == 0) { n +=  2; i <<=  2; }
        n -= i >>> 31;
        return n;
    }

这个方法使用的是二分查找法来计算二进制串前面有多少个零：

i >>> 16 逻辑右移16位（就是32位的一半），也就是取到数值的高16位，如果高16位==0，说明有效数字在后16位，且前16位全为0，累加0的个数，并把低16位往前移（i <<= 16;）；如果高16位不等于0，说明高16位有非0数字，继续往下查找；
i >>> 24 逻辑右移24位，实际上是取高16位的高8位，逻辑同上
i >>> 28 取高4位
i >>> 30 取高2位
i >>> 31 取高1位，二分查找结束

计算好所需的char大小之后，使用formatUnsignedInt()把int放到char数组里面去：

     static int formatUnsignedInt(int val, int shift, char[] buf, int offset, int len) {
        int charPos = len;
        int radix = 1 << shift;
        int mask = radix - 1;
        do {
            buf[offset + --charPos] = Integer.digits[val & mask];
            val >>>= shift;
        } while (val != 0 && charPos > 0);

        return charPos;
    }

这段代码可以说是跪着看完的，Integer.digits[val & mask]的val & mask究竟是什么意思？
原来啊，这是val % radix的更高效的写法。为什么可以这样呢，因为这里的radix基数是2的n次方，val % radix 的取余操作，就是val / radix后的余数，因为radix=1<>> shift的移位操作，实际上就是把末尾的shift位给移掉了，余数就是被移掉的shift位的值，那么末尾的shift位，我在移位之前直接取怎么取呢，就是mask=raidx-1，比如radix=8，mask==7==0b0111,这样&就是取末尾的三位了。
要注意只有radix是2的n次方才能只有用。
然后这里的Integer.digits[]就很好理解了，下面的val >>>= shift就是除以radix，这边其实也只是辗转相除法。

到这，差不多这个toXXString方法就结束了，核心算法是辗转相除。

toString(int i)方法,并非内部调用toString(i,10)：

    public static String toString(int i) {
        if (i == Integer.MIN_VALUE)
            return "-2147483648";
        int size = (i < 0) ? stringSize(-i) + 1 : stringSize(i);
        char[] buf = new char[size];
        getChars(i, size, buf);
        // 这里使用的String了的特殊构造器，share=true的构造器。
        return new String(buf, true);
    }

这个stringSize()方法很有趣：

    final static int [] sizeTable = { 9, 99, 999, 9999, 99999, 999999, 9999999,
                                      99999999, 999999999, Integer.MAX_VALUE };
    // Requires positive x
    static int stringSize(int x) {
        for (int i=0; ; i++)
            if (x <= sizeTable[i])
                return i+1;
    }

它是判断数字是几位数来确认长度的。十分机智啊。

不过这个方法的重点在于getChar方法：

// 传入的i是数值，index是数值的长度，buf是存储字符的数组
    static void getChars(int i, int index, char[] buf) {
        int q, r;
        int charPos = index;
        // 符号
        char sign = 0;
        //  转成正数操作
        if (i < 0) {
            sign = '-';
            i = -i;
        }

        // Generate two digits per iteration 每次迭代生成两个数字
        while (i >= 65536) {
            q = i / 100;
        // really: r = i - (q * 100);用移位代替乘法运算 ,实际上r = i % 100         
            r = i - ((q << 6) + (q << 5) + (q << 2));
            i = q;
            // r<100
            // DigitOnes[r]是取r的个位数，即r%10;
            buf [--charPos] = DigitOnes[r];
            // DigitTens[r]是取r的十位数，即r/10;
            buf [--charPos] = DigitTens[r];
        }

        // Fall thru to fast mode for smaller numbers
        // assert(i <= 65536, i);
        for (;;) {
            q = (i * 52429) >>> (16+3); // 等价于q=i/10
            r = i - ((q << 3) + (q << 1));  // r = i-(q*10) ... 等价于r=i%10
            // 这个循环体的操作就是每次取个位上的数放到数组中
            buf [--charPos] = digits [r];
            i = q;
            if (i == 0) break;
        }
        if (sign != 0) {
            buf [--charPos] = sign;
        }
    }
    final static char [] DigitTens = {
        '0', '0', '0', '0', '0', '0', '0', '0', '0', '0',
        '1', '1', '1', '1', '1', '1', '1', '1', '1', '1',
        '2', '2', '2', '2', '2', '2', '2', '2', '2', '2',
        '3', '3', '3', '3', '3', '3', '3', '3', '3', '3',
        '4', '4', '4', '4', '4', '4', '4', '4', '4', '4',
        '5', '5', '5', '5', '5', '5', '5', '5', '5', '5',
        '6', '6', '6', '6', '6', '6', '6', '6', '6', '6',
        '7', '7', '7', '7', '7', '7', '7', '7', '7', '7',
        '8', '8', '8', '8', '8', '8', '8', '8', '8', '8',
        '9', '9', '9', '9', '9', '9', '9', '9', '9', '9',
        } ;

    final static char [] DigitOnes = {
        '0', '1', '2', '3', '4', '5', '6', '7', '8', '9',
        '0', '1', '2', '3', '4', '5', '6', '7', '8', '9',
        '0', '1', '2', '3', '4', '5', '6', '7', '8', '9',
        '0', '1', '2', '3', '4', '5', '6', '7', '8', '9',
        '0', '1', '2', '3', '4', '5', '6', '7', '8', '9',
        '0', '1', '2', '3', '4', '5', '6', '7', '8', '9',
        '0', '1', '2', '3', '4', '5', '6', '7', '8', '9',
        '0', '1', '2', '3', '4', '5', '6', '7', '8', '9',
        '0', '1', '2', '3', '4', '5', '6', '7', '8', '9',
        '0', '1', '2', '3', '4', '5', '6', '7', '8', '9',
        } ;

getChars方法，实际上核心的算法也是辗转相除法，基数为10。先判断大数字，使用两个预定义的数组来通过空间换时间的策略提高效率，且在i>65535的循环内，依次获取两个位上的数值，然后在<=65535时，就是对10的辗转相除。
r = i - ((q << 6) + (q << 5) + (q << 2));使用移位取代乘法运算，可以提高效率。r=i - ((q << 6) + (q << 5) + (q << 2))=i-q*2^6-q*2^5-q*2^2=i-q*(64+32+4)=i=100q.
q = (i * 52429) >>> (16+3);使用乘法和移位取代除法运算，乘法的效率比除法的效率要高，2^19=534288, (i * 52429) >>> (16+3)相当于i*0.1的结果;

但是为什么要以65535为界限分成两块，以及为什么第一块直接使用除法而第二块使用乘法来取代呢？为什么选择的是52429呢？

(摘自http://www.hollischuang.com/archives/1058)
再来回答上面两个问题中，部分一和部分二中最大的区别就是部分一代码使用了除法，第二部分只使用了乘法和移位。因为乘法和移位的效率都要比除法高，所以第二部分单独使用了乘法加移位的方式来提高效率。那么为什么不都使用乘法加移位的形式呢？为什么大于num1(65536)的数字要使用除法呢？原因是int型变量最大不能超过(2^31-1)。如果使用一个太大的数字进行乘法加移位运算很容易导致溢出。那么为什么是65536这个数字呢？第二阶段用到的乘法的数字和移位的位数又是怎么来的呢？
我们再回答第二个问题。
既然我们要使用q = (i * num2) >>> (num3);的形式使用乘法和移位代替除法，那么n和m就要有这样的关系：
num2= (2^num3 /10 +1)
只有这样才能保证(i * num2) >>> (num3)结果接近于0.1。
那么52429这个数是怎么来的呢?来看以下数据：

2^10=1024, 103/1024=0.1005859375
2^11=2048, 205/2048=0.10009765625
2^12=4096, 410/4096=0.10009765625
2^13=8192, 820/8192=0.10009765625
2^14=16384, 1639/16384=0.10003662109375
2^15=32768, 3277/32768=0.100006103515625
2^16=65536, 6554/65536=0.100006103515625
2^17=131072, 13108/131072=0.100006103515625
2^18=262144, 26215/262144=0.10000228881835938
2^19=524288, 52429/524288=0.10000038146972656
2^20=1048576, 104858/1048576=0.1000003815
2^21=2097152, 209716/2097152 = 0.1000003815
2^22= 4194304, 419431/4194304= 0.1000001431

超过22的数字我就不列举了，因为如果num3越大，就会要求i比较小，因为必须保证(i * num2) >>> (num3)的过程不会因为溢出而导致数据不准确。那么是怎么敲定num1=65536,num2= 524288, num3=19的呢？这三个数字之间是有这样一个操作的：
（num1* num2）>>> num3
因为要保证该操作不能因为溢出导致数据不准确，所以num1和num2就相互约束。两个数的乘积是有一定范围的，不成超过这个范围，所以，num1增大，num2就要随之减小。
我觉得有以下几个原因：
1.52429/524288=0.10000038146972656精度足够高。
2.下一个精度较高的num2和num3的组合是419431和22。2^31/222 = 2^9 = 512。512这个数字实在是太小了。65536正好是2^16，一个整数占4个字节。65536正好占了2个字节，选定这样一个数字有利于CPU访问数据。
不知道有没有人发现，其实65536* 52429是超过了int的最大值的，一旦超过就要溢出，那么为什么还能保证（num1* num2）>>> num3能得到正确的结果呢？
这和>>>有关，因为>>>表示无符号右移，他会在忽略符号位，空位都以0补齐。
一个有符号的整数能表示的范围是-2147483648至2147483647，但是无符号的整数能表示的范围就是0-4,294,967,296（2^32）。所以，只要保证num2*num3的值不超过2^32 就可以了。65536是2^16 ,52429正好小于2^16,所以，他们的乘积在无符号向右移位就能保证数字的准确性。

toUnsignedString()
int型的无符号数范围是0~2^32,232int是存不下的，所以得使用更大的数据类型来存储->long:

    public static String toUnsignedString(int i, int radix) {
        return Long.toUnsignedString(toUnsignedLong(i), radix);
    }

字符串转int

主要方法：

public static int parseInt(String s, int radix);
public static int parseUnsignedInt(String s, int radix); // 无符号int的范围是0~2^32，需要long型来装载数据
public static Integer valueOf(String s, int radix);
public static Integer valueOf(String s);
public static Integer valueOf(int i);
public static Integer getInteger(String nm, Integer val);// 获取系统属性nm的值，val是默认值
public static Integer decode(String nm);

public static int parseInt(String s, int radix);

    public static int parseInt(String s, int radix)throws NumberFormatException{
        /*
         * WARNING: This method may be invoked early during VM initialization
         * before IntegerCache is initialized. Care must be taken to not use
         * the valueOf method.
         */

        if (s == null) {
            throw new NumberFormatException("null");
        }

        if (radix < Character.MIN_RADIX) {
            throw new NumberFormatException("radix " + radix +
                                            " less than Character.MIN_RADIX");
        }

        if (radix > Character.MAX_RADIX) {
            throw new NumberFormatException("radix " + radix +
                                            " greater than Character.MAX_RADIX");
        }

        int result = 0;
        boolean negative = false;
        int i = 0, len = s.length();
        int limit = -Integer.MAX_VALUE;
        int multmin;
        int digit;

        if (len > 0) {
            char firstChar = s.charAt(0);
            if (firstChar < '0') { // Possible leading "+" or "-"
                if (firstChar == '-') {
                    negative = true;
                    limit = Integer.MIN_VALUE;
                } else if (firstChar != '+')
                    throw NumberFormatException.forInputString(s);

                if (len == 1) // Cannot have lone "+" or "-"
                    throw NumberFormatException.forInputString(s);
                i++;
            }
            multmin = limit / radix;
            while (i < len) {
                // Accumulating negatively avoids surprises near MAX_VALUE
                digit = Character.digit(s.charAt(i++),radix);
                if (digit < 0) {
                    throw NumberFormatException.forInputString(s);
                }
                if (result < multmin) {
                    throw NumberFormatException.forInputString(s);
                }
                result *= radix;
                if (result < limit + digit) {
                    throw NumberFormatException.forInputString(s);
                }
                result -= digit;
            }
        } else {
            throw NumberFormatException.forInputString(s);
        }
        return negative ? result : -result;
    }

里面把正数变成负数处理，且使用了multmin这个变量，这两个是为了防止溢出。有一段来自Stack Overflow的解释：

multmin is used in below code:

if (result < multmin) {
    throw NumberFormatException.forInputString(s);
}

If current result is less than multmin, next generation result must overflow, so an exception is thrown:

if result < multmin, 
------>  result < limit / radix (beacause multmin = limit / radix)
------>  result * radix < limit
------>  result * radix - digit < limit (overflow).

If current result is greater than or equals multmin, we can assert result * radix >= limit not overflow, so continue check if result * radix - digit overflow with:
```
if (result < limit + digit) {
   throw NumberFormatException.forInputString(s);
 }
```

Why use negative?
Because the absolute value of Integer.MIN_VALUE(-2147483648) is greater than Integer.MAX_VALUE (2147483647).
Suppose we have a POSITIVE version, when input number start with '+', we can set limit as Integer.MAX_VALUE. But, when input number start with '-', we can not set limit as 2147483648, it's an overflow value.

valueOf(int i)
此方法有一个int数据缓存机制，先看一下代码：

    public static Integer valueOf(int i) {
        if (i >= IntegerCache.low && i <= IntegerCache.high)
            return IntegerCache.cache[i + (-IntegerCache.low)];
        return new Integer(i);
    }

如果i在[IntegerCache.low（-128） , IntegerCache.high（127）]区间中，就直接返回缓存中的对象。这个机制可以反过来解释这个现象：

    @Test
    public void testCache() {
        Integer i1 = 100;
        Integer i2 = 100;
        System.out.println(i1 == i2);  // true
    }

看一下它的字节码，看看在Integer i1 = 100;是怎么做的：

  public void testCache();
    Code:
       0: bipush        100
       2: invokestatic  #44                 // Method java/lang/Integer.valueOf:             (I)Ljava/lang/Integer;
       5: astore_1
       6: bipush        100
       8: invokestatic  #44                 // Method java/lang/Integer.valueOf:             (I)Ljava/lang/Integer;
      11: astore_2
      12: getstatic     #47                 // Field java/lang/System.out:Ljava/             io/PrintStream;
      15: aload_1
      16: aload_2
      17: if_acmpne     24
      20: iconst_1
      21: goto          25
      24: iconst_0
      25: invokevirtual #53                 // Method java/io/PrintStream.printl             n:(Z)V
      28: return
}

从字节码可以明显看到，在给Integer对象赋值的时候，调用的是valueOf方法，所以这也就解释了为什么两个对象相等了。如果不在-128~127的范围内，就不会是true了：

    @Test
    public void testCache() {
        Integer i1 = 128;
        Integer i2 = 128;
        System.out.println(i1 == i2); // false
    }

cache的实现机制：

 private static class IntegerCache {
        static final int low = -128;
        static final int high;
        // 缓存数组
        static final Integer cache[];

        static {
            // high value may be configured by property
            int h = 127;
            // 这个最大值是可以配置的 ，默认是127
            String integerCacheHighPropValue =
                sun.misc.VM.getSavedProperty("java.lang.Integer.IntegerCache.high");
            if (integerCacheHighPropValue != null) {
                try {
                    int i = parseInt(integerCacheHighPropValue);
                    i = Math.max(i, 127);
                    // Maximum array size is Integer.MAX_VALUE
                    h = Math.min(i, Integer.MAX_VALUE - (-low) -1);
                } catch( NumberFormatException nfe) {
                    // If the property cannot be parsed into an int, ignore it.
                }
            }
            high = h;

            cache = new Integer[(high - low) + 1];
            int j = low;
            for(int k = 0; k < cache.length; k++)
                cache[k] = new Integer(j++);

            // range [-128, 127] must be interned (JLS7 5.1.7)
            assert IntegerCache.high >= 127;
        }

        private IntegerCache() {}
    }

decode(String nm)
decode方法内部调用的方法栈是valueOf->parseInt(),这个方法和其他的区别不大，只是他支持解析0x等前缀的数字，而无须指定radix。

     * 
     * 
     * DecodableString:
     * 
Sign_opt DecimalNumeral
     * 
Sign_opt {@code 0x} HexDigits
     * 
Sign_opt {@code 0X} HexDigits
     * 
Sign_opt {@code #} HexDigits
     * 
Sign_opt {@code 0} OctalDigits
     *
     * 
Sign:
     * 
{@code -}
     * 
{@code +}
     * 
     *

其他方法

重写了equals方法，使得Integer的比较是内容的比较（值的比较）

总结

Integer封装int数据
赋值运算符底层调用的是valueOf方法，-128~127间的数据有缓存机制
核心方法是toString和parseInt，使用移位代替乘法运算，使用乘法代替除法运算，使用预定义数组提高效率，使用位与运算替代取模运算，这两个方法内部有很多高级而有趣的内容，值得再三推敲和学习。
使用负数来防止溢出。

再次膜拜jdk大神。

Integer

定义和属性

构造器

极值

整型转为字符串

字符串转int

其他方法

总结

你可能感兴趣的:(Integer)