Java String源码之类声明与构造函数(一)

1, 看类声明:

public final class String
    implements java.io.Serializable, Comparable<String>, CharSequence {...}

final 修饰,最终类,不可被继承。实现了Comparable,CharSequence 接口,一个是比较,一个是可读可写的接口。

CharSequence与String都能用于定义字符串,但CharSequence的值是可读可写序列,而String的值是只读序列

2. value数组声明

/** The value is used for character storage. */
    private final char value[];

    /** Cache the hash code for the string */
    private int hash; // Default to 0

    /** use serialVersionUID from JDK 1.0.2 for interoperability */
    private static final long serialVersionUID = -6849794470754667710L;

    /**
     * Class String is special cased within the Serialization Stream Protocol.
     *
     * A String instance is written into an ObjectOutputStream according to
     * 
     * Object Serialization Specification, Section 6.2, "Stream Elements"
     */
    private static final ObjectStreamField[] serialPersistentFields =
        new ObjectStreamField[0];

定义了value[] 数组,用来 存储字符。
缓存了hash code 。
字符串类在序列化流协议中是特殊的。

3. 初始化方法, 无参构造与传递String 类有参构造

/**
     * Initializes a newly created {@code String} object so that it represents
     * an empty character sequence.  Note that use of this constructor is
     * unnecessary since Strings are immutable.
     */
    public String() {
        this.value = "".value;
    }

/**
     * Initializes a newly created {@code String} object so that it represents
     * the same sequence of characters as the argument; in other words, the
     * newly created string is a copy of the argument string. Unless an
     * explicit copy of {@code original} is needed, use of this constructor is
     * unnecessary since Strings are immutable.
     *
     * @param  original
     *         A {@code String}
     */
    public String(String original) {
        this.value = original.value;
        this.hash = original.hash;
    }

无参构造初始化空字符串,有参初始化该字符串,设置value 和 hash.

4. 传递 char 数组作为参数构造

/**
     * Allocates a new {@code String} so that it represents the sequence of
     * characters currently contained in the character array argument. The
     * contents of the character array are copied; subsequent modification of
     * the character array does not affect the newly created string.
     *
     * @param  value
     *         The initial value of the string
     */
    public String(char value[]) {
        this.value = Arrays.copyOf(value, value.length);
    }

参数是字符数组时候,直接把参数复制到value 里。
这里是Arrays.copyOf实现:

 public static char[] copyOf(char[] original, int newLength) {
        char[] copy = new char[newLength];
        System.arraycopy(original, 0, copy, 0,
                         Math.min(original.length, newLength));
        return copy;
    }

5. char数组,offset ,count 作为构造参数

/**
     * Allocates a new {@code String} that contains characters from a subarray
     * of the character array argument. The {@code offset} argument is the
     * index of the first character of the subarray and the {@code count}
     * argument specifies the length of the subarray. The contents of the
     * subarray are copied; subsequent modification of the character array does
     * not affect the newly created string.
     *
     * @param  value
     *         Array that is the source of characters
     *
     * @param  offset
     *         The initial offset
     *
     * @param  count
     *         The length
     *
     * @throws  IndexOutOfBoundsException
     *          If the {@code offset} and {@code count} arguments index
     *          characters outside the bounds of the {@code value} array
     */
    public String(char value[], int offset, int count) {
    // 当起始位offset小于0, 就抛出数组越界异常
        if (offset < 0) {
            throw new StringIndexOutOfBoundsException(offset);
        }
       // 当count 即要复制的字符个数小于等于0,两种情况
       // 1. count<0,就抛出数组越界异常
       // 2. count=0, 初始化value为空数组 
        if (count <= 0) {
            if (count < 0) {
                throw new StringIndexOutOfBoundsException(count);
            }
            if (offset <= value.length) {
                this.value = "".value;
                return;
            }
        }
        // Note: offset or count might be near -1>>>1.
        // 如果起始位置,加上要复制的位数,超过字符数组长度,也抛出数组越界异常
        if (offset > value.length - count) {
            throw new StringIndexOutOfBoundsException(offset + count);
        }
        // 异常情况全排除,开始正常的 拷贝
        this.value = Arrays.copyOfRange(value, offset, offset+count);
    }

这里使用了copyOfRange,这里是源码:

 public static char[] copyOfRange(char[] original, int from, int to) {
     // 新的字符数组长度就是count, 也就是 to-from
        int newLength = to - from;
        // 如果新数组长度小于0,就抛出非法参数异常
        if (newLength < 0)
            throw new IllegalArgumentException(from + " > " + to);
         // 初始化一个同样长度的数组,再调用数组拷贝方法
        char[] copy = new char[newLength];
        System.arraycopy(original, from, copy, 0,
                         Math.min(original.length - from, newLength));
        return copy;
    }

顺便,看看Sytem.arraycopy方法:

public static native void arraycopy(Object src,  int  srcPos,
                                        Object dest, int destPos,
                                        int length);

native 标志,本地实现,也就是调用了一个非Java实现。

6. int[] codePoints, int offset, int count

// 这里主要注意第一个参数是int类型的数组,也就是传的Unicode编码的位置,即代码点,程序会转成对应的字符
public String(int[] codePoints, int offset, int count) {
// 同样的,对传递的起始位置,要复制的位数校验
        if (offset < 0) {
            throw new StringIndexOutOfBoundsException(offset);
        }
        if (count <= 0) {
            if (count < 0) {
                throw new StringIndexOutOfBoundsException(count);
            }
            if (offset <= codePoints.length) {
                this.value = "".value;
                return;
            }
        }
        // Note: offset or count might be near -1>>>1.
        if (offset > codePoints.length - count) {
            throw new StringIndexOutOfBoundsException(offset + count);
        }
        // 这里定义一个结束位置
        final int end = offset + count;

        // Pass 1: Compute precise size of char[]
        // 第一步,先计算下新字符数组的长度,因为传入的代码点可能不在unicode范围内,超过的话就用两个unicode字符来表示
        int n = count;
        for (int i = offset; i < end; i++) {
            int c = codePoints[i];
            if (Character.isBmpCodePoint(c))
                continue;
            else if (Character.isValidCodePoint(c))
                n++;
            else throw new IllegalArgumentException(Integer.toString(c));
        }

        // Pass 2: Allocate and fill in char[]
        // 第二部,为新数组填充值
        final char[] v = new char[n];

        for (int i = offset, j = 0; i < end; i++, j++) {
            int c = codePoints[i];
            if (Character.isBmpCodePoint(c))
            // 转换unicode代码点为字符
                v[j] = (char)c;
            else
                // 非unicode,转换成unicode字符并添加到数组中
                Character.toSurrogates(c, v, j++);
        }

        this.value = v;
    }

关于Character.toSurrogates源码:

 static void toSurrogates(int codePoint, char[] dst, int index) {
        // We write elements "backwards" to guarantee all-or-nothing
        dst[index+1] = lowSurrogate(codePoint);
        dst[index] = highSurrogate(codePoint);
    }


 public static char lowSurrogate(int codePoint) {
        return (char) ((codePoint & 0x3ff) + MIN_LOW_SURROGATE);
    }

public static char highSurrogate(int codePoint) {
        return (char) ((codePoint >>> 10)
            + (MIN_HIGH_SURROGATE - (MIN_SUPPLEMENTARY_CODE_POINT >>> 10)));
    }

要理解这一个构造函数,需要了解unicode编码,请看下篇关于unicode介绍和这个Character.isBmpCodePoint、Character.isValidCodePoint 这两个方法的介绍。

你可能感兴趣的:(java)