2018.3.21 JDK源码分析之String源码分析

写在前面的话

首先,最近有人在问我String、StringBuffer、StringBuilder的部分问题,我觉得还是深入探讨一下为好,所以开了这个坑,带领大家来初略的领略一下JDK源码是如何对这个三个类进行设计的(基于JDK1.8);

所谓的字符串就是一系列字符的集合,所以我们可以把字符串当作数组来看待。

观察定义结构

String类的定义结构(JDK1.0)

public final class String extend Object implements java.io.Serializable, Comparable, CharSequence

StringBuilder的定义结构(JDK1.5)

public final class StringBuilder extends Object implements Serializable, CharSequence
public final class StringBuilder extends AbstractStringBuilder implements java.io.Serializable, CharSequence

StringBuffer的定义结构(JDK1.0)

public final class StringBuffer extends Object implements Serializable, CharSequence
public final class StringBuffer extends AbstractStringBuilder implements java.io.Serializable, CharSequence

注意,文档上的写法可能于源码上不一致,但是这不影响我们对代码结构的分析。通过文档或者源码的注释,我们可以发现StringBuffer和String类是JDK1.0的时候就出现了,而StringBuilder是JDK1.5的时候才增加的。

AbstractStringBuilder类也是JDK1.5的时候新增的抽象类,该类除了toString()方法是抽象方法以外没有抽象方法。

那么,没有抽象方法的抽象类有什么作用呢?

  • 防止直接使用该类创建对象;
  • 作为适配器设计模式使用;
  • 为多个子类提供统一的实现,多个子类不必再实现相同的方法,消除代码复制;

抽象类还有一个最常见的作用:模板设计模式

String源码分析

String的特点

1、不可修改性

 /** The value is used for character storage. */
    private final char value[];

我们可以发现在String之中,采用字符数组来存储value值,这个数组被定义为final属性,也就是说该属性一旦被赋值,就不能被修改。这就是不可变性的原因。

    /** Cache the hash code for the string */
    private int hash; // Default to 0

这个hash字段就是缓存字符串的hash code值,默认值为0。

2、不可被继承

这个不可被继承的特性在String、StringBuffer、StringBuilder中都是一样的,因为它们都是final关键字定义的类。

总结: 那么final关键字有什么作用呢

  • final类不能被继承(该类的方法不能够被覆写),防止任何继承修改它的意义和实现;
  • 定义常量,final成员变量表示常量,只能被赋值一次,赋值后值不再改变;
  • 高效,编译器在遇到调用final方法时候会转入内嵌;
  • 在JDK7中,匿名内部类中使用局部变量要加final关键字;

3、可被序列化

在String、StringBuffer、StringBuilder中都实现了java.io.Serializable接口,标识具有被序列化的能力;

    /** use serialVersionUID from JDK 1.0.2 for interoperability */
    private static final long serialVersionUID = -6849794470754667710L;

serialVersionUID的作用是序列化时保持版本的兼容性,即在版本升级时反序列化仍保持对象的唯一性。

serialVersionUID有两种生成方式

  • 一个是默认的1L;
  • 一个是根据类名、接口名、成员方法及属性等来生成一个64位的哈希字段;

如果你实现了java.io.Serializable接口,没有serialVersionUID的时候,Eclipse就会有一个黄色的警告标识,提示你未创建serialVersionUID,并给你提供有这两种实现方式,我们也可以使用@suppresswarnings注解来抑制警告;

String类的构造方法讲解

为了简单起见,我就直接在下面的代码之中编写注释。

 /**
     * Initializes a newly created {@code String} object so that it represents
     * an empty character sequence.  Note that use of this constructor is
     * unnecessary since Strings are immutable.
     *
     * 创建一个String类的对象,用它来表示一个空字符串,注:使用这个构造函数是没有必要的,因为字符串是不可变的。
    */
    public String() {
        this.value = "".value;
    }

    /**
     * Initializes a newly created {@code String} object so that it represents
     * the same sequence of characters as the argument; in other words, the
     * newly created string is a copy of the argument string. Unless an
     * explicit copy of {@code original} is needed, use of this constructor is
     * unnecessary since Strings are immutable.
     *
     * @param  original
     *         A {@code String}
     *
     * 以原字符串为副本来创建一个与源字符串相同的字符串,同样除非需要显示的原始字符串的副本,否则这个构造函数没有必要;
     */
    public String(String original) {
        this.value = original.value;
        this.hash = original.hash;
    }

    /**
     * Allocates a new {@code String} so that it represents the sequence of
     * characters currently contained in the character array argument. The
     * contents of the character array are copied; subsequent modification of
     * the character array does not affect the newly created string.
     *
     * @param  value
     *         The initial value of the string
     * 分配一个String对象,用它来表示字符数组中当前包含的所有字符,字符串的内容是复制,随后对字符串数组的修改,不会影响创建出来字符串的内容;
     * 该构造方法使用的是 Arrays.copyOf()方法来进行数组拷贝;
     */
    public String(char value[]) {
        this.value = Arrays.copyOf(value, value.length);
    }

    /**
     * Allocates a new {@code String} that contains characters from a subarray
     * of the character array argument. The {@code offset} argument is the
     * index of the first character of the subarray and the {@code count}
     * argument specifies the length of the subarray. The contents of the
     * subarray are copied; subsequent modification of the character array does
     * not affect the newly created string.
     *
     * @param  value
     *         Array that is the source of characters
     *
     * @param  offset
     *         The initial offset
     *
     * @param  count
     *         The length
     *
     * @throws  IndexOutOfBoundsException
     *          If the {@code offset} and {@code count} arguments index
     *          characters outside the bounds of the {@code value} array
     *          
     *  使用字符数组的从offset偏移开始,长度为count的子数组来创建字符串对象,因为是数组偏移和子数组的长度有关系,所以可能会有数组索引越界的情况
     *  为了与普通的索引越界异常做区别定义了tringIndexOutOfBoundsException异常:
     *  1、偏移小于0,数组索引越界;
     *  2、count长度小于<0,索引越界;
     *  3、如果count=0,offset小于等于原来数组的长度,则创建空字符串;
     *  4、如果偏移值大于字符数组的长度 - 子数组的长度,同样也造成索引越界异常;
     *  最后使用Arrays.copyOfRange()方法来拷贝一个序列
     */
    public String(char value[], int offset, int count) {
        if (offset < 0) {
            throw new StringIndexOutOfBoundsException(offset);
        }
        if (count <= 0) {
            if (count < 0) {
                throw new StringIndexOutOfBoundsException(count);
            }
            if (offset <= value.length) {
                this.value = "".value;
                return;
            }
        }
        // Note: offset or count might be near -1>>>1.
        if (offset > value.length - count) {
            throw new StringIndexOutOfBoundsException(offset + count);
        }
        this.value = Arrays.copyOfRange(value, offset, offset+count);
    }

    /**
     * Allocates a new {@code String} that contains characters from a subarray
     * of the Unicode code point array
     * argument.  The {@code offset} argument is the index of the first code
     * point of the subarray and the {@code count} argument specifies the
     * length of the subarray.  The contents of the subarray are converted to
     * {@code char}s; subsequent modification of the {@code int} array does not
     * affect the newly created string.
     *
     * @param  codePoints
     *         Array that is the source of Unicode code points
     *
     * @param  offset
     *         The initial offset
     *
     * @param  count
     *         The length
     *
     * @throws  IllegalArgumentException
     *          If any invalid Unicode code point is found in {@code
     *          codePoints}
     *
     * @throws  IndexOutOfBoundsException
     *          If the {@code offset} and {@code count} arguments index
     *          characters outside the bounds of the {@code codePoints} array
     *
     * @since  1.5
     * 
     * 使用整型数组的子数组来创建字符串,与字符数组的子数组来创建字符串的构造方法类似,
     * 需要判断索引是否越界
     */
    public String(int[] codePoints, int offset, int count) {
        if (offset < 0) {
            throw new StringIndexOutOfBoundsException(offset);
        }
        if (count <= 0) {
            if (count < 0) {
                throw new StringIndexOutOfBoundsException(count);
            }
            if (offset <= codePoints.length) {
                this.value = "".value;
                return;
            }
        }
        // Note: offset or count might be near -1>>>1.
        if (offset > codePoints.length - count) {
            throw new StringIndexOutOfBoundsException(offset + count);
        }

        // 计算通过offset+count 取得最后一个元素的下标位置
        final int end = offset + count;

        // Pass 1: Compute precise size of char[]
        // 循环判断数组中的代码点是否为unicode的代码点
        // Character.isBmpCodePoint()为判断该代码点是否为BMP代码点
        // unicode字符要么是bmp代码点,要么是增补字符,而BMP代码点为2个字节,超过的为有效代码点;
        // Character.isValidCodePoint()判断该代码点是否为Unicode有效代码点。
        // 因为int型数据在java中占用四个字节的长度,所以如果不是有效代码点,则字符串长度要+1,即这种数字转化为字符的时候要当作两个字符;
        // 如果检测不通过抛出IllegalArgumentException异常;
        
        int n = count;
        for (int i = offset; i < end; i++) {
            int c = codePoints[i];
            if (Character.isBmpCodePoint(c))
                continue;
            else if (Character.isValidCodePoint(c))
                n++;
            else throw new IllegalArgumentException(Integer.toString(c));
        }

        // Pass 2: Allocate and fill in char[]
        // 创建一个长度为n的数组用来保存最后的结果
        // i为原数组的偏移,从i开始循环,如果是bmp代码点,则直接存入数组之中,其中j为要保存的数组的下标索引
        // 否则调用Character.toSurrogates()方法,刚我看文档没有找到,查了一下Character源码,该方法为default权限;
        // 该方法会分别对该代码点求高代理和低代理,
        // 低代理:(char) ((codePoint & 0x3ff) + MIN_LOW_SURROGATE);
        // 高代理:(char) ((codePoint >>> 10)+ (MIN_HIGH_SURROGATE - (MIN_SUPPLEMENTARY_CODE_POINT >>> 10)));
        // Java语言采用的UTF-16编码,也就是65536个编码,每个编码代表一个字符
        // Unicode大于65536个,于是unicode标准指定组织规定从这65536个编号里面取出2048個,规定它们是Surrogates,分为High Surrogates、Low Surrogates;
        // 规定'\uDC00'到'\uDFFF'为Low Surrogates;
        // 规定'\uD800'到'\uDBFF'为High Surrogates;
        final char[] v = new char[n];

        for (int i = offset, j = 0; i < end; i++, j++) {
            int c = codePoints[i];
            if (Character.isBmpCodePoint(c))
                v[j] = (char)c;
            else
                Character.toSurrogates(c, v, j++);
        }

        this.value = v;
    }
   
    /* Common private utility method used to bounds check the byte array
     * and requested offset & length values used by the String(byte[],..)
     * constructors
     * 对于bytes数组的边界检查,和上面讲解过的边界检查类似。
     */
    private static void checkBounds(byte[] bytes, int offset, int length) {
        if (length < 0)
            throw new StringIndexOutOfBoundsException(length);
        if (offset < 0)
            throw new StringIndexOutOfBoundsException(offset);
        if (offset > bytes.length - length)
            throw new StringIndexOutOfBoundsException(offset + length);
    }
   /**
     * Constructs a new {@code String} by decoding the specified subarray of
     * bytes using the specified charset.  The length of the new {@code String}
     * is a function of the charset, and hence may not be equal to the length
     * of the subarray.
     *
     * 

The behavior of this constructor when the given bytes are not valid * in the given charset is unspecified. The {@link * * .charset.CharsetDecoder} class should be used when more control * over the decoding process is required. * * @param bytes * The bytes to be decoded into characters * * @param offset * The index of the first byte to decode * * @param length * The number of bytes to decode * @param charsetName * The name of a supported {@linkplain java.nio.charset.Charset * charset} * * @throws UnsupportedEncodingException * If the named charset is not supported * * @throws IndexOutOfBoundsException * If the {@code offset} and {@code length} arguments index * characters outside the bounds of the {@code bytes} array * * @since JDK1.1 * * 通过byte数组从offset开始,长度为length的子数组以charsetName编码方式来创建String对象 * 首先判断charsetName是否null 如果为null就抛出空指针异常; * 然后检查边界; * 最后通过StringCoding.decode()方法来对byte数组的子数组进行解码后存入this.value数组 */ public String(byte bytes[], int offset, int length, String charsetName) throws UnsupportedEncodingException { if (charsetName == null) throw new NullPointerException("charsetName"); checkBounds(bytes, offset, length); this.value = StringCoding.decode(charsetName, bytes, offset, length); } /** * Constructs a new {@code String} by decoding the specified subarray of * bytes using the specified {@linkplain java.nio.charset.Charset charset}. * The length of the new {@code String} is a function of the charset, and * hence may not be equal to the length of the subarray. * *

This method always replaces malformed-input and unmappable-character * sequences with this charset's default replacement string. The {@link * java.nio.charset.CharsetDecoder} class should be used when more control * over the decoding process is required. * * @param bytes * The bytes to be decoded into characters * * @param offset * The index of the first byte to decode * * @param length * The number of bytes to decode * * @param charset * The {@linkplain java.nio.charset.Charset charset} to be used to * decode the {@code bytes} * * @throws IndexOutOfBoundsException * If the {@code offset} and {@code length} arguments index * characters outside the bounds of the {@code bytes} array * * @since 1.6 * * 这个方法与上一个方法类似,只是charsetName由String变为java.nio.charset.Charset类型 */ public String(byte bytes[], int offset, int length, Charset charset) { if (charset == null) throw new NullPointerException("charset"); checkBounds(bytes, offset, length); this.value = StringCoding.decode(charset, bytes, offset, length); } /** * Constructs a new {@code String} by decoding the specified array of bytes * using the specified {@linkplain java.nio.charset.Charset charset}. The * length of the new {@code String} is a function of the charset, and hence * may not be equal to the length of the byte array. * *

The behavior of this constructor when the given bytes are not valid * in the given charset is unspecified. The {@link * java.nio.charset.CharsetDecoder} class should be used when more control * over the decoding process is required. * * @param bytes * The bytes to be decoded into characters * * @param charsetName * The name of a supported {@linkplain java.nio.charset.Charset * charset} * * @throws UnsupportedEncodingException * If the named charset is not supported * * @since JDK1.1 * * 使用将指定编码类型的byte数组创建String对象 */ public String(byte bytes[], String charsetName) throws UnsupportedEncodingException { this(bytes, 0, bytes.length, charsetName); } /** * Constructs a new {@code String} by decoding the specified subarray of * bytes using the platform's default charset. The length of the new * {@code String} is a function of the charset, and hence may not be equal * to the length of the subarray. * *

The behavior of this constructor when the given bytes are not valid * in the default charset is unspecified. The {@link * java.nio.charset.CharsetDecoder} class should be used when more control * over the decoding process is required. * * @param bytes * The bytes to be decoded into characters * * @param offset * The index of the first byte to decode * * @param length * The number of bytes to decode * * @throws IndexOutOfBoundsException * If the {@code offset} and the {@code length} arguments index * characters outside the bounds of the {@code bytes} array * * @since JDK1.1. * * 使用bytes数组的子数组创建字符串,使用默认的编码来解码 * 这里的默认编码: * 1、调用Charset.defaultCharset().name(); * 2、如果调用不到,则采用ISO-8859-1; * 通过查看Charset.defaultCharset()源码,可以发现,该方法回去查找一个file.encoding属性(而这个属性,就是main方法所在类文件的编码) */ public String(byte bytes[], int offset, int length) { checkBounds(bytes, offset, length); this.value = StringCoding.decode(bytes, offset, length); } /** * Constructs a new {@code String} by decoding the specified array of bytes * using the platform's default charset. The length of the new {@code * String} is a function of the charset, and hence may not be equal to the * length of the byte array. * *

The behavior of this constructor when the given bytes are not valid * in the default charset is unspecified. The {@link * java.nio.charset.CharsetDecoder} class should be used when more control * over the decoding process is required. * * @param bytes * The bytes to be decoded into characters * * @since JDK1.1 * byte数组转换为String对象,调用本类的上面那个构造方法完成的 */ public String(byte bytes[]) { this(bytes, 0, bytes.length); } /** * Allocates a new string that contains the sequence of characters * currently contained in the string buffer argument. The contents of the * string buffer are copied; subsequent modification of the string buffer * does not affect the newly created string. * * @param buffer * A {@code StringBuffer} * 使用StringBuffer类创建String对象,新创建的String对象为Stringbuffer内容的拷贝 * 所以Stringbuffer对象的内容改变,不影响新创建的String对象的内容 */ public String(StringBuffer buffer) { synchronized(buffer) { this.value = Arrays.copyOf(buffer.getValue(), buffer.length()); } } /** * Allocates a new string that contains the sequence of characters * currently contained in the string builder argument. The contents of the * string builder are copied; subsequent modification of the string builder * does not affect the newly created string. * *

This constructor is provided to ease migration to {@code * StringBuilder}. Obtaining a string from a string builder via the {@code * toString} method is likely to run faster and is generally preferred. * * @param builder * A {@code StringBuilder} * * @since 1.5 * 使用StringBuilder类的创建String对象,新创建的String对象为StringBuilder的拷贝 * 所以Stringbuffer对象的内容改变,不影响新创建的String对象的内容 */ public String(StringBuilder builder) { this.value = Arrays.copyOf(builder.getValue(), builder.length()); } /* * Package private constructor which shares value array for speed. * this constructor is always expected to be called with share==true. * a separate constructor is needed because we already have a public * String(char[]) constructor that makes a copy of the given char[]. * * 包私有构造方法,该方法比已经有的String(char[])构造方法多了一个share的参数 * 目前,该参数不支持false,只能使用true。可以肯定这个方法目前加这个参数是为了能够重载 * String(char[])构造使用的Arrays.copy()来获得一份原来数组的拷贝; * 但本方法却使用的直接使用this.value = value即本方法创建出来是String和参数传递来的value数组是共享同一个数组的。 * 那么这么做有什么好处? * 1、因为采用共享数组节约内存 * 2、性能好,因为采用直接赋值(相当于改变指针指向),而另一个则是逐一拷贝; * 而本构造方法为default权限,包级别可以访问,而不是public权限,一旦设置为公有的,那就破坏了字符串的不可变性。 * 在JDK7之前很多方法都使用这种构造方法。 * 例如:substring、replace、concat、valueOf等 * 但是在JDK7及之后,在substring中已经不再使用这个构造方法了,因为在substring中使用这个构造方法,可能会造成内存泄漏或性能下降。 * 设想一下如下场景 * String someString = "...我是一个非常长的字符串..."; * String aPartString = data.substring(20, 40); * return aPartString; * 以上场景,而这个someString可能是从文件、数据库或者网络上读取来的一个较长字符串,这个比较场景的是网页抓取(爬虫)或进行日志分析的时候使用 * someString只是一个临时的,真正需要的是aPartString,但是它的内部数组却是someString之中共享出来的,因此someString可以被垃圾回收器回收 * 而someString的内部数组却不能够被回收,这就导致了内存泄漏,如果这种情况出现很多就有可能导致内存溢出或性能下降。 */ String(char[] value, boolean share) { // assert share : "unshared not supported"; this.value = value; }

注意: 由于源码的长度偏长,文章篇幅有限制,所以凡是@Deprecated注解表明为过时的方法,这里不做过多的讲解。

其他方法

在String类之中,提供有许多,我们常用的操作方法,例如split、repalce等,其中有些部分涉及到了正则表达式类和格式化字符串的类,我这里没有做过多的解释,可能会在正则表达式实现分析上再回来和大家来具体的讲解,Java之中是如何实现正则匹配操作的。

 /**
     * Returns the length of this string.
     * The length is equal to the number of Unicode
     * code units in the string.
     *
     * @return  the length of the sequence of characters represented by this
     *          object.
     * 返回字符串的长度,字符串的长度就是字符数组的长度(这个难度不大)
     */
    public int length() {
        return value.length;
    }

    /**
     * Returns {@code true} if, and only if, {@link #length()} is {@code 0}.
     *
     * @return {@code true} if {@link #length()} is {@code 0}, otherwise
     * {@code false}
     *
     * @since 1.6
     * 判断该对象是否为空字符串(数组的长度为0就是空,不等于0就是非空)
     */
    public boolean isEmpty() {
        return value.length == 0;
    }

    /**
     * Returns the {@code char} value at the
     * specified index. An index ranges from {@code 0} to
     * {@code length() - 1}. The first {@code char} value of the sequence
     * is at index {@code 0}, the next at index {@code 1},
     * and so on, as for array indexing.
     *
     * 

If the {@code char} value specified by the index is a * surrogate, the surrogate * value is returned. * * @param index the index of the {@code char} value. * @return the {@code char} value at the specified index of this string. * The first {@code char} value is at index {@code 0}. * @exception IndexOutOfBoundsException if the {@code index} * argument is negative or not less than the length of this * string. * 根据给定的索引取出该索引所在的字符。既然是给定索引嘛,索引就有可能会有数组越界的可能, * 所以先判断索引如果大于或者等于数组长度,或者小于0就抛出StringIndexOutOfBoundsException异常,否则就返回该下标对应的字符值 * */ public char charAt(int index) { if ((index < 0) || (index >= value.length)) { throw new StringIndexOutOfBoundsException(index); } return value[index]; } /** * Returns the character (Unicode code point) at the specified * index. The index refers to {@code char} values * (Unicode code units) and ranges from {@code 0} to * {@link #length()}{@code - 1}. * *

If the {@code char} value specified at the given index * is in the high-surrogate range, the following index is less * than the length of this {@code String}, and the * {@code char} value at the following index is in the * low-surrogate range, then the supplementary code point * corresponding to this surrogate pair is returned. Otherwise, * the {@code char} value at the given index is returned. * * @param index the index to the {@code char} values * @return the code point value of the character at the * {@code index} * @exception IndexOutOfBoundsException if the {@code index} * argument is negative or not less than the length of this * string. * @since 1.5 * 根据给定索引(同样要判断是否越界)返回指定索引的字符串的代码点, * 如果c1=value[index]为HighSurrogate并且++index < 数组长度,取得下一个元素的值c2并且判断是否为isLowSurrogate, * 如果是通过这两个c1,c2取得代码点,否则返回c1的代码点 */ public int codePointAt(int index) { if ((index < 0) || (index >= value.length)) { throw new StringIndexOutOfBoundsException(index); } return Character.codePointAtImpl(value, index, value.length); } /** * Returns the character (Unicode code point) before the specified * index. The index refers to {@code char} values * (Unicode code units) and ranges from {@code 1} to {@link * CharSequence#length() length}. * *

If the {@code char} value at {@code (index - 1)} * is in the low-surrogate range, {@code (index - 2)} is not * negative, and the {@code char} value at {@code (index - * 2)} is in the high-surrogate range, then the * supplementary code point value of the surrogate pair is * returned. If the {@code char} value at {@code index - * 1} is an unpaired low-surrogate or a high-surrogate, the * surrogate value is returned. * * @param index the index following the code point that should be returned * @return the Unicode code point value before the given index. * @exception IndexOutOfBoundsException if the {@code index} * argument is less than 1 or greater than the length * of this string. * @since 1.5 * 返回根据给定索引之前的字符的代码点,如果索引减1小于0或大于数组长度则抛出StringIndexOutOfBoundsException异常 * 如果index-1的字符是Low Surrogate并且index > 0,并且如果index-2的值为High Surrogate, * 则通过high-surrogate和low-surrogate返回代码点,否则返回index-1的字符值, * */ public int codePointBefore(int index) { int i = index - 1; if ((i < 0) || (i >= value.length)) { throw new StringIndexOutOfBoundsException(index); } return Character.codePointBeforeImpl(value, index, 0); } /** * Returns the number of Unicode code points in the specified text * range of this {@code String}. The text range begins at the * specified {@code beginIndex} and extends to the * {@code char} at index {@code endIndex - 1}. Thus the * length (in {@code char}s) of the text range is * {@code endIndex-beginIndex}. Unpaired surrogates within * the text range count as one code point each. * * @param beginIndex the index to the first {@code char} of * the text range. * @param endIndex the index after the last {@code char} of * the text range. * @return the number of Unicode code points in the specified text * range * @exception IndexOutOfBoundsException if the * {@code beginIndex} is negative, or {@code endIndex} * is larger than the length of this {@code String}, or * {@code beginIndex} is larger than {@code endIndex}. * @since 1.5 */ public int codePointCount(int beginIndex, int endIndex) { if (beginIndex < 0 || endIndex > value.length || beginIndex > endIndex) { throw new IndexOutOfBoundsException(); } return Character.codePointCountImpl(value, beginIndex, endIndex - beginIndex); } /** * Returns the index within this {@code String} that is * offset from the given {@code index} by * {@code codePointOffset} code points. Unpaired surrogates * within the text range given by {@code index} and * {@code codePointOffset} count as one code point each. * * @param index the index to be offset * @param codePointOffset the offset in code points * @return the index within this {@code String} * @exception IndexOutOfBoundsException if {@code index} * is negative or larger then the length of this * {@code String}, or if {@code codePointOffset} is positive * and the substring starting with {@code index} has fewer * than {@code codePointOffset} code points, * or if {@code codePointOffset} is negative and the substring * before {@code index} has fewer than the absolute value * of {@code codePointOffset} code points. * @since 1.5 * */ public int offsetByCodePoints(int index, int codePointOffset) { if (index < 0 || index > value.length) { throw new IndexOutOfBoundsException(); } return Character.offsetByCodePointsImpl(value, 0, value.length, index, codePointOffset); } /** * Copy characters from this string into dst starting at dstBegin. * This method doesn't perform any range checking. * 从本字符串中拷贝从0位置拷贝到目标字符串数组从dstBegin位置,拷贝length长度 * 本方法不进行任何范围检查,本方法是default权限 */ void getChars(char dst[], int dstBegin) { System.arraycopy(value, 0, dst, dstBegin, value.length); } /** * Copies characters from this string into the destination character * array. *

* The first character to be copied is at index {@code srcBegin}; * the last character to be copied is at index {@code srcEnd-1} * (thus the total number of characters to be copied is * {@code srcEnd-srcBegin}). The characters are copied into the * subarray of {@code dst} starting at index {@code dstBegin} * and ending at index: *

     *     dstBegin + (srcEnd-srcBegin) - 1
     * 
* * @param srcBegin index of the first character in the string * to copy. * @param srcEnd index after the last character in the string * to copy. * @param dst the destination array. * @param dstBegin the start offset in the destination array. * @exception IndexOutOfBoundsException If any of the following * is true: *
  • {@code srcBegin} is negative. *
  • {@code srcBegin} is greater than {@code srcEnd} *
  • {@code srcEnd} is greater than the length of this * string *
  • {@code dstBegin} is negative *
  • {@code dstBegin+(srcEnd-srcBegin)} is larger than * {@code dst.length}
* 从本字符串的srcBegin-srcEnd拷贝到从dstBegin开始的dst字符数组中的,拷贝srcEnd - srcBegin长度 * 本方法提供范围检查 * 1、如果srcBegin为负数; * 2、如果srcEnd大于字符串长度; * 3、如果srcBegin>srcEnd * 以上情况都抛出StringIndexOutOfBoundsException异常 */ public void getChars(int srcBegin, int srcEnd, char dst[], int dstBegin) { if (srcBegin < 0) { throw new StringIndexOutOfBoundsException(srcBegin); } if (srcEnd > value.length) { throw new StringIndexOutOfBoundsException(srcEnd); } if (srcBegin > srcEnd) { throw new StringIndexOutOfBoundsException(srcEnd - srcBegin); } System.arraycopy(value, srcBegin, dst, dstBegin, srcEnd - srcBegin); } /** * Encodes this {@code String} into a sequence of bytes using the named * charset, storing the result into a new byte array. * *

The behavior of this method when this string cannot be encoded in * the given charset is unspecified. The {@link * java.nio.charset.CharsetEncoder} class should be used when more control * over the encoding process is required. * * @param charsetName * The name of a supported {@linkplain java.nio.charset.Charset * charset} * * @return The resultant byte array * * @throws UnsupportedEncodingException * If the named charset is not supported * * @since JDK1.1 * 根据指定的字符串编码来取得返回byte数组,如果指定的字符编码为null,则抛出空指针异常 * */ public byte[] getBytes(String charsetName) throws UnsupportedEncodingException { if (charsetName == null) throw new NullPointerException(); return StringCoding.encode(charsetName, value, 0, value.length); } /** * Encodes this {@code String} into a sequence of bytes using the given * {@linkplain java.nio.charset.Charset charset}, storing the result into a * new byte array. * *

This method always replaces malformed-input and unmappable-character * sequences with this charset's default replacement byte array. The * {@link java.nio.charset.CharsetEncoder} class should be used when more * control over the encoding process is required. * * @param charset * The {@linkplain java.nio.charset.Charset} to be used to encode * the {@code String} * * @return The resultant byte array * * @since 1.6 * 同样,根据charset返回给定编码的byte数组,同样charset如果为null,则抛出空指针异常 */ public byte[] getBytes(Charset charset) { if (charset == null) throw new NullPointerException(); return StringCoding.encode(charset, value, 0, value.length); } /** * Encodes this {@code String} into a sequence of bytes using the * platform's default charset, storing the result into a new byte array. * *

The behavior of this method when this string cannot be encoded in * the default charset is unspecified. The {@link * java.nio.charset.CharsetEncoder} class should be used when more control * over the encoding process is required. * * @return The resultant byte array * * @since JDK1.1 * 使用默认编码(前面分析过)返回byte数组。 */ public byte[] getBytes() { return StringCoding.encode(value, 0, value.length); } /** * Compares this string to the specified object. The result is {@code * true} if and only if the argument is not {@code null} and is a {@code * String} object that represents the same sequence of characters as this * object. * * @param anObject * The object to compare this {@code String} against * * @return {@code true} if the given object represents a {@code String} * equivalent to this string, {@code false} otherwise * * @see #compareTo(String) * @see #equalsIgnoreCase(String) * 覆写Object的equals()方法 * 1、如果指针指向同一个地址,返回true; * 2、如果anObject是String的实例,则判断长度是否相当,再判断循环内容是否相等,如果不等返回false,如果内容相等返回true; * 3、如果anObject不是String的实例,则返回false; */ public boolean equals(Object anObject) { if (this == anObject) { return true; } if (anObject instanceof String) { String anotherString = (String)anObject; int n = value.length; if (n == anotherString.value.length) { char v1[] = value; char v2[] = anotherString.value; int i = 0; while (n-- != 0) { if (v1[i] != v2[i]) return false; i++; } return true; } } return false; } /** * Compares this string to the specified {@code StringBuffer}. The result * is {@code true} if and only if this {@code String} represents the same * sequence of characters as the specified {@code StringBuffer}. This method * synchronizes on the {@code StringBuffer}. * * @param sb The {@code StringBuffer} to compare this {@code String} against * * @return {@code true} if this {@code String} represents the same * sequence of characters as the specified {@code StringBuffer}, * {@code false} otherwise * * @since 1.4 * 与StringBuffer类型字符串判断是否相当,看类定义,所以可以直接向上转型调用与CharSequence的对象做判断 */ public boolean contentEquals(StringBuffer sb) { return contentEquals((CharSequence)sb); } /** * 非同步的内容相等比较,参数为AbstractStringBuilder类,可以知道该方法是提供给类内部使用的方法 * 而AbstractStringBuilder又是StringBuffer、StringBuidler的父类 * 所以这个方法是提供给参数是StringBufffer、StringBuilder、CharSequence使用的,比较原理如下(假设调用 str.nonSyncContentEquals(sb)): * 1、首先判断str字符串长度与sb的字符长度是否相当,如果不相当返回false; * 2、否则循环判断str字符串的每个内容与sb的内容是否相等,如果不相等返回fasle; * 3、否则返回True; */ private boolean nonSyncContentEquals(AbstractStringBuilder sb) { char v1[] = value; char v2[] = sb.getValue(); int n = v1.length; if (n != sb.length()) { return false; } for (int i = 0; i < n; i++) { if (v1[i] != v2[i]) { return false; } } return true; } /** * Compares this string to the specified {@code CharSequence}. The * result is {@code true} if and only if this {@code String} represents the * same sequence of char values as the specified sequence. Note that if the * {@code CharSequence} is a {@code StringBuffer} then the method * synchronizes on it. * * @param cs The sequence to compare this {@code String} against * * @return {@code true} if this {@code String} represents the same * sequence of char values as the specified sequence, {@code * false} otherwise * * @since 1.5 * 比较String类的字符串与CharSequence字符串是否相等 * 根据类定义我们可以发现,其实不论String、StringBuffer、StringBuilder都是CharSequence的子类,该方法实现原理如下 * 1、如果参数cs是AbstractStringBuilder实例的,若是,再判断是否为StringBuffer实例, * 若是,使用synchronized关键字同步调用nonSyncContentEquals方法, * 若不是,则直接调用nonSyncContentEquals方法; * 2、如果不是AbstractStringBuilder实例且是String类实例,直接调用equals(CharSequence cs)方法; * 3、如果不是AbstractStringBuilder或String的实例,则表示是CharSequence类,则直接判断长度是否一致、内容是否相等, * 其实现原理与nonSyncContentEquals方法相同。 */ public boolean contentEquals(CharSequence cs) { // Argument is a StringBuffer, StringBuilder if (cs instanceof AbstractStringBuilder) { if (cs instanceof StringBuffer) { synchronized(cs) { return nonSyncContentEquals((AbstractStringBuilder)cs); } } else { return nonSyncContentEquals((AbstractStringBuilder)cs); } } // Argument is a String if (cs instanceof String) { return equals(cs); } // Argument is a generic CharSequence char v1[] = value; int n = v1.length; if (n != cs.length()) { return false; } for (int i = 0; i < n; i++) { if (v1[i] != cs.charAt(i)) { return false; } } return true; } /** * Compares this {@code String} to another {@code String}, ignoring case * considerations. Two strings are considered equal ignoring case if they * are of the same length and corresponding characters in the two strings * are equal ignoring case. * *

Two characters {@code c1} and {@code c2} are considered the same * ignoring case if at least one of the following is true: *

    *
  • The two characters are the same (as compared by the * {@code ==} operator) *
  • Applying the method {@link * java.lang.Character#toUpperCase(char)} to each character * produces the same result *
  • Applying the method {@link * java.lang.Character#toLowerCase(char)} to each character * produces the same result *
* * @param anotherString * The {@code String} to compare this {@code String} against * * @return {@code true} if the argument is not {@code null} and it * represents an equivalent {@code String} ignoring case; {@code * false} otherwise * * @see #equals(Object) * 比较两个String类字符串是否相等(忽略大小写) * 实现原理如下: * 1、先判断是否是同一个引用(即指针是否相同),如果是同一个则返回true; * 2、如果不是同一个,判断anotherString是否为空,如果为空,返回false, * 3、否则判断anotherString的长度与原比较字符串长度是否相同,如果不相同返回false; * 4、如果长度相同,则调用regionMatches方法来进行比较 */ public boolean equalsIgnoreCase(String anotherString) { return (this == anotherString) ? true : (anotherString != null) && (anotherString.value.length == value.length) && regionMatches(true, 0, anotherString, 0, value.length); } /** * Compares two strings lexicographically. * The comparison is based on the Unicode value of each character in * the strings. The character sequence represented by this * {@code String} object is compared lexicographically to the * character sequence represented by the argument string. The result is * a negative integer if this {@code String} object * lexicographically precedes the argument string. The result is a * positive integer if this {@code String} object lexicographically * follows the argument string. The result is zero if the strings * are equal; {@code compareTo} returns {@code 0} exactly when * the {@link #equals(Object)} method would return {@code true}. *

* This is the definition of lexicographic ordering. If two strings are * different, then either they have different characters at some index * that is a valid index for both strings, or their lengths are different, * or both. If they have different characters at one or more index * positions, let k be the smallest such index; then the string * whose character at position k has the smaller value, as * determined by using the < operator, lexicographically precedes the * other string. In this case, {@code compareTo} returns the * difference of the two character values at position {@code k} in * the two string -- that is, the value: *

     * this.charAt(k)-anotherString.charAt(k)
     * 
* If there is no index position at which they differ, then the shorter * string lexicographically precedes the longer string. In this case, * {@code compareTo} returns the difference of the lengths of the * strings -- that is, the value: *
     * this.length()-anotherString.length()
     * 
* * @param anotherString the {@code String} to be compared. * @return the value {@code 0} if the argument string is equal to * this string; a value less than {@code 0} if this string * is lexicographically less than the string argument; and a * value greater than {@code 0} if this string is * lexicographically greater than the string argument. * * compareTo()方法是Comparable接口定义的方法,这这里复写这个比较器的方法。关于这个比较器怎么使用(请自行掌握JavaSE基础)。 * 该方法的返回结果有如下几种: * 1、如果参数字符串等于此字符串,返回0; * 2、如果此字符串小于字符串参数,此字符串字符-参数字符串字符值(负数); * 3、如果此字符串大于字符串参数,此字符串字符-参数字符串字符值(正数); * * 其实现原理: * 1、分别取得两个字符串长度、取得两个字符串当中最小长度作为循环判断结束条件; * 2、循环判断两个字符串的字符,只要字符不相当,就返回两个字符的差值; * 3、否则返回两个长度值想减; */ public int compareTo(String anotherString) { int len1 = value.length; int len2 = anotherString.value.length; int lim = Math.min(len1, len2); char v1[] = value; char v2[] = anotherString.value; int k = 0; while (k < lim) { char c1 = v1[k]; char c2 = v2[k]; if (c1 != c2) { return c1 - c2; } k++; } return len1 - len2; } /** * A Comparator that orders {@code String} objects as by * {@code compareToIgnoreCase}. This comparator is serializable. *

* Note that this Comparator does not take locale into account, * and will result in an unsatisfactory ordering for certain locales. * The java.text package provides Collators to allow * locale-sensitive ordering. * * @see java.text.Collator#compare(String, String) * @since 1.2 * * 这是一个比较器实例对象,它可以序列化。但是这个比较器没有locale的情况考虑在内,它将无法得出某些地区正确的排序结果。 * 在java.text包中提供有Collators,该比较器提供有地区敏感的排序; * CaseInsensitiveComparator是String类里面的内部类,顾名思义,这是一个不区分大小写的比较器 * 注:这是我们在JDK之中第二个见到的内部类了,之前我在讲解Map的时候讲解到一个Map.Entry类就是Map的内部类 * 其实现原理很简单,而上述的比较器类似: * 1、首先,取得两个字符串的长度,并且取得它们字符串中最小的,作为循环结束的条件; * 2、循环遍历取出各自字符,如果c1!=c2,都转换为大写,再比较,如果还不相等,在转换为小写再比较,如果不相等,返回c1-c2; * 3、否则 返回两个长度相减; * * 原理讲解清楚了,此时肯定会有人有疑问,为什么实现了Comparable接口了,为什么还需要有一个CaseInsensitiveComparator静态内部类? * * 为了代码复用,并且这两个比较的方法是有差别的,该内部类在比较的时候忽略大小写,而String类中提供有一个CASE_INSENSITIVE_ORDER属性来支持这个内部类。 * 当需要比较两个字符串可以通过这个变量来调用; * compareToIgnoreCase方法就是通过使用这个类来完成的。 */ public static final Comparator CASE_INSENSITIVE_ORDER = new CaseInsensitiveComparator(); private static class CaseInsensitiveComparator implements Comparator, java.io.Serializable { // use serialVersionUID from JDK 1.2.2 for interoperability private static final long serialVersionUID = 8575799808933029326L; public int compare(String s1, String s2) { int n1 = s1.length(); int n2 = s2.length(); int min = Math.min(n1, n2); for (int i = 0; i < min; i++) { char c1 = s1.charAt(i); char c2 = s2.charAt(i); if (c1 != c2) { c1 = Character.toUpperCase(c1); c2 = Character.toUpperCase(c2); if (c1 != c2) { c1 = Character.toLowerCase(c1); c2 = Character.toLowerCase(c2); if (c1 != c2) { // No overflow because of numeric promotion return c1 - c2; } } } } return n1 - n2; } /** Replaces the de-serialized object. */ // 这个方法是序列化创建实例的时候被引用, 这个方法可以保证,在序列化之后,反序列化回来的对象是同一个对象 // 这一点,我们将具体在设计模式的时候讲解 private Object readResolve() { return CASE_INSENSITIVE_ORDER; } } /** * Compares two strings lexicographically, ignoring case * differences. This method returns an integer whose sign is that of * calling {@code compareTo} with normalized versions of the strings * where case differences have been eliminated by calling * {@code Character.toLowerCase(Character.toUpperCase(character))} on * each character. *

* Note that this method does not take locale into account, * and will result in an unsatisfactory ordering for certain locales. * The java.text package provides collators to allow * locale-sensitive ordering. * * @param str the {@code String} to be compared. * @return a negative integer, zero, or a positive integer as the * specified String is greater than, equal to, or less * than this String, ignoring case considerations. * @see java.text.Collator#compare(String, String) * @since 1.2 * * 在看懂上述比较器的实现之后,我们就很清楚明白,这个忽略大小写的比较,就是调用的 * CASE_INSENSITIVE_ORDER(就是刚刚那个内部类)这个属性的compare()方法来完成 */ public int compareToIgnoreCase(String str) { return CASE_INSENSITIVE_ORDER.compare(this, str); } /** * Tests if two string regions are equal. *

* A substring of this {@code String} object is compared to a substring * of the argument other. The result is true if these substrings * represent identical character sequences. The substring of this * {@code String} object to be compared begins at index {@code toffset} * and has length {@code len}. The substring of other to be compared * begins at index {@code ooffset} and has length {@code len}. The * result is {@code false} if and only if at least one of the following * is true: *

  • {@code toffset} is negative. *
  • {@code ooffset} is negative. *
  • {@code toffset+len} is greater than the length of this * {@code String} object. *
  • {@code ooffset+len} is greater than the length of the other * argument. *
  • There is some nonnegative integer k less than {@code len} * such that: * {@code this.charAt(toffset + }k{@code ) != other.charAt(ooffset + } * k{@code )} *
* * @param toffset the starting offset of the subregion in this string. * @param other the string argument. * @param ooffset the starting offset of the subregion in the string * argument. * @param len the number of characters to compare. * @return {@code true} if the specified subregion of this string * exactly matches the specified subregion of the string argument; * {@code false} otherwise. * * 这个方法是用来槛车两个字符串在一个区域内是否相等。 * toffset:此字符串子区域的偏移值 * other:参与比较的字符串参数 * ooffset:字符串参数的子区域的偏移值; * len:要比较的字符串数 * * 实现原理: * 1、首先判断两个偏移值是否在符合换位内,如果不在复合范围内,返回false; * 2、否则进行循环判断,如果区域返回内的字符不相等,返回false; * 3、否则返回true; */ public boolean regionMatches(int toffset, String other, int ooffset, int len) { char ta[] = value; int to = toffset; char pa[] = other.value; int po = ooffset; // Note: toffset, ooffset, or len might be near -1>>>1. if ((ooffset < 0) || (toffset < 0) || (toffset > (long)value.length - len) || (ooffset > (long)other.value.length - len)) { return false; } while (len-- > 0) { if (ta[to++] != pa[po++]) { return false; } } return true; } /** * Tests if two string regions are equal. *

* A substring of this {@code String} object is compared to a substring * of the argument {@code other}. The result is {@code true} if these * substrings represent character sequences that are the same, ignoring * case if and only if {@code ignoreCase} is true. The substring of * this {@code String} object to be compared begins at index * {@code toffset} and has length {@code len}. The substring of * {@code other} to be compared begins at index {@code ooffset} and * has length {@code len}. The result is {@code false} if and only if * at least one of the following is true: *

  • {@code toffset} is negative. *
  • {@code ooffset} is negative. *
  • {@code toffset+len} is greater than the length of this * {@code String} object. *
  • {@code ooffset+len} is greater than the length of the other * argument. *
  • {@code ignoreCase} is {@code false} and there is some nonnegative * integer k less than {@code len} such that: *
         * this.charAt(toffset+k) != other.charAt(ooffset+k)
         * 
    *
  • {@code ignoreCase} is {@code true} and there is some nonnegative * integer k less than {@code len} such that: *
         * Character.toLowerCase(this.charAt(toffset+k)) !=
         Character.toLowerCase(other.charAt(ooffset+k))
         * 
    * and: *
         * Character.toUpperCase(this.charAt(toffset+k)) !=
         *         Character.toUpperCase(other.charAt(ooffset+k))
         * 
    *
* * @param ignoreCase if {@code true}, ignore case when comparing * characters. * @param toffset the starting offset of the subregion in this * string. * @param other the string argument. * @param ooffset the starting offset of the subregion in the string * argument. * @param len the number of characters to compare. * @return {@code true} if the specified subregion of this string * matches the specified subregion of the string argument; * {@code false} otherwise. Whether the matching is exact * or case insensitive depends on the {@code ignoreCase} * argument. * 该方法的实现原理与上一个regionMatches方法相同 * ignoreCase参数为是否忽略大小写; * 只是在实现的过程中,除了直接判断是否相等外,增加了大小写判断是否相等; */ public boolean regionMatches(boolean ignoreCase, int toffset, String other, int ooffset, int len) { char ta[] = value; int to = toffset; char pa[] = other.value; int po = ooffset; // Note: toffset, ooffset, or len might be near -1>>>1. if ((ooffset < 0) || (toffset < 0) || (toffset > (long)value.length - len) || (ooffset > (long)other.value.length - len)) { return false; } while (len-- > 0) { char c1 = ta[to++]; char c2 = pa[po++]; if (c1 == c2) { continue; } if (ignoreCase) { // If characters don't match but case may be ignored, // try converting both characters to uppercase. // If the results match, then the comparison scan should // continue. char u1 = Character.toUpperCase(c1); char u2 = Character.toUpperCase(c2); if (u1 == u2) { continue; } // Unfortunately, conversion to uppercase does not work properly // for the Georgian alphabet, which has strange rules about case // conversion. So we need to make one last check before // exiting. if (Character.toLowerCase(u1) == Character.toLowerCase(u2)) { continue; } } return false; } return true; } /** * Tests if the substring of this string beginning at the * specified index starts with the specified prefix. * * @param prefix the prefix. * @param toffset where to begin looking in this string. * @return {@code true} if the character sequence represented by the * argument is a prefix of the substring of this object starting * at index {@code toffset}; {@code false} otherwise. * The result is {@code false} if {@code toffset} is * negative or greater than the length of this * {@code String} object; otherwise the result is the same * as the result of the expression *
     *          this.substring(toffset).startsWith(prefix)
     *          
* * 本方法是判断从toffset开始的是否以为prefix字符串开始 * 实现原理: * 1、首先判断toffset是否符合,不复合返回false; * 2、否则进行字符串比较,如果不相等返回false; * 3、否则返回true; */ public boolean startsWith(String prefix, int toffset) { char ta[] = value; int to = toffset; char pa[] = prefix.value; int po = 0; int pc = prefix.value.length; // Note: toffset might be near -1>>>1. if ((toffset < 0) || (toffset > value.length - pc)) { return false; } while (--pc >= 0) { if (ta[to++] != pa[po++]) { return false; } } return true; } /** * Tests if this string starts with the specified prefix. * * @param prefix the prefix. * @return {@code true} if the character sequence represented by the * argument is a prefix of the character sequence represented by * this string; {@code false} otherwise. * Note also that {@code true} will be returned if the * argument is an empty string or is equal to this * {@code String} object as determined by the * {@link #equals(Object)} method. * @since 1. 0 * * 本方法表示字符串是否以 prefix字符串开头 * 本方法直接调用上面那个方法,把偏移值从0开始即可(不难理解); */ public boolean startsWith(String prefix) { return startsWith(prefix, 0); } /** * Tests if this string ends with the specified suffix. * * @param suffix the suffix. * @return {@code true} if the character sequence represented by the * argument is a suffix of the character sequence represented by * this object; {@code false} otherwise. Note that the * result will be {@code true} if the argument is the * empty string or is equal to this {@code String} object * as determined by the {@link #equals(Object)} method. * * 判断字符串是否以某个字符串结尾。 * 实现原理:只需要使用startsWith方法,然后偏移位置从字符串长度减去要比较的字符串参数的长度即可。 * 这个实现真的是非常的巧妙。 */ public boolean endsWith(String suffix) { return startsWith(suffix, value.length - suffix.value.length); } /** * Returns a hash code for this string. The hash code for a * {@code String} object is computed as *
     * s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]
     * 
* using {@code int} arithmetic, where {@code s[i]} is the * ith character of the string, {@code n} is the length of * the string, and {@code ^} indicates exponentiation. * (The hash value of the empty string is zero.) * * @return a hash code value for this object. * * 返回字符串hash码值,这个方法是覆写的Object的方法; * 这里我们就使用到了String类之中定义好的hash字段,该字段默认是0; * hash的计算方法就是 h = 31 * h + val[i],通过这样的迭代完成hash码的计算。 * * hash码就是散列码,使用搞笑的哈希算法来定位查找对象。我们在使用容器存储数据的时候会计算hashcode,然后将数据存入容器。 * * 没错,此时你可能会有疑问,为什么是31,而不是其他数字(hash权重算法的要素和原理)。 * 首先31是一个素数,在存储hash地址的时候,我们希望尽量减少有同样hash地址(冲突) * 如果使用了相同hash地址的数据过多,hash链就会更长,降低了查询效率, * 所以在选择系数的时候选择尽量长的系数,并让乘法尽量不要溢出(如果大于11111的数,很容易溢出) * 如果计算出的hash地址越大,冲突就越少,查找效率也会提。 * */ public int hashCode() { int h = hash; if (h == 0 && value.length > 0) { char val[] = value; for (int i = 0; i < value.length; i++) { h = 31 * h + val[i]; } hash = h; } return h; } /** * Returns the index within this string of the first occurrence of * the specified character. If a character with value * {@code ch} occurs in the character sequence represented by * this {@code String} object, then the index (in Unicode * code units) of the first such occurrence is returned. For * values of {@code ch} in the range from 0 to 0xFFFF * (inclusive), this is the smallest value k such that: *
     * this.charAt(k) == ch
     * 
* is true. For other values of {@code ch}, it is the * smallest value k such that: *
     * this.codePointAt(k) == ch
     * 
* is true. In either case, if no such character occurs in this * string, then {@code -1} is returned. * * @param ch a character (Unicode code point). * @return the index of the first occurrence of the character in the * character sequence represented by this object, or * {@code -1} if the character does not occur. * * 取得某个字符的代码点在本字符串之中的索引值,如果没有找到返回-1; */ public int indexOf(int ch) { return indexOf(ch, 0); } /** * Returns the index within this string of the first occurrence of the * specified character, starting the search at the specified index. *

* If a character with value {@code ch} occurs in the * character sequence represented by this {@code String} * object at an index no smaller than {@code fromIndex}, then * the index of the first such occurrence is returned. For values * of {@code ch} in the range from 0 to 0xFFFF (inclusive), * this is the smallest value k such that: *

     * (this.charAt(k) == ch) {@code &&} (k >= fromIndex)
     * 
* is true. For other values of {@code ch}, it is the * smallest value k such that: *
     * (this.codePointAt(k) == ch) {@code &&} (k >= fromIndex)
     * 
* is true. In either case, if no such character occurs in this * string at or after position {@code fromIndex}, then * {@code -1} is returned. * *

* There is no restriction on the value of {@code fromIndex}. If it * is negative, it has the same effect as if it were zero: this entire * string may be searched. If it is greater than the length of this * string, it has the same effect as if it were equal to the length of * this string: {@code -1} is returned. * *

All indices are specified in {@code char} values * (Unicode code units). * * @param ch a character (Unicode code point). * @param fromIndex the index to start the search from. * @return the index of the first occurrence of the character in the * character sequence represented by this object that is greater * than or equal to {@code fromIndex}, or {@code -1} * if the character does not occur. * * 取得给定ch代码点的字符在字符串以fromIndex开始搜索的索引值 * 实现原理: * 1、如果fromIndex<0,把indexOf赋值为0开始,也就是表示从字符串0索引开始搜索; * 2、如果fromIndex比字符串长度还大,返回-1,表示没找到 * 3、如果ch在BMP代码点范围内,循环判断查找,如果找到返回索引,如果没找到-1; * 4、如果不在BMP代码点范围内调用indexOfSupplementary()方法来查找 */ public int indexOf(int ch, int fromIndex) { final int max = value.length; if (fromIndex < 0) { fromIndex = 0; } else if (fromIndex >= max) { // Note: fromIndex might be near -1>>>1. return -1; } if (ch < Character.MIN_SUPPLEMENTARY_CODE_POINT) { // handle most cases here (ch is a BMP code point or a // negative value (invalid code point)) final char[] value = this.value; for (int i = fromIndex; i < max; i++) { if (value[i] == ch) { return i; } } return -1; } else { return indexOfSupplementary(ch, fromIndex); } } /** * Handles (rare) calls of indexOf with a supplementary character. * * 这个私有方法,供indexOf(int ch, int fromIndex)使用 * * 实现原理: * 1、判断ch是否是有效代码点,如果不是返回-1; * 2、如果是有效代码点,分别取出高代理和低代理,循环判断,如果两个字符分别等于高代理和低代理,就表示找到了,返回高代理下标;否则返回-1; */ private int indexOfSupplementary(int ch, int fromIndex) { if (Character.isValidCodePoint(ch)) { final char[] value = this.value; final char hi = Character.highSurrogate(ch); final char lo = Character.lowSurrogate(ch); final int max = value.length - 1; for (int i = fromIndex; i < max; i++) { if (value[i] == hi && value[i + 1] == lo) { return i; } } } return -1; } /** * Returns the index within this string of the last occurrence of * the specified character. For values of {@code ch} in the * range from 0 to 0xFFFF (inclusive), the index (in Unicode code * units) returned is the largest value k such that: *

     * this.charAt(k) == ch
     * 
* is true. For other values of {@code ch}, it is the * largest value k such that: *
     * this.codePointAt(k) == ch
     * 
* is true. In either case, if no such character occurs in this * string, then {@code -1} is returned. The * {@code String} is searched backwards starting at the last * character. * * @param ch a character (Unicode code point). * @return the index of the last occurrence of the character in the * character sequence represented by this object, or * {@code -1} if the character does not occur. * * 本方法和indexOf(int ch)类似,不过过多的描述 */ public int lastIndexOf(int ch) { return lastIndexOf(ch, value.length - 1); } /** * Returns the index within this string of the last occurrence of * the specified character, searching backward starting at the * specified index. For values of {@code ch} in the range * from 0 to 0xFFFF (inclusive), the index returned is the largest * value k such that: *
     * (this.charAt(k) == ch) {@code &&} (k <= fromIndex)
     * 
* is true. For other values of {@code ch}, it is the * largest value k such that: *
     * (this.codePointAt(k) == ch) {@code &&} (k <= fromIndex)
     * 
* is true. In either case, if no such character occurs in this * string at or before position {@code fromIndex}, then * {@code -1} is returned. * *

All indices are specified in {@code char} values * (Unicode code units). * * @param ch a character (Unicode code point). * @param fromIndex the index to start the search from. There is no * restriction on the value of {@code fromIndex}. If it is * greater than or equal to the length of this string, it has * the same effect as if it were equal to one less than the * length of this string: this entire string may be searched. * If it is negative, it has the same effect as if it were -1: * -1 is returned. * @return the index of the last occurrence of the character in the * character sequence represented by this object that is less * than or equal to {@code fromIndex}, or {@code -1} * if the character does not occur before that point. * * indexOf()我们可以看做是从字符串左边开始查找,而这个方法是从右边开始查找。 * 实现原理: * 1、首先判断是否是BMP代码,如果不是BMP代码点,调用lastIndexOfSupplementary()方法 * 2、如果是BMP代码,取得 fromIndex与最后一个位置索取出最小值,从右开始查找; * 3、如果查找到返回索引下标,如果没有找到返回-1; */ public int lastIndexOf(int ch, int fromIndex) { if (ch < Character.MIN_SUPPLEMENTARY_CODE_POINT) { // handle most cases here (ch is a BMP code point or a // negative value (invalid code point)) final char[] value = this.value; int i = Math.min(fromIndex, value.length - 1); for (; i >= 0; i--) { if (value[i] == ch) { return i; } } return -1; } else { return lastIndexOfSupplementary(ch, fromIndex); } } /** * Handles (rare) calls of lastIndexOf with a supplementary character. * * 私有方法,本法负责从右查找非BMP代码点 * 实现原理: * 1、首先判断是否是有效代码点,如果不是返回-1; * 2、如果是有效代码点,取得高代理和低代理,循环查找,如果两个字符等于高代理和低代理,那么就返回下标,如果不等返回-1; */ private int lastIndexOfSupplementary(int ch, int fromIndex) { if (Character.isValidCodePoint(ch)) { final char[] value = this.value; char hi = Character.highSurrogate(ch); char lo = Character.lowSurrogate(ch); int i = Math.min(fromIndex, value.length - 2); for (; i >= 0; i--) { if (value[i] == hi && value[i + 1] == lo) { return i; } } } return -1; } /** * Returns the index within this string of the first occurrence of the * specified substring. * *

The returned index is the smallest value k for which: *

     * this.startsWith(str, k)
     * 
* If no such value of k exists, then {@code -1} is returned. * * @param str the substring to search for. * @return the index of the first occurrence of the specified substring, * or {@code -1} if there is no such occurrence. * * 与参数为代码点的方法一样,调用index(string str,int fromIndex)来查找 * 从左边第一个字符串开始查找,如果找到String 返回第一个查找到的索引,如果没找到返回-1 */ public int indexOf(String str) { return indexOf(str, 0); } /** * Returns the index within this string of the first occurrence of the * specified substring, starting at the specified index. * *

The returned index is the smallest value k for which: *

     * k >= fromIndex {@code &&} this.startsWith(str, k)
     * 
* If no such value of k exists, then {@code -1} is returned. * * @param str the substring to search for. * @param fromIndex the index from which to start the search. * @return the index of the first occurrence of the specified substring, * starting at the specified index, * or {@code -1} if there is no such occurrence. * * 从左边fromIndex索引位置开始查找str字符串,如果找到返回第一个出现str字符串的下标 * 如果没有找到返回-1; */ public int indexOf(String str, int fromIndex) { return indexOf(value, 0, value.length, str.value, 0, str.value.length, fromIndex); } /** * Code shared by String and AbstractStringBuilder to do searches. The * source is the character array being searched, and the target * is the string being searched for. * * @param source the characters being searched. * @param sourceOffset offset of the source string. * @param sourceCount count of the source string. * @param target the characters being searched for. * @param fromIndex the index to begin searching from. * */ static int indexOf(char[] source, int sourceOffset, int sourceCount, String target, int fromIndex) { return indexOf(source, sourceOffset, sourceCount, target.value, 0, target.value.length, fromIndex); } /** * Code shared by String and StringBuffer to do searches. The * source is the character array being searched, and the target * is the string being searched for. * * @param source the characters being searched. * @param sourceOffset offset of the source string. * @param sourceCount count of the source string. * @param target the characters being searched for. * @param targetOffset offset of the target string. * @param targetCount count of the target string. * @param fromIndex the index to begin searching from. * * 通过上面几个函数,我们发现一层一层调用了这个方法 * 首先我们先介绍一下参数的含义 * source,要被查找的字符数组 * sourceOffset,要被查找的字符串的偏移, * sourceCount,要被查找的字符串的长度 * * target,超找字符串 * targetOffset,查找字符串的偏移 * targetCount,查找字符串的长度 * * fromIndex 从那个索引开始查找 * * 实现原理(逐个比较): * 1、先对目标字符串中出现子字符串的位置可能范围; * 2、然后在此范围中遍历找出与子字符串第一个字符相同的位置,并对后面字符进行比较分析。 * * 此时会有人问,为什么不使用KMP或Boyer-Moore等时间复杂度低的算法? * 查找发现,JDK编写者认为大多数情况下String都不长,使用原始实现可能代价更低。 * 因为KMP和Boyer-Moore都需要预先计算处理来获得辅助数组,需要一定的时间和空间,这可能在短字符串查找相比原始实现耗费更大的代价, * 而大的字符串查找时,程序员们也会使用其它特定的数据结构,查找起来更简单。这有点类似于排除特定情况下的快速排序了。 * https://stackoverflow.com/questions/19543547/why-jdks-string-indexof-does-not-use-kmp/ * */ static int indexOf(char[] source, int sourceOffset, int sourceCount, char[] target, int targetOffset, int targetCount, int fromIndex) { if (fromIndex >= sourceCount) { return (targetCount == 0 ? sourceCount : -1); } if (fromIndex < 0) { fromIndex = 0; } if (targetCount == 0) { return fromIndex; } char first = target[targetOffset]; int max = sourceOffset + (sourceCount - targetCount); for (int i = sourceOffset + fromIndex; i <= max; i++) { /* Look for first character. */ if (source[i] != first) { while (++i <= max && source[i] != first); } /* Found first character, now look at the rest of v2 */ if (i <= max) { int j = i + 1; int end = j + targetCount - 1; for (int k = targetOffset + 1; j < end && source[j] == target[k]; j++, k++); if (j == end) { /* Found whole string. */ return i - sourceOffset; } } } return -1; } /** * Returns the index within this string of the last occurrence of the * specified substring. The last occurrence of the empty string "" * is considered to occur at the index value {@code this.length()}. * *

The returned index is the largest value k for which: *

     * this.startsWith(str, k)
     * 
* If no such value of k exists, then {@code -1} is returned. * * @param str the substring to search for. * @return the index of the last occurrence of the specified substring, * or {@code -1} if there is no such occurrence. * * 这个lastIndexOf的原理与indexOf()原理相同,这里不做过多的描述 * */ public int lastIndexOf(String str) { return lastIndexOf(str, value.length); } /** * Returns the index within this string of the last occurrence of the * specified substring, searching backward starting at the specified index. * *

The returned index is the largest value k for which: *

     * k {@code <=} fromIndex {@code &&} this.startsWith(str, k)
     * 
* If no such value of k exists, then {@code -1} is returned. * * @param str the substring to search for. * @param fromIndex the index to start the search from. * @return the index of the last occurrence of the specified substring, * searching backward from the specified index, * or {@code -1} if there is no such occurrence. */ public int lastIndexOf(String str, int fromIndex) { return lastIndexOf(value, 0, value.length, str.value, 0, str.value.length, fromIndex); } /** * Code shared by String and AbstractStringBuilder to do searches. The * source is the character array being searched, and the target * is the string being searched for. * * @param source the characters being searched. * @param sourceOffset offset of the source string. * @param sourceCount count of the source string. * @param target the characters being searched for. * @param fromIndex the index to begin searching from. */ static int lastIndexOf(char[] source, int sourceOffset, int sourceCount, String target, int fromIndex) { return lastIndexOf(source, sourceOffset, sourceCount, target.value, 0, target.value.length, fromIndex); } /** * Code shared by String and StringBuffer to do searches. The * source is the character array being searched, and the target * is the string being searched for. * * @param source the characters being searched. * @param sourceOffset offset of the source string. * @param sourceCount count of the source string. * @param target the characters being searched for. * @param targetOffset offset of the target string. * @param targetCount count of the target string. * @param fromIndex the index to begin searching from. * * 这个方法的实现原理与indexOf()原理类似,也但是逐个遍历字符判断,并没有使用KMP查找算法 * */ static int lastIndexOf(char[] source, int sourceOffset, int sourceCount, char[] target, int targetOffset, int targetCount, int fromIndex) { /* * Check arguments; return immediately where possible. For * consistency, don't check for null str. */ int rightIndex = sourceCount - targetCount; if (fromIndex < 0) { return -1; } if (fromIndex > rightIndex) { fromIndex = rightIndex; } /* Empty string always matches. */ if (targetCount == 0) { return fromIndex; } int strLastIndex = targetOffset + targetCount - 1; char strLastChar = target[strLastIndex]; int min = sourceOffset + targetCount - 1; int i = min + fromIndex; startSearchForLastChar: while (true) { while (i >= min && source[i] != strLastChar) { i--; } if (i < min) { return -1; } int j = i - 1; int start = j - (targetCount - 1); int k = strLastIndex - 1; while (j > start) { if (source[j--] != target[k--]) { i--; continue startSearchForLastChar; } } return start - sourceOffset + 1; } } /** * Returns a string that is a substring of this string. The * substring begins with the character at the specified index and * extends to the end of this string.

* Examples: *

     * "unhappy".substring(2) returns "happy"
     * "Harbison".substring(3) returns "bison"
     * "emptiness".substring(9) returns "" (an empty string)
     * 
* * @param beginIndex the beginning index, inclusive. * @return the specified substring. * @exception IndexOutOfBoundsException if * {@code beginIndex} is negative or larger than the * length of this {@code String} object. * * 本方法是截取子字符串,返回给定索引开始的字符串 * 实现原理: * 1、首先对索引的范围进行判断,如果beginIndex<0,抛出StringIndexOutOfBoundsException异常; * 2、求字符串长度,如果子字符串长度<0,抛出StringIndexOutOfBoundsException异常; * 3、如果如果beginIndex=0,我们只要返回当前字符串即可,否则创建String对象; * */ public String substring(int beginIndex) { if (beginIndex < 0) { throw new StringIndexOutOfBoundsException(beginIndex); } int subLen = value.length - beginIndex; if (subLen < 0) { throw new StringIndexOutOfBoundsException(subLen); } return (beginIndex == 0) ? this : new String(value, beginIndex, subLen); } /** * Returns a string that is a substring of this string. The * substring begins at the specified {@code beginIndex} and * extends to the character at index {@code endIndex - 1}. * Thus the length of the substring is {@code endIndex-beginIndex}. *

* Examples: *

     * "hamburger".substring(4, 8) returns "urge"
     * "smiles".substring(1, 5) returns "mile"
     * 
* * @param beginIndex the beginning index, inclusive. * @param endIndex the ending index, exclusive. * @return the specified substring. * @exception IndexOutOfBoundsException if the * {@code beginIndex} is negative, or * {@code endIndex} is larger than the length of * this {@code String} object, or * {@code beginIndex} is larger than * {@code endIndex}. * * 截取子字符串,从beginIndex到endIndex的字符串; * 实现原理: * 1、判断beginIndex如果为负数,抛出StringIndexOutOfBoundsException; * 2、判断endIndex如果大于字符串长度,抛出StringIndexOutOfBoundsException; * 3、如果字符串长度(endIndex - beginIndex)为负数,抛出StringIndexOutOfBoundsException; * 4、如果beginIndex为0且endIndex为字符串长度,则返回本字符串,否则创建新的字符串 */ public String substring(int beginIndex, int endIndex) { if (beginIndex < 0) { throw new StringIndexOutOfBoundsException(beginIndex); } if (endIndex > value.length) { throw new StringIndexOutOfBoundsException(endIndex); } int subLen = endIndex - beginIndex; if (subLen < 0) { throw new StringIndexOutOfBoundsException(subLen); } return ((beginIndex == 0) && (endIndex == value.length)) ? this : new String(value, beginIndex, subLen); } /** * Returns a character sequence that is a subsequence of this sequence. * *

An invocation of this method of the form * *

     * str.subSequence(begin, end)
* * behaves in exactly the same way as the invocation * *
     * str.substring(begin, end)
* * @apiNote * This method is defined so that the {@code String} class can implement * the {@link CharSequence} interface. * * @param beginIndex the begin index, inclusive. * @param endIndex the end index, exclusive. * @return the specified subsequence. * * @throws IndexOutOfBoundsException * if {@code beginIndex} or {@code endIndex} is negative, * if {@code endIndex} is greater than {@code length()}, * or if {@code beginIndex} is greater than {@code endIndex} * * @since 1.4 * @spec JSR-51 * * 求子字符串序列,本身就是对substring的调用。 * */ public CharSequence subSequence(int beginIndex, int endIndex) { return this.substring(beginIndex, endIndex); } /** * Concatenates the specified string to the end of this string. *

* If the length of the argument string is {@code 0}, then this * {@code String} object is returned. Otherwise, a * {@code String} object is returned that represents a character * sequence that is the concatenation of the character sequence * represented by this {@code String} object and the character * sequence represented by the argument string.

* Examples: *

     * "cares".concat("s") returns "caress"
     * "to".concat("get").concat("her") returns "together"
     * 
* * @param str the {@code String} that is concatenated to the end * of this {@code String}. * @return a string that represents the concatenation of this object's * characters followed by the string argument's characters. * * 字符串拼接函数,返回拼接后的字符串; * 实现原理: * 如果字符串长度为0,即为空字符串(""),则直接返回本身,可以发现,如果参数传null,会抛出空指针异常(JavaSE基础内容,不过多解释) * 使用Arrays.copyOf()来创建一个buf数组,然后在使用str.getChars(buf,len)方法; * */ public String concat(String str) { int otherLen = str.length(); if (otherLen == 0) { return this; } int len = value.length; char buf[] = Arrays.copyOf(value, len + otherLen); str.getChars(buf, len); return new String(buf, true); } /** * Returns a string resulting from replacing all occurrences of * {@code oldChar} in this string with {@code newChar}. *

* If the character {@code oldChar} does not occur in the * character sequence represented by this {@code String} object, * then a reference to this {@code String} object is returned. * Otherwise, a {@code String} object is returned that * represents a character sequence identical to the character sequence * represented by this {@code String} object, except that every * occurrence of {@code oldChar} is replaced by an occurrence * of {@code newChar}. *

* Examples: *

     * "mesquite in your cellar".replace('e', 'o')
     *         returns "mosquito in your collar"
     * "the war of baronets".replace('r', 'y')
     *         returns "the way of bayonets"
     * "sparring with a purple porpoise".replace('p', 't')
     *         returns "starring with a turtle tortoise"
     * "JonL".replace('q', 'x') returns "JonL" (no change)
     * 
* * @param oldChar the old character. * @param newChar the new character. * @return a string derived from this string by replacing every * occurrence of {@code oldChar} with {@code newChar}. * * 字符串替换由于字符串是不可变,所以这里的替换也是会生成一个新的字符串对象 * 实现原理: * 1、首先先判断要替换字符和新的字符是否相等,如果相当返回字符串本身; * 2、首先查找到oldChar字符,如果没找到索引>数组长度,会返回字符串本身; * 3、如果找到,这创建一个buf数组,拷贝到总置顶索引位置。然后继续循环,判断每个字符串,如果字符等于要替换的字符,替换否则使用原字符; * 4、最后返回buf数组创建的String; * */ public String replace(char oldChar, char newChar) { if (oldChar != newChar) { int len = value.length; int i = -1; char[] val = value; /* * avoid getfield opcode * 我觉得这个注释可能会有人看不懂,字面意思就是避免getfield操作。 * 在《深入理解JVM虚拟机》里面提到:getfield助记符,获取指定类的实例域,并将值压入栈顶,减少了getfield操作次数,从而提高了性能; * */ while (++i < len) { if (val[i] == oldChar) { break; } } if (i < len) { char buf[] = new char[len]; for (int j = 0; j < i; j++) { buf[j] = val[j]; } while (i < len) { char c = val[i]; buf[i] = (c == oldChar) ? newChar : c; i++; } return new String(buf, true); } } return this; } /** * Tells whether or not this string matches the given regular expression. * *

An invocation of this method of the form * str{@code .matches(}regex{@code )} yields exactly the * same result as the expression * *

* {@link java.util.regex.Pattern}.{@link java.util.regex.Pattern#matches(String,CharSequence) * matches(regex, str)} *
* * @param regex * the regular expression to which this string is to be matched * * @return {@code true} if, and only if, this string matches the * given regular expression * * @throws PatternSyntaxException * if the regular expression's syntax is invalid * * @see java.util.regex.Pattern * * @since 1.4 * @spec JSR-51、 * * 这部分是正则匹配,关于这部分内容可以间正则实现相关章节 * */ public boolean matches(String regex) { return Pattern.matches(regex, this); } /** * Returns true if and only if this string contains the specified * sequence of char values. * * @param s the sequence to search for * @return true if this string contains {@code s}, false otherwise * @since 1.5 * * 判断是否存在存在某个子字符串,返回ture或者false * 实现很简单,就是调用indexOf()查找,如果结果>-1为true,否则为false; * */ public boolean contains(CharSequence s) { return indexOf(s.toString()) > -1; } /** * Replaces the first substring of this string that matches the given regular expression with the * given replacement. * *

An invocation of this method of the form * str{@code .replaceFirst(}regex{@code ,} repl{@code )} * yields exactly the same result as the expression * *

* * {@link java.util.regex.Pattern}.{@link * java.util.regex.Pattern#compile compile}(regex).{@link * java.util.regex.Pattern#matcher(java.lang.CharSequence) matcher}(str).{@link * java.util.regex.Matcher#replaceFirst replaceFirst}(repl) * *
* *

* Note that backslashes ({@code \}) and dollar signs ({@code $}) in the * replacement string may cause the results to be different than if it were * being treated as a literal replacement string; see * {@link java.util.regex.Matcher#replaceFirst}. * Use {@link java.util.regex.Matcher#quoteReplacement} to suppress the special * meaning of these characters, if desired. * * @param regex * the regular expression to which this string is to be matched * @param replacement * the string to be substituted for the first match * * @return The resulting {@code String} * * @throws PatternSyntaxException * if the regular expression's syntax is invalid * * @see java.util.regex.Pattern * * @since 1.4 * @spec JSR-51 * * 替换第一次匹配出现的字符串,由参数可以知道,第一个参数为正则表达式字符串, * 第二个参数为要替换的字符串 */ public String replaceFirst(String regex, String replacement) { return Pattern.compile(regex).matcher(this).replaceFirst(replacement); } /** * Replaces each substring of this string that matches the given regular expression with the * given replacement. * *

An invocation of this method of the form * str{@code .replaceAll(}regex{@code ,} repl{@code )} * yields exactly the same result as the expression * *

* * {@link java.util.regex.Pattern}.{@link * java.util.regex.Pattern#compile compile}(regex).{@link * java.util.regex.Pattern#matcher(java.lang.CharSequence) matcher}(str).{@link * java.util.regex.Matcher#replaceAll replaceAll}(repl) * *
* *

* Note that backslashes ({@code \}) and dollar signs ({@code $}) in the * replacement string may cause the results to be different than if it were * being treated as a literal replacement string; see * {@link java.util.regex.Matcher#replaceAll Matcher.replaceAll}. * Use {@link java.util.regex.Matcher#quoteReplacement} to suppress the special * meaning of these characters, if desired. * * @param regex * the regular expression to which this string is to be matched * @param replacement * the string to be substituted for each match * * @return The resulting {@code String} * * @throws PatternSyntaxException * if the regular expression's syntax is invalid * * @see java.util.regex.Pattern * * @since 1.4 * @spec JSR-51 * * 替换所有出现符合匹配的字符串。具体实现讲解正则表达式实现再说 */ public String replaceAll(String regex, String replacement) { return Pattern.compile(regex).matcher(this).replaceAll(replacement); } /** * Replaces each substring of this string that matches the literal target * sequence with the specified literal replacement sequence. The * replacement proceeds from the beginning of the string to the end, for * example, replacing "aa" with "b" in the string "aaa" will result in * "ba" rather than "ab". * * @param target The sequence of char values to be replaced * @param replacement The replacement sequence of char values * @return The resulting string * @since 1.5 * * 同样也是使用了正则,这里描述如何实现的 */ public String replace(CharSequence target, CharSequence replacement) { return Pattern.compile(target.toString(), Pattern.LITERAL).matcher( this).replaceAll(Matcher.quoteReplacement(replacement.toString())); } /** * Splits this string around matches of the given * regular expression. * *

The array returned by this method contains each substring of this * string that is terminated by another substring that matches the given * expression or is terminated by the end of the string. The substrings in * the array are in the order in which they occur in this string. If the * expression does not match any part of the input then the resulting array * has just one element, namely this string. * *

When there is a positive-width match at the beginning of this * string then an empty leading substring is included at the beginning * of the resulting array. A zero-width match at the beginning however * never produces such empty leading substring. * *

The {@code limit} parameter controls the number of times the * pattern is applied and therefore affects the length of the resulting * array. If the limit n is greater than zero then the pattern * will be applied at most n - 1 times, the array's * length will be no greater than n, and the array's last entry * will contain all input beyond the last matched delimiter. If n * is non-positive then the pattern will be applied as many times as * possible and the array can have any length. If n is zero then * the pattern will be applied as many times as possible, the array can * have any length, and trailing empty strings will be discarded. * *

The string {@code "boo:and:foo"}, for example, yields the * following results with these parameters: * *

* * * * * * * * * * * * * * * * * * * * * * * *
RegexLimitResult
:2{@code { "boo", "and:foo" }}
:5{@code { "boo", "and", "foo" }}
:-2{@code { "boo", "and", "foo" }}
o5{@code { "b", "", ":and:f", "", "" }}
o-2{@code { "b", "", ":and:f", "", "" }}
o0{@code { "b", "", ":and:f" }}
* *

An invocation of this method of the form * str.{@code split(}regex{@code ,} n{@code )} * yields the same result as the expression * *

* * {@link java.util.regex.Pattern}.{@link * java.util.regex.Pattern#compile compile}(regex).{@link * java.util.regex.Pattern#split(java.lang.CharSequence,int) split}(strn) * *
* * * @param regex * the delimiting regular expression * * @param limit * the result threshold, as described above * * @return the array of strings computed by splitting this string * around matches of the given regular expression * * @throws PatternSyntaxException * if the regular expression's syntax is invalid * * @see java.util.regex.Pattern * * @since 1.4 * @spec JSR-51 * * 字符串拆分函数,regex参数为正则表达式字符串,limit参数表示要拆分成几个元素的数组 * 实现原理: * 1、如果regex只有一个字符,如果不是正则表达式元字符或长度等于2且如果第一个字符串为反斜杠且第二个字符为0-9、A-Z、a-z或符合高代理或低代理 * 符合上述条件,就进行拆分操作。 * 1、如果没有匹配到,返回只有当前字符串的数组; * 2、如果匹配到了,进行字符串拆分然后都存储到list里面,然后在对list进行转换为字符串操作 * 2、否则使用正则表达式的拆分方法来拆分 */ public String[] split(String regex, int limit) { /* fastpath if the regex is a (1)one-char String and this character is not one of the RegEx's meta characters ".$|()[{^?*+\\", or (2)two-char String and the first char is the backslash and the second is not the ascii digit or ascii letter. */ char ch = 0; if (((regex.value.length == 1 && ".$|()[{^?*+\\".indexOf(ch = regex.charAt(0)) == -1) || (regex.length() == 2 && regex.charAt(0) == '\\' && (((ch = regex.charAt(1))-'0')|('9'-ch)) < 0 && ((ch-'a')|('z'-ch)) < 0 && ((@ch-'A')|('Z'-ch)) < 0)) && (ch < Character.MIN_HIGH_SURROGATE || ch > Character.MAX_LOW_SURROGATE)) { int off = 0; int next = 0; boolean limited = limit > 0; ArrayList list = new ArrayList<>(); while ((next = indexOf(ch, off)) != -1) { if (!limited || list.size() < limit - 1) { list.add(substring(off, next)); off = next + 1; } else { // last one //assert (list.size() == limit - 1); list.add(substring(off, value.length)); off = value.length; break; } } // If no match was found, return this if (off == 0) return new String[]{this}; // Add remaining segment if (!limited || list.size() < limit) list.add(substring(off, value.length)); // Construct result int resultSize = list.size(); if (limit == 0) { while (resultSize > 0 && list.get(resultSize - 1).length() == 0) { resultSize--; } } String[] result = new String[resultSize]; return list.subList(0, resultSize).toArray(result); } return Pattern.compile(regex).split(this, limit); } /** * Splits this string around matches of the given regular expression. * *

This method works as if by invoking the two-argument {@link * #split(String, int) split} method with the given expression and a limit * argument of zero. Trailing empty strings are therefore not included in * the resulting array. * *

The string {@code "boo:and:foo"}, for example, yields the following * results with these expressions: * *

* * * * * * * * *
RegexResult
:{@code { "boo", "and", "foo" }}
o{@code { "b", "", ":and:f" }}
* * * @param regex * the delimiting regular expression * * @return the array of strings computed by splitting this string * around matches of the given regular expression * * @throws PatternSyntaxException * if the regular expression's syntax is invalid * * @see java.util.regex.Pattern * * @since 1.4 * @spec JSR-51 * * 字符串拆分操作,这里调用了上一个方法。 * */ public String[] split(String regex) { return split(regex, 0); } /** * Returns a new String composed of copies of the * {@code CharSequence elements} joined together with a copy of * the specified {@code delimiter}. * *
For example, *
{@code
     *     String message = String.join("-", "Java", "is", "cool");
     *     // message returned is: "Java-is-cool"
     * }
* * Note that if an element is null, then {@code "null"} is added. * * @param delimiter the delimiter that separates each element * @param elements the elements to join together. * * @return a new {@code String} that is composed of the {@code elements} * separated by the {@code delimiter} * * @throws NullPointerException If {@code delimiter} or {@code elements} * is {@code null} * * @see java.util.StringJoiner * @since 1.8 * * JDK1.8新增加的方法,用于字符拼接。将多个字符串与给定分割字符串进行拼接在一起,例如 * String message = String.join("-", "Java", "is", "cool"); * 其返回结果为"Java-is-cool" * * 实现原理如下 * 1、首先判断delimiter和elements是否为null,如果为null,则抛出空指针异常(Objects.requireNonNull()方法做的事情); * 2、然后使用StringJoiner类来指定delimiter,然后添加相应的字符串进入; * 3、使用toString()方法返回即可 * * 这里使用到Objects工具类,这个工具类是JDK1.7新增加的工具类,它提供了一些工具方法来操作对象,这些工具方法大多是“空指针”安全的。 * 说到工具类,那么这个类就是java.util包下的,也就是说进行了空指针检查。 * * StringJoiner类也是java.util包下的类,是JDK1.8新增的,说到这个类,这类可以不但可以指定分隔符,还可以指定字符串的前缀、后缀等操作 */ public static String join(CharSequence delimiter, CharSequence... elements) { Objects.requireNonNull(delimiter); Objects.requireNonNull(elements); // Number of elements not likely worth Arrays.stream overhead. StringJoiner joiner = new StringJoiner(delimiter); for (CharSequence cs: elements) { joiner.add(cs); } return joiner.toString(); } /** * Returns a new {@code String} composed of copies of the * {@code CharSequence elements} joined together with a copy of the * specified {@code delimiter}. * *
For example, *
{@code
     *     List strings = new LinkedList<>();
     *     strings.add("Java");strings.add("is");
     *     strings.add("cool");
     *     String message = String.join(" ", strings);
     *     //message returned is: "Java is cool"
     *
     *     Set strings = new LinkedHashSet<>();
     *     strings.add("Java"); strings.add("is");
     *     strings.add("very"); strings.add("cool");
     *     String message = String.join("-", strings);
     *     //message returned is: "Java-is-very-cool"
     * }
* * Note that if an individual element is {@code null}, then {@code "null"} is added. * * @param delimiter a sequence of characters that is used to separate each * of the {@code elements} in the resulting {@code String} * @param elements an {@code Iterable} that will have its {@code elements} * joined together. * * @return a new {@code String} that is composed from the {@code elements} * argument * * @throws NullPointerException If {@code delimiter} or {@code elements} * is {@code null} * * @see #join(CharSequence,CharSequence...) * @see java.util.StringJoiner * @since 1.8 * * 本法的的使用和实现原理与上述实现原理一样,只是换了一个参数。 * */ public static String join(CharSequence delimiter, Iterable elements) { Objects.requireNonNull(delimiter); Objects.requireNonNull(elements); StringJoiner joiner = new StringJoiner(delimiter); for (CharSequence cs: elements) { joiner.add(cs); } return joiner.toString(); } /** * Converts all of the characters in this {@code String} to lower * case using the rules of the given {@code Locale}. Case mapping is based * on the Unicode Standard version specified by the {@link java.lang.Character Character} * class. Since case mappings are not always 1:1 char mappings, the resulting * {@code String} may be a different length than the original {@code String}. *

* Examples of lowercase mappings are in the following table: *

* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
Language Code of LocaleUpper CaseLower CaseDescription
tr (Turkish)\u0130\u0069capital letter I with dot above -> small letter i
tr (Turkish)\u0049\u0131capital letter I -> small letter dotless i
(all)French Friesfrench frieslowercased all chars in String
(all)capiotacapchi * capthetacapupsil * capsigmaiotachi * thetaupsilon * sigmalowercased all chars in String
* * @param locale use the case transformation rules for this locale * @return the {@code String}, converted to lowercase. * @see java.lang.String#toLowerCase() * @see java.lang.String#toUpperCase() * @see java.lang.String#toUpperCase(Locale) * @since 1.1 * * 将字符串中的字符转换为小写,其实现原理如下 * 1、首先先判断locale是否为null,如果为null则抛出空指针异常; * 2、然后检查是否有需要转换为大小写的字符,如果有,则跳出循环,如果没有返回字符串本身; * 3、由于使用Unicode的,所以需要考虑到代码点相关的东西,所以下面的代码看起来很能会复杂一点, * 但是具体的实现和普通ascii的差不多。 * */ public String toLowerCase(Locale locale) { if (locale == null) { throw new NullPointerException(); } int firstUpper; final int len = value.length; /* Now check if there are any characters that need to be changed. */ scan: { for (firstUpper = 0 ; firstUpper < len; ) { char c = value[firstUpper]; if ((c >= Character.MIN_HIGH_SURROGATE) && (c <= Character.MAX_HIGH_SURROGATE)) { int supplChar = codePointAt(firstUpper); if (supplChar != Character.toLowerCase(supplChar)) { break scan; } firstUpper += Character.charCount(supplChar); } else { if (c != Character.toLowerCase(c)) { break scan; } firstUpper++; } } return this; } char[] result = new char[len]; int resultOffset = 0; /* result may grow, so i+resultOffset * is the write location in result */ /* Just copy the first few lowerCase characters. */ System.arraycopy(value, 0, result, 0, firstUpper); String lang = locale.getLanguage(); boolean localeDependent = (lang == "tr" || lang == "az" || lang == "lt"); char[] lowerCharArray; int lowerChar; int srcChar; int srcCount; for (int i = firstUpper; i < len; i += srcCount) { srcChar = (int)value[i]; if ((char)srcChar >= Character.MIN_HIGH_SURROGATE && (char)srcChar <= Character.MAX_HIGH_SURROGATE) { srcChar = codePointAt(i); srcCount = Character.charCount(srcChar); } else { srcCount = 1; } if (localeDependent || srcChar == '\u03A3' || // GREEK CAPITAL LETTER SIGMA srcChar == '\u0130') { // LATIN CAPITAL LETTER I WITH DOT ABOVE lowerChar = ConditionalSpecialCasing.toLowerCaseEx(this, i, locale); } else { lowerChar = Character.toLowerCase(srcChar); } if ((lowerChar == Character.ERROR) || (lowerChar >= Character.MIN_SUPPLEMENTARY_CODE_POINT)) { if (lowerChar == Character.ERROR) { lowerCharArray = ConditionalSpecialCasing.toLowerCaseCharArray(this, i, locale); } else if (srcCount == 2) { resultOffset += Character.toChars(lowerChar, result, i + resultOffset) - srcCount; continue; } else { lowerCharArray = Character.toChars(lowerChar); } /* Grow result if needed */ int mapLen = lowerCharArray.length; if (mapLen > srcCount) { char[] result2 = new char[result.length + mapLen - srcCount]; System.arraycopy(result, 0, result2, 0, i + resultOffset); result = result2; } for (int x = 0; x < mapLen; ++x) { result[i + resultOffset + x] = lowerCharArray[x]; } resultOffset += (mapLen - srcCount); } else { result[i + resultOffset] = (char)lowerChar; } } return new String(result, 0, len + resultOffset); } /** * Converts all of the characters in this {@code String} to lower * case using the rules of the default locale. This is equivalent to calling * {@code toLowerCase(Locale.getDefault())}. *

* Note: This method is locale sensitive, and may produce unexpected * results if used for strings that are intended to be interpreted locale * independently. * Examples are programming language identifiers, protocol keys, and HTML * tags. * For instance, {@code "TITLE".toLowerCase()} in a Turkish locale * returns {@code "t\u005Cu0131tle"}, where '\u005Cu0131' is the * LATIN SMALL LETTER DOTLESS I character. * To obtain correct results for locale insensitive strings, use * {@code toLowerCase(Locale.ROOT)}. *

* @return the {@code String}, converted to lowercase. * @see java.lang.String#toLowerCase(Locale) */ public String toLowerCase() { return toLowerCase(Locale.getDefault()); } /** * Converts all of the characters in this {@code String} to upper * case using the rules of the given {@code Locale}. Case mapping is based * on the Unicode Standard version specified by the {@link java.lang.Character Character} * class. Since case mappings are not always 1:1 char mappings, the resulting * {@code String} may be a different length than the original {@code String}. *

* Examples of locale-sensitive and 1:M case mappings are in the following table. * *

* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
Language Code of LocaleLower CaseUpper CaseDescription
tr (Turkish)\u0069\u0130small letter i -> capital letter I with dot above
tr (Turkish)\u0131\u0049small letter dotless i -> capital letter I
(all)\u00df\u0053 \u0053small letter sharp s -> two letters: SS
(all)FahrvergnügenFAHRVERGNÜGEN
* @param locale use the case transformation rules for this locale * @return the {@code String}, converted to uppercase. * @see java.lang.String#toUpperCase() * @see java.lang.String#toLowerCase() * @see java.lang.String#toLowerCase(Locale) * @since 1.1 * * 转换为大写,其实现原理与转换为小写一致,这些不重复描述 */ public String toUpperCase(Locale locale) { if (locale == null) { throw new NullPointerException(); } int firstLower; final int len = value.length; /* Now check if there are any characters that need to be changed. */ scan: { for (firstLower = 0 ; firstLower < len; ) { int c = (int)value[firstLower]; int srcCount; if ((c >= Character.MIN_HIGH_SURROGATE) && (c <= Character.MAX_HIGH_SURROGATE)) { c = codePointAt(firstLower); srcCount = Character.charCount(c); } else { srcCount = 1; } int upperCaseChar = Character.toUpperCaseEx(c); if ((upperCaseChar == Character.ERROR) || (c != upperCaseChar)) { break scan; } firstLower += srcCount; } return this; } /* result may grow, so i+resultOffset is the write location in result */ int resultOffset = 0; char[] result = new char[len]; /* may grow */ /* Just copy the first few upperCase characters. */ System.arraycopy(value, 0, result, 0, firstLower); String lang = locale.getLanguage(); boolean localeDependent = (lang == "tr" || lang == "az" || lang == "lt"); char[] upperCharArray; int upperChar; int srcChar; int srcCount; for (int i = firstLower; i < len; i += srcCount) { srcChar = (int)value[i]; if ((char)srcChar >= Character.MIN_HIGH_SURROGATE && (char)srcChar <= Character.MAX_HIGH_SURROGATE) { srcChar = codePointAt(i); srcCount = Character.charCount(srcChar); } else { srcCount = 1; } if (localeDependent) { upperChar = ConditionalSpecialCasing.toUpperCaseEx(this, i, locale); } else { upperChar = Character.toUpperCaseEx(srcChar); } if ((upperChar == Character.ERROR) || (upperChar >= Character.MIN_SUPPLEMENTARY_CODE_POINT)) { if (upperChar == Character.ERROR) { if (localeDependent) { upperCharArray = ConditionalSpecialCasing.toUpperCaseCharArray(this, i, locale); } else { upperCharArray = Character.toUpperCaseCharArray(srcChar); } } else if (srcCount == 2) { resultOffset += Character.toChars(upperChar, result, i + resultOffset) - srcCount; continue; } else { upperCharArray = Character.toChars(upperChar); } /* Grow result if needed */ int mapLen = upperCharArray.length; if (mapLen > srcCount) { char[] result2 = new char[result.length + mapLen - srcCount]; System.arraycopy(result, 0, result2, 0, i + resultOffset); result = result2; } for (int x = 0; x < mapLen; ++x) { result[i + resultOffset + x] = upperCharArray[x]; } resultOffset += (mapLen - srcCount); } else { result[i + resultOffset] = (char)upperChar; } } return new String(result, 0, len + resultOffset); } /** * Converts all of the characters in this {@code String} to upper * case using the rules of the default locale. This method is equivalent to * {@code toUpperCase(Locale.getDefault())}. *

* Note: This method is locale sensitive, and may produce unexpected * results if used for strings that are intended to be interpreted locale * independently. * Examples are programming language identifiers, protocol keys, and HTML * tags. * For instance, {@code "title".toUpperCase()} in a Turkish locale * returns {@code "T\u005Cu0130TLE"}, where '\u005Cu0130' is the * LATIN CAPITAL LETTER I WITH DOT ABOVE character. * To obtain correct results for locale insensitive strings, use * {@code toUpperCase(Locale.ROOT)}. *

* @return the {@code String}, converted to uppercase. * @see java.lang.String#toUpperCase(Locale) * * 这里是使用默认的Local对象来转换大小写方法 * */ public String toUpperCase() { return toUpperCase(Locale.getDefault()); } /** * Returns a string whose value is this string, with any leading and trailing * whitespace removed. *

* If this {@code String} object represents an empty character * sequence, or the first and last characters of character sequence * represented by this {@code String} object both have codes * greater than {@code '\u005Cu0020'} (the space character), then a * reference to this {@code String} object is returned. *

* Otherwise, if there is no character with a code greater than * {@code '\u005Cu0020'} in the string, then a * {@code String} object representing an empty string is * returned. *

* Otherwise, let k be the index of the first character in the * string whose code is greater than {@code '\u005Cu0020'}, and let * m be the index of the last character in the string whose code * is greater than {@code '\u005Cu0020'}. A {@code String} * object is returned, representing the substring of this string that * begins with the character at index k and ends with the * character at index m-that is, the result of * {@code this.substring(k, m + 1)}. *

* This method may be used to trim whitespace (as defined above) from * the beginning and end of a string. * * @return A string whose value is this string, with any leading and trailing white * space removed, or this string if it has no leading or * trailing white space. * * 本方法就是字符串的首尾的空格字符(凡是小于\u0020的都算作空格); * 实现原理: * 1、st用来记录从左边开始不是空格的索引,len用来记录从右边开始不是空格的索引; * 2、如果st<=0或len>=字符串长度,那么返回本对象的引用; * 3、否则就从st到len截取字符串; */ public String trim() { int len = value.length; int st = 0; char[] val = value; /* avoid getfield opcode 同样,这里做了避免getfield操作,性能优化*/ while ((st < len) && (val[st] <= ' ')) { st++; } while ((st < len) && (val[len - 1] <= ' ')) { len--; } return ((st > 0) || (len < value.length)) ? substring(st, len) : this; } /** * This object (which is already a string!) is itself returned. * * @return the string itself. * * 覆写Object类的toString()方法,直接返回自身就好 * */ public String toString() { return this; } /** * Converts this string to a new character array. * * @return a newly allocated character array whose length is the length * of this string and whose contents are initialized to contain * the character sequence represented by this string. * * 提供字符串转换为字符数组的方式返回,这里采用了System.arraycopy()方法,而没有采用Arrays.copyOf方法。 * * 注释上说因为存在有类初始化顺序问题。第一次遇见的时候,我也不是很明白为什么。不过经过我多次查找。 * 查了好多的资料,发现在stackoverflow上有人问过相关的问题 * https://stackoverflow.com/questions/49715328/why-doesnt-string-tochararray-use-arrays-copyof * * 我们来看一下,原文是怎么说?(以下是我根据stackoverflow上的问题做的总结,如果有什么错误,请告诉我) * 1、我们可以对String.java进行修改,修改成使用Arrays.copyOf; * 2、编译修改后String.java并且保存进jre下的rt.jar; * 3、编写一个简单的测试类,编译并运行该成测试程序; * 4、最后我们发现,抛出了一个NullPointerException异常; * * 从异常中,我们可以看出来 * 1、在初始化的时候,System.initProperties(本地方法)需要去处理某些String; * 2、并且初始化的时候,它可能会调用String.toCharArray()从这些字符串获取char数组; * 3、而此时String在调用Arrays.copyOf方法,但是这个时候,Arrays还未被加载; * 4、与运行Java代码不同,本地方法不能发起类初始化请求,所以这将导致NullPointerException异常出现; * */ public char[] toCharArray() { // Cannot use Arrays.copyOf because of class initialization order issues char result[] = new char[value.length]; System.arraycopy(value, 0, result, 0, value.length); return result; } /** * Returns a formatted string using the specified format string and * arguments. * *

The locale always used is the one returned by {@link * java.util.Locale#getDefault() Locale.getDefault()}. * * @param format * A format string * * @param args * Arguments referenced by the format specifiers in the format * string. If there are more arguments than format specifiers, the * extra arguments are ignored. The number of arguments is * variable and may be zero. The maximum number of arguments is * limited by the maximum dimension of a Java array as defined by * The Java™ Virtual Machine Specification. * The behaviour on a * {@code null} argument depends on the conversion. * * @throws java.util.IllegalFormatException * If a format string contains an illegal syntax, a format * specifier that is incompatible with the given arguments, * insufficient arguments given the format string, or other * illegal conditions. For specification of all possible * formatting errors, see the Details section of the * formatter class specification. * * @return A formatted string * * @see java.util.Formatter * @since 1.5 * * 格式化字符操作,这里是调用Formatter类的方法 */ public static String format(String format, Object... args) { return new Formatter().format(format, args).toString(); } /** * Returns a formatted string using the specified locale, format string, * and arguments. * * @param l * The {@linkplain java.util.Locale locale} to apply during * formatting. If {@code l} is {@code null} then no localization * is applied. * * @param format * A format string * * @param args * Arguments referenced by the format specifiers in the format * string. If there are more arguments than format specifiers, the * extra arguments are ignored. The number of arguments is * variable and may be zero. The maximum number of arguments is * limited by the maximum dimension of a Java array as defined by * The Java™ Virtual Machine Specification. * The behaviour on a * {@code null} argument depends on the * conversion. * * @throws java.util.IllegalFormatException * If a format string contains an illegal syntax, a format * specifier that is incompatible with the given arguments, * insufficient arguments given the format string, or other * illegal conditions. For specification of all possible * formatting errors, see the Details section of the * formatter class specification * * @return A formatted string * * @see java.util.Formatter * @since 1.5 * * 这里formate()方法提供了字符串格式化操作,Locale对象表示地区,format格式化字符串参数,Object ...args 为可变参数,可以理解为数组。 * 这里format参数要参考Formatter类。 * */ public static String format(Locale l, String format, Object... args) { return new Formatter(l).format(format, args).toString(); } /** * Returns the string representation of the {@code Object} argument. * * @param obj an {@code Object}. * @return if the argument is {@code null}, then a string equal to * {@code "null"}; otherwise, the value of * {@code obj.toString()} is returned. * @see java.lang.Object#toString() * * */ public static String valueOf(Object obj) { return (obj == null) ? "null" : obj.toString(); } /** * Returns the string representation of the {@code char} array * argument. The contents of the character array are copied; subsequent * modification of the character array does not affect the returned * string. * * @param data the character array. * @return a {@code String} that contains the characters of the * character array. */ public static String valueOf(char data[]) { return new String(data); } /** * Returns the string representation of a specific subarray of the * {@code char} array argument. *

* The {@code offset} argument is the index of the first * character of the subarray. The {@code count} argument * specifies the length of the subarray. The contents of the subarray * are copied; subsequent modification of the character array does not * affect the returned string. * * @param data the character array. * @param offset initial offset of the subarray. * @param count length of the subarray. * @return a {@code String} that contains the characters of the * specified subarray of the character array. * @exception IndexOutOfBoundsException if {@code offset} is * negative, or {@code count} is negative, or * {@code offset+count} is larger than * {@code data.length}. */ public static String valueOf(char data[], int offset, int count) { return new String(data, offset, count); } /** * Equivalent to {@link #valueOf(char[], int, int)}. * * @param data the character array. * @param offset initial offset of the subarray. * @param count length of the subarray. * @return a {@code String} that contains the characters of the * specified subarray of the character array. * @exception IndexOutOfBoundsException if {@code offset} is * negative, or {@code count} is negative, or * {@code offset+count} is larger than * {@code data.length}. * * */ public static String copyValueOf(char data[], int offset, int count) { return new String(data, offset, count); } /** * Equivalent to {@link #valueOf(char[])}. * * @param data the character array. * @return a {@code String} that contains the characters of the * character array. */ public static String copyValueOf(char data[]) { return new String(data); } /** * Returns the string representation of the {@code boolean} argument. * * @param b a {@code boolean}. * @return if the argument is {@code true}, a string equal to * {@code "true"} is returned; otherwise, a string equal to * {@code "false"} is returned. */ public static String valueOf(boolean b) { return b ? "true" : "false"; } /** * Returns the string representation of the {@code char} * argument. * * @param c a {@code char}. * @return a string of length {@code 1} containing * as its single character the argument {@code c}. */ public static String valueOf(char c) { char data[] = {c}; return new String(data, true); } /** * Returns the string representation of the {@code int} argument. *

* The representation is exactly the one returned by the * {@code Integer.toString} method of one argument. * * @param i an {@code int}. * @return a string representation of the {@code int} argument. * @see java.lang.Integer#toString(int, int) */ public static String valueOf(int i) { return Integer.toString(i); } /** * Returns the string representation of the {@code long} argument. *

* The representation is exactly the one returned by the * {@code Long.toString} method of one argument. * * @param l a {@code long}. * @return a string representation of the {@code long} argument. * @see java.lang.Long#toString(long) */ public static String valueOf(long l) { return Long.toString(l); } /** * Returns the string representation of the {@code float} argument. *

* The representation is exactly the one returned by the * {@code Float.toString} method of one argument. * * @param f a {@code float}. * @return a string representation of the {@code float} argument. * @see java.lang.Float#toString(float) */ public static String valueOf(float f) { return Float.toString(f); } /** * Returns the string representation of the {@code double} argument. *

* The representation is exactly the one returned by the * {@code Double.toString} method of one argument. * * @param d a {@code double}. * @return a string representation of the {@code double} argument. * @see java.lang.Double#toString(double) * * 一系列的valueOf就是提供其他类型对象转换为String字符串对象 */ public static String valueOf(double d) { return Double.toString(d); } /** * Returns a canonical representation for the string object. *

* A pool of strings, initially empty, is maintained privately by the * class {@code String}. *

* When the intern method is invoked, if the pool already contains a * string equal to this {@code String} object as determined by * the {@link #equals(Object)} method, then the string from the pool is * returned. Otherwise, this {@code String} object is added to the * pool and a reference to this {@code String} object is returned. *

* It follows that for any two strings {@code s} and {@code t}, * {@code s.intern() == t.intern()} is {@code true} * if and only if {@code s.equals(t)} is {@code true}. *

* All literal strings and string-valued constant expressions are * interned. String literals are defined in section 3.10.5 of the * The Java™ Language Specification. * * @return a string that has the same contents as this string, but is * guaranteed to be from a pool of unique strings. * * native(本地)方法,意味着这和平台有关系,native方法使用专门的本地方法栈,主要用来加载文件和动态链接库, * 由于Java语言无法访问操作系统底层信息,这时候就需要借助C语言完成。被native修饰的方法可以被 C语言重写。 * 这个部分就属于JNI技术范畴。 * * JVM运行时数据区的方法区有一个常量池(但JDK1.6之后常量池被放置在了堆区)。 * 设计该方法的初衷就是重用String对象以节省内存消耗。 * */ public native String intern();

转载于:https://www.cnblogs.com/xiaojiaxuezhang/p/8621093.html

你可能感兴趣的:(2018.3.21 JDK源码分析之String源码分析)