java源码-String源码解析

Java中的String类是一个非常重要的类,因为它被广泛地应用在Java编程中。在这里,我们将对String类的源代码进行分析和解释。

String类是Java语言中的一个特殊类,因为它是用来表示字符串的类。这个类是final的,这就意味着它不能被继承。在Java中,字符串被定义为一个数组,这个数组中的每一个元素都是一个char类型的值。这些字符组成了字符串。下面是String类的源码:


public final class String
    implements java.io.Serializable, Comparable, CharSequence {
    /** The value is used for character storage. */
    private final char value[];
    /** The offset is the first index of the storage that is used. */
    private final int offset;
    /** The count is the number of characters in the String. */
    private final int count;
    /** Cache the hash code for the string */
    private int hash; // Default to 0

    /**
     * Initializes a newly created {@code String} object so that it represents
     * an empty character sequence.
     */
    public String() {
        this.value = "".toCharArray();
        this.offset = 0;
        this.count = 0;
    }

    /**
     * Initializes a newly created {@code String} object so that it represents
     * the same sequence of characters as the argument; in other words, the newly
     * created string is a copy of the argument string.
     *
     * @param  original
     *         A {@code String}
     */
    public String(String original) {
        this.value = original.value;
        this.offset = original.offset;
        this.count = original.count;
        this.hash = original.hash;
    }

    /**
     * Allocates a new {@code String} so that it represents the sequence of
     * characters currently contained in the character array argument. The
     * contents of the character array are copied; subsequent modification of
     * the character array does not affect the newly created string.
     *
     * @param  value
     *         The initial value of the string
     */
    public String(char value[]) {
        this.value = Arrays.copyOf(value, value.length);
        this.offset = 0;
        this.count = value.length;
    }

    /**
     * Allocates a new {@code String} that contains characters from a subarray
     * of the character array argument. The {@code offset} argument is the
     * index of the first character of the subarray and the {@code count}
     * argument specifies the length of the subarray. The contents of the
     * subarray are copied; subsequent modification of the character array does
     * not affect the newly created string.
     *
     * @param  value
     *         Array that is the source of characters
     *
     * @param  offset
     *         The initial offset
     *
     * @param  count
     *         The length
     */
    public String(char value[], int offset, int count) {
        if (offset < 0) {
            throw new StringIndexOutOfBoundsException(offset);
        }
        if (count <= 0) {
            if (count < 0) {
                throw new StringIndexOutOfBoundsException(count);
            }
            if (offset <= value.length) {
                this.value = "".toCharArray();
                this.offset = 0;
                this.count = 0;
                return;
            }
        }
        // Note: offset or count might be near -1>>>1.
        if (offset > value.length - count) {
            throw new StringIndexOutOfBoundsException(offset + count);
        }
        this.value = Arrays.copyOfRange(value, offset, offset + count);
        this.offset = 0;
        this.count = count;
    }

    /**
     * Allocates a new {@code String} constructed from a subarray of an array
     * of 8-bit integer values.
     *
     * @param ascii
     *        The bytes to be converted to characters
     *
     * @param hibyte
     *        The top 8 bits of each 16-bit Unicode character
     * 
     * @deprecated  This method does not properly convert bytes into characters.
     *             As of JDK 1.1, the preferred way to do this is via the
     *             {@code String} constructors that take a character-encoding
     *             name or that use the platform's default encoding.
     * @see  String#String(byte[], int)
     * @see  String#String(byte[], int, int, java.lang.String)
     * @see  String#String(byte[], java.lang.String)
     */
    @Deprecated(since="1.1")
    public String(byte ascii[], int hibyte) {
        this.value = StringCoding.decode( hibyte,
                                          ascii, 0, ascii.length);
        this.offset = 0;
        this.count = value.length;
    }

    /**
     * Allocates a new {@code String} containing characters constructed from
     * an array of 8-bit integer values. Each character c in the resulting
     * string is constructed from the corresponding component b in the
     * byte array such that:
     *
     *


     * {@code c == (char)(((hibyte & 0xff) << 8) | (b & 0xff))}
     *

     *
     * @deprecated This method does not properly convert bytes into characters.
     *             As of JDK 1.1, the preferred way to do this is via the
     *             {@code String} constructors that take a character-encoding
     *             name or that use the platform's default encoding.
     * @see  String#String(byte[], int)
     * @see  String#String(byte[], int, int, java.lang.String)
     * @see  String#String(byte[], java.lang.String)
     */
    @Deprecated(since="1.1")
    public String(byte ascii[]) {
        this(ascii, 0, ascii.length);
    }

    /**
     * Allocates a new {@code String} containing characters constructed from
     * an array of 16-bit integer values.
     *
     *

The {@code offset} argument is the index of the first character of
     * the subarray, and the {@code count} argument specifies the length of the
     * subarray.
     *

Each character in the resulting string is constructed from the
     * corresponding component of the byte array argument. The method does not
     * return a new {@code String} object but populates the character array
     * passed as the first argument with the result.
     *
     * @param  codePoints
     *         The code points
     *
     * @param  offset
     *         The initial offset

     * @param  count
     *         The length
     *
     * @throws  IllegalArgumentException
     *          If any invalid Unicode code point is found in {@code codePoints}
     *
     * @since  1.5
     */
    public String(int[] codePoints, int offset, int count) {
        if (offset < 0 || count < 0 || offset > codePoints.length - count) {
            throw new IllegalArgumentException();
        }
        // count*2 is a safe bound on the storage needed.
        final int end = offset + count;
        int n = count;
        for (int i = offset; i < end; i++) {
            int c = codePoints[i];
            if (Character.isBmpCodePoint(c))
                continue;
            else if (Character.isValidCodePoint(c))
                n++;
            else throw new IllegalArgumentException();
        }
        final char[] v = new char[n];
        for (int i = offset, j = 0; i < end; i++, j++) {
            int c = codePoints[i];
            if (Character.isBmpCodePoint(c))
                v[j] = (char)c;
            else
                Character.toSurrogates(c, v, j++);
        }
        this.value = v;
        this.offset = 0;
        this.count = n;
    }

    /**
     * Initializes a newly created {@code String} object so that it represents
     * the same sequence of characters as the argument; in other words, the
     * newly created string is a copy of the argument string. Unless an
     * explicit copy of {@code original} is needed, use of this constructor is
     * unnecessary since Strings are immutable.
     *
     * @param  original
     *         A {@code String}
     */
    public String(StringBuffer original) {
        synchronized(original) {
            this.value = Arrays.copyOf(original.getValue(), original.length());
            this.offset = 0;
            this.count = original.length();
        }
    }

    /**
     * Initializes a newly created {@code String} object so that it represents
     * the same sequence of characters as the argument; in other words, the
     * newly created string is a copy of the argument string. Unless an
     * explicit copy of {@code original} is needed, use of this constructor is
     * unnecessary since Strings are immutable.
     *
     * @param  original
     *         A {@code CharSequence}
     *
     * @since 1.4
     */
    public String(CharSequence original) {
        // The subSequence call

Java中的String类是一个非常重要的类,它代表了一个不可改变的字符序列。下面是Java中String类的源码解析:

1. 源码目录结构

在JDK源码中,String类的源码位于`java.lang`包下,文件名为`String.java`。

2. 常用方法

(1) 构造方法

String类提供了多个构造方法,可以从不同的数据类型中创建字符串对象。其中,最常用的构造方法是通过传递一个字节数组或字符数组来创建一个字符串对象。

```java
public String(byte[] bytes)
public String(byte[] bytes, Charset charset)
public String(byte[] bytes, int offset, int length)
public String(byte[] bytes, int offset, int length, Charset charset)
public String(char[] value)
public String(char[] value, int offset, int count)
public String(String original)
```

(2) 字符串操作

String类提供了一系列方法来操作字符串,例如:

- `length()`:返回字符串的长度
- `charAt(int index)`:返回指定索引处的字符
- `substring(int beginIndex, int endIndex)`:返回一个子字符串,包含从beginIndex到endIndex-1的字符
- `concat(String str)`:将参数字符串连接到原字符串末尾
- `trim()`:返回一个去除了前导空格和尾部空格的字符串
- `toLowerCase()`:将字符串中的所有字符转换为小写字母
- `toUpperCase()`:将字符串中的所有字符转换为大写字母
- `replace(char oldChar, char newChar)`:返回一个新的字符串,它是原字符串中所有出现的oldChar字符替换为newChar字符后的结果

(3) 字符串比较

String类提供了多个方法来比较字符串,例如:

- `equals(Object anObject)`:比较字符串对象是否相等
- `equalsIgnoreCase(String anotherString)`:比较字符串对象是否相等,忽略大小写
- `compareTo(String anotherString)`:比较两个字符串的顺序关系,返回一个整数
- `startsWith(String prefix)`:判断该字符串是否以指定的前缀开头
- `endsWith(String suffix)`:判断该字符串是否以指定的后缀结尾
- `contains(CharSequence s)`:判断该字符串是否包含指定的字符序列

(4) 字符串转换

String类提供了多个方法来进行字符串与其他数据类型之间的转换,例如:

- `valueOf(int i)`:将一个整数转换为字符串类型
- `valueOf(boolean b)`:将一个布尔值转换为字符串类型
- `valueOf(char c)`:将一个字符转换为字符串类型
- `valueOf(double d)`:将一个双精度浮点数转换为字符串类型
- `valueOf(float f)`:将一个单精度浮点数转换为字符串类型
- `valueOf(long l)`:将一个长整型数转换为字符串类型

(5) 其他方法

- `intern()`:返回字符串对象的规范化表示

3. 源码实现

String类的实现主要是依靠`char[]`存储字符串的字符,而且Java中的String对象是不可变的。也就是说,一旦一个String对象被创建,它的内容就不能够改变了。

在Java的内存模型中,String对象是被放在堆(heap)中的。不过,由于String对象是不可变的,因此它的值可以被缓存(hash)起来,以便于提高系统的性能。

String类中有一个私有属性`value`,它表示了这个字符串对象的字符数组,跟`final`关键字一起使用,保证了这个字符数组的内容不能被改变。

在String类中,所有的方法都是线程安全的,因为Java中的所有类的方法都是默认同步的(synchronized)。这意味着在多线程环境中,多个线程对同一个String对象进行调用时,不会出现并发问题。

你可能感兴趣的:(java,开发语言,java,开源)