JDK 源码学习与分析(一)

1、java.lang.String

    1 基本定义

public final class String implements java.io.Serializable, Comparable, CharSequence {
     /** The value is used for character storage. */
    private final char value[];

    /** The offset is the first index of the storage that is used. */
    private final int offset;

    /** The count is the number of characters in the String. */
    private final int count;

    /** Cache the hash code for the string */
    private int hash; // Default to 0

    /** use serialVersionUID from JDK 1.0.2 for interoperability */
    private static final long serialVersionUID = -6849794470754667710L;

可以看出,String 类实现了 Serializable 接口代表其可被序列化处理, 且被final 修饰,故其不能被继承,故保证了其jdk中基本类的特性,

其值则由final类型的char数组保存其值,所以其值一旦被初始化就不会被改变,即使 String s1= "s1";s1='"s2";需要明白的是只是使s1指向了另外一个字符串地址,另外指定缓存字符串的hash值为0。

可以看出String 的实现是基于一个char数组而来 

2  构造方法

    

  public String(String original) {
        int size = original.count;
        char[] originalValue = original.value;
        char[] v;
        if (originalValue.length > size) {
            // The array representing the String is bigger than the new
            // String itself.  Perhaps this constructor is being called
            // in order to trim the baggage, so make a copy of the array.
            int off = original.offset;
            v = Arrays.copyOfRange(originalValue, off, off+size);
        } else {
            // The array representing the String is the same
            // size as the String, so no point in making a copy.
            v = originalValue;
        }
        this.offset = 0;
        this.count = size;
        this.value = v;
    }

 
    public String(char value[]) {
        int size = value.length;
        this.offset = 0;
        this.count = size;
        this.value = Arrays.copyOf(value, size);
    }


    public String(char value[], int offset, int count) {
        if (offset < 0) {
            throw new StringIndexOutOfBoundsException(offset);
        }
        if (count < 0) {
            throw new StringIndexOutOfBoundsException(count);
        }
        // Note: offset or count might be near -1>>>1.
        if (offset > value.length - count) {
            throw new StringIndexOutOfBoundsException(offset + count);
        }
        this.offset = 0;
        this.count = count;
        this.value = Arrays.copyOfRange(value, offset, offset+count);
    }

 此处我们只需要关心它的三个构造方法,默认构造方法就不分析了,剩下两个均以接受char数组为参数的构造方法

一个是直接获取char数组的长度,然后调用array.copy方法(系统中的system.copyOf()

 public static char[] copyOf(char[] original, int newLength) {
        char[] copy = new char[newLength];
        System.arraycopy(original, 0, copy, 0,
                         Math.min(original.length, newLength));
        return copy;
    }

另一个构造方法不同的地方在于传入了长度和偏移量,然后同样调用的是上述地城的system.copyOf()

3 常用方法

 length(),empty() 通过检查数组的长度来返回数组

 而charAt()方法,则通过数组的下标,返回数组当前的char元素

 

 public int length() {
        return count;
    }

    /**
     * Returns true if, and only if, {@link #length()} is 0.
     *
     * @return true if {@link #length()} is 0, otherwise
     * false
     *
     * @since 1.6
     */
    public boolean isEmpty() {
        return count == 0;
    }

    /**
     * Returns the char value at the
     * specified index. An index ranges from 0 to
     * length() - 1. The first char value of the sequence
     * is at index 0, the next at index 1,
     * and so on, as for array indexing.
     *
     * 

If the char value specified by the index is a * surrogate, the surrogate * value is returned. * * @param index the index of the char value. * @return the char value at the specified index of this string. * The first char value is at index 0. * @exception IndexOutOfBoundsException if the index * argument is negative or not less than the length of this * string. */ public char charAt(int index) { if ((index < 0) || (index >= count)) { throw new StringIndexOutOfBoundsException(index); } return value[index + offset]; }

codePointAt()

码点&代码单元,是从Unicode标准而来的术语,Unicode标准的核心是一个编码字符集,
它为每一个字符分配一个唯一数字。Unicode标准始终使用16进制数字,并且在书写时在前面加上U+,
如字符“A”的编码为“U+0041”。
代码点是指可用于编码字符集的数字。编码字符集定义一个有效的代码点范围,
但是并不一定将字符分配给所有这些代码点。有效的Unicode代码点范围是U+0000至U+10FFFF。
Unicode4.0将字符分配给一百多万个代码点中的96382个代码点。
    public int codePointAt(int index) {
        if ((index < 0) || (index >= count)) {
            throw new StringIndexOutOfBoundsException(index);
        }
        return Character.codePointAtImpl(value, offset + index, offset + count);
    }

所以该方法的作用就是返回给定索引的代码点 

接下来比较重点的两个方法

equals() 和 compareTo()

public boolean equals(Object anObject) {
        if (this == anObject) {
            return true;
        }
        if (anObject instanceof String) {
            String anotherString = (String)anObject;
            int n = count;
            if (n == anotherString.count) {
                char v1[] = value;
                char v2[] = anotherString.value;
                int i = offset;
                int j = anotherString.offset;
                while (n-- != 0) {
                    if (v1[i++] != v2[j++])
                        return false;
                }
                return true;
            }
        }
        return false;
    }

equals() 可以看出String 是被类来对待的,然后通过遍历类的属性来对 数组的长度和value进行对比的

 public int compareTo(String anotherString) {
        int len1 = count;
        int len2 = anotherString.count;
        int n = Math.min(len1, len2);
        char v1[] = value;
        char v2[] = anotherString.value;
        int i = offset;
        int j = anotherString.offset;

        if (i == j) {
            int k = i;
            int lim = n + i;
            while (k < lim) {
                char c1 = v1[k];
                char c2 = v2[k];
                if (c1 != c2) {
                    return c1 - c2;
                }
                k++;
            }
        } else {
            while (n-- != 0) {
                char c1 = v1[i++];
                char c2 = v2[j++];
                if (c1 != c2) {
                    return c1 - c2;
                }
            }
        }
        return len1 - len2;
    }

 

compareTo()实际上也是对数组进行遍历,然后比较大小。

其他的方法的话,就不记下了

 

 

 

你可能感兴趣的:(JAVA源码分析)