java一面试题解析:Java中默认的行字排序方式是什么?

java一面试题解析:Java中默认的行字排序方式是什么?

昨天听到朋友在西安某公司一技术总监的面试题:

Java中默认的行字排序方式是什么?

朋友想了一下就说是以ASCII码,却被该技术总监否认。该技术总监认为是以汉字区域码来排序。实际上我朋友和该技术总监都是不对的,默认应该是以unicode 码来排序的。下面是JDK1.6文档中关于类String  compareTo方法的说明:

compareTo

public int compareTo(String anotherString)

按字典顺序比较两个字符串。该比较基于字符串中各个字符的 Unicode 值。按字典顺序将此 String 对象表示的字符序列与参数字符串所表示的字符序列进行比较。如果按字典顺序此 String 对象位于参数字符串之前,则比较结果为一个负整数。如果按字典顺序此 String 对象位于参数字符串之后,则比较结果为一个正整数。如果这两个字符串相等,则结果为 0compareTo 只在方法 equals(Object) 返回 true 时才返回 0

这是字典排序的定义。如果这两个字符串不同,那么它们要么在某个索引处的字符不同(该索引对二者均为有效索引),要么长度不同,或者同时具备这两种情况。如果它们在一个或多个索引位置上的字符不同,假设 k 是这类索引的最小值;则在位置 k 上具有较小值的那个字符串(使用 < 运算符确定),其字典顺序在其他字符串之前。在这种情况下,compareTo 返回这两个字符串在位置 k 处两个char 值的差,即值:

 this.charAt(k)-anotherString.charAt(k)

 

如果没有字符不同的索引位置,则较短字符串的字典顺序在较长字符串之前。在这种情况下,compareTo 返回这两个字符串长度的差,即值:

 this.length()-anotherString.length()

 

指定者:

接口 Comparable<String> 中的 compareTo

参数:

anotherString - 要比较的 String

返回:

如果参数字符串等于此字符串,则返回值 0;如果此字符串按字典顺序小于字符串参数,则返回一个小于 0 的值;如果此字符串按字典顺序大于字符串参数,则返回一个大于 0 的值。

 

那么什么情况下才是以汉字区位码排序呢?下面是一段测试代码:

package test;

 

import java.io.UnsupportedEncodingException;

import java.text.Collator;

import java.text.RuleBasedCollator;

import java.util.Locale;

 

public class Sort2 {

    /**

     * @param args

     */

    public static void main(String[] args) {

       String[] test = new String[] { "", "", "", "t", "", "", "", "[", "", "", "", "", "", "", "", "", "", "", "", "", "", "",

              "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",

              "", "", "", "", "", "", "", "", "2", "0", "H", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "",

              "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "1", "", "", "", "", "", "",

              "", "", "", "", "", "", "", "", "", "", "", "", "", "", "8", "0", "H", "", "", "", "", "", "", "", "", "",

              "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "A", "0",

              "H", "", "", "", "", "", "", "", "", "" };

       java.util.Arrays.sort(test);

       System.out.println("============默认排序:");

       for (String key : test) {

           System.out.print(" GB2312: " + key + " = " + getString(key));

           //Unicode

           System.out.print("   Unicode : ");

           char[] s3 = key.toCharArray();

           for (char c : s3) {

              int d = c;

              System.out.print(c + " = " + d + " , ");

           }

           System.out.println();

       }

       System.out.println("============按中文排序:");

       java.util.Arrays.sort(test, (RuleBasedCollator) Collator.getInstance(Locale.CHINA));

       for (String key : test) {

           System.out.print(" GB2312: " + key + " = " + getString(key));

           //Unicode

           System.out.print("   Unicode : ");

           char[] s3 = key.toCharArray();

           for (char c : s3) {

              int d = c;

              System.out.print(c + " = " + d + " , ");

           }

           System.out.println();

       }

       java.util.Arrays.sort(test);

    }

 

    // 汉字转换成区位码

    public static String getString(String chinese) {

       byte[] bs;

       String s = "";

       try {

           bs = chinese.getBytes("GB2312");

           for (int i = 0; i < bs.length; i++) {

              int a = Integer.parseInt(bytes2HexString(bs[i]), 16);

              int f = (a - 0x80 - 0x20);

              s += (f < 10 ? ("0" + f) : f) + "";

           }

       } catch (UnsupportedEncodingException e) {

           e.printStackTrace();

       }

       return s;

    }

 

    public static String bytes2HexString(byte b) {

       return bytes2HexString(new byte[] { b });

    }

 

    public static String bytes2HexString(byte[] b) {

       String ret = "";

       for (int i = 0; i < b.length; i++) {

           String hex = Integer.toHexString(b[i] & 0xFF);

           if (hex.length() == 1) {

              hex = '0' + hex;

           }

           ret += hex.toUpperCase();

       }

       return ret;

    }

}

 

 

通过测试可知,默认是按照Unicode码来排序的,只有指定了Comparator实现类可以比较的对象类型,才按照其规则排序,这里指定了区分语言环境的 String 比较Collator 类:

以下是在不同的环境中运行结果:

英文XP SP3(中文环境),JDK1.6

中文XP SP3(中文环境),JDK1.5

============默认排序:

============默认排序:

 GB2312: 0 = 0-112

Unicode : 0 = 48 ,

 GB2312: 0 = 0-112

Unicode : 0 = 48 ,

 GB2312: 0 = 0-112

Unicode : 0 = 48 ,

 GB2312: 0 = 0-112

Unicode : 0 = 48 ,

 GB2312: 0 = 0-112

Unicode : 0 = 48 ,

 GB2312: 0 = 0-112

Unicode : 0 = 48 ,

 GB2312: 1 = 0-111

Unicode : 1 = 49 ,

 GB2312: 1 = 0-111

Unicode : 1 = 49 ,

 GB2312: 2 = 0-110

Unicode : 2 = 50 ,

 GB2312: 2 = 0-110

Unicode : 2 = 50 ,

 GB2312: 8 = 0-104

Unicode : 8 = 56 ,

 GB2312: 8 = 0-104

Unicode : 8 = 56 ,

 GB2312: A = 0-95

Unicode : A = 65 ,

 GB2312: A = 0-95

Unicode : A = 65 ,

 GB2312: H = 0-88

Unicode : H = 72 ,

 GB2312: H = 0-88

Unicode : H = 72 ,

 GB2312: H = 0-88

Unicode : H = 72 ,

 GB2312: H = 0-88

Unicode : H = 72 ,

 GB2312: H = 0-88

Unicode : H = 72 ,

 GB2312: H = 0-88

Unicode : H = 72 ,

 GB2312: [ = 0-69

Unicode : [ = 91 ,

 GB2312: [ = 0-69

Unicode : [ = 91 ,

 GB2312: t = 0-44

Unicode : t = 116 ,

 GB2312: t = 0-44

Unicode : t = 116 ,

 GB2312: = 0102

Unicode : = 12289 ,

 GB2312: = 0102

Unicode : = 12289 ,

 GB2312: = 0103

Unicode : = 12290 ,

 GB2312: = 0103

Unicode : = 12290 ,

 GB2312: = 0103

Unicode : = 12290 ,

 GB2312: = 0103

Unicode : = 12290 ,

 GB2312: = 4093

Unicode : = 19977 ,

 GB2312: = 4093

Unicode : = 19977 ,

 GB2312: = 3329

Unicode : = 20004 ,

 GB2312: = 3329

Unicode : = 20004 ,

 GB2312: = 3329

Unicode : = 20004 ,

 GB2312: = 3329

Unicode : = 20004 ,

 GB2312: = 3329

Unicode : = 20004 ,

 GB2312: = 3329

Unicode : = 20004 ,

 GB2312: = 3329

Unicode : = 20004 ,

 GB2312: = 3329

Unicode : = 20004 ,

 GB2312: = 2486

Unicode : = 20010 ,

 GB2312: = 2486

Unicode : = 20010 ,

 GB2312: = 2486

Unicode : = 20010 ,

 GB2312: = 2486

Unicode : = 20010 ,

 GB2312: = 2486

Unicode : = 20010 ,

 GB2312: = 2486

Unicode : = 20010 ,

 GB2312: = 2486

Unicode : = 20010 ,

 GB2312: = 2486

Unicode : = 20010 ,

 GB2312: = 4610

Unicode : = 20026 ,

 GB2312: = 4610

Unicode : = 20026 ,

 GB2312: = 4610

Unicode : = 20026 ,

 GB2312: = 4610

Unicode : = 20026 ,

 GB2312: = 4610

Unicode : = 20026 ,

 GB2312: = 4610

Unicode : = 20026 ,

 GB2312: = 5414

Unicode : = 20043 ,

 GB2312: = 5414

Unicode : = 20043 ,

 GB2312: = 2927

Unicode : = 20132 ,

 GB2312: = 2927

Unicode : = 20132 ,

 GB2312: = 2927

Unicode : = 20132 ,

 GB2312: = 2927

Unicode : = 20132 ,

 GB2312: = 4627

Unicode : = 20301 ,

 GB2312: = 4627

Unicode : = 20301 ,

 GB2312: = 4627

Unicode : = 20301 ,

 GB2312: = 4627

Unicode : = 20301 ,

 GB2312: = 4627

Unicode : = 20301 ,

 GB2312: = 4627

Unicode : = 20301 ,

 GB2312: = 4627

Unicode : = 20301 ,

 GB2312: = 4627

Unicode : = 20301 ,

 GB2312: = 5587

Unicode : = 20316 ,

 GB2312: = 5587

Unicode : = 20316 ,

 GB2312: = 3389

Unicode : = 20845 ,

 GB2312: = 3389

Unicode : = 20845 ,

 GB2312: = 3389

Unicode : = 20845 ,

 GB2312: = 3389

Unicode : = 20845 ,

 GB2312: = 2556

Unicode : = 20851 ,

 GB2312: = 2556

Unicode : = 20851 ,

 GB2312: = 3658

Unicode : = 20869 ,

 GB2312: = 3658

Unicode : = 20869 ,

 GB2312: = 3658

Unicode : = 20869 ,

 GB2312: = 3658

Unicode : = 20869 ,

 GB2312: = 3658

Unicode : = 20869 ,

 GB2312: = 3658

Unicode : = 20869 ,

 GB2312: = 3658

Unicode : = 20869 ,

 GB2312: = 3658

Unicode : = 20869 ,

 GB2312: = 2354

Unicode : = 20998 ,

 GB2312: = 2354

Unicode : = 20998 ,

 GB2312: = 2354

Unicode : = 20998 ,

 GB2312: = 2354

Unicode : = 20998 ,

 GB2312: = 2354

Unicode : = 20998 ,

 GB2312: = 2354

Unicode : = 20998 ,

 GB2312: = 2354

Unicode : = 20998 ,

 GB2312: = 2354

Unicode : = 20998 ,

 GB2312: = 1780

Unicode : = 21035 ,

 GB2312: = 1780

Unicode : = 21035 ,

 GB2312: = 1780

Unicode : = 21035 ,

 GB2312: = 1780

Unicode : = 21035 ,

 GB2312: = 1780

Unicode : = 21035 ,

 GB2312: = 1780

Unicode : = 21035 ,

 GB2312: = 1780

Unicode : = 21035 ,

 GB2312: = 1780

Unicode : = 21035 ,

 GB2312: = 2129

Unicode : = 21040 ,

 GB2312: = 2129

Unicode : = 21040 ,

 GB2312: = 2129

Unicode : = 21040 ,

 GB2312: = 2129

Unicode : = 21040 ,

 GB2312: = 2129

Unicode : = 21040 ,

 GB2312: = 2129

Unicode : = 21040 ,

 GB2312: = 5438

Unicode : = 21046 ,

 GB2312: = 5438

Unicode : = 21046 ,

 GB2312: = 5438

Unicode : = 21046 ,

 GB2312: = 5438

Unicode : = 21046 ,

 GB2312: = 5438

Unicode : = 21046 ,

 GB2312: = 5438

Unicode : = 21046 ,

 GB2312: = 5438

Unicode : = 21046 ,

 GB2312: = 5438

Unicode : = 21046 ,

 GB2312: = 2851

Unicode : = 21152 ,

 GB2312: = 2851

Unicode : = 21152 ,

 GB2312: = 2851

Unicode : = 21152 ,

 GB2312: = 2851

Unicode : = 21152 ,

 GB2312: = 2851

Unicode : = 21152 ,

 GB2312: = 2851

Unicode : = 21152 ,

 GB2312: = 2851

Unicode : = 21152 ,

 GB2312: = 2851

Unicode : = 21152 ,

 GB2312: = 3988

Unicode : = 21306 ,

 GB2312: = 3988

Unicode : = 21306 ,

 GB2312: = 3988

Unicode : = 21306 ,

 GB2312: = 3988

Unicode : = 21306 ,

 GB2312: = 3988

Unicode : = 21306 ,

 GB2312: = 3988

Unicode : = 21306 ,

 GB2312: = 4214

Unicode : = 21313 ,

 GB2312: = 4214

Unicode : = 21313 ,

 GB2312: = 4214

Unicode : = 21313 ,

 GB2312: = 4214

Unicode : = 21313 ,

 GB2312: = 4214

Unicode : = 21313 ,

 GB2312: = 4214

Unicode : = 21313 ,

 GB2312: = 4214

Unicode : = 21313 ,

 GB2312: = 4214

Unicode : = 21313 ,

 GB2312: = 2820

Unicode : = 21363 ,

 GB2312: = 2820

Unicode : = 21363 ,

 GB2312: = 2683

Unicode : = 21518 ,

 GB2312: = 2683

Unicode : = 21518 ,

 GB2312: = 2683

Unicode : = 21518 ,

 GB2312: = 2683

Unicode : = 21518 ,

 GB2312: = 2645

Unicode : = 21644 ,

 GB2312: = 2645

Unicode : = 21644 ,

 GB2312: = 1601

Unicode : = 21834 ,

 GB2312: = 1601

Unicode : = 21834 ,

 GB2312: = 2590

Unicode : = 22269 ,

 GB2312: = 2590

Unicode : = 22269 ,

 GB2312: = 2590

Unicode : = 22269 ,

 GB2312: = 2590

Unicode : = 22269 ,

 GB2312: = 2590

Unicode : = 22269 ,

 GB2312: = 2590

Unicode : = 22269 ,

 GB2312: = 2590

Unicode : = 22269 ,

 GB2312: = 2590

Unicode : = 22269 ,

 GB2312: = 5554

Unicode : = 23383 ,

 GB2312: = 5554

Unicode : = 23383 ,

 GB2312: = 5554

Unicode : = 23383 ,

 GB2312: = 5554

Unicode : = 23383 ,

 GB2312: = 5554

Unicode : = 23383 ,

 GB2312: = 5554

Unicode : = 23383 ,

 GB2312: = 5554

Unicode : = 23383 ,

 GB2312: = 5554

Unicode : = 23383 ,

 GB2312: = 5554

Unicode : = 23383 ,

 GB2312: = 5554

Unicode : = 23383 ,

 GB2312: = 5554

Unicode : = 23383 ,

 GB2312: = 5554

Unicode : = 23383 ,

 GB2312: = 5554

Unicode : = 23383 ,

 GB2312: = 5554

Unicode : = 23383 ,

 GB2312: = 2252

Unicode : = 23545 ,

 GB2312: = 2252

Unicode : = 23545 ,

 GB2312: = 2252

Unicode : = 23545 ,

 GB2312: = 2252

Unicode : = 23545 ,

 GB2312: = 2252

Unicode : = 23545 ,

 GB2312: = 2252

Unicode : = 23545 ,

 GB2312: = 5106

Unicode : = 24212 ,

 GB2312: = 5106

Unicode : = 24212 ,

 GB2312: = 5106

Unicode : = 24212 ,

 GB2312: = 5106

Unicode : = 24212 ,

 GB2312: = 5106

Unicode : = 24212 ,

 GB2312: = 5106

Unicode : = 24212 ,

 GB2312: = 2135

Unicode : = 24471 ,

 GB2312: = 2135

Unicode : = 24471 ,

 GB2312: = 2135

Unicode : = 24471 ,

 GB2312: = 2135

Unicode : = 24471 ,

 GB2312: = 2135

Unicode : = 24471 ,

 GB2312: = 2135

Unicode : = 24471 ,

 GB2312: = 4650

Unicode : = 25105 ,

 GB2312: = 4650

Unicode : = 25105 ,

 GB2312: = 2727

Unicode : = 25442 ,

 GB2312: = 2727

Unicode : = 25442 ,

 GB2312: = 2727

Unicode : = 25442 ,

 GB2312: = 2727

Unicode : = 25442 ,

 GB2312: = 2727

Unicode : = 25442 ,

 GB2312: = 2727

Unicode : = 25442 ,

 GB2312: = 2727

Unicode : = 25442 ,

 GB2312: = 2727

Unicode : = 25442 ,

 GB2312: = 4239

Unicode : = 26159 ,

 GB2312: = 4239

Unicode : = 26159 ,

 GB2312: = 5578

Unicode : = 26368 ,

 GB2312: = 5578

Unicode : = 26368 ,

 GB2312: = 2790

Unicode : = 26426 ,

 GB2312: = 2790

Unicode : = 26426 ,

 GB2312: = 2790

Unicode : = 26426 ,

 GB2312: = 2790

Unicode : = 26426 ,

 GB2312: = 2790

Unicode : = 26426 ,

 GB2312: = 2790

Unicode : = 26426 ,

 GB2312: = 2790

Unicode : = 26426 ,

 GB2312: = 2790

Unicode : = 26426 ,

 GB2312: = 1774

Unicode : = 26631 ,

 GB2312: = 1774

Unicode : = 26631 ,

 GB2312: = 1774

Unicode : = 26631 ,

 GB2312: = 1774

Unicode : = 26631 ,

 GB2312: = 1774

Unicode : = 26631 ,

 GB2312: = 1774

Unicode : = 26631 ,

 GB2312: = 1774

Unicode : = 26631 ,

 GB2312: = 1774

Unicode : = 26631 ,

 GB2312: = 2626

Unicode : = 27721 ,

 GB2312: = 2626

Unicode : = 27721 ,

 GB2312: = 2626

Unicode : = 27721 ,

 GB2312: = 2626

Unicode : = 27721 ,

 GB2312: = 2626

Unicode : = 27721 ,

 GB2312: = 2626

Unicode : = 27721 ,

 GB2312: = 1866

Unicode : = 27979 ,

 GB2312: = 1866

Unicode : = 27979 ,

 GB2312: = 6815

Unicode : = 27983 ,

 GB2312: = 6815

Unicode : = 27983 ,

 GB2312: = 2136

Unicode : = 30340 ,

 GB2312: = 2136

Unicode : = 30340 ,

 GB2312: = 2136

Unicode : = 30340 ,

 GB2312: = 2136

Unicode : = 30340 ,

 GB2312: = 2136

Unicode : = 30340 ,

 GB2312: = 2136

Unicode : = 30340 ,

 GB2312: = 2136

Unicode : = 30340 ,

 GB2312: = 2136

Unicode : = 30340 ,

 GB2312: = 2136

Unicode : = 30340 ,

 GB2312: = 2136

Unicode : = 30340 ,

 GB2312: = 2136

Unicode : = 30340 ,

 GB2312: = 2136

Unicode : = 30340 ,

 GB2312: = 2136

Unicode : = 30340 ,

 GB2312: = 2136

Unicode : = 30340 ,

 GB2312: = 2136

Unicode : = 30340 ,

 GB2312: = 2136

Unicode : = 30340 ,

 GB2312: = 8010

Unicode : = 30361 ,

 GB2312: = 8010

Unicode : = 30361 ,

 GB2312: = 3475

Unicode : = 30721 ,

 GB2312: = 3475

Unicode : = 30721 ,

 GB2312: = 3475

Unicode : = 30721 ,

 GB2312: = 3475

Unicode : = 30721 ,

 GB2312: = 3475

Unicode : = 30721 ,

 GB2312: = 3475

Unicode : = 30721 ,

 GB2312: = 3475

Unicode : = 30721 ,

 GB2312: = 3475

Unicode : = 30721 ,

 GB2312: = 3475

Unicode : = 30721 ,

 GB2312: = 3475

Unicode : = 30721 ,

 GB2312: = 3475

Unicode : = 30721 ,

 GB2312: = 3475

Unicode : = 30721 ,

 GB2312: = 3475

Unicode : = 30721 ,

 GB2312: = 3475

Unicode : = 30721 ,

 GB2312: = 3475

Unicode : = 30721 ,

 GB2312: = 3475

Unicode : = 30721 ,

 GB2312: = 3475

Unicode : = 30721 ,

 GB2312: = 3475

Unicode : = 30721 ,

 GB2312: = 3475

Unicode : = 30721 ,

 GB2312: = 3475

Unicode : = 30721 ,

 GB2312: = 3475

Unicode : = 30721 ,

 GB2312: = 3475

Unicode : = 30721 ,

 GB2312: = 3475

Unicode : = 30721 ,

 GB2312: = 3475

Unicode : = 30721 ,

 GB2312: = 3475

Unicode : = 30721 ,

 GB2312: = 3475

Unicode : = 30721 ,

 GB2312: = 4721

Unicode : = 31995 ,

 GB2312: = 4721

Unicode : = 31995 ,

 GB2312: = 5363

Unicode : = 32773 ,

 GB2312: = 5363

Unicode : = 32773 ,

 GB2312: = 2958

Unicode : = 33410 ,

 GB2312: = 2958

Unicode : = 33410 ,

 GB2312: = 2958

Unicode : = 33410 ,

 GB2312: = 2958

Unicode : = 33410 ,

 GB2312: = 2958

Unicode : = 33410 ,

 GB2312: = 2958

Unicode : = 33410 ,

 GB2312: = 2958

Unicode : = 33410 ,

 GB2312: = 2958

Unicode : = 33410 ,

 GB2312: = 5510

Unicode : = 36716 ,

 GB2312: = 5510

Unicode : = 36716 ,

 GB2312: = 5510

Unicode : = 36716 ,

 GB2312: = 5510

Unicode : = 36716 ,

 GB2312: = 2988

Unicode : = 36827 ,

 GB2312: = 2988

Unicode : = 36827 ,

 GB2312: = 2988

Unicode : = 36827 ,

 GB2312: = 2988

Unicode : = 36827 ,

 GB2312: = 2988

Unicode : = 36827 ,

 GB2312: = 2988

Unicode : = 36827 ,

 GB2312: = 2988

Unicode : = 36827 ,

 GB2312: = 2988

Unicode : = 36827 ,

 GB2312: = 7946

Unicode : = 38210 ,

 GB2312: = 7946

Unicode : = 38210 ,

 GB2312: = 2868

Unicode : = 38388 ,

 GB2312: = 2868

Unicode : = 38388 ,

 GB2312: = 2463

Unicode : = 39640 ,

 GB2312: = 2463

Unicode : = 39640 ,

 GB2312: = 0308

Unicode : = 65288 ,

 GB2312: = 0308

Unicode : = 65288 ,

 GB2312: = 0308

Unicode : = 65288 ,

 GB2312: = 0308

Unicode : = 65288 ,

 GB2312: = 0308

Unicode : = 65288 ,

 GB2312: = 0308

Unicode : = 65288 ,

 GB2312: = 0308

Unicode : = 65288 ,

 GB2312: = 0308

Unicode : = 65288 ,

 GB2312: = 0309

Unicode : = 65289 ,

 GB2312: = 0309

Unicode : = 65289 ,

 GB2312: = 0309

Unicode : = 65289 ,

 GB2312: = 0309

Unicode : = 65289 ,

 GB2312: = 0309

Unicode : = 65289 ,

 GB2312: = 0309

你可能感兴趣的:(JAVA)