将字符串按照一定的规律转换成字符串数组,我们很容易想到使用String.split(String)方法。
的确String的split方法很方便,但是对于性能要求高的应用,string.split(String)将会花费更多的性能需求
我们可以使用java.util.StringTokenizer来代替String.split()方法,性能上也有一定的提升。
以下通过例子比较两者的性能消耗
String str = "abc"; StringBuffer buffer = new StringBuffer(); // prepare the string for (int i = 0; i < 10000; i ++){ buffer.append(str).append(","); } str = buffer.toString(); // java.util.StringTokenizer long curTime = System.currentTimeMillis(); for (int m = 0; m < 1000; m ++){ StringTokenizer token = new StringTokenizer(str, ","); String[] array2 = new String[token.countTokens()]; int i = 0; while (token.hasMoreTokens()){ array2[i++] = token.nextToken(); } } System.out.println("java.util.StringTokener : " + (System.currentTimeMillis() - curTime)); // String.split() curTime = System.currentTimeMillis(); for (int m = 0; m < 1000; m ++){ String[] array = str.split(","); } System.out.println("String.split : " + (System.currentTimeMillis() - curTime)); curTime = System.currentTimeMillis(); for (int n = 0; n < 1000; n ++){ Vector<String> vector= new Vector<String>(); int index = 0, offset = 0; while ((index = str.indexOf(",", index + 1)) != -1){ vector.addElement(str.substring(offset, index)); offset = index + 1; } String[] array3 = vector.toArray(new String[0]); } System.out.println("Vector & indexOf : " + (System.currentTimeMillis() - curTime));
输出----
java.util.StringTokener : 1407
String.split : 2546
Vector & indexOf : 1094
很显眼,使用StringTokenizer比使用Spring.split()提高接近一倍的性能。
而是用indexOf来逐步查找,性能还能进一步提高25%左右。很显然,越接近底层的方法性能越得到满足。
不过,这个只是在于对性能要求高的需求底下才有真正的意义。普通应用,String.split()足以
补充一点:
使用String.indexOf()去扫描的时候,如果使用ArrayList或者Vector(两者性能基本上没多大区别)也不是最优方案
还有可以提高更好的性能的方法,就是先扫描有多少个分割符,用String[] 来存贮,比使用Vector要提高一倍左右的性能
如果还需要更进一步,那么就需要使用好的扫描算法了。
public static String[] split(String s, String delimiter){ if (s == null) { return null; } int delimiterLength; int stringLength = s.length(); if (delimiter == null || (delimiterLength = delimiter.length()) == 0){ return new String[] {s}; } // a two pass solution is used because a one pass solution would // require the possible resizing and copying of memory structures // In the worst case it would have to be resized n times with each // resize having a O(n) copy leading to an O(n^2) algorithm. int count; int start; int end; // Scan s and count the tokens. count = 0; start = 0; while((end = s.indexOf(delimiter, start)) != -1){ count++; start = end + delimiterLength; } count++; // allocate an array to return the tokens, // we now know how big it should be String[] result = new String[count]; // Scan s again, but this time pick out the tokens count = 0; start = 0; while((end = s.indexOf(delimiter, start)) != -1){ result[count] = (s.substring(start, end)); count++; start = end + delimiterLength; } end = stringLength; result[count] = s.substring(start, end); return (result); }