Think in Java第四版 读书笔记7第13章 字符串

本章内容

1.string的基本使用
2.string拼接符 +
3.Object方法toString
4.String的常用方法
5.String的格式化输出
6.正则表达式

13.1 不可变字符串String

//此处可以参考我以前写的一篇关于java值传递的文章
//https://blog.csdn.net/u011109881/article/details/80458946
//不可变字符串String
public class Immutable {
  public static String upcase(String s) {
	  //s的作用范围仅限于当前方法内
    return s.toUpperCase();
  }
  public static void main(String[] args) {
    String q = "howdy";
    System.out.println(q); // howdy
    //参数q进行了引用拷贝 原来的值不动
    String qq = upcase(q);//创建了新的String 赋值为大写字符
    
    System.out.println(qq);  
    System.out.println(q); // 旧值没有改变
  }
} /* Output:
howdy
HOWDY
howdy
*///:~

13.2 重载的“+” 与StringBuilder

由于String是不可变的,它的任何引用无法改变它的值,但是不可变性会导致一些效率问题。+和+=操作符可以用于拼接字符串
比如

//javap -c Concatenation可以反编译代码
//执行javap -c Concatenation 之前要先执行javap -c Concatenation编译生成class文件
//否则报错class not found: Concatenation
public class Concatenation {
	void test() {
	}

	public static void main(String[] args) {
		String mango = "mango";
		String s = "abc" + mango + "def" + 47;
		System.out.println(s);
	}
}
/**

以上代码的实际工作过程:

D:\code\javatest\13string\src\strings>javap -c Concatenation
Warning: Binary file Concatenation contains strings.Concatenation
Compiled from "Concatenation.java"
public class strings.Concatenation {
  public strings.Concatenation();
    Code:
       0: aload_0
       1: invokespecial #1                  // Method java/lang/Object."":()V
       4: return

  void test();
    Code:
       0: return

  public static void main(java.lang.String[]);
    Code:
       0: ldc           #2                  // String mango
       2: astore_1
       3: new           #3                  // class java/lang/StringBuilder
       6: dup
       7: invokespecial #4                  // Method java/lang/StringBuilder."":()V
      10: ldc           #5                  // String abc
      12: invokevirtual #6                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
      15: aload_1
      16: invokevirtual #6                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
      19: ldc           #7                  // String def
      21: invokevirtual #6                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
      24: bipush        47
      26: invokevirtual #8                  // Method java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder;
      29: invokevirtual #9                  // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
      32: astore_2
      33: getstatic     #10                 // Field java/lang/System.out:Ljava/io/PrintStream;
      36: aload_2
      37: invokevirtual #11                 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
      40: return
}
 * **/
//虽然再程序中我们没有使用StringBuilder,但是Java内部自己使用了,我们看到他调用了4次StringBuilder.append形成了最后的字符串
//也就是说+字符其实使用StringBuilder进行字符串拼接

那么String和StringBuilder的区别在哪?我们可以随意使用String么 因为Java会自动为我们优化,在内部使用StringBuilder,看下一个例子

//String VS StringBuilder
//create two method to add all string in string array.
public class WhitherStringBuilder {
  public String implicit(String[] fields) {//内部使用String拼接
    String result = "";
    for(int i = 0; i < fields.length; i++)
      result += fields[i];
    return result;
  }
  public String explicit(String[] fields) {//内部使用StringBuilder拼接
    StringBuilder result = new StringBuilder();
    for(int i = 0; i < fields.length; i++)
      result.append(fields[i]);
    return result.toString();
  }
} ///:~
/**
javap -c WhitherStringBuilder.class
Compiled from "WhitherStringBuilder.java"
public class strings.WhitherStringBuilder {
  public strings.WhitherStringBuilder();
    Code:
       0: aload_0
       1: invokespecial #1                  // Method java/lang/Object."":()V
       4: return

  public java.lang.String implicit(java.lang.String[]);
    Code:
       0: ldc           #2                  // String
       2: astore_2
       3: iconst_0
       4: istore_3
       5: iload_3
       6: aload_1
       7: arraylength
       8: if_icmpge     38
      11: new           #3                  // class java/lang/StringBuilder
      14: dup
      15: invokespecial #4                  // Method java/lang/StringBuilder."":()V
      18: aload_2
      19: invokevirtual #5                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
      22: aload_1
      23: iload_3
      24: aaload
      25: invokevirtual #5                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
      28: invokevirtual #6                  // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
      31: astore_2
      32: iinc          3, 1
      35: goto          5
      38: aload_2
      39: areturn

  public java.lang.String explicit(java.lang.String[]);
    Code:
       0: new           #3                  // class java/lang/StringBuilder
       3: dup
       4: invokespecial #4                  // Method java/lang/StringBuilder."":()V
       7: astore_2
       8: iconst_0
       9: istore_3
      10: iload_3
      11: aload_1
      12: arraylength
      13: if_icmpge     30
      16: aload_2
      17: aload_1
      18: iload_3
      19: aaload
      20: invokevirtual #5                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
      23: pop
      24: iinc          3, 1
      27: goto          10
      30: aload_2
      31: invokevirtual #6                  // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
      34: areturn
}

D:\code\javatest\13string\src\strings>
*/
//第一个方法5-35是一个循环体 初始化StringBuilder在循环体内部 说明它创建了多次
//第二个方法10-27是一个循环体 初始化StringBuilder在循环体外部 说明它创建了1次
如果是简单的的字符串拼接 可以使用String 但是如果想在循环体进行拼接,那么优先使用StringBuilder
比如这样

public class UsingStringBuilder {
  public static Random rand = new Random(47);
  public String toString() {
    StringBuilder result = new StringBuilder("[");
    for(int i = 0; i < 25; i++) {
      result.append(rand.nextInt(100));
      result.append(", ");
    }
    result.delete(result.length()-2, result.length());
    result.append("]");
    return result.toString();
  }
  public static void main(String[] args) {
    UsingStringBuilder usb = new UsingStringBuilder();
    System.out.println(usb);
  }
} /* Output:
[58, 55, 93, 61, 61, 29, 68, 0, 22, 7, 88, 28, 51, 89, 9, 78, 98, 61, 20, 58, 16, 40, 11, 22, 4]
*///:~

StringBuilder在JavaSE5时引入,相比于String 他还有insert replace substring reverse等方法,而JavaSE5之前String的拼接使用的是StringBuilder的替代品StringBuffer,StringBuilder线程不安全 StringBuffer线程安全,因此StringBuffer的开销更大 JavaSE5字符串拼接更快

13.3 无意识递归

每一个容器类都是Object对象,而他们都有toString方法 比如ArrayList的toString 会遍历所有元素并且调用元素的toString方法,其原理是调用间接父类AbstractCollection的toString

AbstractCollection的toString方法
    public String toString() {
        Iterator it = iterator();
        if (! it.hasNext())
            return "[]";

        StringBuilder sb = new StringBuilder();
        sb.append('[');
        for (;;) {
            E e = it.next();
            sb.append(e == this ? "(this Collection)" : e);
            if (! it.hasNext())
                return sb.append(']').toString();
            sb.append(',').append(' ');
        }
    }

假如我们希望toString打印对象的地址 如何做呢 我们可能这么写

import java.util.*;

public class InfiniteRecursion {
  public String toString() {
    return " InfiniteRecursion address: " + this + "\n";
  }
  public static void main(String[] args) {
    List v =
      new ArrayList();
    for(int i = 0; i < 10; i++)
      v.add(new InfiniteRecursion());
    System.out.println(v);
  }
} ///:~

这里就会产生无意识递归
" InfiniteRecursion address: " + this处会产生自动类型转换
" InfiniteRecursion address: "是一个字符串 因此+操作符后面应该会跟着一个String 而InfiniteRecursion不是一个String,那么他就会调用自己的toString方法,就形成递归 而后报错InfiniteRecursion(无限递归)
正确的方法应该是调用super的toString方法 因为InfiniteRecursion的父类是Object,Object的toString方法就是打印对象的地址

13.4 String上的操作

Think in Java第四版 读书笔记7第13章 字符串_第1张图片
Think in Java第四版 读书笔记7第13章 字符串_第2张图片
这里只是String的一些方法,但是常用的只有几个,看看就好,用到时不记得可以再查

13.5 格式化输出

13.5.1 C语言中的格式化输出

C语言不能像java一样进行字符串拼接 但是它有自己的输出方式
比如 printf(“ROW 1:[%d %f]\n”,x,y);
运行代码时 x会插入到%d的位置 y会插入到%f的位置 并且它会说明插入什么类型的变量 %d表示x是一个整数 %f表示y是一个浮点型(float or double)

13.5.2 System.out.format()

	System.out.format("ROW 1:[%d %f]\n",x,y);
	System.out.printf("ROW 1:[%d %f]\n",x,y);
	System.out.print("ROW 1:["+x+" "+y+"]\n");

这三者其实近似等价 习惯c语言用法可以使用C语言的写法 习惯Java的写法的我还是喜欢用第三种

13.5.3 Fomatter类

进行格式化 和输出重定向

public class Turtle {
  private String name;
  private Formatter f;
  public Turtle(String name, Formatter f) {
    this.name = name;
    this.f = f;
  }
  public void move(int x, int y) {
    f.format("%s The Turtle is at (%d,%d)\n", name, x, y);
  }
  public static void main(String[] args) {
    PrintStream outAlias = System.out;
    Turtle tommy = new Turtle("Tommy",
      new Formatter(System.err));
    Turtle terry = new Turtle("Terry",
      new Formatter(outAlias));
    tommy.move(0,0);
    terry.move(4,8);
    tommy.move(3,4);
    terry.move(2,5);
    tommy.move(3,3);
    terry.move(3,3);
  }
}

13.5.4 格式化说明符(例子中类似 “%-15s” 的字符集合)

public class Receipt {
  private double total = 0;
  private Formatter f = new Formatter(System.out);
  public void printTitle() {
    f.format("%-15s %5s %10s\n", "Item", "Qty", "Price");
    f.format("%-15s %5s %10s\n", "----", "---", "-----");
  }
  public void print(String name, int qty, double price) {
    f.format("%-15.15s %5d %10.2f\n", name, qty, price);
    total += price;
  }
  public void printTotal() {
    f.format("%-15s %5s %10.2f\n", "Tax", "", total*0.06);
    f.format("%-15s %5s %10s\n", "", "", "-----");
    f.format("%-15s %5s %10.2f\n", "Total", "",
      total * 1.06);
  }
  public static void main(String[] args) {
    Receipt receipt = new Receipt();
    receipt.printTitle();
    receipt.print("Jack's Magic Beans", 4, 4.25);
    receipt.print("Princess Peas", 3, 5.1);
    receipt.print("Three Bears Porridge", 1, 14.29);
    receipt.printTotal();
  }
} /* Output:
Item              Qty      Price
----              ---      -----
Jack's Magic Be     4       4.25
Princess Peas       3       5.10
Three Bears Por     1      14.29
Tax                         1.42
                           -----
Total                      25.06
*///:~

13.5.5 Formatter转换

Think in Java第四版 读书笔记7第13章 字符串_第3张图片

public class Conversion {
  public static void main(String[] args) {
    Formatter f = new Formatter(System.out);

    char u = 'a';
    System.out.println("u = 'a'");
    f.format("s: %s\n", u);//u 以String形式输出
    // f.format("d: %d\n", u);//u 不能以整型形式输出
    f.format("c: %c\n", u);//u 以Unicode形式输出
    f.format("b: %b\n", u);//u 以Boolean形式输出 只要不是null对象 输出都为true 0也不例外 要区别于其他语言
    // f.format("f: %f\n", u);//u 不能以浮点数形式输出
    // f.format("e: %e\n", u);//u 不能以科学技术浮点形式输出
    // f.format("x: %x\n", u);//u 不能以16进制整数形式输出
    f.format("h: %h\n", u);//u 以Unicode index形式输出

    int v = 121;
    System.out.println("v = 121");
    f.format("d: %d\n", v);
    f.format("c: %c\n", v);
    f.format("b: %b\n", v);
    f.format("s: %s\n", v);
    // f.format("f: %f\n", v);
    // f.format("e: %e\n", v);
    f.format("x: %x\n", v);
    f.format("h: %h\n", v);

    BigInteger w = new BigInteger("50000000000000");
    System.out.println(
      "w = new BigInteger(\"50000000000000\")");
    f.format("d: %d\n", w);
    // f.format("c: %c\n", w);
    f.format("b: %b\n", w);
    f.format("s: %s\n", w);
    // f.format("f: %f\n", w);
    // f.format("e: %e\n", w);
    f.format("x: %x\n", w);
    f.format("h: %h\n", w);

    double x = 179.543;
    System.out.println("x = 179.543");
    // f.format("d: %d\n", x);
    // f.format("c: %c\n", x);
    f.format("b: %b\n", x);
    f.format("s: %s\n", x);
    f.format("f: %f\n", x);
    f.format("e: %e\n", x);
    // f.format("x: %x\n", x);
    f.format("h: %h\n", x);

    Conversion y = new Conversion();
    System.out.println("y = new Conversion()");
    // f.format("d: %d\n", y);
    // f.format("c: %c\n", y);
    f.format("b: %b\n", y);
    f.format("s: %s\n", y);
    // f.format("f: %f\n", y);
    // f.format("e: %e\n", y);
    // f.format("x: %x\n", y);
    f.format("h: %h\n", y);

    boolean z = false;
    System.out.println("z = false");
    // f.format("d: %d\n", z);
    // f.format("c: %c\n", z);
    f.format("b: %b\n", z);
    f.format("s: %s\n", z);
    // f.format("f: %f\n", z);
    // f.format("e: %e\n", z);
    // f.format("x: %x\n", z);
    f.format("h: %h\n", z);
  }
} /* Output: (Sample)
u = 'a'
s: a
c: a
b: true
h: 61
v = 121
d: 121
c: y
b: true
s: 121
x: 79
h: 79
w = new BigInteger("50000000000000")
d: 50000000000000
b: true
s: 50000000000000
x: 2d79883d2000
h: 8842a1a7
x = 179.543
b: true
s: 179.543
f: 179.543000
e: 1.795430e+02
h: 1ef462c
y = new Conversion()
b: true
s: Conversion@9cab16
h: 9cab16
z = false
b: false
s: false
h: 4d5
*///:~

13.5.6 String.format()

String.format()接受与Formatter.format一样的参数 但是String.format()不会进行重定向而是返回一个String,这样我们很容易得到格式化的字符串
public class DatabaseException extends Exception {
  public DatabaseException(int transactionID, int queryID,
    String message) {
    super(String.format("(t%d, q%d) %s", transactionID,
        queryID, message));//分别取代%d  %d  %s位置的占位符
  }
  public static void main(String[] args) {
    try {
      throw new DatabaseException(3, 7, "Write failed");
    } catch(Exception e) {
      System.out.println(e);
    }
  }
} /* Output:
DatabaseException: (t3, q7) Write failed
*///:~

我们看一下String.format的内部实现如下 可以看到原理还是用的Formatter().format

public static String format(String format, Object… args) {
return new Formatter().format(format, args).toString();
}

13.6.1 正则表达式

正则表达式用于判断一个字符串是否满足某些条件
比如\d表示一位数字
在java中反斜杠“\”有些不同于其他语言
在java中\具有转义的意义 比如\n \t 等等,如果不具有实际意义 如\a Java会报错
那么我们真的想存储\a 就得写成"\a",该字符串实际存储值是\a, “\n"实际存储值是\n “\n"实际存储值是换行
当我们想使用正则表达式去匹配反斜杠”“时 我们需要期望它的实际存储值是\ 那么在Java中它就写为了”\\”

可能有或者没有用"?"
比如可能有一个-号 表示成"-?"
\d表示整数
一个或多个用+
比如一个或多个数字 \d+

正则表达式的简单运用

public class IntegerMatch {
  public static void main(String[] args) {
    System.out.println("-1234".matches("-?\\d+"));//有一个或没有-号 后面跟着若干数字
    System.out.println("5678".matches("-?\\d+"));
    System.out.println("+911".matches("-?\\d+"));
    System.out.println("+911".matches("(-|\\+)?\\d+"));//实际效果是(-|+)? 在java中需要写作(-|\\+)? 表示开头可能有或没有(+或者-)
  }
} /* Output:
true
true
false
true
*///:~

String的split()方法
作用:按照正则划分String 在String中匹配正则的部分会被删除
例子:

public class Splitting {
  public static String knights =
    "Then, when you have found the shrubbery, you must " +
    "cut down the mightiest tree in the forest... " +
    "with... a herring!";
  public static void split(String regex) {
    System.out.println(
      Arrays.toString(knights.split(regex)));
  }
  public static void main(String[] args) {
    split(" "); // Doesn't have to contain regex chars //按空格划分
    split("\\W+"); // Non-word characters //\W 表示非单词字符 比如本例子中的逗号 此处表示按照一个或多个连在一起的非单词字符划分
    split("n\\W+"); // 'n' followed by non-word characters // 表示n后面跟着一个或多个非单词字符
  }
} /* Output:
[Then,, when, you, have, found, the, shrubbery,, you, must, cut, down, the, mightiest, tree, in, the, forest..., with..., a, herring!]
[Then, when, you, have, found, the, shrubbery, you, must, cut, down, the, mightiest, tree, in, the, forest, with, a, herring]
[The, whe, you have found the shrubbery, you must cut dow, the mightiest tree i, the forest... with... a herring!]
*///:~

String.split的重载版本 可以限制分割次数
使用正则表达式替换String内容
例子:

public class Replacing {
	  public static String knights =
			    "Then, when you have found the shrubbery, you must " +
			    "cut down the mightiest tree in the forest... " +
			    "with... a herring!";
  public static void main(String[] args) {
    System.out.println(knights.replaceFirst("f\\w+", "located"));
    System.out.println(knights.replaceAll("shrubbery|tree|herring","banana"));
  }
} /* Output:
Then, when you have located the shrubbery, you must cut down the mightiest tree in the forest... with... a herring!
Then, when you have found the banana, you must cut down the mightiest banana in the forest... with... a banana!
*///:~

13.6.2 创建正则表达式

Think in Java第四版 读书笔记7第13章 字符串_第4张图片
Think in Java第四版 读书笔记7第13章 字符串_第5张图片
边界相关的字符
Think in Java第四版 读书笔记7第13章 字符串_第6张图片
一些特殊字符和正则规则
例子:使用正则表达式

public class Rudolph {
	//创建字符串数组 包含了四个字符串 使用Rudolph与该4个字符串对比 是否匹配
  public static void main(String[] args) {
	  //四个字符分别表示   1.Rudolph   2.r或者R后面跟着udolph   3.r或者R,接着是aeiou的其中一个,接着跟着小写字母,接着是ol在连接任意字符
	  //4.R打头后面跟着任意字符
    for(String pattern : new String[]{ "Rudolph",
      "[rR]udolph", "[rR][aeiou][a-z]ol.*", "R.*" })
      System.out.println("Rudolph".matches(pattern));
  }
} /* Output:
true
true
true
true
*///:~

13.6.3 量词

量词图片
Think in Java第四版 读书笔记7第13章 字符串_第7张图片
书中讲的不够清晰 参考网络资料
https://blog.csdn.net/zfq642773391/article/details/5506618

Greedy是最常用的,它的匹配方式是先把整个字符串吞下,然后匹配整个字符串,如果不匹配,就从右端吐出一个字符,再进行匹配,直到找到匹配或把整个字符串吐完为止。因为总是从最大 匹配开始匹配,故称贪婪。

Matcher m=Pattern.compile("a.*b")
              .matcher("a====b=========b=====");
while(m.find()){
      System.out.println(m.group());
		}


输出:
a====b=========b

Reluctant正好和Greedy相反,它先从最小匹配开始,先从左端吞入一个字符,然后进行匹配,若不匹配就再吞入一个字符,直到找到匹配或将整个字符串吞入为止。因为总是从最小匹配开始,故称懒惰。

Matcher m=Pattern.compile("a.*?b")
                  .matcher("a====b=========b=====");
while(m.find()){
        System.out.println(m.group());
		}

输出:
a====b

Possessive和Greedy的匹配方式一样,先把整个字符串吞下,然后匹配整个字符串,如果匹配,就认为匹配,如果不匹配,就认为整个字符串不匹配,它不会从右端吐出一个字符串再进行匹配,只进行一次。因为贪婪但并不聪明,故称强占。

Matcher m=Pattern.compile("a.*+b")
                   .matcher("a====b=========b=====");
while(m.find()){
	System.out.println(m.group());
		}

输出:

(没有找到匹配)

13.6.4 Pattern和Matcher

Pattern作用:使用Pattern.compile()方法可以编译传入的String类型的正则表达式生成Pattern对象。
Matcher作用:使用Pattern对象的matcher方法可以生成Matcher对象 matcher拥有许多匹配 替换的方法。
正则表达式的验证与使用例子(eclipse给main函数的args赋值方法:Run–>Run Configuaration–>Run Arguments–>Program arguments)

// Allows you to easily try out regular expressions.
// {Args: abcabcabcdefabc "abc+" "(abc)+" "(abc){2,}" }//此处是输入参数 //注意这里用的均是贪婪型
// abc+代表ab后面跟着一个或多个c
//(abc)+ 代表一个或多个abc
//(abc){2,} 代表至少出现abc两次
import java.util.regex.*;

public class TestRegularExpression {
  public static void main(String[] args) {
    if(args.length < 2) {
      System.out.println("Usage:\njava TestRegularExpression " +
        "characterSequence regularExpression+");
      System.exit(0);
    }
    System.out.println("Input: \"" + args[0] + "\"");//输出第零个参数并加上引号
    for(String arg : args) {//遍历String数组  将各元素编译成Pattern对象 然后与第零个参数匹配 
      System.out.println("Regular expression: \"" + arg + "\"");
      Pattern p = Pattern.compile(arg);
      Matcher m = p.matcher(args[0]);
      while(m.find()) {
        System.out.println("Match \"" + m.group() + "\" at positions " +
          m.start() + "-" + (m.end() - 1));
      }
    }
  }
} /* Output:
Input: "abcabcabcdefabc"
Regular expression: "abcabcabcdefabc"
Match "abcabcabcdefabc" at positions 0-14
Regular expression: "abc+"
Match "abc" at positions 0-2
Match "abc" at positions 3-5
Match "abc" at positions 6-8
Match "abc" at positions 12-14
Regular expression: "(abc)+"
Match "abcabcabc" at positions 0-8
Match "abc" at positions 12-14
Regular expression: "(abc){2,}"
Match "abcabcabc" at positions 0-8
*///:~

Pattern的几个方法
public static boolean matches(String regex, CharSequence input) //检查input是否与正则regex匹配

Matcher的几个方法

public boolean matches() {
    return match(from, ENDANCHOR);
}

public boolean lookingAt() {
    return match(from, NOANCHOR);
}
public boolean find() 
public boolean find(int start)

常用的是matches和find的无参方法
find方法可以查询一个字符串中的多个匹配
例子 find 与 find(i)

public class Finding {
	public static void main(String[] args) {
		Matcher m = Pattern.compile("\\w+")// 词字符 [a-z A-Z 0-9]
				.matcher("Evening is full of the linnet's wings");//顺序遍历 遍历过的部分不再查找
		while (m.find())
			System.out.println(m.group() + " ");
		System.out.println("=============================");
		int i = 0;
		while (m.find(i)) {//i代表搜索字符起始index
			System.out.println(m.group() + " " + "(" + i + ")");
			i++;
		}
	}
} /*
Evening 
is 
full 
of 
the 
linnet 
s 
wings 
=============================
Evening (0)
vening (1)
ening (2)
ning (3)
ing (4)
ng (5)
g (6)
is (7)
is (8)
s (9)
full (10)
full (11)
ull (12)
ll (13)
l (14)
of (15)
of (16)
f (17)
the (18)
the (19)
he (20)
e (21)
linnet (22)
linnet (23)
innet (24)
nnet (25)
net (26)
et (27)
t (28)
s (29)
s (30)
wings (31)
wings (32)
ings (33)
ngs (34)
gs (35)
s (36)
 */// :~

正则表达式中组的概念
matcher对象有一个属性和组有关
举个例子

一个字符串为"A(B(C))D"
group(0)为 ABCD
group(1)为BC
group(2)为C
group不加参数 就是group(0)
groupCount()返回所有组的数目 此时是3
matcher还有另外两个方法start和end
start表示上一个匹配字符串找到组的起始index
end表示上一个匹配字符串找到组的结束index+1

例子

public class Groups {
  static public final String POEM =
    "Twas brillig, and the slithy toves\n" +
    "Did gyre and gimble in the wabe.\n" +
    "All mimsy were the borogoves,\n" +
    "And the mome raths outgrabe.\n\n" +
    "Beware the Jabberwock, my son,\n" +
    "The jaws that bite, the claws that catch.\n" +
    "Beware the Jubjub bird, and shun\n" +
    "The frumious Bandersnatch.";
  public static void main(String[] args) {
	  //"S+"表示任意非空格字符 "s+"表示任意数目的空格 ?m是一个模式标记  $表示结尾
	  //整个正则表达式就是捕获每行的最后三个单词
    Matcher m =
      Pattern.compile("(?m)(\\S+)\\s+((\\S+)\\s+(\\S+))$")
        .matcher(POEM);
    //正则的组分别是 
    //group(0)->(\\S+)\\s+((\\S+)\\s+(\\S+))
    //group(1)->(\\S+)
    //group(2)->((\\S+)\\s+(\\S+))
    //group(3)->(\\S+)
    //group(4)->(\\S+)
    while(m.find()) {
      for(int j = 0; j <= m.groupCount(); j++)
    	  System.out.println("[" + m.group(j) + "]");//找到匹配项之后 依次输出各组(输出group(0) group())
      System.out.println();
    }
  }
} /* Output:
[the slithy toves]
[the]
[slithy toves]
[slithy]
[toves]

[in the wabe.]
[in]
[the wabe.]
[the]
[wabe.]

[were the borogoves,]
[were]
[the borogoves,]
[the]
[borogoves,]

[mome raths outgrabe.]
[mome]
[raths outgrabe.]
[raths]
[outgrabe.]

[Jabberwock, my son,]
[Jabberwock,]
[my son,]
[my]
[son,]

[claws that catch.]
[claws]
[that catch.]
[that]
[catch.]

[bird, and shun]
[bird,]
[and shun]
[and]
[shun]

[The frumious Bandersnatch.]
[The]
[frumious Bandersnatch.]
[frumious]
[Bandersnatch.]
*///:~

start和end方法例子不太好 略过
Pattern标记
Think in Java第四版 读书笔记7第13章 字符串_第8张图片
Think in Java第四版 读书笔记7第13章 字符串_第9张图片
Pattern标记可以使用"|"操作符来实现组合使用

public class ReFlags {
  public static void main(String[] args) {
    Pattern p =  Pattern.compile("^java",
      Pattern.CASE_INSENSITIVE | Pattern.MULTILINE);
    Matcher m = p.matcher(
      "java has regex\nJava has regex\n" +
      "JAVA has pretty good regular expressions\n" +
      "Regular expressions are in Java");
    while(m.find())
      System.out.println(m.group());
  }
} /* Output:
java
Java
JAVA
*///:~

13.6.6 替换操作

13.6.7 reset方法

该方法可以使得matcher重新与另一个字符串匹配(如果不带参数 则是将对象的比较位置重新置到起始位置)

public class Resetting {
  public static void main(String[] args) throws Exception {
    Matcher m = Pattern.compile("[frb][aiu][gx]")
      .matcher("fix the rug with bags");
    while(m.find())
      System.out.print(m.group() + " ");
    System.out.println();
    m.reset("fix the rig with rags");
    while(m.find())
      System.out.print(m.group() + " ");
  }
} /* Output:
fix rug bag
fix rig rag
*///:~

13.6.8 正则表达式与Java I/O

13.7

略过

13.8 StringTokenizer

(基本可以略过 已经被正则和Scanner取代)

public class ReplacingStringTokenizer {
  public static void main(String[] args) {
    String input = "But I'm not dead yet! I feel happy!";
    StringTokenizer stoke = new StringTokenizer(input);
    while(stoke.hasMoreElements())
      System.out.print(stoke.nextToken() + " ");
    System.out.println();
    System.out.println(Arrays.toString(input.split(" ")));
    Scanner scanner = new Scanner(input);
    while(scanner.hasNext())
      System.out.print(scanner.next() + " ");
  }
} /* Output:
But I'm not dead yet! I feel happy!
[But, I'm, not, dead, yet!, I, feel, happy!]
But I'm not dead yet! I feel happy!
*///:~

正则表达式可以相当复杂,编程思想的内容是正则表达式的最基础内容。要想深入学习正则表达式,还有专门的书籍,但是我觉得一般的编程者只需要了解这些基础即可,复杂的正则表达式完全可以使用时搜索,比如身份证的有效性 电话号码 邮件的有效性验证等等。

你可能感兴趣的:(java)