Using Regular Expressions in Groovy

Using Regular Expressions in Groovy

Because Groovy is based on Java, you can use Java's regular expression package with Groovy. Simply putimport java.util.regex.* at the top of your Groovy source code. Any Java code using regular expressions will then automatically work in your Groovy code too.

Using verbose Java code to work with regular expressions in Groovy wouldn't be very groovy. Groovy has a bunch of language features that make code using regular expressions a lot more concise. You can mix the Groovy-specific syntax with regular Java code. It's all based in the java.util.regex package, which you'll need to import regardless.

Groovy Strings

Java has only one string style. Strings are placed between double quotes. Double quotes and backslashes in strings must be escaped with backslashes. That yields a forest of backslashes in literal regular expressions.

Groovy has five string styles. Strings can be placed between single quotes, double quotes, triple single quotes, and triple double quotes. Using triple single or double quotes allows the string to span multiple lines, which is handy for free-spacing regular expressions. Unfortunately, all four of these string styles require backslashes to be escaped.

The fifth string style is provided specifically for regular expressions. The string is placed between forward slashes, and only forward slashes (not backslashes) in the string need to be escaped. This is indeed a string style. Both/hello/ and "hello" are literal instances of java.lang.String. Unfortunately, strings delimited with forward slashes cannot span across lines, so you can't use them for free-spacing regular expressions.

Groovy Patterns and Matchers

To actually use a string as a regular expression, you need to instantiate the java.util.regex.Pattern class. To actually use that pattern on a string, you need to instantiate the java.util.regex.Matcher class. You use these classes in Groovy just like you do in Java. But Groovy does provide some special syntax that allows you to create those instances with much less typing.

To create a Pattern instance, simply place a tilde before the string with your regular expression. The string can use any of Groovy's five string styles. When assigning this pattern to a variable, make sure to leave a space between the assignment operator and the tilde.

Pattern myRegex = ~/regex/

You won't actually instantiate patterns this way very often. The only time you need the Pattern instance is to split a string, which requires you to call Pattern.split(). To find regex matches or to search-and-replace with a regular expression, you need a Matcher instance that binds the pattern to a string. In Groovy, you can create this instance directly from the literal string with your regular expression using the =~ operator. No space between the = and ~ this time.

Matcher myMatcher = "subject" =~ /regex/

This short for:

Matcher myMatcher = Pattern.compile(/regex/).matcher("subject")

Finally, the ==~ operator is a quick way to test whether a regex can match a string entirely. myString ==~ /regex/ is equivalent to myString.matches(/regex/). To find partial matches, you need to use the Matcher.

Further Reading

Java Regular Expressions - Taming the java.util.regex EngineIf you'd like a more detailed overview of all the functionality offered by the java.util.regex package, you may want to get yourself a copy of "Java Regular Expressions" written by Mehran Habibi and published by Apress. Though this book doesn't mention Groovy at all, it is the most detailed guide to the java.util.regex package, which is what you're using with Groovy. Groovy only ads some syntactic shortcuts, which are all explained on this web page.

My review of the book Java Regular Expressions

你可能感兴趣的:(Using Regular Expressions in Groovy)