2.工程中添加libicucore.dylib frameworks。
3.现在所有的nsstring对象就可以调用RegexKitLite中的方法了。
NSString *email = @”[email protected]”;
[email isMatchedByRegex:@"\\b([a-zA-Z0-9%_.+\\-]+)@([a-zA-Z0-9.\\-]+?\\.[a-zA-Z]{2,6})\\b”];
返回YES,证明是email格式,需要注意的是RegexKitLite用到的正则表达式和wiki上的略有区别。
searchString = @”http://www.example.com:8080/index.html”;
regexString = @”\\bhttps?://[a-zA-Z0-9\\-.]+(?::(\\d+))?(?:(?:/[a-zA-Z0-9\\-._?,'+\\&%$=~*!():@\\\\]*)+)?”;
NSInteger portInteger = [[searchString stringByMatching:regexString capture:1L] integerValue];
NSLog(@”portInteger: ‘%ld’”, (long)portInteger);
// 2008-10-15 08:52:52.500 host_port[8021:807] portInteger: ‘8080′
取string中http的例子。
下面给出常用的一些正则表达式(其实就是RegexKitLite官网上的,怕同鞋偷情不看)
Character
Description
\a
Match a BELL, \u0007
\A
Match at the beginning of the input. Differs from ^ in that \A will not match after a new-line within the input.
\b, outside of a [Set]
Match if the current position is a word boundary. Boundaries occur at the transitions between word \w and non-word \W characters, with combining marks ignored.
See also: RKLUnicodeWordBoundaries
\b, within a [Set]
Match a BACKSPACE, \u0008.
\B
Match if the current position is not a word boundary.
\cx
Match a Control-x character.
\d
Match any character with the Unicode General Category of Nd (Number, Decimal Digit).
\D
Match any character that is not a decimal digit.
\e
Match an ESCAPE, \u001B.
\E
Terminates a \Q…\E quoted sequence.
\f
Match a FORM FEED, \u000C.
\G
Match if the current position is at the end of the previous match.
\n
Match a LINE FEED, \u000A.
\N{Unicode Character Name}
Match the named Unicode Character.
\p{Unicode Property Name}
Match any character with the specified Unicode Property.
\P{Unicode Property Name}
Match any character not having the specified Unicode Property.
\Q
Quotes all following characters until \E.
\r
Match a CARRIAGE RETURN, \u000D.
\s
Match a white space character. White space is defined as [\t\n\f\r\p{Z}].
\S
Match a non-white space character.
\t
Match a HORIZONTAL TABULATION, \u0009.
\uhhhh
Match the character with the hex value hhhh.
\Uhhhhhhhh
Match the character with the hex value hhhhhhhh. Exactly eight hex digits must be provided, even though the largest Unicode code point is \U0010ffff.
\w
Match a word character. Word characters are [\p{Ll}\p{Lu}\p{Lt}\p{Lo}\p{Nd}].
\W
Match a non-word character.
\x{h…}
Match the character with hex value hhhh. From one to six hex digits may be supplied.
\xhh
Match the character with two digit hex value hh.
\X
Match a Grapheme Cluster.
\Z
Match if the current position is at the end of input, but before the final line terminator, if one exists.
\z
Match if the current position is at the end of input.
\n
Back Reference. Match whatever the nth capturing group matched. n must be a number ≥ 1 and ≤ total number of capture groups in the pattern.Note:
Octal escapes, such as \012, are not supported.
[pattern]
Match any one character from the set. See ICU Regular Expression Character Classes for a full description of what may appear in the pattern.
.
Match any character.
^
Match at the beginning of a line.
$
Match at the end of a line.
\
Quotes the following character. Characters that must be quoted to be treated as literals are * ? + [ ( ) { } ^ $ | \ . /
OperatorsOperator
Description
|
Alternation. A|B matches either A or B.
*
Match zero or more times. Match as many times as possible.
+
Match one or more times. Match as many times as possible.
?
Match zero or one times. Prefer one.
{n}
Match exactly n times.
{n,}
Match at least n times. Match as many times as possible.
{n,m}
Match between n and m times. Match as many times as possible, but not more than m.
*?
Match zero or more times. Match as few times as possible.
+?
Match one or more times. Match as few times as possible.
??
Match zero or one times. Prefer zero.
{n}?
Match exactly n times.
{n,}?
Match at least n times, but no more than required for an overall pattern match.
{n,m}?
Match between n and m times. Match as few times as possible, but not less than n.
*+
Match zero or more times. Match as many times as possible when first encountered, do not retry with fewer even if overall match fails. Possessive match.
++
Match one or more times. Possessive match.
?+
Match zero or one times. Possessive match.
{n}+
Match exactly n times. Possessive match.
{n,}+
Match at least n times. Possessive match.
{n,m}+
Match between n and m times. Possessive match.
(…)
Capturing parentheses. Range of input that matched the parenthesized subexpression is available after the match.
(?:…)
Non-capturing parentheses. Groups the included pattern, but does not provide capturing of matching text. Somewhat more efficient than capturing parentheses.
(?>…)
Atomic-match parentheses. First match of the parenthesized subexpression is the only one tried; if it does not lead to an overall pattern match, back up the search for a match to a position before the (?> .
(?#…)
Free-format comment (?#comment).
(?=…)
Look-ahead assertion. True if the parenthesized pattern matches at the current input position, but does not advance the input position.
(?!…)
Negative look-ahead assertion. True if the parenthesized pattern does not match at the current input position. Does not advance the input position.
(?<=…)
Look-behind assertion. True if the parenthesized pattern matches text preceding the current input position, with the last character of the match being the input character just before the current position. Does not alter the input position. The length of possible strings matched by the look-behind pattern must not be unbounded (no * or + operators).
(?<!…)
Negative Look-behind assertion. True if the parenthesized pattern does not match text preceding the current input position, with the last character of the match being the input character just before the current position. Does not alter the input position. The length of possible strings matched by the look-behind pattern must not be unbounded (no * or + operators).
(?ismwx-ismwx:…)
Flag settings. Evaluate the parenthesized expression with the specified flags enabled or -disabled.
(?ismwx-ismwx)
Flag settings. Change the flag settings. Changes apply to the portion of the pattern following the setting. For example, (?i) changes to a case insensitive match.
See also: Regular Expression Options
同时需要注意的是转义字符哦~~在safari上复制会直接转换(网站蛮人性化的)
同时也提供了转换工具,safari测试支持,可能下载的时候有点慢,耐心等待,链接
来源:http://www.minroad.com/?p=85
今天把字符串的过滤给解决了。其实不难,只是一些通配符我不知道怎么处理。下面我就把ios中如何使用正则表达式总结一下!ios SDK中并没有公开提供处理正则表达式的API。我用的是RegexKitLite,先去网上载了这个类库,解压之后,把RegexKitLite.h和RegexKitLite.m两个文件加入到你的项目中,因为RegexKitLite使用ICU库,所以需要动态链接到/usr/lib/libicucore.dylib库当中去,所以还得向Other Linker Flags(Xcode3.2中 进入菜单Project->Edit Project Settings 在搜索框内输入linker搜索,找到“Other Linker Flags”选项。 Xcode4 在Build Settings下的Basic选项中能找到“Other Linker Flags”选项。)中添加一个标签,双击后面之后 会出现一个+的符号,然后编辑栏中加入-licucore即可! 接下来还得加入一个框架libicucore.dylib,Xcode3.2中在frameworks文件夹下右击,增加已经存在的框架就行。Xcode4得找到一个貌似Links Libraries的选项,也就是能看到其他框架的地方。框架下面有个+号,然后点击输入libicucore.dylib,点击Add就行了。
接下来就是在要引用这个正则类的文件中#import "RegexKitLite.h",这样一来就能调用其中的方法了。
RegexKitLite提供的方法都是透过NSString来操作,并且有相当多的功能,以下是几个常用的方法:
- (NSArray *)captureComponentsMatchedByRegex:(NSString *)regex;
这个方法是透过传入一个Regex进行字串的比对,并且会将第一组比对出来的结果以NSArray回传群组。
- (NSArray *)arrayOfCaptureComponentsMatchedByRegex:(NSString *)regex;
�@��方法如同上者,一样会回传Regex所比对出来的字串群组,但会回传全部的配对组合。
- (BOOL)isMatchedByRegex:(NSString *)regex;
判断字串是否与Regex配对,在进行资料验证的�r候很实用。
例如:NSString *email = @”[email protected]”;
[email isMatchedByRegex:@"\\b([a-zA-Z0-9%_.+\\-]+)@([a-zA-Z0-9.\\-]+?\\.[a-zA-Z]{2,6})\\b”];
- (NSString *)stringByMatching:(NSString *)regex;
这方法则是会回传配对出来的第一��完整字串。
- (NSString *)stringByReplacingOccurrencesOfRegex:(NSString *)regex withString:(NSString *)replacement;
将字串中与Regex配对的结果进行替换。
在过滤的时候,我虽然知道方法,可是我不知道通配符的含义,结果还是没处理好,最好还是总监帮我处理的,我告诉他方法。方法是这样的:
coupon_tips = [coupon_tips stringByReplacingOccurrencesOfRegex:@"<.*?>" withString:@""];
这样就把NSString *coupon_tips中,< >中的内容以及< >全部替换成了空,就取到了删除<>及其内容的作用!
接下来要做的,就是对收藏夹导航进行编写。今天想了很久,还是没有想到比较合理的处理方法。不过我相信我能想出来的。
转自http://hi.baidu.com/belival/blog/item/3b71699710bd1c53d0135e38.html