ATL 正则表达式, CAtlRegExp

/* 最近简单看了下ATL中的正则表达式,感觉不好用,主要是她的正则表达式规则太简单,有些情况匹配不出来,当然也许是我自己对正则表达式都不是很熟练,所以写不出匹配的规则,不过还是要记录下他的使用方法 */ CAtlRegExp<> regexp; //这里模板中不用写任何东西 REParseError status = reg.Parse("{//d//d//d//d//d//d?}"); //这里传入匹配的规则 if (REPARSE_ERROR_OK != status) //REPARSE_ERROR_OK 表明解析正确 { return; //解析不正确返回 } CAtlREMatchContext<> mc; //模板中同样不用写任何东西,保持匹配的结果 BOOL matched = reg.Match(input.c_str(), &mc); //查找匹配, if(matched) //matched 为true,表明有匹配的字符串, { const CAtlREMatchContext<>::RECHAR* szStart = 0; //某个匹配的结果串的开始位置指针 const CAtlREMatchContext<>::RECHAR* szEnd = 0; //某个匹配的结果串的结束位置指针 mc.GetMatch(0, &szStart, &szEnd); //根据标号来获取某个结果串的开始结束位置指针, 0 表示第一个,可以使用以下的代码来匹配多个并获取 /* for (UINT nGroupIndex = 0; nGroupIndex < mcUrl.m_uNumGroups; ++nGroupIndex) { const CAtlREMatchContext<>::RECHAR* szStart = 0; const CAtlREMatchContext<>::RECHAR* szEnd = 0; mcUrl.GetMatch(nGroupIndex, &szStart, &szEnd); ptrdiff_t nLength = szEnd - szStart; printf("%d: /"%.*s/"/n", nGroupIndex, nLength, szStart); } */ if(szStart == NULL) return; ptrdiff_t nLength = szEnd - szStart; string startstr = szStart; startstr = startstr.substr(0, nLength); //获取匹配的串 } /* . Matches any single character. [ ] Indicates a character class. Matches any character inside the brackets (for example, [abc] matches "a", "b", and "c"). ^ If this metacharacter occurs at the start of a character class, it negates the character class. A negated character class matches any character except those inside the brackets (for example, [^abc] matches all characters except "a", "b", and "c"). If ^ is at the beginning of the regular expression, it matches the beginning of the input (for example, ^[abc] will only match input that begins with "a", "b", or "c"). - In a character class, indicates a range of characters (for example, [0-9] matches any of the digits "0" through "9"). ? Indicates that the preceding expression is optional: it matches once or not at all (for example, [0-9][0-9]? matches "2" and "12"). + Indicates that the preceding expression matches one or more times (for example, [0-9]+ matches "1", "13", "666", and so on). * Indicates that the preceding expression matches zero or more times. ??, +?, *? Non-greedy versions of ?, +, and *. These match as little as possible, unlike the greedy versions which match as much as possible. Example: given the input "<abc><def>", <.*?> matches "<abc>" while <.*> matches "<abc><def>". ( ) Grouping operator. Example: (/d+,)*/d+ matches a list of numbers separated by commas (such as "1" or "1,23,456"). { } Indicates a match group. The actual text in the input that matches the expression inside the braces can be retrieved through the CAtlREMatchContext object. / Escape character: interpret the next character literally (for example, [0-9]+ matches one or more digits, but [0-9]/+ matches a digit followed by a plus character). Also used for abbreviations (such as /a for any alphanumeric character; see table below). If / is followed by a number n, it matches the nth match group (starting from 0). Example: <{.*?}>.*?<//0> matches "<head>Contents</head>". Note that in C++ string literals, two backslashes must be used: "//+", "//a", "<{.*?}>.*?<///0>". $ At the end of a regular expression, this character matches the end of the input. Example: [0-9]$ matches a digit at the end of the input. | Alternation operator: separates two expressions, exactly one of which matches (for example, T|the matches "The" or "the"). ! Negation operator: the expression following ! does not match the input. Example: a!b matches "a" not followed by "b". Abbreviations CAtlRegExp can handle abbreviations, such as /d instead of [0-9]. The abbreviations are provided by the character traits class passed in the CharTraits parameter. The predefined character traits classes provide the following abbreviations. Abbreviation Matches /a Any alphanumeric character: ([a-zA-Z0-9]) /b White space (blank): ([ //t]) /c Any alphabetic character: ([a-zA-Z]) /d Any decimal digit: ([0-9]) /h Any hexadecimal digit: ([0-9a-fA-F]) /n Newline: (/r|(/r?/n)) /q A quoted string: (/"[^/"]*/")|(/'[^/']*/') /w A simple word: ([a-zA-Z]+) /z An integer: ([0-9]+) */

你可能感兴趣的:(正则表达式,Class,input,character,Literals,newline)