一、NSCharacterSet的使用
NSCharacterSet是一个字符集合,利用好这个类,可以更加方便的对字符串进行操作。
比如"abcdefghijklmnfwafajkfjawifa",需求要将这段字符中的"f"、"a"去掉,我们大概会这么做:
NSString *str = @"abcdefghijklmnfwafajkfjawifa";
NSInteger length = str.length;
NSString *removeStr = @"af";
NSMutableString *resultStr = [NSMutableString string];
for (int i = 0; i < length; i ++) {
NSString *indexStr = [str substringWithRange:NSMakeRange(i, 1)];
if (![removeStr containsString:indexStr]) {
[resultStr appendString:indexStr];
}
}
NSLog(@"%@",resultStr);
有心的可以算一下时间复杂度,这真的是一段看起来就很烦躁的代码。
如果用NSCharacterSet进行处理:
NSString *str = @"abcdefghijklmnfwafajkfjawifa";
NSCharacterSet *set = [NSCharacterSet characterSetWithCharactersInString:@"af"];
NSString *resultStr = [str stringByTrimmingCharactersInSet:set];
NSLog(@"%@",resultStr);
stringByTrimmingCharactersInSet:缩减NSCharacterSet里面的字符。
NSCharacterSet的创建:
除了自己拼接string的方式,还可以使用以下类方法,直接获取一个想要的字符集合
/** 常用快捷方法集合 */
+ controlCharacterSet
+ whitespaceCharacterSet //空格
+ whitespaceAndNewlineCharacterSet //空格和换行符
+ decimalDigitCharacterSet //0-9的数字
+ letterCharacterSet //所有字母
+ lowercaseLetterCharacterSet //小写字母
+ uppercaseLetterCharacterSet //大写字母
+ alphanumericCharacterSet //所有数字和字母(大小写不分)
+ punctuationCharacterSet //标点符号
+ newlineCharacterSet //换行
/** URL相关快捷方法集合 */
+ URLUserAllowedCharacterSet
+ URLPasswordAllowedCharacterSet
+ URLHostAllowedCharacterSet
+ URLPathAllowedCharacterSet //路径允许使用的字符集合
+ URLQueryAllowedCharacterSet //参数允许使用的字符集合
+ URLFragmentAllowedCharacterSet
二、rangeOfComposedCharacterSequencesForRange、rangeOfComposedCharacterSequenceAtIndex
每一个中文或者英文在NSString中的length均为1,但是一个Emoji的length的长度为2或者4,如果使用substringToIndex可能存在把Emoji截断而导致乱码的情况。
这两个方法,可以获取当前范围或者当前index下的完整字符,避免乱码情况的出现。
URL的编码就是一个很好地字符串处理的例子:
我们先来回顾一下URL的编码规则:
根据2005年发布的RFC3986“%编码”规范:对URL中属于ASCII字符集的非保留字不做编码;对URL中的保留字需要取其ASCII内码,然后加上“%”前缀将该字符进行替换(编码);对于URL中的非ASCII字符需要取其Unicode内码,然后加上“%”前缀将该字符进行替换(编码)。由于这种编码是采用“%”加上字符内码的方式,所以,有些地方也称其为“百分号编码”。
URL中的保留字符: ! * ' ( ) ; : @ & = + $ , / ? # [ ]
其中分隔符:/ ?
AFNetworking提供的编码方法
/**
Returns a percent-escaped string following RFC 3986 for a query string key or value.
RFC 3986 states that the following characters are "reserved" characters.
- General Delimiters: ":", "#", "[", "]", "@", "?", "/"
- Sub-Delimiters: "!", "$", "&", "'", "(", ")", "*", "+", ",", ";", "="
In RFC 3986 - Section 3.4, it states that the "?" and "/" characters should not be escaped to allow
query strings to include a URL. Therefore, all "reserved" characters with the exception of "?" and "/"
should be percent-escaped in the query string.
- parameter string: The string to be percent-escaped.
- returns: The percent-escaped string.
*/
NSString * AFPercentEscapedStringFromString(NSString *string) {
static NSString * const kAFCharactersGeneralDelimitersToEncode = @":#[]@"; // does not include "?" or "/" due to RFC 3986 - Section 3.4
static NSString * const kAFCharactersSubDelimitersToEncode = @"!$&'()*+,;=";
NSMutableCharacterSet * allowedCharacterSet = [[NSCharacterSet URLQueryAllowedCharacterSet] mutableCopy];
[allowedCharacterSet removeCharactersInString:[kAFCharactersGeneralDelimitersToEncode stringByAppendingString:kAFCharactersSubDelimitersToEncode]];
// FIXME: https://github.com/AFNetworking/AFNetworking/pull/3028
// return [string stringByAddingPercentEncodingWithAllowedCharacters:allowedCharacterSet];
static NSUInteger const batchSize = 50;
NSUInteger index = 0;
NSMutableString *escaped = @"".mutableCopy;
while (index < string.length) {
NSUInteger length = MIN(string.length - index, batchSize);
NSRange range = NSMakeRange(index, length);
// To avoid breaking up character sequences such as
range = [string rangeOfComposedCharacterSequencesForRange:range];
NSString *substring = [string substringWithRange:range];
NSString *encoded = [substring stringByAddingPercentEncodingWithAllowedCharacters:allowedCharacterSet];
[escaped appendString:encoded];
index += range.length;
}
return escaped;
}