字符串查找通常有四种方式,暴力查找,KMP查找,BoyerMoore查找以及RabinKarp算法查找,查找最简单的莫过于暴力查找,如果内容是“FlyElephant”,需要查找的内容是“Elephant”,上面的四种方式会给出不同的方案~
暴力查找
最简单的暴力查找:
+(BOOL)simpleMatchPattern:(NSString *)text pattern:(NSString *)pattern{
NSInteger cnlen=text.length;
NSInteger ptlen=pattern.length;
for (NSInteger i=0; i<=cnlen-ptlen; i++) {
NSInteger j;
for (j=0; j
遍历内容的长度,获取索引m,然后索引m...m+匹配长度,逐一比较,这个算法可以稍微改进一下:
+(NSInteger)simpleBackMatchPattern:(NSString *)text pattern:(NSString *)pattern{
NSInteger i,j;
NSInteger cnlen=text.length;
NSInteger ptlen=pattern.length;
for (i=0,j=0; i
KMP算法
KMP是knuth,Morris和Pratt发明的算法,暴力查找的每次只是循环是固定的加1,假设内容中只有A,B两种你字符,假设查找的字符串是BAAAABAAB,模式字符串为BAAB,我们发现前两个匹配,第三个不匹配,暴力中,我们的循环会从索引1开始继续比较,但是匹配我们知道其他的全是A,所以我们应该从第二个字母B开始匹配,减少回退的成本,实现代码:
static const NSInteger LetterCount=256;
@interface KMPSearch()
@property (strong,nonatomic) NSString *pattern;
@property (strong,nonatomic) NSMutableArray *dfa;
@end
@implementation KMPSearch
-(instancetype)initWithPattern:(NSString *)pattern{
self=[super init];
if (self) {
self.pattern=pattern;
[self setupData];
}
return self;
}
-(NSInteger)search:(NSString *)text{
NSInteger i,j;
NSInteger plen=self.pattern.length;
NSInteger tlen=text.length;
for (i=0,j=0; i
BoyerMooer算法
之前的暴力查找和KMP都是从左向右扫描文本对比实现的,但是BoyerMoore是从右向组走扫描实现的,具体实现:
static const NSInteger LetterCount=256;
@interface BoyerMoore()
@property (strong,nonatomic) NSString *pattern;
@property (strong,nonatomic) NSMutableArray *rightArr;
@end
@implementation BoyerMoore
-(instancetype)initWithPattern:(NSString *)pattern{
self=[super init];
if (self) {
self.pattern=pattern;
[self setup];
}
return self;
}
-(NSInteger)search:(NSString *)text{
NSInteger tlen=text.length;
NSInteger plen=self.pattern.length;
NSInteger skip;
for (NSInteger i=0; i<=tlen-plen; i+=skip) {
skip=0;
for (NSInteger j=plen-1;j>=0;j=j-1) {
if ([self.pattern characterAtIndex:j]!=[text characterAtIndex:(i+j)]) {
NSInteger index=[text characterAtIndex:(i+j)];
skip=j-[self.rightArr[index] integerValue];
break;
}
}
if (skip==0) {
return i;//匹配成功
}
}
return -1;//匹配失败
}
-(void)setup{
self.rightArr=[[NSMutableArray alloc]init];
for (NSInteger i=0; i
Rabin-Karp指纹字符串查找
Rabin和Karp这种算法和前三种走了完全的不同的方向,基于散列的查找,也就是每一个字符串都有一个对应的Hash值,如果Hash值相同的话,那么匹配成功,基本效率保持在O(n).
static const NSInteger Q=997;
static const NSInteger LetterCount=10;
@interface RabinKarp()
@property (strong,nonatomic) NSString *pattern;
@property (assign,nonatomic) NSInteger pCount;
@property (assign,nonatomic) NSInteger RM;
@property (assign,nonatomic) NSInteger pathHash;
@end
@implementation RabinKarp
-(instancetype)initWithPattern:(NSString *)pattern{
self=[super init];
if (self) {
self.pattern=pattern;
self.pCount=self.pattern.length;
[self setup];
}
return self;
}
-(NSInteger)search:(NSString *)text{
NSInteger tlen=text.length;
NSInteger txtHash=[self hash:text length:self.pCount];
//开头匹配成功
if (self.pathHash==txtHash) {
return 0;
}
for (NSInteger j=self.pCount; j
测试代码:
//1.暴力查找
NSString *text=@"She is a girl";
NSString *pattern=@"girl";
if ([SearchMatch simpleMatchPattern:text pattern:pattern]) {
NSLog(@"%@ is contains %@",text,pattern);
}else{
NSLog(@"%@ is not contains %@",text,pattern);
}
//2.简单回退
text=@"FlyElephant";
pattern=@"Elem";
NSInteger matchIndex=[SearchMatch simpleBackMatchPattern:text pattern:pattern];
if (matchIndex>0) {
NSLog(@"%@ contains %@ ,index %ld",text,pattern,matchIndex);
}else{
NSLog(@"%@ is not contains %@",text,pattern);
}
//3.KMP查找
NSString *kmpText=@"BCBAABACAABABACAA";
NSString *kmpPattern=@"ABABAC";
kmpText=@"AAACCBBDEF";
kmpPattern=@"A";
KMPSearch *kmp=[[KMPSearch alloc]initWithPattern:kmpPattern];
NSInteger result=[kmp search:kmpText];
if (result>=0) {
NSLog(@"%@ contains %@ ,Match Begin index:%ld",kmpText,kmpPattern,result);
}else{
NSLog(@"%@ not contains %@ ",kmpText,kmpPattern);
}
//4.BoyerMoore查找
NSString *bmText=@"FlyElephant";
NSString *bmPattern=@"Fly";
BoyerMoore *bm=[[BoyerMoore alloc]initWithPattern:bmPattern];
NSInteger bmResult=[bm search:bmText];
if (bmResult>=0) {
NSLog(@"%@ contains %@,Begin Index:%ld",bmText,bmPattern,bmResult);
}else{
NSLog(@"%@ not contains %@",bmText,bmPattern);
}
//5.RabinKarp 查找
NSString *rmText=@"3141592653589793";
NSString *rmPattern=@"26535";
RabinKarp *rm=[[RabinKarp alloc]initWithPattern:rmPattern];
NSInteger rmResult=[rm search:rmText];
if (rmResult>=0) {
NSLog(@"%@ contains %@,Begin Index:%ld",rmText,rmPattern,rmResult);
}else{
NSLog(@"%@ not contains %@",rmText,rmPattern);
}