AC 自动机

AC自动机

AC自动机是一个经典的多模式串匹配算法,它可以实现对主串的一次扫描来匹配多个模式串的功能。
实现AC自动机就是在Trie树的基础上加上了类似KMP的next数组。

class ACNode{
    public int data;
    public ACNode[] children = new ACNode[26];
    public boolean isEndingChar = false;
    public int length = -1;
    public ACNode fail;

    public ACNode(char data) {
        this.data = data;
    }
}

AC自动机的构造分成两步:

  1. 构造Trie树
  2. 在Trie树的基础上构建fail指针(相当于KMP的next数组)
    Trie树的构造可以参考Trie树;
    下面是fail指针的构建
public class AC {
    private ACNode root = new ACNode('/');
    public void buildFailurePointer(){
        Queue queue = new LinkedList();
        root.fail = null;
        queue.add(root);
        while(!queue.isEmpty()){
            ACNode p = queue.remove();
            for(int i = 0; i < 26; i++){
                ACNode pc = p.children[i];
                if(p.children[i] == null){
                    continue;
                }
                if(p == root){
                    pc.fail = root;
                }
                else{
                    ACNode q = p.fail;
                    while(q != null){
                        ACNode qc = q.children[pc.data - 'a'];
                        if(qc != null){
                            pc.fail = qc;
                            break;
                        }
                        q = q.fail;
                    }
                    if(q == null){
                        pc.fail = root;
                    }
                }
                queue.add(pc);
            }
        }
    }
    public void match(char[] text){
        int n = text.length;
        ACNode p = root;
        for(int i = 0; i < n; i++){
            int index = text[i] - 'a';
            while(p.children[index] == null && p != root){
                p = p.fail;
            }
            p = p.children[index];
            if(p == null){
                p = root;
            }
            ACNode temp = p;
            while(temp != root){
                if(temp.isEndingChar){
                    int pos = i - temp.length + 1;
                    System.out.println("matched from index " + pos + "; length " +temp.length);
                }
                temp = temp.fail;
            }
        }
    }
}

你可能感兴趣的:(AC 自动机)