trie tree 是一种公共前缀树,(和关联规则的一种算法frequent-pattern growth算法的数据结构相同),具体定义见wiki。
上面为一棵trie tree,直观上看,如这种数据结构做索引,应该不错,事实也是如此:)。
节点值为0/1的trie tree,称为bitwise trie。bitwise trie是一个二叉查找trie tree。
下面看nedtrie 的实现。
首先看一下基本用法(code segment in test.c)
//define tree node struct typedef struct foo_s foo_t; struct foo_s { NEDTRIE_ENTRY(foo_s) link; size_t key; }; //initial tree root typedef struct foo_tree_s foo_tree_t; NEDTRIE_HEAD(foo_tree_s, foo_s); static foo_tree_t footree; size_t fookeyfunct(const foo_t *r) { return r->key; } NEDTRIE_GENERATE(static, foo_tree_s, foo_s, link, fookeyfunct, NEDTRIE_NOBBLEZEROS(foo_tree_s))
int main(void) { foo_t a, b, c, *r; NEDTRIE_INIT(&footree); a.key=2; NEDTRIE_INSERT(foo_tree_s, &footree, &a); b.key=6; NEDTRIE_INSERT(foo_tree_s, &footree, &b); r=NEDTRIE_FIND(foo_tree_s, &footree, &b); assert(r==&b); c.key=5; r=NEDTRIE_NFIND(foo_tree_s, &footree, &c); assert(r==&b); /* NFIND finds next largest. Invert the key function (i.e. 1-key) to find next smallest. */ NEDTRIE_REMOVE(foo_tree_s, &footree, &a); NEDTRIE_FOREACH(r, foo_tree_s, &footree) { printf("%p, %u\n", (void *) r, (unsigned) r->key); }
可以看出,nedtrie的功能相当于哈希树,将数据的哈希值作为键值,键值为整数,其二进制形式即为该数据的存储路径。如:hash("hello world") = 1111b,那么"hello world"在nedtrie上的存储路径即为root-1-1-1-1-value
和很多C的数据结构一样,作者使用宏实现了范型,支持多种数据类型。struct foo_s 为trie的内部节点和叶子节点数据类型,其中NEDTRIE_ENTRY为
#define NEDTRIE_ENTRY(type) \ struct { \ struct type *trie_parent; /* parent element */ \ struct type *trie_child[2]; /* my children based on whether they are zero or one. */ \ struct type *trie_prev, *trie_next; /* my siblings of identical key to me. */ \ }
foo_tree_s定义了跟节点的数据类型,NEDTRIE_HEAD定义了其结构,代码如下
#define NEDTRIE_HEAD2(name, type) \ struct name { \ size_t count; \ type *triebins[NEDTRIE_INDEXBINS]; /* each containing (1<<x)<=bitscanrev(x)<(1<<(x+1)) */ \ int nobbledir; \ } #define NEDTRIE_HEAD(name, type) NEDTRIE_HEAD2(name, struct type)