nocml

后缀树

在pongba的讨论组上看到一道Amazon的面试题：找出给定字符串里的最长回文。例子：输入XMADAMYX。则输出MADAM。这道题的流行解法是用后缀树（Suffix Tree)。这坨数据结构最酷的地方是用它能高效解决一大票复杂的字符串编程问题：

在文本T里查询T是否包含子串P（复杂度同流行的KMP相当）。

文本T里找出最长重复子串。比如abcdabcefda里abc同da都重复出现，而最长重复子串是abc。

找出字符串S1同S2的最长公共子串。注意不是常用作动态规划例子的LCS哈。比如字符串acdfg同akdfc的最长公共子串为df，而他们的LCS是adf。

Ziv-Lampel无损压缩算法。

还有就是这道面试题问的最长回文了。

另外后缀树在生物信息学里应该应用广泛。碱基匹配和选取的计算本质上就是操作超长的{C, T, A, G, U}*字符串嘛。

虽说后缀树的概念独立于Trie的概念，但我觉得从Trie推出后缀树自然简洁，所以先简单解释一下Trie。“Trie”这个单词来自于"retrieve"，可见它的用途主要是字符串查询。不过词汇变迁多半比较诡异，Trie不发tree的音，而发try的音。

Trie是坨简单但实用的数据结构，通常用于实现字典查询。我们做即时响应用户输入的AJAX搜索框时，就是Trie开始。谁说学点数据结构没用来着？本质上，Trie是一颗存储多个字符串的树。相邻节点间的边代表一个字符，这样树的每条分支代表一则子串，而树的叶节点则代表完整的字符串。和普通树不同的地方是，相同的字符串前缀共享同一条分支。还是例子最清楚。给出一组单词，inn, int, at, age, adv, ant, 我们可以得到下面的Trie：

可以看出：

每条边对应一个字母。

每个节点对应一项前缀。叶节点对应最长前缀，即单词本身。

单词inn与单词int有共同的前缀“in”, 因此他们共享左边的一条分支，root->i->in。同理，ate, age, adv, 和ant共享前缀"a"，所以他们共享从根节点到节点"a"的边。

查询非常简单。比如要查找int，顺着路径i -> in -> int就找到了。

搭建Trie的基本算法也很简单，无非是逐一把每则单词的每个字母插入Trie。插入前先看前缀是否存在。如果存在，就共享，否则创建对应的节点和边。比如要插入单词add，就有下面几步：

考察前缀"a"，发现边a已经存在。于是顺着边a走到节点a。

考察剩下的字符串"dd"的前缀"d"，发现从节点a出发，已经有边d存在。于是顺着边d走到节点ad

考察最后一个字符"d"，这下从节点ad出发没有边d了，于是创建节点ad的子节点add，并把边ad->add标记为d。

继续插播广告。Graph作图软件Graphviz还不错，用的DSL相当简单。上面的图就是用它做的。三步就够了：

实现Trie数据结构。这步不用花哨。10行代码，一坨hash足矣。

把上面的结构翻译成Graphviz的DSL。简单的深度优先足矣。

调用Graphviz的命令。图就生成乐。

多花20分钟，避免了手工作图排版的自虐行为。而且可以自由试验各式例子而不用担心反复画图的琐碎，何乐而不为嗫？

有了Trie，后缀树就容易理解了。先说说后缀的定义。给定一长度为n的字符串S=S1S2..Si..Sn，和整数i，1 <= i <= n，子串SiSi+1...Sn都是字符串S的后缀。以字符串S=XMADAMYX为例，它的长度为8，所以S[1..8], S[2..8], ... , S[8..8]都算S的后缀，我们一般还把空字串也算成后缀。这样，我们一共有如下后缀。对于后缀S[i..n]，我们说这项后缀起始于i。

S[1..8], XMADAMYX，也就是字符串本身，起始位置为1

S[2..8], MADAMYX，起始位置为2

S[3..8], ADAMYX，起始位置为3

S[4..8], DAMYX，起始位置为4

S[5..8], AMYX，起始位置为5

S[6..8], MYX，起始位置为6

S[7..8], YX，起始位置为7

S[8..8], X，起始位置为8

空字串。记为$。

而后缀树，就是包含一则字符串所有后缀的压缩Trie。把上面的后缀加入Trie后，我们得到下面的结构：

仔细观察上图，我们可以看到不少值得压缩的地方。比如蓝框标注的分支都是独苗，没有必要用单独的节点同边表示。如果我们允许任意一条边里包含多个字母，就可以把这种没有分叉的路径压缩到一条边。另外每条边已经包含了足够的后缀信息，我们就不用再给节点标注字符串信息了。我们只需要在叶节点上标注上每项后缀的起始位置。于是我们得到下图：

这样的结构丢失了某些后缀。比如后缀X在上图中消失了，因为它正好是字符串XMADAMYX的前缀。为了避免这种情况，我们也规定每项后缀不能是其它后缀的前缀。要解决这个问题其实挺简单，在待处理的子串后加一坨空字串就行了。例如我们处理XMADAMYX前，先把XMADAMYX变为 XMADAMYX$，于是就得到suffix tree乐。

那后缀树同最长回文有什么关系呢？我们得先知道两坨坨简单概念：

最低共有祖先，LCA（Lowest Common Ancestor)，也就是任意两节点（多个也行）最长的共有前缀。比如下图中，节点7同节点10的共同祖先是节点1与借点，但最低共同祖先是5。查找LCA的算法是O(1)的复杂度，这年头少见。代价是需要对后缀树做复杂度为O(n)的预处理。

广义后缀树(Generalized Suffix Tree)。传统的后缀树处理一坨单词的所有后缀。广义后缀树存储任意多个单词的所有后缀。例如下图是单词XMADAMYX与XYMADAMX的广义后缀树。注意我们需要区分不同单词的后缀，所以叶节点用不同的特殊符号与后缀位置配对。

有了上面的概念，查找最长回文相对简单了。思维的突破点在于考察回文的半径，而不是回文本身。所谓半径，就是回文对折后的字串。比如回文MADAM 的半径为MAD，半径长度为3，半径的中心是字母D。显然，最长回文必有最长半径，且两条半径相等。还是以MADAM为例，以D为中心往左，我们得到半径 DAM；以D为中心向右，我们得到半径DAM。二者肯定相等。因为MADAM已经是单词XMADAMYX里的最长回文，我们可以肯定从D往左数的字串 DAMX与从D往右数的子串DAMYX共享最长前缀DAM。而这，正是解决回文问题的关键。现在我们有后缀树，怎么把从D向左数的字串DAMX变成后缀呢？到这个地步，答案应该明显：把单词XMADAMYX翻转就行了。于是我们把寻找回文的问题转换成了寻找两坨后缀的LCA的问题。当然，我们还需要知道到底查询那些后缀间的LCA。这也简单，给定字符串S，如果最长回文的中心在i，那从位置i向右数的后缀刚好是S(i)，而向左数的字符串刚好是翻转S后得到的字符串S‘的后缀S'(n-i+1)。这里的n是字符串S的长度。有了这套直观解释，算法自然呼之欲出：

预处理后缀树，使得查询LCA的复杂度为O(1)。这步的开销是O(N)，N是单词S的长度

对单词的每一位置i(也就是从0到N-1)，获取LCA(S(i), S(N-i+1)) 以及LCA(S(i+1), S(n-i+1))。查找两次的原因是我们需要考虑奇数回文和偶数回文的情况。这步要考察每坨i，所以复杂度是O(N)

找到最大的LCA，我们也就得到了回文的中心i以及回文的半径长度，自然也就得到了最长回文。总的复杂度O(n)。

用上图做例子，i为3时，LCA(3$, 4#)为DAM，正好是最长半径。当然，这只是直观的叙述。

这篇帖子只大致描述了后缀树的基本思路。要想写出实用代码，至少还得知道下面的知识：

创建后缀树的O(n)算法。至于是Peter Weiner的73年年度最佳算法，还是Edward McCreight1976的改进算法，还是1995年E. Ukkonen大幅简化的算法，还是Juha Kärkkäinen 和 Peter Sanders2003年进一步简化的线性算法，各位老大随喜。

实现后缀树用的数据结构。比如常用的子结点加兄弟节点列表，Directed

优化后缀树空间的办法。比如不存储子串，而存储读取子串必需的位置。以及Directed Acyclic Word Graph，常缩写为黑哥哥们挂在嘴边的DAWG。

2,后缀树的用途，总结起来大概有如下几种

一个C++的实现：

//
// Suffix tree creation
//
// Mark Nelson, updated December, 2006
//
// This code has been tested with Borland C++ and
// Microsoft Visual C++.
//
// This program asks you for a line of input, then
// creates the suffix tree corresponding to the given
// text. Additional code is provided to validate the
// resulting tree after creation.
//
#include <iostream>
#include <iomanip>
#include <cstdlib>
#include <cstring>
#include <cassert>
#include < string>

using std::cout;
using std::cin;
using std::cerr;
using std::setw;
using std::flush;
using std::endl;

//
// When a new tree is added to the table, we step
// through all the currently defined suffixes from
// the active point to the end point.  This structure
// defines a Suffix by its final character.
// In the canonical representation, we define that last
// character by starting at a node in the tree, and
// following a string of characters, represented by
// first_char_index and last_char_index.  The two indices
// point into the input string.  Note that if a suffix
// ends at a node, there are no additional characters
// needed to characterize its last character position.
// When this is the case, we say the node is Explicit,
// and set first_char_index > last_char_index to flag
// that.
//

class Suffix {
     public :
         int origin_node;
         int first_char_index;
         int last_char_index;
        Suffix( int node, int start, int stop )
            : origin_node( node ),
              first_char_index( start ),
              last_char_index( stop ){};
         int Explicit(){ return first_char_index > last_char_index; }
         int Implicit(){ return last_char_index >= first_char_index; }
         void Canonize();
};

//
// The suffix tree is made up of edges connecting nodes.
// Each edge represents a string of characters starting
// at first_char_index and ending at last_char_index.
// Edges can be inserted and removed from a hash table,
// based on the Hash() function defined here.  The hash
// table indicates an unused slot by setting the
// start_node value to -1.
//

class Edge {
     public :
         int first_char_index;
         int last_char_index;
         int end_node;
         int start_node;
         void Insert();
         void Remove();
        Edge();
        Edge( int init_first_char_index,
               int init_last_char_index,
               int parent_node );
         int SplitEdge( Suffix &s );
         static Edge Find( int node, int c );
         static int Hash( int node, int c );
};

//
//   The only information contained in a node is the
//   suffix link. Each suffix in the tree that ends
//   at a particular node can find the next smaller suffix
//   by following the suffix_node link to a new node.  Nodes
//   are stored in a simple array.
//
class Node {
     public :
         int suffix_node;
         int father;
         int leaf_index;
        Node() { suffix_node = -1;
                father=-1;
                leaf_index=-1;}
         static int Count;
         static int Leaf;
};

//
// The maximum input string length this program
// will handle is defined here.  A suffix tree
// can have as many as 2N edges/nodes.  The edges
// are stored in a hash table, whose size is also
// defined here.
//
const int MAX_LENGTH = 1000;
const int HASH_TABLE_SIZE = 2179;   // A prime roughly 10% larger

//
// This is the hash table where all the currently
// defined edges are stored.  You can dump out
// all the currently defined edges by iterating
// through the table and finding edges whose start_node
// is not -1.
//

Edge Edges[ HASH_TABLE_SIZE ];

//
// The array of defined nodes.  The count is 1 at the
// start because the initial tree has the root node
// defined, with no children.
//

int Node::Count = 1;
int Node::Leaf = 1;
Node Nodes[ MAX_LENGTH * 2 ];

//
// The input buffer and character count.  Please note that N
// is the length of the input string -1, which means it
// denotes the maximum index in the input buffer.
//

char T[ MAX_LENGTH ];
int N;

//
// Necessary forward references
//
void validate();
int walk_tree( int start_node, int last_char_so_far );

//
// The default ctor for Edge just sets start_node
// to the invalid value.  This is done to guarantee
// that the hash table is initially filled with unused
// edges.
//

Edge::Edge()
{
    start_node = -1;
}

//
// I create new edges in the program while walking up
// the set of suffixes from the active point to the
// endpoint.  Each time I create a new edge, I also
// add a new node for its end point.  The node entry
// is already present in the Nodes[] array, and its
// suffix node is set to -1 by the default Node() ctor,
// so I don't have to do anything with it at this point.
//

Edge::Edge( int init_first, int init_last, int parent_node )
{
    first_char_index = init_first;
    last_char_index = init_last;
    start_node = parent_node;
    end_node = Node::Count++;
    Nodes[end_node].father=start_node;
}

//
// Edges are inserted into the hash table using this hashing
// function.
//

int Edge::Hash( int node, int c )
{
     return ( ( node << 8 ) + c ) % HASH_TABLE_SIZE;
}

//
// A given edge gets a copy of itself inserted into the table
// with this function.  It uses a linear probe technique, which
// means in the case of a collision, we just step forward through
// the table until we find the first unused slot.
//

void Edge::Insert()
{
     int i = Hash( start_node, T[ first_char_index ] );
     while ( Edges[ i ].start_node != -1 )
        i = ++i % HASH_TABLE_SIZE;
    Edges[ i ] = * this;
}

//
// Removing an edge from the hash table is a little more tricky.
// You have to worry about creating a gap in the table that will
// make it impossible to find other entries that have been inserted
// using a probe.  Working around this means that after setting
// an edge to be unused, we have to walk ahead in the table,
// filling in gaps until all the elements can be found.
//
// Knuth, Sorting and Searching, Algorithm R, p. 527
//

void Edge::Remove()
{
     int i = Hash( start_node, T[ first_char_index ] );
     while ( Edges[ i ].start_node != start_node ||
            Edges[ i ].first_char_index != first_char_index )
        i = ++i % HASH_TABLE_SIZE;
     for ( ; ; ) {
        Edges[ i ].start_node = -1;
         int j = i;
         for ( ; ; ) {
            i = ++i % HASH_TABLE_SIZE;
             if ( Edges[ i ].start_node == -1 )
                 return;
             int r = Hash( Edges[ i ].start_node, T[ Edges[ i ].first_char_index ] );
             if ( i >= r && r > j )
                 continue;
             if ( r > j && j > i )
                 continue;
             if ( j > i && i >= r )
                 continue;
             break;
        }
        Edges[ j ] = Edges[ i ];
    }
}

//
// The whole reason for storing edges in a hash table is that it
// makes this function fairly efficient.  When I want to find a
// particular edge leading out of a particular node, I call this
// function.  It locates the edge in the hash table, and returns
// a copy of it.  If the edge isn't found, the edge that is returned
// to the caller will have start_node set to -1, which is the value
// used in the hash table to flag an unused entry.
//

Edge Edge::Find( int node, int c )
{
     int i = Hash( node, c );
     for ( ; ; ) {
         if ( Edges[ i ].start_node == node )
             if ( c == T[ Edges[ i ].first_char_index ] )
                 return Edges[ i ];
         if ( Edges[ i ].start_node == -1 )
             return Edges[ i ];
        i = ++i % HASH_TABLE_SIZE;
    }
}

//
// When a suffix ends on an implicit node, adding a new character
// means I have to split an existing edge.  This function is called
// to split an edge at the point defined by the Suffix argument.
// The existing edge loses its parent, as well as some of its leading
// characters.  The newly created edge descends from the original
// parent, and now has the existing edge as a child.
//
// Since the existing edge is getting a new parent and starting
// character, its hash table entry will no longer be valid.  That's
// why it gets removed at the start of the function.  After the parent
// and start char have been recalculated, it is re-inserted.
//
// The number of characters stolen from the original node and given
// to the new node is equal to the number of characters in the suffix
// argument, which is last - first + 1;
//

int Edge::SplitEdge( Suffix &s )
{
    Remove();
    Edge *new_edge =
       new Edge( first_char_index,
                first_char_index + s.last_char_index - s.first_char_index,
                s.origin_node );
    new_edge->Insert();
    Nodes[ new_edge->end_node ].suffix_node = s.origin_node;
    first_char_index += s.last_char_index - s.first_char_index + 1;
    start_node = new_edge->end_node;
    Insert();
    Nodes[end_node].father=start_node;
     return new_edge->end_node;
}

//
// This routine prints out the contents of the suffix tree
// at the end of the program by walking through the
// hash table and printing out all used edges.  It
// would be really great if I had some code that will
// print out the tree in a graphical fashion, but I don't!
//

void dump_edges( int current_n )
{
    cout << " Start  End  Suf  First Last  String\n";
     for ( int j = 0 ; j < HASH_TABLE_SIZE ; j++ ) {
        Edge *s = Edges + j;
         if ( s->start_node == -1 )
             continue;
        cout << setw( 5 ) << s->start_node << " "
             << setw( 5 ) << s->end_node << " "
             << setw( 3 ) << Nodes[ s->end_node ].suffix_node << " "
             << setw( 5 ) << s->first_char_index << " "
             << setw( 6 ) << s->last_char_index << "  ";
         int top;
         if ( current_n > s->last_char_index )
            top = s->last_char_index;
         else
            top = current_n;
         for ( int l = s->first_char_index ;
                  l <= top;
                  l++ )
            cout << T[ l ];
        cout << "\n";
    }
}

//
// A suffix in the tree is denoted by a Suffix structure
// that denotes its last character.  The canonical
// representation of a suffix for this algorithm requires
// that the origin_node by the closest node to the end
// of the tree.  To force this to be true, we have to
// slide down every edge in our current path until we
// reach the final node.

void Suffix::Canonize()
{
     if ( !Explicit() ) {
        Edge edge = Edge::Find( origin_node, T[ first_char_index ] );
         int edge_span = edge.last_char_index - edge.first_char_index;
         while ( edge_span <= ( last_char_index - first_char_index ) ) {
            first_char_index = first_char_index + edge_span + 1;
            origin_node = edge.end_node;
             if ( first_char_index <= last_char_index ) {
               edge = Edge::Find( edge.end_node, T[ first_char_index ] );
                edge_span = edge.last_char_index - edge.first_char_index;
            };
        }
    }
}

//
// This routine constitutes the heart of the algorithm.
// It is called repetitively, once for each of the prefixes
// of the input string.  The prefix in question is denoted
// by the index of its last character.
//
// At each prefix, we start at the active point, and add
// a new edge denoting the new last character, until we
// reach a point where the new edge is not needed due to
// the presence of an existing edge starting with the new
// last character.  This point is the end point.
//
// Luckily for use, the end point just happens to be the
// active point for the next pass through the tree.  All
// we have to do is update it's last_char_index to indicate
// that it has grown by a single character, and then this
// routine can do all its work one more time.
//

void AddPrefix( Suffix &active, int last_char_index )
{
     int parent_node;
     int last_parent_node = -1;

     for ( ; ; ) {
        Edge edge;
        parent_node = active.origin_node;
//
// Step 1 is to try and find a matching edge for the given node.
// If a matching edge exists, we are done adding edges, so we break
// out of this big loop.
//
         if ( active.Explicit() ) {
            edge = Edge::Find( active.origin_node, T[ last_char_index ] );
             if ( edge.start_node != -1 )
                 break;
        } else { // implicit node, a little more complicated
            edge = Edge::Find( active.origin_node, T[ active.first_char_index ] );
             int span = active.last_char_index - active.first_char_index;
             if ( T[ edge.first_char_index + span + 1 ] == T[ last_char_index ] )
                 break;
            parent_node = edge.SplitEdge( active );
        }
//
// We didn't find a matching edge, so we create a new one, add
// it to the tree at the parent node position, and insert it
// into the hash table.  When we create a new node, it also
// means we need to create a suffix link to the new node from
// the last node we visited.
//
        Edge *new_edge = new Edge( last_char_index, N, parent_node );
        new_edge->Insert();
        Nodes[new_edge->end_node].leaf_index=Node::Leaf++;
         if ( last_parent_node > 0 )
            Nodes[ last_parent_node ].suffix_node = parent_node;
        last_parent_node = parent_node;
//
// This final step is where we move to the next smaller suffix
//
         if ( active.origin_node == 0 )
            active.first_char_index++;
         else
            active.origin_node = Nodes[ active.origin_node ].suffix_node;
        active.Canonize();
    }
     if ( last_parent_node > 0 )
        Nodes[ last_parent_node ].suffix_node = parent_node;
    active.last_char_index++;   // Now the endpoint is the next active point
    active.Canonize();
};

int main()
{
    cout << "Normally, suffix trees require that the last\n"
         << "character in the input string be unique.  If\n"
         << "you don't do this, your tree will contain\n"
         << "suffixes that don't end in leaf nodes.  This is\n"
         << "often a useful requirement. You can build a tree\n"
         << "in this program without meeting this requirement,\n"
         << "but the validation code will flag it as being an\n"
         << "invalid tree\n\n";
    cout << "Enter string: " << flush;
    cin.getline( T, MAX_LENGTH - 1 );
    N = strlen( T ) - 1;
//
// The active point is the first non-leaf suffix in the
// tree.  We start by setting this to be the empty string
// at node 0.  The AddPrefix() function will update this
// value after every new prefix is added.
//
    Suffix active( 0, 0, -1 );   // The initial active prefix
     for ( int i = 0 ; i <= N ; i++ )
        AddPrefix( active, i );


     for(i=0;i<Node::Count;i++)
        cout<<i<<"   "<<Nodes[i].father<<"   "<<Nodes[i].leaf_index<<endl;
//
// Once all N prefixes have been added, the resulting table
// of edges is printed out, and a validation step is
// optionally performed.
//
//    dump_edges( N );
//    cout << "Would you like to validate the tree?"
//         << flush;
//    std::string s;
//     std::getline( cin, s );
//    if ( s.size() > 0 && s[ 0 ] == 'Y' || s[ 0 ] == 'y' )
//        validate();
     return 1;
};

//
// The validation code consists of two routines.  All it does
// is traverse the entire tree.  walk_tree() calls itself
// recursively, building suffix strings up as it goes.  When
// walk_tree() reaches a leaf node, it checks to see if the
// suffix derived from the tree matches the suffix starting
// at the same point in the input text.  If so, it tags that
// suffix as correct in the GoodSuffixes[] array.  When the tree
// has been traversed, every entry in the GoodSuffixes array should
// have a value of 1.
//
// In addition, the BranchCount[] array is updated while the tree is
// walked as well.  Every count in the array has the
// number of child edges emanating from that node.  If the node
// is a leaf node, the value is set to -1.  When the routine
// finishes, every node should be a branch or a leaf.  The number
// of leaf nodes should match the number of suffixes (the length)
// of the input string.  The total number of branches from all
// nodes should match the node count.
//

char CurrentString[ MAX_LENGTH ];
char GoodSuffixes[ MAX_LENGTH ];
char BranchCount[ MAX_LENGTH * 2 ] = { 0 };

void validate()
{
     for ( int i = 0 ; i < N ; i++ )
        GoodSuffixes[ i ] = 0;
    walk_tree( 0, 0 );
     int error = 0;
     for ( i = 0 ; i < N ; i++ )
         if ( GoodSuffixes[ i ] != 1 ) {
            cout << "Suffix " << i << " count wrong!\n";
            error++;
        }
     if ( error == 0 )
        cout << "All Suffixes present!\n";
     int leaf_count = 0;
     int branch_count = 0;
     for (i = 0 ; i < Node::Count ; i++ ) {
         if ( BranchCount[ i ] == 0 )
            cout << "Logic error on node "
                 << i
                 << ", not a leaf or internal node!\n";
         else if ( BranchCount[ i ] == -1 )
            leaf_count++;
         else
            branch_count += BranchCount[ i ];
    }
    cout << "Leaf count : "
         << leaf_count
         << ( leaf_count == ( N + 1 ) ? " OK" : " Error!" )
         << "\n";
    cout << "Branch count : "
         << branch_count
         << ( branch_count == (Node::Count - 1) ? " OK" : " Error!" )
         << endl;
}

int walk_tree( int start_node, int last_char_so_far )
{
     int edges = 0;
     for ( int i = 0 ; i < 256 ; i++ ) {
        Edge edge = Edge::Find( start_node, i );
         if ( edge.start_node != -1 ) {
             if ( BranchCount[ edge.start_node ] < 0 )
                cerr << "Logic error on node "
                     << edge.start_node
                     << '\n';
            BranchCount[ edge.start_node ]++;
            edges++;
             int l = last_char_so_far;
             for ( int j = edge.first_char_index ; j <= edge.last_char_index ; j++ )
                CurrentString[ l++ ] = T[ j ];
            CurrentString[ l ] = '\0';
             if ( walk_tree( edge.end_node, l ) ) {
                 if ( BranchCount[ edge.end_node ] > 0 )
                        cerr << "Logic error on node "
                             << edge.end_node
                             << "\n";
                BranchCount[ edge.end_node ]--;
            }
        }
    }
//
// If this node didn't have any child edges, it means we
// are at a leaf node, and can check on this suffix.  We
// check to see if it matches the input string, then tick
// off it's entry in the GoodSuffixes list.
//
     if ( edges == 0 ) {
        cout << "Suffix : ";
         for ( int m = 0 ; m < last_char_so_far ; m++ )
            cout << CurrentString[ m ];
        cout << "\n";
        GoodSuffixes[ strlen( CurrentString ) - 1 ]++;
        cout << "comparing: " << ( T + N - strlen( CurrentString ) + 1 )
             << " to " << CurrentString << endl;
         if ( strcmp(T + N - strlen(CurrentString) + 1, CurrentString ) != 0 )
            cout << "Comparison failure!\n";
         return 1;
    } else
         return 0;
}
转载出处： http://www.cppblog.com/superKiki/archive/2010/10/29/131786.aspx

你可能感兴趣的:(后缀树)

面试中需要熟知的字符串知识华南溜达虎数据结构与算法面试算法数据结构职场和发展
面试中需要熟知的字符串知识字符串介绍字符串是一串字符组成的序列，跟数组类似，处理数组的一些方法同样适用于字符串，建议读本文前先读一下面试中需要熟知的数组知识。查找字符串常用的数据结构有：前缀树后缀树常用的字符串算法：KMP算法，在字符串匹配时特别高效。时间复杂度字符串实际上就是一个字符数组，字符串操作和数组操作类似，所以复杂度也基本类似。操作时间复杂度访问O(1)搜索O(n)插入O(n)删除O(n
算法分类合集 weixin_30784945
算法分类合集ACM所有算法数据结构栈，队列，链表哈希表，哈希数组堆，优先队列双端队列可并堆左偏堆二叉查找树Treap伸展树并查集集合计数问题二分图的识别平衡二叉树二叉排序树线段树一维线段树二维线段树树状数组一维树状数组N维树状数组字典树后缀数组，后缀树块状链表哈夫曼树桶，跳跃表Trie树(静态建树、动态建树)AC自动机LCA和RMQ问题KMP算法图论基本图算法图广度优先遍历深度优先遍历拓扑排序割边
ACM算法分类（要学习的东西还很多）还是太年轻
ACM所有算法数据结构栈，队列，链表哈希表，哈希数组堆，优先队列双端队列可并堆左偏堆二叉查找树Treap伸展树并查集集合计数问题二分图的识别平衡二叉树二叉排序树线段树一维线段树二维线段树树状数组一维树状数组N维树状数组字典树后缀数组，后缀树块状链表哈夫曼树桶，跳跃表Trie树(静态建树、动态建树)AC自动机LCA和RMQ问题KMP算法图论基本图算法图广度优先遍历深度优先遍历拓扑排序割边割点强连通分
ACM算法目录龍木
ACM所有算法数据结构栈，队列，链表哈希表，哈希数组堆，优先队列双端队列可并堆左偏堆二叉查找树Treap伸展树并查集集合计数问题二分图的识别平衡二叉树二叉排序树线段树一维线段树二维线段树树状数组一维树状数组N维树状数组字典树后缀数组，后缀树块状链表哈夫曼树桶，跳跃表Trie树(静态建树、动态建树)AC自动机LCA和RMQ问题KMP算法图论基本图算法图广度优先遍历深度优先遍历拓扑排序割边割点强连通分
后缀树算法小潤澤
后缀树算法后缀树算法在现代的比对工具中也是非常常见的一类比对算法，常用的STAR软件利用的就是后缀树算法，而bowtie，BWA等比对软件用的是BWT算法，这就是为什么STAR的比对速度要比其他二代软件快，索引比其他二代软件大的原因构建后缀树算法构建后缀树算法的流程类似于BWT算法，比方说我的ref序列为：ATCATGATC$，类似于BWT算法依次向前移位，并去掉第一个元素上表的第一行表示位置信息
单词的压缩编码（后缀树的使用） JYeontu JavaScript 前端算法 javascript 前端算法
说在前面后缀树（suffixtree）是一种数据结构，通常用于字符串处理。后缀树可以快速找到一个字符串所有的子串，因此在文本搜索、字符串匹配等领域有广泛应用。后缀树的构建过程是将一个字符串的所有后缀插入到一棵树中。这个树满足以下性质：根节点代表空字符串。每个非根节点都表示一个非空字符串的后缀。从根节点到叶子节点的路径表示一个原始字符串的后缀。由于每个节点都代表了一个字符串的后缀，因此可以在后缀树上
C/C++，树算法——Ukkonen的“后缀树“构造算法的源程序深度混淆 C#算法演义 Algorithm Recipes c语言 c++算法数值计算开发语言
1文本格式//ACprogramtoimplementUkkonen'sSuffixTreeConstruction//Andthenbuildgeneralizedsuffixtree#include#include#include#defineMAX_CHAR256structSuffixTreeNode{structSuffixTreeNode*children[MAX_CHAR];//po
【NOI2019集训题2】序列后缀树+splay+dfs序 diaoyoutun2652
题目大意：给你一个长度为$n$的序列$a_i$，还有一个数字$m$，有$q$次询问每次给出一个$d$和$k$，问你对所有的$a_i$都在模$m$意义下加了$d$后，第$k$小的后缀的起点编号。数据范围：$n≤100000，d≤a_i2#defineM2000053#definelc(x)ch[(x)][0]4#definerc(x)ch[(x)][1]5usingnamespacestd;67in
中文分词原理 money666
jieba原理一、步骤1、基于Trie树结构实现高效的词图扫描，生成句子中汉字所有可能成词情况所构成的有向无环图（DAG)2、采用了动态规划查找最大概率路径,找出基于词频的最大切分组合3、对于未登录词，采用了基于汉字成词能力的HMM模型，使用了Viterbi算法二、名词解释1、Trie，又经常叫前缀树，字典树等等。它有很多变种，如后缀树，RadixTree/Trie，PATRICIAtree，以及
序列回帖与multi-mapped reads的处理 YangRiriri 生物信息
数据回帖根据维基百科的定义：在计算和数据管理中，数据映射（datamapping）是在两个不同的数据模型之间建立数据元素映射的过程。一个经典的patternmapping问题：查找pattern（P）中字符串（T）的重复次数。通常的解决方法是使用后缀树，在之前的文章中写过方法：后缀树练习实例：从目标串S中查找串T重复次数在生物信息中，根据有无已知的基因组信息可以将mapping分成两类。这里只谈m
算法：字符串和二分搜索相关题目 sjz_hahalala479 算法 leetcode 面试
字符串面试的概念回文子串（连续）、子序列（不连续）前缀树（Trie树）、后缀树和后缀数组匹配字典序字符串题目类型规则判断判断字符串是否符合整数、浮点数是否返回回文规则数字运算大整数相关的加、减、乘、除操作与数组操作有关排序技巧、快排划分技巧字符计数类型hash表、依据ascii范围使用固定长度数组进行统计255、65535计数题常见类型：滑动窗口、寻找无重复子串、变位词动态规划最长公共子串、最长公
字符串 --- KMP Eentend-Kmp 自动机 trie图 trie树后缀树后缀数组北岛知寒
涉及到字符串的问题，无外乎这样一些算法和数据结构：自动机KMP算法Extend-KMP后缀树后缀数组trie树trie图及其应用。当然这些都是比较高级的数据结构和算法，而这里面最常用和最熟悉的大概是kmp，即使如此还是有相当一部分人也不理解kmp，更别说其他的了。当然一般的字符串问题中，我们只要用简单的暴力算法就可以解决了，然后如果暴力效率太低，就用个hash。当然hash也是一个面试中经常被用到
离线建AC自动机维护子串+线段树维护AC自动机：HDU4117 Qres821 AC自动机线段树
https://acm.hdu.edu.cn/showproblem.php?pid=4117离线处理AC自动机每次插入都要重构，但其实可以先离线建好，再进行操作AC自动机理解——维护子串每个子串都可以表示成一个前缀的一个后缀。任意一个前缀是Trie树上的一个点，然后其对应后缀就是fail树上的祖先fail树本质是一个后缀树线段树维护现在在fail树上操作，对每个点查询all祖先，可以变成祖先修改
[算法系列之二十四]后缀树（Suffix Tree） Roger_CoderLife Algorithm
之前有篇文章（[算法系列之二十]字典树（Trie））我们详细的介绍了字典树。有了这些基础我们就能更好的理解后缀树了。一引言模式匹配问题给定一个文本text[0…n-1],和一个模式串pattern[0…m-1]，写一个函数search(charpattern[],chartext[]),打印出pattern在text中出现的所有位置(n>m)。这个问题已经有两个经典的算法：KMP算法，有限自动机，
SPOJ LCS 最长公共子串后缀自动机&后缀树(Ukkonen) buttloem 题解数据结构后缀树后缀自动机最长公共子串
终于搞清楚了这两个恶心的算法。其实后缀树也不难写嘛。题目给定两个字符串a和b，求在a和b中都有出现的连续子串的最长长度。样例输入alsdfkjfjkdsalfdjskalajfkdsla样例输出3做法1使用后缀自动机。clj的课件讲得很详细了，这里不细说。主要说几件事：后缀自动机的状态的本质是right集合（见课件），parent意味着right集合的最小扩充。时刻记着这一点可以使很多性质的证明变
机器学习与数据分析￡Cauchy 机器学习数据分析人工智能
【数据清洗】异常检测孤立森林（IsolationForest）从原理到实践效果评估：F-score【1】保护隐私的时间序列异常检测架构概率后缀树PST–（异常检测）【1】UEBA架构设计之路5：概率后缀树模型【2】基于深度模型的日志序列异常检测【3】史上最全异常检测算法概述后缀树–（最长公共子串）【1】【1】【1】【1】【1】后缀树-字符串问题【2】后缀树应用5–最长的公共子字符串【2】【2】后缀
字符串匹配 - 文本预处理：后缀树（Suffix Tree） DeveloperFire 领域算法 suffix-tree 算法 java 数据结构面试
上述字符串匹配算法(朴素的字符串匹配算法,KMP算法,Boyer-Moore算法)均是通过对模式（Pattern）字符串进行预处理的方式来加快搜索速度。对Pattern进行预处理的最优复杂度为O(m)，其中m为Pattern字符串的长度。那么，有没有对文本（Text）进行预处理的算法呢？本文即将介绍一种对Text进行预处理的字符串匹配算法：后缀树（SuffixTree）。什么是后缀树上述字符串匹配
源码分享-go语言实现qsufsort后缀数组生成算法 zhyulo 源码分析 golang 算法 qsufsort bsdiff 后缀数组
qsufsort是开源差分工具bsdiff使用的后缀树生成算法。qsufsort实现原理为JesperLarsson的FasterSuffixSorting算法。packagemainimport"fmt"funcsplit(I[]int,V[]int,start,len_,hint){iflen(I)start{split(I,V,start,jj-start,h)}fori:=0;ikk{sp
近期规划 kyrielrving 规划
1.tarjan2.cdq分治3.二维线段树二叉树▪二叉树▪二叉查找树▪笛卡尔树▪Toptree▪T树自平衡二叉查找树▪AA树▪AVL树▪红黑树▪伸展树▪树堆▪节点大小平衡树B树▪B树▪B+树▪B*树▪Bx树▪UB树▪2-3树▪2-3-4树▪(a,b)-树▪Dancingtree▪H树Trie▪前缀树▪后缀树▪基数树空间划分树▪四叉树▪八叉树▪k-d树▪vp-树▪R树▪R*树▪R+树▪X树▪M树▪
常见树的简介小小宁儿
数据结构中为了存储和查找的方便，用各种树结构来存储文件，此文就简单总结一下各种树的特点，使读者对常见的树有个基本的认识，针对不同树的详解有专门的文章描述。本章涉及的树结构包括：二叉查找树（二叉排序树）、平衡二叉树（AVL树）、红黑树、B-树、B+树、B*树、(字典树（trie树）、后缀树、广义后缀树，这些不做讲解)。1、二叉查找树（二叉排序树/BST树）（图a）二叉查找树是一种动态查找表（图a），
[十二省联考2019]字符串问题（SAM优化建图+DAG上DP） cqbzcsq 字符串图论动态规划 c++算法字符串动态规划图论
题面见：https://www.luogu.com.cn/problem/P5284题解当年考的时候直接写了40暴力。。。现在看了看，好像可以用后缀树优化建图先倒着建一个SAM，然后再倍增定位每个区间后缀树上的边就从父亲连向儿子，A连边向B此时我们本来应该让B向其定位的区间连边的但是一个点可能会对应多个区间，直接连边会出很多其他的问题于是我们换一种思路，把定位在同一个点的区间按照长度排序，B排在A
数据结构基础--前缀树&&后缀树 kirito_song
本文只是自己的笔记，并不具备过多的指导意义。前缀树何为前缀树前缀树又名字典树，单词查找树，Trie树，是一种多路树形结构，是哈希树的变种，和hash效率有一拼，是一种用于快速检索的多叉树结构。多用于词频搜索或者模糊查询。查询时只与单样本长度有关，而与样本量无关。举例：给出一组单词，inn,int,at,age,adv,ant,我们可以得到下面的Trie：image如此，在进行依次输入进行查询时。只
10.12 bwa使用安装文件路径与使用 sh权限 KK_f2d5
我们这里将用于流程构建的BWA就是其中最优秀的一个，它将BW(Burrows-Wheeler)压缩算法和后缀树相结合，能够让我们以较小的时间和空间代价，获得准确的序列比对结果。别人的已安装文件打包传递后使用：1、连接服务器2、家目录下，Users，ls-all(或者打开/etc里profile）3、vim.bashrc```exportPATH="yourpath:$PATH"```添加的路径是绿
第三章_字符串_2019-03-18 雨住多一横
字符串的特点重要广泛性1、可以将字符串转化为字符数组2、很多题貌似不是字符串类型，但是可以转化为字符串类型题解决常见概念回文、子串、子序列（不连续）、前缀树（Trie树）、后缀树和后缀数组、匹配、字典序常见操作增、删、改、差、字符替换、字符串旋转常见类型题规则判断1、判断字符串是否符合整数规则2、判断字符串是否符合浮点数规则3、判断字符串是否符合回文字符串规则数字运算int、long表达的整数范围
Trie树使用实例 go4it
序本文简单介绍下apachecollection4中的PatriciaTrie的使用。Trie树Trie树，又称字典树，单词查找树或者前缀树，是一种用于快速检索的多叉树结构。应用经常被搜索引擎系统用于文本词频统计。同时，它也是很多算法和复杂数据结构的基础，如后缀树，AC自动机等优点最大限度地减少无谓的字符串比较，查询效率比哈希表高。缺点如果系统中存在大量字符串且这些字符串基本没有公共前缀，则相应的
近来写过的一些题目以及想法 Hellsegamosken 算法
2019ICPCAsiaXuzhouRegionalL(Loli,Yen-Jen,andacoolproblem)题意：给定一棵树，每个节点有一个字母，多次询问，求从某个结点向上L长度的字符串在树中出现了多少次。这是个广义SAM模板题。至于SAM这个东西还是理解为在反串后缀树上跳来跳去比较直观。要注意的时候建广义SAM的时候可以先把trie建出来，然后直接在trie上建。在线的建法不能只是简单的把
扩展KMP算法(Extend KMP) 学习小记 Hdu 4333 Revolving Digits 时雨晴天学习轨迹 KMP exKMP 字符串
前几天复习了KMP，现在来学习exKMP。exKMP的作用是：求出一个串所有后缀串(即s[i...len])和模式串的最长公共前缀。网上学习资料版本不多，看来看去还是刘雅琼的PPT《扩展的KMP算法》最好理解。这里有一个字符串算法大集合：字符串：KMPEentend-Kmp自动机trie图trie树后缀树后缀数组-星星的日志Hdu4333RevolvingDigits题意：给出一个不含前导0的数字
腾讯SOSO面试总结-细节决定成败 Garfier 求职面试腾讯算法编程
今天腾讯soso面试，表现不太好，记录下来，总结经验。题目：1、给定一个数组a[N]数组大小为N，数组中有M个元素，编程实现插入一个整数b的函数，如果数组空间不够则将最小的数淘汰掉2、求字符串之间的最大匹配，最长公共连续字串3、数组的最大子段和第一个题目简单题，注意细节就是。第二题想到了使用后缀数组，然后blabla说了一堆。回来网上查到好像后缀树解法最优。第三题要把边界和特殊情况考虑清楚，他给了
查找字符串之boyer-moore算法廖先贵算法设计
1问题的提出给出字符串P和T，长度分别为n和m。找出P在T中出现的所有位置。2原始匹配算法intIndex(char*P,char*T,intpos){i=pos;j=0;while(istrlen(P))return(i–strlen(P)+1);elsereturn-1;}上述算法的最坏时间复杂度为O(mn)。boyer-moore算法、KMP算法、suffixtree算法(后缀树)能够在线性
算法学习笔记 Eason_hoo 算法算法
学习方法*把所有经典算法写一遍*看算法有关源码*加入算法学习社区，相互鼓励学习*看经典书籍*刷题基本数据结构和算法这些算法全部自己敲一遍：链表链表双向链表二叉树二叉树二叉查找树伸展树(splaytree分裂树)平衡二叉树AVL红黑树B树,B+,B*R树Trie树(前缀树)后缀树最优二叉树(赫夫曼树)二叉堆（大根堆，小根堆）二项树二项堆斐波那契堆(FibonacciHeap)哈希表/散列表(Hash
Spring中@Value注解，需要注意的地方无量 spring bean @Value xml
Spring 3以后,支持@Value注解的方式获取properties文件中的配置值，简化了读取配置文件的复杂操作 1、在applicationContext.xml文件(或引用文件中)中配置properties文件 <bean id="appProperty" class="org.springframework.beans.fac
mongoDB 分片开窍的石头 mongodb
mongoDB的分片。要mongos查询数据时候先查询configsvr看数据在那台shard上，configsvr上边放的是metar信息，指的是那条数据在那个片上。由此可以看出mongo在做分片的时候咱们至少要有一个configsvr,和两个以上的shard（片）信息。第一步启动两台以上的mongo服务 &nb
OVER(PARTITION BY)函数用法 0624chenhong oracle
这篇写得很好，引自 http://www.cnblogs.com/lanzi/archive/2010/10/26/1861338.html OVER(PARTITION BY)函数用法 2010年10月26日 OVER(PARTITION BY)函数介绍开窗函数 &nb
Android开发中，ADB server didn't ACK 解决方法一炮送你回车库 Android开发
首先通知：凡是安装360、豌豆荚、腾讯管家的全部卸载，然后再尝试。一直没搞明白这个问题咋出现的，但今天看到一个方法，搞定了！原来是豌豆荚占用了 5037 端口导致。参见原文章：一个豌豆荚引发的血案——关于ADB server didn't ACK的问题简单来讲，首先将Windows任务进程中的豌豆荚干掉，如果还是不行，再继续按下列步骤排查。 &nb
canvas中的像素绘制问题换个号韩国红果果 JavaScript canvas
pixl的绘制，1.如果绘制点正处于相邻像素交叉线，绘制x像素的线宽，则从交叉线分别向前向后绘制x/2个像素，如果x/2是整数，则刚好填满x个像素，如果是小数，则先把整数格填满，再去绘制剩下的小数部分，绘制时，是将小数部分的颜色用来除以一个像素的宽度，颜色会变淡。所以要用整数坐标来画的话（即绘制点正处于相邻像素交叉线时），线宽必须是2的整数倍。否则会出现不饱满的像素。 2.如果绘制点为一个像素的
编码乱码问题灵静志远 java jvm jsp 编码
1、JVM中单个字符占用的字节长度跟编码方式有关，而默认编码方式又跟平台是一一对应的或说平台决定了默认字符编码方式；2、对于单个字符：ISO-8859-1单字节编码，GBK双字节编码，UTF-8三字节编码；因此中文平台(中文平台默认字符集编码GBK)下一个中文字符占2个字节，而英文平台(英文平台默认字符集编码Cp1252(类似于ISO-8859-1))。 3、getBytes()、getByte
java 求几个月后的日期 darkranger calendar getinstance
Date plandate = planDate.toDate(); SimpleDateFormat df = new SimpleDateFormat("yyyy-MM-dd"); Calendar cal = Calendar.getInstance(); cal.setTime(plandate); // 取得三个月后时间 cal.add(Calendar.M
数据库设计的三大范式（通俗易懂） aijuans 数据库复习
关系数据库中的关系必须满足一定的要求。满足不同程度要求的为不同范式。数据库的设计范式是数据库设计所需要满足的规范。只有理解数据库的设计范式，才能设计出高效率、优雅的数据库，否则可能会设计出错误的数据库. 目前，主要有六种范式：第一范式、第二范式、第三范式、BC范式、第四范式和第五范式。满足最低要求的叫第一范式，简称1NF。在第一范式基础上进一步满足一些要求的为第二范式，简称2NF。其余依此类推。
想学工作流怎么入手 atongyeye jbpm
工作流在工作中变得越来越重要，很多朋友想学工作流却不知如何入手。很多朋友习惯性的这看一点，那了解一点，既不系统，也容易半途而废。好比学武功，最好的办法是有一本武功秘籍。研究明白，则犹如打通任督二脉。系统学习工作流，很重要的一本书《JBPM工作流开发指南》。本人苦苦学习两个月，基本上可以解决大部分流程问题。整理一下学习思路，有兴趣的朋友可以参考下。 1 首先要
Context和SQLiteOpenHelper创建数据库百合不是茶 android Context创建数据库
一直以为安卓数据库的创建就是使用SQLiteOpenHelper创建,但是最近在android的一本书上看到了Context也可以创建数据库,下面我们一起分析这两种方式创建数据库的方式和区别,重点在SQLiteOpenHelper 一:SQLiteOpenHelper创建数据库: 1,SQLi
浅谈group by和distinct bijian1013 oracle 数据库 group by distinct
group by和distinct只了去重意义一样，但是group by应用范围更广泛些，如分组汇总或者从聚合函数里筛选数据等。譬如：统计每id数并且只显示数大于3 select id ,count(id) from ta
vi opertion 征客丶 mac opration vi
进入 command mode （命令行模式）按 esc 键再按 shift + 冒号注：以下命令中带 $ 【在命令行模式下进行】，不带 $ 【在非命令行模式下进行】一、文件操作 1.1、强制退出不保存 $ q! 1.2、保存 $ w 1.3、保存并退出 $ wq 1.4、刷新或重新加载已打开的文件 $ e 二、光标移动 2.1、跳到指定行数字
【Spark十四】深入Spark RDD第三部分RDD基本API bit1129 spark
对于K/V类型的RDD,如下操作是什么含义？ val rdd = sc.parallelize(List(("A",3),("C",6),("A",1),("B",5)) rdd.reduceByKey(_+_).collect reduceByKey在这里的操作，是把
java类加载机制 BlueSkator java 虚拟机
java类加载机制 1.java类加载器的树状结构引导类加载器 ^ | 扩展类加载器 ^ | 系统类加载器 java使用代理模式来完成类加载，java的类加载器也有类似于继承的关系，引导类是最顶层的加载器，它是所有类的根加载器，它负责加载java核心库。当一个类加载器接到装载类到虚拟机的请求时，通常会代理给父类加载器，若已经是根加载器了，就自己完成加载。虚拟机区分一个Cla
动态添加文本框 BreakingBad 文本框
<script> var num=1; function AddInput() { var str=""; str+="<input
读《研磨设计模式》-代码笔记-单例模式 bylijinnan java 设计模式
声明：本文只为方便我个人查阅和理解，详细的分析以及源代码请移步原作者的博客http://chjavach.iteye.com/ public class Singleton { } /* * 懒汉模式。注意，getInstance如果在多线程环境中调用，需要加上synchronized，否则存在线程不安全问题 */ class LazySingleton
iOS应用打包发布常见问题 chenhbc ios iOS发布 iOS上传 iOS打包
这个月公司安排我一个人做iOS客户端开发，由于急着用，我先发布一个版本，由于第一次发布iOS应用，期间出了不少问题，记录于此。 1、使用Application Loader 发布时报错：Communication error.please use diagnostic mode to check connectivity.you need to have outbound acc
工作流复杂拓扑结构处理新思路 comsci 设计模式工作算法企业应用 OO
我们走的设计路线和国外的产品不太一样，不一样在哪里呢？国外的流程的设计思路是通过事先定义一整套规则(类似XPDL)来约束和控制流程图的复杂度(我对国外的产品了解不够多，仅仅是在有限的了解程度上面提出这样的看法)，从而避免在流程引擎中处理这些复杂的图的问题，而我们却没有通过事先定义这样的复杂的规则来约束和降低用户自定义流程图的灵活性，这样一来，在引擎和流程流转控制这一个层面就会遇到很
oracle 11g新特性Flashback data archive daizj oracle
1. 什么是flashback data archive Flashback data archive是oracle 11g中引入的一个新特性。Flashback archive是一个新的数据库对象，用于存储一个或多表的历史数据。Flashback archive是一个逻辑对象，概念上类似于表空间。实际上flashback archive可以看作是存储一个或多个表的所有事务变化的逻辑空间。
多叉树:2-3-4树 dieslrae 树
平衡树多叉树,每个节点最多有4个子节点和3个数据项,2,3,4的含义是指一个节点可能含有的子节点的个数,效率比红黑树稍差.一般不允许出现重复关键字值.2-3-4树有以下特征: 1、有一个数据项的节点总是有2个子节点(称为2-节点) 2、有两个数据项的节点总是有3个子节点(称为3-节
C语言学习七动态分配 malloc的使用 dcj3sjt126com c language malloc
/* 2013年3月15日15:16:24 malloc 就memory(内存) allocate(分配)的缩写本程序没有实际含义，只是理解使用 */ # include <stdio.h> # include <malloc.h> int main(void) { int i = 5; //分配了4个字节静态分配 int * p
Objective-C编码规范[译] dcj3sjt126com 代码规范
原文链接 : The official raywenderlich.com Objective-C style guide 原文作者 : raywenderlich.com Team 译文出自 : raywenderlich.com Objective-C编码规范译者 : Sam Lau
0.性能优化-目录 frank1234 性能优化
从今天开始笔者陆续发表一些性能测试相关的文章，主要是对自己前段时间学习的总结，由于水平有限，性能测试领域很深，本人理解的也比较浅，欢迎各位大咖批评指正。主要内容包括：一、性能测试指标吞吐量、TPS、响应时间、负载、可扩展性、PV、思考时间 http://frank1234.iteye.com/blog/2180305 二、性能测试策略生产环境相同基准测试预热等 htt
Java父类取得子类传递的泛型参数Class类型 happyqing java 泛型父类子类 Class
import java.lang.reflect.ParameterizedType; import java.lang.reflect.Type; import org.junit.Test; abstract class BaseDao<T> { public void getType() { //Class<E> clazz =
跟我学SpringMVC目录汇总贴、PDF下载、源码下载 jinnianshilongnian springMVC
----广告-------------------------------------------------------------- 网站核心商详页开发掌握Java技术，掌握并发/异步工具使用，熟悉spring、ibatis框架；掌握数据库技术，表设计和索引优化，分库分表/读写分离；了解缓存技术，熟练使用如Redis/Memcached等主流技术；了解Ngin
the HTTP rewrite module requires the PCRE library 流浪鱼 rewrite
./configure: error: the HTTP rewrite module requires the PCRE library. 模块依赖性Nginx需要依赖下面3个包 1. gzip 模块需要 zlib 库 ( 下载: http://www.zlib.net/ ) 2. rewrite 模块需要 pcre 库 ( 下载: http://www.pcre.org/ ) 3. s
第12章 Ajax（中） onestopweb Ajax
index.html <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/
Optimize query with Query Stripping in Web Intelligence blueoxygen BO
http://wiki.sdn.sap.com/wiki/display/BOBJ/Optimize+query+with+Query+Stripping+in+Web+Intelligence and a very straightfoward video http://www.sdn.sap.com/irj/scn/events?rid=/library/uuid/40ec3a0c-936
Java开发者写SQL时常犯的10个错误 tomcat_oracle java sql
1、不用PreparedStatements 　　有意思的是，在JDBC出现了许多年后的今天，这个错误依然出现在博客、论坛和邮件列表中，即便要记住和理解它是一件很简单的事。开发者不使用PreparedStatements的原因可能有如下几个：　　他们对PreparedStatements不了解　　他们认为使用PreparedStatements太慢了　　他们认为写Prepar
世纪互联与结盟有感阿尔萨斯
10月10日，世纪互联与（Foxcon）签约成立合资公司，有感。全球电子制造业巨头（全球500强企业）与世纪互联共同看好IDC、云计算等业务在中国的增长空间，双方迅速果断出手，在资本层面上达成合作，此举体现了全球电子制造业巨头对世纪互联IDC业务的欣赏与信任，另一方面反映出世纪互联目前良好的运营状况与广阔的发展前景。众所周知，精于电子产品制造（世界第一），对于世纪互联而言，能够与结盟