Innodb-AHI

作用

  • AHI是针对叶子节点的,作用是减少B树寻址的代价。

如何做的

通过key(index_id+fileds+bytes):value(记录的物理地址)来直接定位。
这里面有几个细节

  • fileds+bytes如何选择
  • key(index_id+fileds+bytes)能唯一确定一条记录,如果存在重复的情况则选择最左或最右的记录。
    如下图,假设fileds为2,bytes为0,则需要创建(a,a),(a,b),(a,c),(a,d)的key(不考虑跨页的情况)。
    存在重复的情况,当left_side为True时,key(a,b)的value选择(a,b,1)这条记录,否则key(a,b)的value选择(a,b,3)这条记录.


    image.png

如何使用

整体流程

在btr_cur_search_to_nth_level函数里

  • 如果满足一定条件,就进入btr_search_guess_on_hash来定位,再通过btr_search_check_guess来判断记录的有效性。
  • 如果无法通过AHI定位或者定位不成功,search_loop逐层查找。
  • 当完成了搜索之后,如果最终定位的层是叶子节点,会调用btr_search_info_update更新AHI相关的信息。
    代码
btr_cur_search_to_nth_level(
    dict_index_t*  index,  /*!< in: index */
    ulint      level,  /*!< in: the tree level of search */
    ulint      mode,   /*!< in: PAGE_CUR_L, ...;
                Inserts should always be made using
                PAGE_CUR_LE to search the position! */
    ...)
{
    /* Use of AHI is disabled for intrinsic table as these tables re-use
    the index-id and AHI validation is based on index-id. */
    if (rw_lock_get_writer(btr_get_search_latch(index))
        == RW_LOCK_NOT_LOCKED
        && latch_mode <= BTR_MODIFY_LEAF
        && info->last_hash_succ
        && !index->disable_ahi
        && !estimate
# ifdef PAGE_CUR_LE_OR_EXTENDS
        && mode != PAGE_CUR_LE_OR_EXTENDS
# endif /* PAGE_CUR_LE_OR_EXTENDS */
        && !dict_index_is_spatial(index)
        /* If !has_search_latch, we do a dirty read of
        btr_search_enabled below, and btr_search_guess_on_hash()
        will have to check it again. */
        && UNIV_LIKELY(btr_search_enabled)
        && !modify_external
        && btr_search_guess_on_hash(index, info, tuple, mode,
                    latch_mode, cursor,
                    has_search_latch, mtr)) {

        /* Search using the hash index succeeded */

        ut_ad(cursor->up_match != ULINT_UNDEFINED
              || mode != PAGE_CUR_GE);
        ut_ad(cursor->up_match != ULINT_UNDEFINED
              || mode != PAGE_CUR_LE);
        ut_ad(cursor->low_match != ULINT_UNDEFINED
              || mode != PAGE_CUR_LE);
        btr_cur_n_sea++;

        DBUG_VOID_RETURN;
    }
  // 初始的,获得索引的根节点(space_id,page_no)
  space = dict_index_get_space(index);
  page_no = dict_index_get_page(index);

search_loop:
  // 循环、逐层的查找,直至达到传入的层数「level」,一般是0(即叶子节点)
  // 此处的分析忽略Change Buffer的部分
  // 从Buffer Pool或磁盘中得到索引页    
  block = buf_page_get_gen(space, zip_size, page_no, rw_latch, guess, buf_mode,
        file, line, mtr);
    
  // 在索引页中中查找对于指定的Tuple,满足某种条件(依赖于传入的 mode,例如 PAGE_CUR_L)
  // 的 record 将查找结果保存在page_cursor中,page_cursor结构也很简单:
  //   struct page_cur_t{
  //     byte*       rec;    /*!< pointer to a record on page */
  //     buf_block_t*    block;  /*!< pointer to the block containing rec */
  //   };
  page_cur_search_with_match(block, index, tuple, page_mode, &up_match, &up_bytes,
        &low_match, &low_bytes, page_cursor);

  if (level != height) {
    // 如果没到达指定层数,获得page_cursor(中间节点)内保存的下层节点的索引页page_no
    //注意:中间节点的Value是一个Pointer(page_no),指向子节点(中间节点或叶子节点)
    node_ptr = page_cur_get_rec(page_cursor);
    /* Go to the child node */
    page_no = btr_node_ptr_get_child_page_no(node_ptr, offsets);
        
    // 在下一层继续查找
    goto search_loop;
  }

  // 达到指定层数,函数退出
        if (btr_search_enabled && !index->disable_ahi) {
            btr_search_info_update(index, cursor);
        }
}

定位与判断有效性

定位-btr_search_guess_on_hash
  • 首先用户提供的前缀索引查询条件必须大于等于构建AHI时的前缀索引列数,这里存在一种可能性:索引上的search_info的n_fields 和block上构建AHI时的cur_n_fields值已经不相同了,但是我们并不知道本次查询到底落在哪个block上,这里一致以search_info上的n_fields为准来计算fold,去查询AHI;
  • 在检索AHI时需要加&btr_search_latch的S锁;
  • 如果本次无法命中AHI,就会将btr_search_info::last_hash_succ设置为false,这意味着随后的查询都不会去使用AHI了,只能等待下一路查询信息分析后才可能再次启动(btr_search_failure);
  • 对于从ahi中获得的记录指针,还需要根据当前的查询模式检查是否是正确的记录位置(btr_search_check_guess)。
判断记录有效性btr_search_check_guess

判断记录的有效性跟查询模式很相关,细节看注释。

btr_search_check_guess(
    btr_cur_t*  cursor,
    ibool       can_only_compare_to_cursor_rec,
    const dtuple_t* tuple,
    ulint       mode,
    mtr_t*      mtr)
{
    rec_t*      rec;
    ulint       n_unique;
    ulint       match;
    int     cmp;
    mem_heap_t* heap        = NULL;
    ulint       offsets_[REC_OFFS_NORMAL_SIZE];
    ulint*      offsets     = offsets_;
    ibool       success     = FALSE;
    rec_offs_init(offsets_);
    n_unique = dict_index_get_n_unique_in_tree(cursor->index);
    rec = btr_cur_get_rec(cursor);
    ut_ad(page_rec_is_user_rec(rec));
    match = 0;
    offsets = rec_get_offsets(rec, cursor->index, offsets,
                  n_unique, &heap);
    cmp = cmp_dtuple_rec_with_match(tuple, rec, offsets, &match);
    if (mode == PAGE_CUR_GE) {
        //cmp>0,说明tuple大于rec,rec可能是重复情况下的最左记录,比如AHI的key为(a,b),tuple为(a,b,c),rec为(a,b,a),那么这种情况就不行。
        if (cmp > 0) {
            goto exit_func;
        }
        cursor->up_match = match;
        if (match >= n_unique) {
            success = TRUE;
            goto exit_func;
        }
    } else if (mode == PAGE_CUR_LE) {
        if (cmp < 0) {
            goto exit_func;
        }
        cursor->low_match = match;
    } else if (mode == PAGE_CUR_G) {
        if (cmp >= 0) {
            goto exit_func;
        }
    } else if (mode == PAGE_CUR_L) {
        if (cmp <= 0) {
            goto exit_func;
        }
    }
    if (can_only_compare_to_cursor_rec) {
        /* Since we could not determine if our guess is right just by
        looking at the record under the cursor, return FALSE */
        goto exit_func;
    }
    match = 0;
    //还需要进一步判断记录是否有效,当mode为PAGE_CUR_G和PAGE_CUR_GE时,判断记录是否为满足AHI查询key的最左记录。否则判断记录是否为满足AHI查询key的最右记录。
    if ((mode == PAGE_CUR_G) || (mode == PAGE_CUR_GE)) {
        rec_t*  prev_rec;
        ut_ad(!page_rec_is_infimum(rec));
        prev_rec = page_rec_get_prev(rec);
        if (page_rec_is_infimum(prev_rec)) {
            success = btr_page_get_prev(page_align(prev_rec), mtr)
                == FIL_NULL;
            goto exit_func;
        }
        offsets = rec_get_offsets(prev_rec, cursor->index, offsets,
                      n_unique, &heap);
        cmp = cmp_dtuple_rec_with_match(
            tuple, prev_rec, offsets, &match);
        if (mode == PAGE_CUR_GE) {
            success = cmp > 0;
        } else {
            success = cmp >= 0;
        }
        goto exit_func;
    } else {
        rec_t*  next_rec;
        ut_ad(!page_rec_is_supremum(rec));
        next_rec = page_rec_get_next(rec);
        if (page_rec_is_supremum(next_rec)) {
            if (btr_page_get_next(page_align(next_rec), mtr)
                == FIL_NULL) {
                cursor->up_match = 0;
                success = TRUE;
            }
            goto exit_func;
        }
        offsets = rec_get_offsets(next_rec, cursor->index, offsets,
                      n_unique, &heap);
        cmp = cmp_dtuple_rec_with_match(
            tuple, next_rec, offsets, &match);
        if (mode == PAGE_CUR_LE) {
            success = cmp < 0;
            cursor->up_match = match;
        } else {
            success = cmp <= 0;
        }
    }
exit_func:
    if (UNIV_LIKELY_NULL(heap)) {
        mem_heap_free(heap);
    }
    return(success);
}

如何构建AHI

fileds+bytes

上文说了key是由index_id+fileds+bytes构成,那么如何确定fileds和bytes。

fileds和bytes是什么概念

参看https://www.jianshu.com/p/0cdd573a8232

用于确定fileds和bytes的结构体字段

总共有3个结构体在确定fileds和bytes发挥作用,分别是btr_cur_t(树查询时的游标)、btr_search_t(为每个索引维护的查询信息)、buf_block_t(block控制结构体)。btr_cur_t中的信息在B树定位中更新,在B树定位后,btr_search_t根据btr_cur_t的信息更新,用于记录B树查询相关的信息,然后buf_block_t根据btr_search_t的信息更新,用于记录本Block相关的查询信息。

为每个索引对象维护的index->search_info,类型为btr_search_t。

/** The search info struct in an index */
struct btr_search_t{

    ...

    ulint   n_fields;   /*!< recommended prefix length for hash search:
                number of full fields */
    ulint   n_fields;   /*!< recommended prefix: number of bytes in
                an incomplete field
                @see BTR_PAGE_MAX_REC_SIZE */
    ibool   left_side;  /*!< TRUE or FALSE, depending on whether
                the leftmost record of several records with
                the same prefix should be indexed in the
                hash index */

    ...

};

block控制结构体上相关变量(buf_block_t)

struct buf_block_t{
    
    ...

    volatile ulint  n_bytes;    /*!< recommended prefix length for hash
                    search: number of bytes in
                    an incomplete last field */
    volatile ulint  n_fields;   /*!< recommended prefix length for hash
                    search: number of full fields */
    volatile bool   left_side;  /*!< true or false, depending on
                    whether the leftmost record of several
                    records with the same prefix should be
                    indexed in the hash index */
    ...
}

The tree cursor

struct btr_cur_t {

    ...

    ulint       up_match;   /*!< If the search mode was PAGE_CUR_LE,
                    the number of matched fields to the
                    the first user record to the right of
                    the cursor record after
                    btr_cur_search_to_nth_level;
                    for the mode PAGE_CUR_GE, the matched
                    fields to the first user record AT THE
                    CURSOR or to the right of it;
                    NOTE that the up_match and low_match
                    values may exceed the correct values
                    for comparison to the adjacent user
                    record if that record is on a
                    different leaf page! (See the note in
                    row_ins_duplicate_error_in_clust.) */
    ulint       up_bytes;   /*!< number of matched bytes to the
                    right at the time cursor positioned;
                    only used internally in searches: not
                    defined after the search */
    ulint       low_match;  /*!< if search mode was PAGE_CUR_LE,
                    the number of matched fields to the
                    first user record AT THE CURSOR or
                    to the left of it after
                    btr_cur_search_to_nth_level;
                    NOT defined for PAGE_CUR_GE or any
                    other search modes; see also the NOTE
                    in up_match! */
    ulint       low_bytes;  /*!< number of matched bytes to the
                    left at the time cursor positioned;
                    only used internally in searches: not
                    defined after the search */
    ulint       n_fields;   /*!< prefix length used in a hash
                    search if hash_node != NULL */
    ulint       n_bytes;    /*!< hash prefix bytes if hash_node !=
                    NULL */

    ...
    
};
确定fileds与bytes的时机

参考整体流程,当完成了搜索之后,如果最终定位的层是叶子节点,会调用btr_search_info_update更新AHI相关的信息。
这个时候cursor->{up_match, up_bytes, low_match, low_bytes}都已经确定。
首先需要根据cursor->{up_match, up_bytes, low_match, low_bytes}来更新index的search info。
路径为btr_search_info_update->btr_search_info_update_slow->btr_search_info_update_hash。
有两种情况需要更新btr_search_t->{n_fields,n_bytes,left_side}。

  • btr_search_t->n_hash_potential为0:search info首次初始化或者上次查询根据查询条件无法唯一确定一条记录。
  • 如代码所示,如果cmp<=0,说明cursor->low_match, cursor->low_bytes所在的记录是在info->n_fields, info->n_bytes这个范围内与查询条件相等的最右边的记录,如果info的建议是按照相同前缀最左记录构建AHI,说明已不符合当次查询要求,需要重新生成建议。(补个图吧)
    cmp = ut_pair_cmp(info->n_fields, info->n_bytes,
              cursor->low_match, cursor->low_bytes);
    if (info->left_side ? cmp <= 0 : cmp > 0) {
        goto set_new_recomm;
    }

生成info->{n_fields,n_bytes,left_side}新值是如下算法,由以下算法可以看出,选择{info->n_fields, info->n_bytes, info->left_side}的依据则是在不超过 unique index 列数的前提下,使其计算代价最小,而 index->info->left_side 的值则会决定存储同一数据页上相同前缀索引的最左记录还是最右记录。
细节说明看注释

set_new_recomm:
    /* We have to set a new recommendation; skip the hash analysis
    for a while to avoid unnecessary CPU time usage when there is no
    chance for success */
    info->hash_analysis = 0;
    cmp = ut_pair_cmp(cursor->up_match, cursor->up_bytes,
              cursor->low_match, cursor->low_bytes);
    if (cmp == 0) {
        //cmp==0说明根据查询条件无法唯一确定一条记录,比如根据=b查询,然后定位low定位到a,up定位到c。
        info->n_hash_potential = 0;
        /* For extra safety, we set some sensible values here */
        info->n_fields = 1;
        info->n_bytes = 0;
        info->left_side = TRUE;
    } else if (cmp > 0) {
        //cm
        info->n_hash_potential = 1;
        if (cursor->up_match >= n_unique) {
            //n_unique个fileds已经能唯一确定一条记录了
            info->n_fields = n_unique;
            info->n_bytes = 0;

        } else if (cursor->low_match < cursor->up_match) {
            //+1怎么理解,比如low_match=1,up_match=3,这个时候把n_fields设置为2已经足以定位到up_match上了,比如查询条件是(a,b,c),low_match为(a), up_match为(a,b,c)这个时候使用(a,b已经足以定位到up_match)。另外low_match=0时,n_fields设置为1,也足以满足情况了。
            info->n_fields = cursor->low_match + 1;
            info->n_bytes = 0;
        } else {
            info->n_fields = cursor->low_match;
            info->n_bytes = cursor->low_bytes + 1;
        }

        info->left_side = TRUE;
    } else {
        info->n_hash_potential = 1;

        if (cursor->low_match >= n_unique) {

            info->n_fields = n_unique;
            info->n_bytes = 0;
        } else if (cursor->low_match > cursor->up_match) {

            info->n_fields = cursor->up_match + 1;
            info->n_bytes = 0;
        } else {
            info->n_fields = cursor->up_match;
            info->n_bytes = cursor->up_bytes + 1;
        }

        info->left_side = FALSE;
    }
static
void
btr_search_info_update_hash(
    btr_search_t*   info,
    btr_cur_t*  cursor)
{
    dict_index_t*   index = cursor->index;
    ulint       n_unique;
    int     cmp;

    ut_ad(!rw_lock_own(btr_get_search_latch(index), RW_LOCK_S));
    ut_ad(!rw_lock_own(btr_get_search_latch(index), RW_LOCK_X));

    if (dict_index_is_ibuf(index)) {
        /* So many deletes are performed on an insert buffer tree
        that we do not consider a hash index useful on it: */

        return;
    }

    n_unique = dict_index_get_n_unique_in_tree(index);

    if (info->n_hash_potential == 0) {

        goto set_new_recomm;
    }

    /* Test if the search would have succeeded using the recommended
    hash prefix */

    if (info->n_fields >= n_unique && cursor->up_match >= n_unique) {
increment_potential:
        info->n_hash_potential++;

        return;
    }

    cmp = ut_pair_cmp(info->n_fields, info->n_bytes,
              cursor->low_match, cursor->low_bytes);

    if (info->left_side ? cmp <= 0 : cmp > 0) {

        goto set_new_recomm;
    }

    cmp = ut_pair_cmp(info->n_fields, info->n_bytes,
              cursor->up_match, cursor->up_bytes);

    if (info->left_side ? cmp <= 0 : cmp > 0) {

        goto increment_potential;
    }

set_new_recomm:
    /* We have to set a new recommendation; skip the hash analysis
    for a while to avoid unnecessary CPU time usage when there is no
    chance for success */

    info->hash_analysis = 0;

    cmp = ut_pair_cmp(cursor->up_match, cursor->up_bytes,
              cursor->low_match, cursor->low_bytes);
    if (cmp == 0) {
        //cmp==0说明根据查询条件无法唯一确定一条记录,比如根据=b查询,然后定位low定位到a,up定位到c。
        info->n_hash_potential = 0;

        /* For extra safety, we set some sensible values here */

        info->n_fields = 1;
        info->n_bytes = 0;

        info->left_side = TRUE;

    } else if (cmp > 0) {
        info->n_hash_potential = 1;

        if (cursor->up_match >= n_unique) {

            info->n_fields = n_unique;
            info->n_bytes = 0;

        } else if (cursor->low_match < cursor->up_match) {

            info->n_fields = cursor->low_match + 1;
            info->n_bytes = 0;
        } else {
            info->n_fields = cursor->low_match;
            info->n_bytes = cursor->low_bytes + 1;
        }

        info->left_side = TRUE;
    } else {
        info->n_hash_potential = 1;

        if (cursor->low_match >= n_unique) {

            info->n_fields = n_unique;
            info->n_bytes = 0;
        } else if (cursor->low_match > cursor->up_match) {

            info->n_fields = cursor->up_match + 1;
            info->n_bytes = 0;
        } else {
            info->n_fields = cursor->up_match;
            info->n_bytes = cursor->up_bytes + 1;
        }

        info->left_side = FALSE;
    }
}

完成Index层面的n_fileds和n_bytes建议后,如何落实到block层面。
代码路径为btr_search_info_update->btr_search_info_update_slow->btr_search_update_block_hash_info。
因为AHI虽然是针对Index产生建议,但是最终是在block上建立key:value的映射关系,block层面的记录的是对block的查询信息,如果满足一定条件,就建立AHI。
有个疑问?在一个block构建完成后,如果index建议的fields和bytes发生变化,innodb是什么行为。

如何避免频繁构建AHI

  • Index层面btr_search_t,如何避免频繁生成新的建议
  • block层面buf_block_t,如何判断该block是否值得构建
    先说index层面的btr_search_t,btr_search_t有个变量hash_analysis,当生成新的建议后hash_analysis被重置为0,重置后对该索引BTR_SEARCH_HASH_ANALYSIS次查询内,都不会尝试生成新的建议了。
    再说block层面的,参考https://juejin.cn/post/6844903536765976590

AHI并发控制

https://juejin.cn/post/6844903536765976590
https://developer.aliyun.com/article/41046
http://mysql.taobao.org/monthly/2015/09/01/
https://www.jianshu.com/p/0cdd573a8232

你可能感兴趣的:(Innodb-AHI)