CMU15445 2020 B+TREE简单记录

CMU15445 2020 B+TREE

    • 前期准备
    • check point1简单记录
    • check point2简单记录
      • 删除
      • 迭代器
      • 并发

lab地址
CMU15445 2021博客地址

前期准备

做完了2021的15445,想做一下2020的b+ tree。按照2020 c++ primer assignment步骤一样拉取仓库,安装依赖包,但拉取的代码已经是最新的2021,所以需要将commit回滚到之前的版本。用pro1做测试,回滚到有buffer_pool_manager.cpp的版本。
CMU15445 2020 B+TREE简单记录_第1张图片
commit:f92ef74d8fb0d20d2038495b83b3ad0535b25f2c
图中为vscode插件git graph
回滚完之后cmake报以下错误:

fatal: invalid reference: master
CMake Error at googletest-download/googletest-prefix/tmp/googletest-gitclone.cmake:40 (message):
  Failed to checkout tag: 'master'

将 build_support中gtest_CMakeLists.txt.in的master改成main
CMU15445 2020 B+TREE简单记录_第2张图片
完整的流程为:

git init 空文件夹中初始化仓库
git remote add public https://github.com/cmu-db/bustub.git
git fetch public
git merge public/master
git reset --hard f92ef74d8fb0d20d2038495b83b3ad0535b25f2c
sudo ./build_support/packages.sh
mkdir build
cd build
修改gtest_CMakeLists.txt.in
cmake ..
make

2020 buffer pool只需实现lru replacement policy和buffer pool manager,涉及的文件有:

src/include/buffer/lru_replacer.h
src/buffer/lru_replacer.cpp
src/include/buffer/buffer_pool_manager.h
src/buffer/buffer_pool_manager.cpp

而2021需实现 lru replacement policy,buffer pool manager instance,parallel buffer pool manager,涉及的文件有:

src/include/buffer/lru_replacer.h
src/buffer/lru_replacer.cpp
src/include/buffer/buffer_pool_manager_instance.h
src/buffer/buffer_pool_manager_instance.cpp
src/include/buffer/parallel_buffer_pool_manager.h
src/buffer/parallel_buffer_pool_manager.cpp

复制粘贴2021代码,简单修改AllocatePage DeallocatePage使用方式就可以通过本地测试了。
通过课程代码:5VX7JZ添加2020的课程进行在线测试
CMU15445 2020 B+TREE简单记录_第3张图片

All of the source code for the projects are available on Github. There is a Gradescope submission site available to non-CMU students (Entry Code: 5VX7JZ). We will make the auto-grader for each assignment available to non-CMU students on Gradescope after their due date for CMU students. In exchange for making this available to the public, we ask that you do not make your project implementations public on Github or other source code repositories.

参考:
CMU15445 lab0 C++ PRIMER
googletest pull fails
FAQ

check point1简单记录

CMU15445 2020 B+TREE简单记录_第4张图片

因为听说这个实验很难,所以先在b站看完

  • Lecture #07: Trees Indexes I
  • Lecture #08: Trees Indexes II
  • Lecture #09: Index Concurrency Control

然后对照着schedule中note系统学一下b+ tree,实际上这也是我第一次看15445的课。
刚开始的时候不急着实现各个page类的方法,先实现b_plus_tree类的方法,这样就能知道各个page中的方法的作用了,然后再一一实现,实际上通过各类中的API就能看出各个操作的大致流程了。
在模板方法中使用auto自动推导类型的变量,在vscode就不能自动补全,非常不方便,所以尽量用明确的类型了。
在check point1中也不知道啥时候可以unpin,故直接记录所有new或者fetch的页,操作结束前全部释放。(错误的方法)

class BPlusTree {
	std::vector<page_id_t> dirty_id_;
}



//在各个Create/Fetch api中记录访问的page id
// helper function
InternalPage *CreateInternalPage(page_id_t *page_id, const page_id_t &parent_id);

InternalPage *FetchInternalPage(const page_id_t &page_id);

LeafPage *CreateLeafPage(page_id_t *page_id, const page_id_t &parent_id);

LeafPage *FetchLeafPage(const page_id_t &page_id);

BPlusTreePage *FetchTreePage(const page_id_t &page_id);

INDEX_TEMPLATE_ARGUMENTS
void BPLUSTREE_TYPE::ReleaseAllPage() {
  for (page_id_t id : dirty_id_) {
    buffer_pool_manager_->UnpinPage(id, true);
  }
  dirty_id_.clear();
}

check point2应该是需要改这个的,但现在无所谓,先实现基本的代码逻辑再考虑优化的问题。
不知道为什么,提交到gradescope显示以下错误,就直接使用本地的grading_b_plus_tree_checkpoint_1_test.cpp进行测试了。

/autograder/bustub/test/storage/b_plus_tree_insert_test.cpp:127:64: error: no member named 'end' in 'bustub::BPlusTree, bustub::RID, bustub::GenericComparator<8> >'; did you mean 'End'? [clang-diagnostic-error]
  for (auto iterator = tree.Begin(index_key); iterator != tree.end(); ++iterator) {
                                                               ^~~
                                                               End
/autograder/bustub/src/include/storage/index/b_plus_tree.h:67:22: note: 'End' declared here
  INDEXITERATOR_TYPE End();
                     ^

 Checking: /autograder/bustub/test/storage/grading_b_plus_tree_checkpoint_1_test.cpp
 Checking: /autograder/bustub/test/storage/grading_b_plus_tree_checkpoint_2_sequential_test.cpp
 Checking: /autograder/bustub/test/storage/grading_b_plus_tree_memory_test.cpp
 Checking: /autograder/bustub/test/storage/tmp_tuple_page_test.cpp
 Checking: /autograder/bustub/test/type/type_test.cpp
/autograder/bustub/test/storage/grading_b_plus_tree_checkpoint_2_sequential_test.cpp:60:18: error: invalid range expression of type 'bustub::BPlusTree, bustub::RID, bustub::GenericComparator<8> >'; no viable 'begin' function available [clang-diagnostic-error]
  for (auto pair : tree) {

/autograder/bustub/test/storage/grading_b_plus_tree_checkpoint_2_sequential_test.cpp:135:57: error: no member named 'isEnd' in 'bustub::IndexIterator, bustub::RID, bustub::GenericComparator<8> >'; did you mean 'IsEnd'? [clang-diagnostic-error]
  for (auto iterator = tree.Begin(index_key); !iterator.isEnd(); ++iterator) {
                                                        ^~~~~
                                                        IsEnd
/autograder/bustub/src/include/storage/index/index_iterator.h:31:8: note: 'IsEnd' declared here
  bool IsEnd();

/autograder/bustub/build/googletest-src/googletest/include/gtest/gtest.h:1358:11: error: The left operand of '==' is a garbage value [clang-analyzer-core.UndefinedBinaryOperatorResult,-warnings-as-errors]
  if (lhs == rhs) {

解决:后面发现只需要把begin end isEnd Begin End IsEnd方法全部定义一下就可以了,不需要考虑语法风格检查问题。 The left operand of ‘==’ is a garbage value问题可以通过将src/include/storage/page/tmp_tuple_page.h 加入压缩包解决,不过需要注意的是,tmp_tuple_page.h有可能需要修改
CMU15445 2020 B+TREE简单记录_第5张图片

// tmp_tuple_page.h
#pragma once

#include "storage/page/page.h"
#include "storage/table/tmp_tuple.h"
#include "storage/table/tuple.h"

namespace bustub {

// To pass the test cases for this class, you must follow the existing TmpTuplePage format and implement the
// existing functions exactly as they are! It may be helpful to look at TablePage.
// Remember that this task is optional, you get full credit if you finish the next task.

/**
 * TmpTuplePage format:
 *
 * Sizes are in bytes.
 * | PageId (4) | LSN (4) | FreeSpace (4) | (free space) | TupleSize2 | TupleData2 | TupleSize1 | TupleData1 |
 *
 * We choose this format because DeserializeExpression expects to read Size followed by Data.
 */
class TmpTuplePage : public Page {
 public:
  void Init(page_id_t page_id, uint32_t page_size) {
    memcpy(GetData(), &page_id, sizeof(page_id_t));
    memcpy(GetData() + sizeof(page_id_t), &page_size, sizeof(uint32_t));
  }

  auto GetTablePageId() -> page_id_t { return INVALID_PAGE_ID; }

  auto Insert(const Tuple &tuple, TmpTuple *out) -> bool { return false; }

 private:
  static_assert(sizeof(page_id_t) == 4);
};

}  // namespace bustub

# proj2.sh
 zip project2-submission.zip \
    src/include/buffer/lru_replacer.h \
    src/buffer/lru_replacer.cpp \
    src/include/buffer/buffer_pool_manager.h \
    src/buffer/buffer_pool_manager.cpp \
    src/include/storage/page/b_plus_tree_page.h \
    src/storage/page/b_plus_tree_page.cpp \
    src/include/storage/page/b_plus_tree_internal_page.h \
    src/storage/page/b_plus_tree_internal_page.cpp \
    src/include/storage/page/b_plus_tree_leaf_page.h \
    src/storage/page/b_plus_tree_leaf_page.cpp \
    src/include/storage/index/b_plus_tree.h \
    src/storage/index/b_plus_tree.cpp \
    src/include/storage/index/index_iterator.h \
    src/storage/index/index_iterator.cpp \
    src/include/storage/page/tmp_tuple_page.h 

CMU15445 2020 B+TREE简单记录_第6张图片

实现的时候对照着可视化网站写比较容易想出相应的代码步骤B+ Tree Visualization (usfca.edu),我觉得挺巧妙的一点就是:内部节点第一个key不用,在分裂的时候移动一半的kv到新节点,正好这个key可以插入到父节点。

    InternalPage *new_inner = Split(parent_page);
    KeyType comp = new_inner->KeyAt(0);
    InsertIntoParent(parent_page, comp, new_inner, transaction);

遇到的问题:
一 git下载gtest总是会出现各种问题,可能的解决方法有

  1. 关闭电脑代理,重置git代理
git config --global --unset http.proxy 
git config --global --unset https.proxy
  1. 修改hosts文件,添加github与ip地址映射
  2. 重启网络或主机

相关链接:https://github.com/521xueweihan/GitHub520
二 定义辅助方法创建新的叶子节点

INDEX_TEMPLATE_ARGUMENTS
BPlusTree<KeyType, ValueType, KeyComparator>::LeafPage *BPLUSTREE_TYPE::CreateLeafPage(page_id_t *page_id, const page_id_t &parent_id) {
  Page *page = buffer_pool_manager_->NewPage(page_id);
  if (page == nullptr) {
    throw Exception(ExceptionType::OUT_OF_MEMORY, "out memory when create leaf page");
  }
  LeafPage *leaf_page = reinterpret_cast<LeafPage *>(page->GetData());
  leaf_page->Init(*page_id, parent_id, leaf_max_size_);
  return leaf_page;
}

报错need ‘typename’ before because is a dependent scope
原因在于编译器无法识别BPlusTree::LeafPage这个名称是一个成员变量还是一个类型.
故定义宏当做函数返回类型

#define INTERNAL_PAGE_TYPE typename BPlusTree<KeyType, ValueType, KeyComparator>::InternalPage
#define LEAF_PAGE_TYPE typename BPlusTree<KeyType, ValueType, KeyComparator>::LeafPage

编译错误need ‘typename’ before *** because *** is a dependent scope 浅析

三 在CopyNFrom函数中,我想直接调用memcpy函数进行拷贝,但报以下错误:

CopyNFrom
memcpy(&array_, items, size * sizeof(MappingType));
error: undefined behavior, source object type 'std::pair, int>' is not TriviallyCopyable [bugprone-undefined-memory-manipulation,-warnings-as-errors]

也就是说std::pair, int>类型不是拷贝不变(trivially copyable)类型,对该类型数据拷贝结果未定义
对于未定义行为这篇博客浅谈 C++ Undefined Behavior讲的很好。

undefined behavior 是那些标准没有明确规定、不要求每个 C++ implementation 在其文档中明确规定、且标准也没有对具体的 behavior 施加任何限制的行为。从 abstract machine 的角度考虑,undefined behavior 与 unspecified behavior 也类似,它规定了 abstract machine 的非确定性状态转移:abstract machine 从一个初始的状态开始,执行一个包含 undefined behavior 的程序,abstract machine 的最终状态可能是任何一个状态。标准没有对 abstract machine 的最终状态施加任何限制。经典的 undefined behavior 包括:数组索引越界、null pointer / dangling pointer 解引用、有符号整数上下溢等。

综上所述,规定 undefined behavior 的原因归根结底就是现实世界太复杂了。Undefined behavior 是极度简洁的语言设计和极其复杂的真实世界之间的不可调和的矛盾的产物。

四:默认参数问题
BPlusTree类的构造函数中叶节点和内部节点都使用了宏,但内部节点的MappingType不应该是std::pair,而是std::pair

#define MappingType std::pair<KeyType, ValueType>

#define B_PLUS_TREE_LEAF_PAGE_TYPE BPlusTreeLeafPage<KeyType, ValueType, KeyComparator>
#define LEAF_PAGE_HEADER_SIZE 28
#define LEAF_PAGE_SIZE ((PAGE_SIZE - LEAF_PAGE_HEADER_SIZE) / sizeof(MappingType))

#define B_PLUS_TREE_INTERNAL_PAGE_TYPE BPlusTreeInternalPage<KeyType, ValueType, KeyComparator>
#define INTERNAL_PAGE_HEADER_SIZE 24
#define INTERNAL_PAGE_SIZE ((PAGE_SIZE - INTERNAL_PAGE_HEADER_SIZE) / (sizeof(MappingType)))
explicit BPlusTree(std::string name, BufferPoolManager *buffer_pool_manager, const KeyComparator &comparator,
                     int leaf_max_size = LEAF_PAGE_SIZE, int internal_max_size = INTERNAL_PAGE_SIZE);

CMU15445 2020 B+TREE简单记录_第7张图片
故定义常量

  static const int LEAF_PAGE_MAX_SIZE = (PAGE_SIZE - LEAF_PAGE_HEADER_SIZE) / sizeof(std::pair<KeyType, ValueType>);
  static const int INTERNAL_PAGE_MAX_SIZE = (PAGE_SIZE - INTERNAL_PAGE_HEADER_SIZE) / (sizeof(std::pair<KeyType, page_id_t>));
  explicit BPlusTree(std::string name, BufferPoolManager *buffer_pool_manager, const KeyComparator &comparator,
                     int leaf_max_size = LEAF_PAGE_MAX_SIZE, int internal_max_size = INTERNAL_PAGE_MAX_SIZE);

CMU15445 2020 B+TREE简单记录_第8张图片

check point2简单记录

删除

在实现删除操作时,突出一个不知道在写啥,也不知道提供的接口各参数如何使用,特别是Coalesce函数的双重指针,后面才知道原来15445还有教材,也就是《数据库系统概念》,这本书上已经有了插入删除的伪码,对着伪码实现起来就非常简单了。不过中文版有些错误,对照英文版比较好。
《Database-System-Concepts》
CMU15445 2020 B+TREE简单记录_第9张图片
CMU15445 2020 B+TREE简单记录_第10张图片
CMU15445 2020 B+TREE简单记录_第11张图片
CMU15445 2020 B+TREE简单记录_第12张图片
注意的点:
根节点:首先处理目标节点为根节点的情况
叶子节点:注意this与recipient的位置关系,更新next id
内部节点:注意在移动时mid key的处理,维持不变量,另外对更改移动节点的parent id(创建辅助函数封装该过程)

不变量
* Store n indexed keys and n+1 child pointers (page_id) within internal page.
* Pointer PAGE_ID(i) points to a subtree in which all keys K satisfy:
* K(i) <= K < K(i+1).

父节点:对应key的更新与删除

迭代器

为避免多次访问叶子节点,直接一次读取叶子节点所有数据项

// add your own private member variables here
BufferPoolManager *buffer_pool_manager_;
Page *page_;
LeafPage *leaf_;
std::vector<MappingType> data_;
int index_;

Page *page = FindLeafPage(key, false);
LeafPage *leaf = reinterpret_cast<LeafPage *>(page->GetData());
int index = leaf->KeyIndex(key, comparator_);
return INDEXITERATOR_TYPE(buffer_pool_manager_, page, leaf->GetAllItem(), index);

INDEX_TEMPLATE_ARGUMENTS
INDEXITERATOR_TYPE::IndexIterator(BufferPoolManager *buffer_pool_manager, Page *page, std::vector<MappingType> &&data,int index)
    : buffer_pool_manager_(buffer_pool_manager), page_(page), data_(std::move(data)), index_(index) {
  leaf_ = reinterpret_cast<LeafPage *>(page_->GetData());
}

并发

写这部分代码我主要参考了:
CMU 15445 Project2 B+TREE | 简单的谈一谈B+树
CMU 15445 Project 2C 实现B+树并发INDEX

实际上也就是用到了第一篇博客中的虚拟根节点与第二篇博客中对unpin与delete加上断言的思路。

虚拟根节点:

ReaderWriterLatch virtual_root_;  // 虚拟根节点
// 释放请求节点的锁并unpin
INDEX_TEMPLATE_ARGUMENTS
void BPLUSTREE_TYPE::ReleaseAncestorsPage(std::stack<Page *> *ancestors, bool is_dirty) {
  while (!ancestors->empty()) {
    Page *page = ancestors->top();
    if (page != nullptr) {
      BPlusTreePage *tree_page = reinterpret_cast<BPlusTreePage *>(page->GetData());
      page_id_t page_id = tree_page->GetPageId();
      page->WUnlatch();
      buffer_pool_manager_->UnpinPage(page_id, is_dirty);
    } else {  // 虚拟根节点处理
      virtual_root_.WUnlock();
    }

    ancestors->pop();
  }
}

想象根节点之上还有一个虚拟根节点,读操作时先获取虚拟根节点的读锁再访问根节点,获取到根节点的读锁后再释放虚拟根节点。插入操作时先获取虚拟根节点的写锁再访问根节点,待子节点安全后统一释放,可压入一个空指针作为虚拟根节点的标记,一个非常巧妙的思路。

// 读操作
virtual_root_.RLock();  // 获取虚拟根节点读锁
if (IsEmpty()) {
  virtual_root_.RUnlock();
  return false;
}
// 递归查询,找到相应的叶节点
page_id_t page_id;
page_id_t parent_id;
Page *page;
Page *parent;
BPlusTreePage *tree_page;

page_id = root_page_id_;
page = buffer_pool_manager_->FetchPage(page_id);
page->RLatch();
virtual_root_.RUnlock();
tree_page = reinterpret_cast<BPlusTreePage *>(page->GetData());
 
// 插入操作
virtual_root_.WLock();  // 获取虚拟根节点的写锁
if (IsEmpty()) {
  StartNewTree(key, value);
  virtual_root_.WUnlock();
  return true;
}
// 在该函数中对virtual_root_解锁
return InsertIntoLeaf(key, value, transaction);


// 删除操作
virtual_root_.WLock();  // 获取虚拟根节点的写锁
if (IsEmpty()) {
  virtual_root_.WUnlock();
  return;
}
// 递归查询,找到相应的叶节点
page_id_t page_id;
Page *page;
BPlusTreePage *tree_page;
std::stack<Page *> ancestors;

ancestors.emplace(nullptr);  // 表示压入虚拟根节点
page_id = root_page_id_;
page = buffer_pool_manager_->FetchPage(page_id);
page->WLatch();
tree_page = reinterpret_cast<BPlusTreePage *>(page->GetData());

而第二篇分析太多了,懒得看,主要是借鉴了断言的用法来验证实现的正确性

UnpinPageImpl函数
if (page.pin_count_ <= 0) {
  printf("unping page failed  id:%d  count:%d\n", page_id, page.pin_count_);
  assert(false);
  return false;
}
DeletePageImpl
if (delete_page.pin_count_ != 0) {
  printf("delete page failed  id:%d  count:%d\n", page_id, delete_page.pin_count_);
  assert(false);
  return false;
}

之前实现没怎么考虑unpin的问题,直接在辅助函数中记录create或fetch的页,而后统一unpin。这给我这部分的实现埋了许多的坑。

读取操作与插入操作实现起来没有太大的问题,而删除操作中可能发生页的删除,故修改CoalesceOrRedistribute函数定义,传递std::stack *ancestors参数,每发生一次节点的删除就将当前页弹出,表示当前页的解锁与unpin不再由Remove函数负责,而由AdjustRoot函数或Coalesce函数负责(Coalesce函数可能发生节点交换,故不一定删除当前页,有可能删除其兄弟节点,故职责传递给函该函数编写代码比较方便)。写代码时调用fetch或create时一定要明确由哪个函数负责unpin

  ancestors.emplace(page);
  // 进行删除操作,必要时进行节点重组
  LeafPage *leaf = reinterpret_cast<LeafPage *>(tree_page);
  int size = leaf->RemoveAndDeleteRecord(key, comparator_);
  // 小于阈值则执行重分布或删除操作
  if (size < leaf->GetMinSize()) {
    // 若删除发生,弹出部分节点,这些节点由CoalesceOrRedistribute负责解锁 unpin 删除
    CoalesceOrRedistribute(leaf, &ancestors, transaction);
  }
  ReleaseAncestorsPage(&ancestors, true);

可在创建页或删除页时打印相关信息,便于了解程序的执行状态;也可以利用BPlusTree的Print方法输出b+树的数据构成。加锁实现完后可以跑一遍单线程的程序,保证代码基本的正确性。
CMU15445 2020 B+TREE简单记录_第13张图片

遇到的问题:

1 在基本实现后运行顺序执行的测试程序时删除根节点的pin_count总是等于1。

  std::vector<RID> rids;
  for (auto key : keys) {
    rids.clear();
    index_key.SetFromInteger(key);
    tree.GetValue(index_key, &rids);
    EXPECT_EQ(rids.size(), 1);

    int64_t value = key & 0xFFFFFFFF;
    EXPECT_EQ(rids[0].GetSlotNum(), value);
  }
// 此时pin count==0
  int64_t start_key = 1;
  int64_t current_key = start_key;
  for (auto pair : tree) {
    (void)pair;
    current_key = current_key + 1;
  }
  EXPECT_EQ(current_key, keys.size() + 1);
// 此时pin count==1
  int64_t remove_scale = 9900;
  std::vector<int64_t> remove_keys;
  for (int64_t key = 1; key < remove_scale; key++) {
    remove_keys.push_back(key);
  }
  // std::random_shuffle(remove_keys.begin(), remove_keys.end());
  for (auto key : remove_keys) {
    index_key.SetFromInteger(key);
    tree.Remove(index_key, transaction);
  }

  start_key = 9900;
  current_key = start_key;
  int64_t size = 0;
  index_key.SetFromInteger(start_key);
  for (auto pair : tree) {
    (void)pair;
    current_key = current_key + 1;
    size = size + 1;
  }

在GetValue时pin count等于0,Remove时就变成1。后面才发现for (auto pair : tree) 隐含调用了begin方法,而在FindLeafPage方法中使用了FetchInternalPage方法,忘了unpin。

2 unpin时is_dirty问题
在顺序测试时,如果顺序删除,则结果正确,如果打乱,则结果错误。

std::random_shuffle(remove_keys.begin(), remove_keys.end());
for (auto key : remove_keys) {
  index_key.SetFromInteger(key);
  tree.Remove(index_key, transaction);
}

后面发现在ReleaseAncestorsPage函数中UnpinPage我一律设置为了false,实际上当由于子节点安全调用的ReleaseAncestorsPage,is_dirty应该是false,因为当前操作并不会修改这些节点的值,但在插入或删除之后调用的ReleaseAncestorsPage应该设置为true。

INDEX_TEMPLATE_ARGUMENTS
bool BPLUSTREE_TYPE::InsertIntoLeaf(const KeyType &key, const ValueType &value, Transaction *transaction) {
  // 递归查询,找到相应的叶节点
  page_id_t page_id;
  Page *page;
  BPlusTreePage *tree_page;
  std::stack<Page *> ancestors;

  ancestors.emplace(nullptr);  // 表示压入虚拟根节点
  page_id = root_page_id_;
  page = buffer_pool_manager_->FetchPage(page_id);
  page->WLatch();
  tree_page = reinterpret_cast<BPlusTreePage *>(page->GetData());

  while (!tree_page->IsLeafPage()) {
    ancestors.emplace(page);  // 压入父节点
    InternalPage *inner = reinterpret_cast<InternalPage *>(tree_page);
    page_id = inner->Lookup(key, comparator_);
    page = buffer_pool_manager_->FetchPage(page_id);
    page->WLatch();
    tree_page = reinterpret_cast<BPlusTreePage *>(page->GetData());
    if (tree_page->GetSize() < tree_page->GetMaxSize() - 1) {  // 该节点安全,可释放父节点锁并unpin
      ReleaseAncestorsPage(&ancestors, false);                 // 未修改父节点,is_dirty为false
    }
  }
  ancestors.emplace(page);  // 压入目标节点
  LeafPage *leaf = reinterpret_cast<LeafPage *>(tree_page);
  int old_size = leaf->GetSize();
  int new_size = leaf->Insert(key, value, comparator_);
  if (new_size >= leaf_max_size_) {  // 节点已满,需进行分裂操作
    LeafPage *new_leaf = Split(leaf);
    KeyType comp = new_leaf->KeyAt(0);
    InsertIntoParent(leaf, comp, new_leaf, transaction);
    buffer_pool_manager_->UnpinPage(new_leaf->GetPageId(), true);  // unpin新的叶子节点
  }
  ReleaseAncestorsPage(&ancestors, true);  // 解锁并unpin
  return new_size > old_size;
}

3: 节点安全问题

// Print输出结果
Internal Page: 3 parent: -1
0: 1,128: 2,255: 4,382: 5,509: 6,638: 7,766: 8,

Leaf Page: 1 parent: 3 next: 2
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,

Leaf Page: 2 parent: 3 next: 4
128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,151,152,153,154,155,156,157,158,159,160,161,162,163,164,165,166,167,168,169,170,171,172,173,174,175,176,177,178,179,180,181,182,183,184,185,186,187,188,189,190,191,192,193,194,195,196,197,198,199,200,201,202,203,204,205,206,207,208,209,210,211,212,213,214,215,216,217,218,219,220,221,222,223,224,225,226,227,228,229,230,231,232,233,234,235,236,237,238,239,240,241,242,243,244,245,246,247,248,249,250,251,252,253,254,

Leaf Page: 4 parent: 3 next: 5
255,256,257,258,259,260,261,262,263,264,265,266,267,268,269,270,271,272,273,274,275,276,277,278,279,280,281,282,283,284,285,286,287,288,289,290,291,292,293,294,295,296,297,298,299,300,301,302,303,304,305,306,307,308,309,310,311,312,313,314,315,316,317,318,319,320,321,322,323,324,325,326,327,328,329,330,331,332,333,334,335,336,337,338,339,340,341,342,343,344,345,346,347,348,349,350,351,352,353,354,355,356,357,358,359,360,361,362,363,364,365,366,367,368,369,370,371,372,373,374,375,376,377,378,379,380,381,512,

Leaf Page: 5 parent: 3 next: 6
382,383,384,385,386,387,388,389,390,391,392,393,394,395,396,397,398,399,400,401,402,403,404,405,406,407,408,409,410,411,412,413,414,415,416,417,418,419,420,421,422,423,424,425,426,427,428,429,430,431,432,433,434,435,436,437,438,439,440,441,442,443,444,445,446,447,448,449,450,451,452,453,454,455,456,457,458,459,460,461,462,463,464,465,466,467,468,469,470,471,472,473,474,475,476,477,478,479,480,481,482,483,484,485,486,487,488,489,490,491,492,493,494,495,496,497,498,499,500,501,502,503,504,505,506,507,508,632,

Leaf Page: 6 parent: 3 next: 7
509,510,511,513,514,515,516,517,518,519,520,521,522,523,524,525,526,527,528,529,530,531,532,533,534,535,536,537,538,539,540,541,542,543,544,545,546,547,548,549,550,551,552,553,554,555,556,557,558,559,560,561,562,563,564,565,566,567,568,569,570,571,572,573,574,575,576,577,578,579,580,581,582,583,584,585,586,587,588,589,590,591,592,593,594,595,596,597,598,599,600,601,602,603,604,605,606,607,608,609,610,611,612,613,614,615,616,617,618,619,620,621,622,623,624,625,626,627,628,629,630,631,633,634,635,636,637,749,

Leaf Page: 7 parent: 3 next: 8
638,639,640,641,642,643,644,645,646,647,648,649,650,651,652,653,654,655,656,657,658,659,660,661,662,663,664,665,666,667,668,669,670,671,672,673,674,675,676,677,678,679,680,681,682,683,684,685,686,687,688,689,690,691,692,693,694,695,696,697,698,699,700,701,702,703,704,705,706,707,708,709,710,711,712,713,714,715,716,717,718,719,720,721,722,723,724,725,726,727,728,729,730,731,732,733,734,735,736,737,738,739,740,741,742,743,744,745,746,747,748,750,751,752,753,754,755,756,757,758,759,760,761,762,763,764,765,

Leaf Page: 8 parent: 3 next: -1
766,767,768,769,770,771,772,773,774,775,776,777,778,779,780,781,782,783,784,785,786,787,788,789,790,791,792,793,794,795,796,797,798,799,800,801,802,803,804,805,806,807,808,809,810,811,812,813,814,815,816,817,818,819,820,821,822,823,824,825,826,827,828,829,830,831,832,833,834,835,836,837,838,839,840,841,842,843,844,845,846,847,848,849,850,851,852,853,854,855,856,857,858,859,860,861,862,863,864,865,866,867,868,869,870,871,872,873,874,875,876,877,878,879,880,881,882,883,884,885,886,887,888,889,890,891,892,893,894,895,896,897,898,899,900,901,902,903,904,905,906,907,908,909,910,911,912,913,914,915,916,917,918,919,920,921,922,923,924,925,926,927,928,929,930,931,932,933,934,935,936,937,938,939,940,941,942,943,944,945,946,947,948,949,950,951,952,953,954,955,956,957,958,959,960,961,962,963,964,965,966,967,968,969,970,971,972,973,974,975,976,977,978,979,980,981,982,983,984,985,986,987,988,989,990,991,992,993,994,995,996,997,998,999,

并行执行执行时,总有一些数组(512)会出现在较小的节点上,通过在每次分裂时打印B+树的信息发现,错误总是发生在分裂前后,而后检查InsertIntoLeaf函数代码,发生判断节点是否安全的代码存在问题:

if (tree_page->GetSize() < tree_page->GetMaxSize()) {
  ReleaseAncestorsPage(&ancestors, false);
}

如果当前节点大小恰好为max_size-1,在插入一元素后将会发生分裂,进而修改父节点的值,故在这个时候不应该释放父节点的锁。
故修改为:

if (tree_page->GetSize() < tree_page->GetMaxSize()-1) {
  ReleaseAncestorsPage(&ancestors, false);
}

4 MixTest测试时unpin和delete的断言失败
CMU15445 2020 B+TREE简单记录_第14张图片
后面发现Page的id与TreePage的id有时候会不一样(例如上面的id 0),故unpin一律用TreePage的page id
CMU15445 2020 B+TREE简单记录_第15张图片
但还是没有彻底解决整个问题,Page的数据很奇怪,id=0,is_dirty为123
CMU15445 2020 B+TREE简单记录_第16张图片
暂时没找到原因,以后有时间再看看。不过还是能通过所有测试

你可能感兴趣的:(国外课程实验,15445,B+树)