UncleYau

Rocksdb 代码学习写流程1（WriteBatch写,WriterThead调度Writer）

1.几个需要使用的相关类

1.Slice

//主要用来装数据的
// 就两个成员变量data,size
// （就是用装key和value的值,长度），以及一些处理函数。
class Slice {
 public:
  // Create an empty slice.
  Slice() : data_(""), size_(0) { }
  // Create a slice that refers to d[0,n-1].
  Slice(const char* d, size_t n) : data_(d), size_(n) { }
  // Create a slice that refers to the contents of "s"
  /* implicit */
  Slice(const std::string& s) : data_(s.data()), size_(s.size()) { }
  // Create a slice that refers to s[0,strlen(s)-1]
  /* implicit */
  Slice(const char* s) : data_(s), size_(strlen(s)) { }
  // Create a single slice from SliceParts using buf as storage.
  // buf must exist as long as the returned Slice exists.
  Slice(const struct SliceParts& parts, std::string* buf);
  // Return a pointer to the beginning of the referenced data
  const char* data() const { return data_; }
  // Return the length (in bytes) of the referenced data
  size_t size() const { return size_; }
  // Return true iff the length of the referenced data is zero
  bool empty() const { return size_ == 0; }
  // Return the ith byte in the referenced data.
  // REQUIRES: n < size()
  char operator[](size_t n) const {
    assert(n < size());
    return data_[n];
  }
  // Change this slice to refer to an empty array
  void clear() { data_ = ""; size_ = 0; }
  // Drop the first "n" bytes from this slice.
  void remove_prefix(size_t n) {
    assert(n <= size());
    data_ += n;
    size_ -= n;
  }
  // Return a string that contains the copy of the referenced data.
  std::string ToString(bool hex = false) const;
  // Three-way comparison.  Returns value:
  //   <  0 iff "*this" <  "b",
  //   == 0 iff "*this" == "b",
  //   >  0 iff "*this" >  "b"
  int compare(const Slice& b) const;
  // Return true iff "x" is a prefix of "*this"
  bool starts_with(const Slice& x) const {
    return ((size_ >= x.size_) &&
            (memcmp(data_, x.data_, x.size_) == 0));
  }
  // Compare two slices and returns the first byte where they differ
  size_t difference_offset(const Slice& b) const;
 // private: make these public for rocksdbjni access
  const char* data_;
  size_t size_;
  // Intentionally copyable
};

2.WriteOptions

struct WriteOptions {
  // Default: false
  bool sync; //是否需要同步

  // If true, writes will not first go to the write ahead log,
  // and the write may got lost after a crash.
  bool disableWAL; //是否需要写事务日志

  // The option is deprecated. It's not used anymore.
  uint64_t timeout_hint_us; // 指示了这个写操作完成的时间期限

  // If true and if user is trying to write to column families that don't exist
  // (they were dropped),  ignore the write (don't return an error). If there
  // are multiple writes in a WriteBatch, other writes will succeed.
  // Default: false
  bool ignore_missing_column_families;

  WriteOptions()
      : sync(false),
        disableWAL(false),
        timeout_hint_us(0),
        ignore_missing_column_families(false) {}
};

3.WriteBatch


//rocksdb在写时做了一个优化批量更新的操作，即writebatch类。writebatch类只有一个成员变量，存储的
//是若干条记录的序列号字符串，这个字符串是按照一定格式生成，当要取出这些记录时，也要按照格式一条一条
//解析出来。
//先介绍下这个类的成员变量rep_，这个字符串用来存储这次批操作的所有记录，格式如下:
// WriteBatch::rep_ :=
//    sequence: fixed64
//    count: fixed32
//    data: record[count]
// record :=
//    kTypeValue varstring varstring
//    kTypeDeletion varstring
//    kTypeSingleDeletion varstring
//    kTypeMerge varstring varstring
//    kTypeColumnFamilyValue varint32 varstring varstring
//    kTypeColumnFamilyDeletion varint32 varstring varstring
//    kTypeColumnFamilySingleDeletion varint32 varstring varstring
//    kTypeColumnFamilyMerge varint32 varstring varstring
// varstring :=
//    len: varint32
//    data: uint8[len]
//可以看到这个这个字符串首先有有8字节的序列号和4字节的记录数作为头，所以这个类定义了
//static const size_t KHeader=12
//作为这个这个字符串的最小长度。在头之后，紧接着就是一条一条的记录。
//对于插入的记录，由kTypeValue+key长度+key+value长度+value组成
//对于删除记录，由kTypeDelete+key长度+key组成
//这类定义了写和删除的操作实现就是调用WriteBatchInternal这个类里面的方法
class WriteBatch : public WriteBatchBase {
 public:
  explicit WriteBatch(size_t reserved_bytes = 0);
  ~WriteBatch();

  using WriteBatchBase::Put;
  // Store the mapping "key->value" in the database.
  void Put(ColumnFamilyHandle* column_family, const Slice& key,
           const Slice& value) override;
  void Put(const Slice& key, const Slice& value) override {
    Put(nullptr, key, value);
  }

  // Variant of Put() that gathers output like writev(2).  The key and value
  // that will be written to the database are concatentations of arrays of
  // slices.
  void Put(ColumnFamilyHandle* column_family, const SliceParts& key,
           const SliceParts& value) override;
  void Put(const SliceParts& key, const SliceParts& value) override {
    Put(nullptr, key, value);
  }

  using WriteBatchBase::Delete;
  // If the database contains a mapping for "key", erase it.  Else do nothing.
  void Delete(ColumnFamilyHandle* column_family, const Slice& key) override;
  void Delete(const Slice& key) override { Delete(nullptr, key); }

  // variant that takes SliceParts
  void Delete(ColumnFamilyHandle* column_family,
              const SliceParts& key) override;
  void Delete(const SliceParts& key) override { Delete(nullptr, key); }

  using WriteBatchBase::SingleDelete;
  // If the database contains a mapping for "key", erase it. Expects that the
  // key was not overwritten. Else do nothing.
  void SingleDelete(ColumnFamilyHandle* column_family,
                    const Slice& key) override;
  void SingleDelete(const Slice& key) override { SingleDelete(nullptr, key); }

  // variant that takes SliceParts
  void SingleDelete(ColumnFamilyHandle* column_family,
                    const SliceParts& key) override;
  void SingleDelete(const SliceParts& key) override {
    SingleDelete(nullptr, key);
  }

4.WriteBatchInternal
这个类主要作用就是操作WriteBatch的字符串，比如取出/设置序列号，取出/设置记录数，将WriteBatch插入memtable等等。来看下这个类的操作成员方法，全部声明为static方法:

class WriteBatchInternal {
 public:
  // WriteBatch methods with column_family_id instead of ColumnFamilyHandle*
  static void Put(WriteBatch* batch, uint32_t column_family_id,
                  const Slice& key, const Slice& value);

  static void Put(WriteBatch* batch, uint32_t column_family_id,
                  const SliceParts& key, const SliceParts& value);

  static void Delete(WriteBatch* batch, uint32_t column_family_id,
                     const SliceParts& key);

  static void Delete(WriteBatch* batch, uint32_t column_family_id,
                     const Slice& key);

  static void SingleDelete(WriteBatch* batch, uint32_t column_family_id,
                           const SliceParts& key);

  static void SingleDelete(WriteBatch* batch, uint32_t column_family_id,
                           const Slice& key);

  static void Merge(WriteBatch* batch, uint32_t column_family_id,
                    const Slice& key, const Slice& value);

  static void Merge(WriteBatch* batch, uint32_t column_family_id,
                    const SliceParts& key, const SliceParts& value);

  // Return the number of entries in the batch.
  static int Count(const WriteBatch* batch);

  // Set the count for the number of entries in the batch.
  static void SetCount(WriteBatch* batch, int n);

  // Return the seqeunce number for the start of this batch.
  static SequenceNumber Sequence(const WriteBatch* batch);

  // Store the specified number as the seqeunce number for the start of
  // this batch.
  static void SetSequence(WriteBatch* batch, SequenceNumber seq);

  // Returns the offset of the first entry in the batch.
  // This offset is only valid if the batch is not empty.
  static size_t GetFirstOffset(WriteBatch* batch);

  static Slice Contents(const WriteBatch* batch) {
    return Slice(batch->rep_);
  }

  static size_t ByteSize(const WriteBatch* batch) {
    return batch->rep_.size();
  }

  static void SetContents(WriteBatch* batch, const Slice& contents);

  // Inserts batch entries into memtable
  // If dont_filter_deletes is false AND options.filter_deletes is true,
  // then --> Drops deletes in batch if db->KeyMayExist returns false
  // If ignore_missing_column_families == true. WriteBatch referencing
  // non-existing column family should be ignored.
  // However, if ignore_missing_column_families == false, any WriteBatch
  // referencing non-existing column family will return a InvalidArgument()
  // failure.
  //
  // If log_number is non-zero, the memtable will be updated only if
  // memtables->GetLogNumber() >= log_number
  static Status InsertInto(const WriteBatch* batch,
                           ColumnFamilyMemTables* memtables,
                           bool ignore_missing_column_families = false,
                           uint64_t log_number = 0, DB* db = nullptr,
                           const bool dont_filter_deletes = true);

  static void Append(WriteBatch* dst, const WriteBatch* src);
};

5.MemTableInserter

// 这个类将正常的键值对和删除类型的键值对添加进memtable。这个类将作为参数，传入rep_解析函数
class MemTableInserter : public WriteBatch::Handler {
 public:
  SequenceNumber sequence_;
  ColumnFamilyMemTables* cf_mems_;
  bool ignore_missing_column_families_;
  uint64_t log_number_;
  DBImpl* db_;
  const bool dont_filter_deletes_;

  MemTableInserter(SequenceNumber sequence, ColumnFamilyMemTables* cf_mems,
                   bool ignore_missing_column_families, uint64_t log_number,
                   DB* db, const bool dont_filter_deletes)
      : sequence_(sequence),
        cf_mems_(cf_mems),
        ignore_missing_column_families_(ignore_missing_column_families),
        log_number_(log_number),
        db_(reinterpret_cast(db)),
        dont_filter_deletes_(dont_filter_deletes) {
    assert(cf_mems);
    if (!dont_filter_deletes_) {
      assert(db_);
    }
  }

  bool SeekToColumnFamily(uint32_t column_family_id, Status* s) {
    // We are only allowed to call this from a single-threaded write thread
    // (or while holding DB mutex)
    bool found = cf_mems_->Seek(column_family_id);
    if (!found) {
      if (ignore_missing_column_families_) {
        *s = Status::OK();
      } else {
        *s = Status::InvalidArgument(
            "Invalid column family specified in write batch");
      }
      return false;
    }
    if (log_number_ != 0 && log_number_ < cf_mems_->GetLogNumber()) {
      // This is true only in recovery environment (log_number_ is always 0 in
      // non-recovery, regular write code-path)
      // * If log_number_ < cf_mems_->GetLogNumber(), this means that column
      // family already contains updates from this log. We can't apply updates
      // twice because of update-in-place or merge workloads -- ignore the
      // update
      *s = Status::OK();
      return false;
    }
    return true;
  }
  virtual Status PutCF(uint32_t column_family_id, const Slice& key,
                       const Slice& value) override {
    Status seek_status;
      //如何在memtable中没有找到传入的ColumnFamily，直接返回
    if (!SeekToColumnFamily(column_family_id, &seek_status)) {
      ++sequence_;
      return seek_status;
    }
      //获取memtable
    MemTable* mem = cf_mems_->GetMemTable();
    auto* moptions = mem->GetMemTableOptions();
      //如何memtable操作中的内部更新不支持就添加这条记录
    if (!moptions->inplace_update_support) {
      mem->Add(sequence_, kTypeValue, key, value);
        //或者更新这条记录
    } else if (moptions->inplace_callback == nullptr) {
      mem->Update(sequence_, key, value);
      RecordTick(moptions->statistics, NUMBER_KEYS_UPDATED);
    } else {
        //或者更新这条记录
      if (mem->UpdateCallback(sequence_, key, value)) {
      } else {
          //如果在memtable中找不到这条记录，就去从sst获取，并且更新，添加
        // key not found in memtable. Do sst get, update, add
        SnapshotImpl read_from_snapshot;
        read_from_snapshot.number_ = sequence_;
        ReadOptions ropts;
        ropts.snapshot = &read_from_snapshot;

        std::string prev_value;
        std::string merged_value;

        auto cf_handle = cf_mems_->GetColumnFamilyHandle();
        if (cf_handle == nullptr) {
          cf_handle = db_->DefaultColumnFamily();
        }
          //调用数据库的Get的操作获获取这个key之前的值，并存在快照中
        Status s = db_->Get(ropts, cf_handle, key, &prev_value);

        char* prev_buffer = const_cast<char*>(prev_value.c_str());
        uint32_t prev_size = static_cast(prev_value.size());
        auto status = moptions->inplace_callback(s.ok() ? prev_buffer : nullptr,
                                                 s.ok() ? &prev_size : nullptr,
                                                 value, &merged_value);
        if (status == UpdateStatus::UPDATED_INPLACE) {
          // prev_value is updated in-place with final value.
          mem->Add(sequence_, kTypeValue, key, Slice(prev_buffer, prev_size));
          RecordTick(moptions->statistics, NUMBER_KEYS_WRITTEN);
        } else if (status == UpdateStatus::UPDATED) {
          // merged_value contains the final value.
          mem->Add(sequence_, kTypeValue, key, Slice(merged_value));
          RecordTick(moptions->statistics, NUMBER_KEYS_WRITTEN);
        }
      }
    }
    // Since all Puts are logged in trasaction logs (if enabled), always bump
    // sequence number. Even if the update eventually fails and does not result
    // in memtable add/update.
    sequence_++;
    cf_mems_->CheckMemtableFull();
    return Status::OK();
  }

  virtual Status DeleteCF(uint32_t column_family_id,
                          const Slice& key) override {
    Status seek_status;
    if (!SeekToColumnFamily(column_family_id, &seek_status)) {
      ++sequence_;
      return seek_status;
    }
    MemTable* mem = cf_mems_->GetMemTable();
    auto* moptions = mem->GetMemTableOptions();
    if (!dont_filter_deletes_ && moptions->filter_deletes) {
      SnapshotImpl read_from_snapshot;
      read_from_snapshot.number_ = sequence_;
      ReadOptions ropts;
      ropts.snapshot = &read_from_snapshot;
      std::string value;
      auto cf_handle = cf_mems_->GetColumnFamilyHandle();
      if (cf_handle == nullptr) {
        cf_handle = db_->DefaultColumnFamily();
      }
      if (!db_->KeyMayExist(ropts, cf_handle, key, &value)) {
        RecordTick(moptions->statistics, NUMBER_FILTERED_DELETES);
        return Status::OK();
      }
    }
    mem->Add(sequence_, kTypeDeletion, key, Slice());
    sequence_++;
    cf_mems_->CheckMemtableFull();
    return Status::OK();
  }

  virtual Status SingleDeleteCF(uint32_t column_family_id,
                                const Slice& key) override {
    Status seek_status;
    if (!SeekToColumnFamily(column_family_id, &seek_status)) {
      ++sequence_;
      return seek_status;
    }
    MemTable* mem = cf_mems_->GetMemTable();
    auto* moptions = mem->GetMemTableOptions();
    if (!dont_filter_deletes_ && moptions->filter_deletes) {
      SnapshotImpl read_from_snapshot;
      read_from_snapshot.number_ = sequence_;
      ReadOptions ropts;
      ropts.snapshot = &read_from_snapshot;
      std::string value;
      auto cf_handle = cf_mems_->GetColumnFamilyHandle();
      if (cf_handle == nullptr) {
        cf_handle = db_->DefaultColumnFamily();
      }
      if (!db_->KeyMayExist(ropts, cf_handle, key, &value)) {
        RecordTick(moptions->statistics, NUMBER_FILTERED_DELETES);
        return Status::OK();
      }
    }
    mem->Add(sequence_, kTypeSingleDeletion, key, Slice());
    sequence_++;
    cf_mems_->CheckMemtableFull();
    return Status::OK();
  }

  virtual Status MergeCF(uint32_t column_family_id, const Slice& key,
                         const Slice& value) override {
    Status seek_status;
    if (!SeekToColumnFamily(column_family_id, &seek_status)) {
      ++sequence_;
      return seek_status;
    }
    MemTable* mem = cf_mems_->GetMemTable();
    auto* moptions = mem->GetMemTableOptions();
    bool perform_merge = false;

    if (moptions->max_successive_merges > 0 && db_ != nullptr) {
      LookupKey lkey(key, sequence_);

      // Count the number of successive merges at the head
      // of the key in the memtable
      size_t num_merges = mem->CountSuccessiveMergeEntries(lkey);

      if (num_merges >= moptions->max_successive_merges) {
        perform_merge = true;
      }
    }

    if (perform_merge) {
      // 1) Get the existing value
      std::string get_value;

      // Pass in the sequence number so that we also include previous merge
      // operations in the same batch.
      SnapshotImpl read_from_snapshot;
      read_from_snapshot.number_ = sequence_;
      ReadOptions read_options;
      read_options.snapshot = &read_from_snapshot;

      auto cf_handle = cf_mems_->GetColumnFamilyHandle();
      if (cf_handle == nullptr) {
        cf_handle = db_->DefaultColumnFamily();
      }
      db_->Get(read_options, cf_handle, key, &get_value);
      Slice get_value_slice = Slice(get_value);

      // 2) Apply this merge
      auto merge_operator = moptions->merge_operator;
      assert(merge_operator);

      std::deque<std::string> operands;
      operands.push_front(value.ToString());
      std::string new_value;
      bool merge_success = false;
      {
        StopWatchNano timer(Env::Default(), moptions->statistics != nullptr);
        PERF_TIMER_GUARD(merge_operator_time_nanos);
        merge_success = merge_operator->FullMerge(
            key, &get_value_slice, operands, &new_value, moptions->info_log);
        RecordTick(moptions->statistics, MERGE_OPERATION_TOTAL_TIME,
                   timer.ElapsedNanos());
      }

      if (!merge_success) {
          // Failed to merge!
        RecordTick(moptions->statistics, NUMBER_MERGE_FAILURES);

        // Store the delta in memtable
        perform_merge = false;
      } else {
        // 3) Add value to memtable
        mem->Add(sequence_, kTypeValue, key, new_value);
      }
    }

    if (!perform_merge) {
      // Add merge operator to memtable
      mem->Add(sequence_, kTypeMerge, key, value);
    }

    sequence_++;
    cf_mems_->CheckMemtableFull();
    return Status::OK();
  }
};

2.写流程

rocksdb_put(db, writeoptions, key, strlen(key), value, strlen(value) + 1, &err);
//调用
SaveError(errptr, db->rep->Put(options->rep, Slice(key, keylen), Slice(val, vallen)));
//调用
db->rep->Put(options->rep, Slice(key, keylen), Slice(val, vallen))
Status DB::Put(const WriteOptions& opt, ColumnFamilyHandle* column_family,
               const Slice& key, const Slice& value) {
  // Pre-allocate size of write batch conservatively.
  // 8 bytes are taken by header, 4 bytes for count, 1 byte for type,
  // and we allocate 11 extra bytes for key length, as well as value length.
  WriteBatch batch(key.size() + value.size() + 24);  //设置Batch
  batch.Put(column_family, key, value);
  return Write(opt, &batch);//写Batch
}
//首先设置WriteBatch
void WriteBatch::Put(ColumnFamilyHandle* column_family, const Slice& key,
                     const Slice& value) {
  WriteBatchInternal::Put(this, GetColumnFamilyID(column_family), key, value);
}
//实际调用的的是WriteBatchInternal::Put
void WriteBatchInternal::Put(WriteBatch* b, uint32_t column_family_id,
                             const Slice& key, const Slice& value) {
    //WriteBatch记入数加1
  WriteBatchInternal::SetCount(b, WriteBatchInternal::Count(b) + 1);
    //
  if (column_family_id == 0) {
    b->rep_.push_back(static_cast<char>(kTypeValue));//添加类型
  } else {
    b->rep_.push_back(static_cast<char>(kTypeColumnFamilyValue));//添加类型
    PutVarint32(&b->rep_, column_family_id);
  }
  PutLengthPrefixedSlice(&b->rep_, key);//key的长度和值
  PutLengthPrefixedSlice(&b->rep_, value);//添加value的长度和值
}
//然后再把WriteBatch写入
virtual Status Write(const WriteOptions& options, WriteBatch* updates) = 0;
|
Status DBImpl::Write(const WriteOptions& write_options, WriteBatch* my_batch) {
  return WriteImpl(write_options, my_batch, nullptr);
}
//写入实现在这里
Status DBImpl::WriteImpl(const WriteOptions& write_options,
                         WriteBatch* my_batch, WriteCallback* callback) {
  if (my_batch == nullptr) {
    return Status::Corruption("Batch is nullptr!");
  }
  if (write_options.timeout_hint_us != 0) {
    return Status::InvalidArgument("timeout_hint_us is deprecated");
  }

  Status status;
  bool callback_failed = false;

  bool xfunc_attempted_write = false;
    /*先尝试下可以不可以写*/
  XFUNC_TEST("transaction", "transaction_xftest_write_impl",
             xf_transaction_write1, xf_transaction_write, write_options,
             db_options_, my_batch, callback, this, &status,
             &xfunc_attempted_write);
  if (xfunc_attempted_write) {
    // Test already did the write
    return status;
  }

  PERF_TIMER_GUARD(write_pre_and_post_process_time);
  WriteThread::Writer w; //建立写操作
  w.batch = my_batch; //要写的数据
  w.sync = write_options.sync; //需不需要对事务日志执行fsync或者fdatasync 操作
  w.disableWAL = write_options.disableWAL;//指示需要不需要写事务日志
  w.in_batch_group = false;
  // 最后，in_batch_group的比较有意思。在RocksDB内部，对写入操作做了优化，尽可能地将用户的写入
  // 批量处理。这其中使用了一个队列，即write_thread_内部的WriteThread::Writer*队列。在准备写队列头
  // 的任务时，会试着用BuildBatchGroup()构建一个批量任务组，将紧跟着队头的其他写操作任务加入
  // 到一个BatchGroup，一次性地写入数据库。
  w.done = false; //写操作完成时设置
  w.has_callback = (callback != nullptr) ? true : false;

  if (!write_options.disableWAL) {
    //记入 the Number of Write calls that request WAL
    RecordTick(stats_, WRITE_WITH_WAL);
  }

  //记入工作。数据库评测时用到
  StopWatch write_sw(env_, db_options_.statistics.get(), DB_WRITE);

  // 将当前写入任务@w挂入写队列，并在mutex_上睡眠等待。等待直到:
  // 1) 写操作设置了超时时间，等待超时。或，
  // 2) @w之前的任务都已完成，@w已处于队列头部。或，
  // 3) @w这个写任务被别的写线程完成了。
  // 第3个条件，任务被别的写线程完成，实际上是被之前的写任务合并进一个
  // WriteBatchGroup中去了。此时的@w会被标记成in_batch_group。有意思的是，在JoinBatchGroup()
  // 里面，如果因为超时唤醒了，发现当前任务in_batch_group为true，则会继续等待，
  // 因为它已经被别的线程加入BatchGroup准备写入数据库了。

  write_thread_.JoinBatchGroup(&w);//将要带有要写的batch的Write加入写的队列当中
  if (w.done) {
    // write was done by someone else, no need to grab mutex
    RecordTick(stats_, WRITE_DONE_BY_OTHER);
    return w.status;
  }
  // else we are the leader of the write batch group

  WriteContext context;
  mutex_.Lock();


    //如果需要写事务日志
  if (!write_options.disableWAL) {
    default_cf_internal_stats_->AddDBStats(InternalStats::WRITE_WITH_WAL, 1);
  }

    //还是自己的write写自己的batch
  RecordTick(stats_, WRITE_DONE_BY_SELF);
  default_cf_internal_stats_->AddDBStats(InternalStats::WRITE_DONE_BY_SELF, 1);

  // Once reaches this point, the current writer "w" will try to do its write
  // job.  It may also pick up some of the remaining writers in the "writers_"
  // when it finds suitable, and finish them in the same write batch.
  // This is how a write job could be done by the other writer.
  assert(!single_column_family_mode_ ||
         versions_->GetColumnFamilySet()->NumberOfColumnFamilies() == 1);


  //设置最大的log的size
  uint64_t max_total_wal_size = (db_options_.max_total_wal_size == 0)
                                    ? 4 * max_total_in_memory_state_
                                    : db_options_.max_total_wal_size;

  if (UNLIKELY(!single_column_family_mode_) &&
      alive_log_files_.begin()->getting_flushed == false &&
      total_log_size_ > max_total_wal_size) {
    // 如果column family有多个，最早的活跃的事务日志对应的memtable还没有被写入磁盘，
    // 而且当前日志总大小超过了设定的最大值，那么就需要分配新的memtable，将老的
    // immutable memtable内容写入磁盘。
    uint64_t flush_column_family_if_log_file = alive_log_files_.begin()->number; //当前活跃事务的日志对应的num
    alive_log_files_.begin()->getting_flushed = true;
    Log(InfoLogLevel::INFO_LEVEL, db_options_.info_log,
        "Flushing all column families with data in WAL number %" PRIu64
        ". Total log size is %" PRIu64 " while max_total_wal_size is %" PRIu64,
        flush_column_family_if_log_file, total_log_size_, max_total_wal_size);
    // no need to refcount because drop is happening in write thread, so can't
    // happen while we're in the write thread
    for (auto cfd : *versions_->GetColumnFamilySet()) {
      if (cfd->IsDropped()) {
        continue;
      }
      //小等于当前活跃事务日志的num的colum family都应该切换新的memtable
      if (cfd->GetLogNumber() <= flush_column_family_if_log_file) {
        //为Column family分配新的memtable
        status = SwitchMemtable(cfd, &context);
        if (!status.ok()) {
          break;
        }
        cfd->imm()->FlushRequested();
          //调度将要发生的flush
        SchedulePendingFlush(cfd);
      }
    }

      //调度flush或者compaction
    MaybeScheduleFlushOrCompaction();
  }
      /*判断需要flush   */
  else if (UNLIKELY(write_buffer_.ShouldFlush())) {
    Log(InfoLogLevel::INFO_LEVEL, db_options_.info_log,
        "Flushing all column families. Write buffer is using %" PRIu64
        " bytes out of a total of %" PRIu64 ".",
        write_buffer_.memory_usage(), write_buffer_.buffer_size());
    // no need to refcount because drop is happening in write thread, so can't
    // happen while we're in the write thread
    //这里flush当前版本的columnfamily
    for (auto cfd : *versions_->GetColumnFamilySet()) {
      if (cfd->IsDropped()) {
        continue;
      }
      if (!cfd->mem()->IsEmpty()) {
        status = SwitchMemtable(cfd, &context);
        if (!status.ok()) {
          break;
        }
        cfd->imm()->FlushRequested();
        SchedulePendingFlush(cfd);
      }
    }
      //调度flush或者compaction
    MaybeScheduleFlushOrCompaction();
  }

  if (UNLIKELY(status.ok() && !bg_error_.ok())) {
    status = bg_error_;
  }
    /*flush_schedule不为空*/
  if (UNLIKELY(status.ok() && !flush_scheduler_.Empty())) {
    status = ScheduleFlushes(&context);
  }

    /*write_controller_判断是否需要stop或者delay*/
  if (UNLIKELY(status.ok()) &&
      (write_controller_.IsStopped() || write_controller_.NeedsDelay())) {
    PERF_TIMER_STOP(write_pre_and_post_process_time);
    PERF_TIMER_GUARD(write_delay_time);
    // We don't know size of curent batch so that we always use the size
    // for previous one. It might create a fairness issue that expiration
    // might happen for smaller writes but larger writes can go through.
    // Can optimize it if it is an issue.
    status = DelayWrite(last_batch_group_size_);
    PERF_TIMER_START(write_pre_and_post_process_time);
  }

  uint64_t last_sequence = versions_->LastSequence();
  WriteThread::Writer* last_writer = &w;
  autovector write_batch_group;
    /*日志的和日志的dir同步*/
  bool need_log_sync = !write_options.disableWAL && write_options.sync;
  bool need_log_dir_sync = need_log_sync && !log_dir_synced_;

  //这里在等待事务日志同步的完成
  if (status.ok()) {
    //把需要写的WriteBatch作为leader加入BatchGroup中
    last_batch_group_size_ = write_thread_.EnterAsBatchGroupLeader(
        &w, &last_writer, &write_batch_group);

    if (need_log_sync) {
      while (logs_.front().getting_synced) {
        log_sync_cv_.Wait();
      }
      for (auto& log : logs_) {
        assert(!log.getting_synced);
        log.getting_synced = true;
      }
    }

    // Add to log and apply to memtable.  We can release the lock
    // during this phase since &w is currently responsible for logging
    // and protects against concurrent loggers and concurrent writes
    // into memtables

    mutex_.Unlock();

    if (callback != nullptr) {
      // If this write has a validation callback, check to see if this write
      // is able to be written.  Must be called on the write thread.
      status = callback->Callback(this);
      callback_failed = true;
    }
  } else {
    mutex_.Unlock();
  }

  // At this point the mutex is unlocked

  //这里开始是写memtable
  if (status.ok()) {
    //把write_batch_group中的WriteBatch往WriteBatchInternal这个类要往memtable中写的updates-WriteBatch(这样写和前面区分开)添加
      WriteBatch* updates = nullptr;
      if (write_batch_group.size() == 1) {
        updates = write_batch_group[0];
      } else {
        updates = &tmp_batch_;
        for (size_t i = 0; i < write_batch_group.size(); ++i) {

            //往writeBatch追加
          WriteBatchInternal::Append(updates, write_batch_group[i]);
        }
      }
    //这个updates-writeBatch的序列号等于verson中最后的序列号+1
      const SequenceNumber current_sequence = last_sequence + 1;
      //设置序列号
      WriteBatchInternal::SetSequence(updates, current_sequence);
      //获取记录数
      int my_batch_count = WriteBatchInternal::Count(updates);
      //verson中最后的序列号等于updates-writeBatch的count
      last_sequence += my_batch_count;
      const uint64_t batch_size = WriteBatchInternal::ByteSize(updates);
      // Record statistics
      RecordTick(stats_, NUMBER_KEYS_WRITTEN, my_batch_count);
      RecordTick(stats_, BYTES_WRITTEN, batch_size);
      if (write_options.disableWAL) {
        flush_on_destroy_ = true;
      }
      PERF_TIMER_STOP(write_pre_and_post_process_time);

      uint64_t log_size = 0;
      //需要写事务日志
      if (!write_options.disableWAL) {
        PERF_TIMER_GUARD(write_wal_time);
          //通过WriteBatch创建log入口，就是获取updates-Batch的内容
        Slice log_entry = WriteBatchInternal::Contents(updates);
          //log队列里添加log_entry
        status = logs_.back().writer->AddRecord(log_entry);
        total_log_size_ += log_entry.size();
        //添加日志的size
        alive_log_files_.back().AddSize(log_entry.size());
        log_empty_ = false;
        log_size = log_entry.size();
        RecordTick(stats_, WAL_FILE_BYTES, log_size);

          /*同步日志*/
        if (status.ok() && need_log_sync) {
          RecordTick(stats_, WAL_FILE_SYNCED);
          StopWatch sw(env_, stats_, WAL_FILE_SYNC_MICROS);
          // It's safe to access logs_ with unlocked mutex_ here because:
          //  - we've set getting_synced=true for all logs,
          //    so other threads won't pop from logs_ while we're here,
          //  - only writer thread can push to logs_, and we're in
          //    writer thread, so no one will push to logs_,
          //  - as long as other threads don't modify it, it's safe to read
          //    from std::deque from multiple threads concurrently.
          for (auto& log : logs_) {
            status = log.writer->file()->Sync(db_options_.use_fsync);
            if (!status.ok()) {
              break;
            }
          }
          if (status.ok() && need_log_dir_sync) {
            // We only sync WAL directory the first time WAL syncing is
            // requested, so that in case users never turn on WAL sync,
            // we can avoid the disk I/O in the write code path.
            status = directories_.GetWalDir()->Fsync();
          }
        }
      }

      //上面都是在操作log
      if (status.ok()) {
        PERF_TIMER_GUARD(write_memtable_time);

          /****************这里开始将WriteBatch写memtable***********/
       //这个函数就是往memtable中写WriteBatch用的
        status = WriteBatchInternal::InsertInto(
            updates, column_family_memtables_.get(),
            write_options.ignore_missing_column_families, 0, this, false);
        // A non-OK status here indicates iteration failure (either in-memory
        // writebatch corruption (very bad), or the client specified invalid
        // column family).  This will later on trigger bg_error_.
        //
        // Note that existing logic was not sound. Any partial failure writing
        // into the memtable would result in a state that some write ops might
        // have succeeded in memtable but Status reports error for all writes.

        SetTickerCount(stats_, SEQUENCE_NUMBER, last_sequence);
      }

    //收尾操作
      PERF_TIMER_START(write_pre_and_post_process_time);
      if (updates == &tmp_batch_) {
        tmp_batch_.Clear();
      }
      mutex_.Lock();

      // internal stats
      default_cf_internal_stats_->AddDBStats(
          InternalStats::BYTES_WRITTEN, batch_size);
      default_cf_internal_stats_->AddDBStats(InternalStats::NUMBER_KEYS_WRITTEN,
                                             my_batch_count);
      if (!write_options.disableWAL) {
        if (write_options.sync) {
          default_cf_internal_stats_->AddDBStats(InternalStats::WAL_FILE_SYNCED,
                                                 1);
        }
        default_cf_internal_stats_->AddDBStats(
            InternalStats::WAL_FILE_BYTES, log_size);
      }
      if (status.ok()) {
        versions_->SetLastSequence(last_sequence);
      }
  } else {
    // Operation failed.  Make sure sure mutex is held for cleanup code below.
    mutex_.Lock();
  }

  if (db_options_.paranoid_checks && !status.ok() && !callback_failed &&
      !status.IsBusy() && bg_error_.ok()) {
    bg_error_ = status; // stop compaction & fail any further writes
  }

  mutex_.AssertHeld();

  if (need_log_sync) {
    MarkLogsSynced(logfile_number_, need_log_dir_sync, status);
  }

  uint64_t writes_for_other = write_batch_group.size() - 1;
  if (writes_for_other > 0) {
    default_cf_internal_stats_->AddDBStats(InternalStats::WRITE_DONE_BY_OTHER,
                                           writes_for_other);
    if (!write_options.disableWAL) {
      default_cf_internal_stats_->AddDBStats(InternalStats::WRITE_WITH_WAL,
                                             writes_for_other);
    }
  }

  mutex_.Unlock();

  write_thread_.ExitAsBatchGroupLeader(&w, last_writer, status);

  return status;
}

参考资料

http://blog.csdn.net/wang_xijue/article/details/46521605
http://kernelmaker.github.io/Rocksdb_Study_4

你可能感兴趣的:(k-v)

Redis 进阶知识：看完绝对可以提升你的Redis技能 Java爱好狂. java 后端 redis
提到Redis，也许大家第一印象是：高性能的K-V缓存。其除了缓存业务上的热点数据还能做队列以及分布式锁。可大部分的我们在使用的时候也许都是公司封装好的Redis，对于整个Redis的集群以及内部核心实现一知半解。只专注业务开发，那久而久之，在个人成长上变得帮组甚少。最明显的就是当业务量级上来后，在极端情况下，许多问题暴露出来，如果没有对Redis的底层有深入的了解，很难快速定位并解决问题Redi
Redis进阶篇米兰的小铁匠~ Java面试 redis 数据库缓存
Redis线程模型redis是基于内存运行的高性能k-v数据库，6.x之前是单线程,对外提供的键值存储服务的主要流程是单线程，也就是网络IO和数据读写是由单个线程来完成，6.x之后引入多线程而键值对读写命令仍然是单线程处理的，所以Redis依然是并发安全的Redis为什么快完全基于内存操作，避免了传统的磁盘io读取内存这部分的消耗数据结构简单，基于哈希表结构，可以在O(1)的时间内计算出hash值
ETCD 七 gRPC 通信接口以及客户端 wanghaichao1234 etcd etcd docker 数据库
client/v3client定义//Clientprovidesandmanagesanetcdv3clientsession.typeClientstruct{Cluster//向集群里增加etcd服务端节点之类，属于管理员操作。KV//我们主要使用的功能，即操作K-V。Lease//租约相关操作，比如申请一个TTL=10秒的租约。Watcher//观察订阅，从而监听最新的数据变化。Auth/
【数据结构】哈希桶封装出map和set 深度搜索数据结构哈希算法封装
利用之前的哈希桶封装出unordered_map和unordered_set。这个封装并不简单，迭代器的使用，模板参数的繁多，需要我们一层一层封装。map是一个k-v类型，set是k类型，那么就明确了如果需要封装，底层的tables必须能接收K-T类型。在上层，就能用>封装map,利用封装set修改结构K-V=>T1.底层不关心是K还是pair构造（普通迭代器和常迭代器）为了保证迭代器不能被修改，
关于Java的ORM及 PHP直接面向数据库关系模式的思考 irelandken
最近半年学习了PHP，在PHP面向数据库这一层次，我们不用ORM，也没有实体类的定义（如User类），直接就是面向数据库表的，读取数据时，直接返回个K-V形式的Array;而以前做Java开发时，用的是Hibernate框架，有实体类定义（如User类），里面定义User类的每个属性及其类型，读取数据库后，再转换为实体类；这种模式经常遇到的问题是：1，修改数据库表时，要同时修改User类，2,当我
JAVA复习面经（十四）原来是笑傲菌殿下 java 面试
JAVA复习面经（十四）面试难度：☆☆☆问：cookie，session答：由于Http协议是无连接的，因此需要采用cookie或session的方式来判断当前的登录用户。cookie其实是一个K-V形式的文本数据包，客户端第一次发送请求给服务器端的时候，服务器端会返回一个特定的cookie，客户端将服务器端的cookie保存，并在下次数据发送的时候带上cookie。session与cookie不
redis是单线程的为什么还这么快？ wming666 java 开发语言
Redis是基于内存运行的高性能K-V数据库，官方提供的测试报告是单机可以支持约10w/s的QPS。但是，在设计上，Redis采用单线程架构。为什么单线程设计依然会有这么高的性能？如果利用多线程并发处理请求不是更好吗？在本文中，让我们深入探讨为什么Redis只有单线程架构，依然如此之快，主要从下面4个方面讲解。•数据存储在内存中•高效的数据结构•单线程架构•非阻塞IO让我们一一剖析。1、数据存储在
【前端开发】缓存工具类-uniapp版会写代码的饭桶前端开发自学技术学习日志 uni-app 前端缓存工具类
uniapp数据缓存是基于K-V形式进行存储，本文主要扩展加上过期时间处理，使用其缓存具备时效性设计思路缓存数据时添加过期时间，在获取缓存内容时，判断是否过期。若过期，则清除缓存内容；若未过期，则返回相应内容设置缓存set(key,value,expire){letdata={expire:expire?(this.time()+expire):"",value}if(typeofdata==='
【C++】树型结构关联式容器：map/multimap/set/multisetの使用指南（27） YY的秘密代码小屋 YY 滴《C++系列》c++java 开发语言
前言大家好吖，欢迎来到YY滴C++系列，热烈欢迎！本章主要内容面向接触过C++的老铁主要内容含：欢迎订阅YY滴C++专栏！更多干货持续更新！以下是传送门！YY的《C++》专栏YY的《C++11》专栏YY的《Linux》专栏YY的《数据结构》专栏YY的《C语言基础》专栏YY的《初学者易错点》专栏YY的《小小知识点》专栏目录一.键值对二.关联式容器＆序列式容器三.k模型＆k-v模型四.树形结构的关联式
MongoDB 与 mongo-express docker 安装程序员大飞1 Docker MongoDB mongodb express docker
MongoDB和mongo-express与MySQL不同，MongoDB为NoSQL数据库，MongoDB中没有table，schema概念，取而代之的collection，其中collection存储的为BSON格式，是一种类似于JSON的用于存储k-v键值对数据，比较适用于JS应用开发mongo-express是一个用Node.js、Express.js和BootStrap3编写的基于web
日志结构的存储引擎 Dakini_Wind
参考《DDIA》如果你把东西整理的井井有条，下次就不用找了。为什么要关注数据库内部的存储和索引？我们往往需要从众多的存储引擎中选择一个对自己应用来说适合的，针对特定的工作负载而对数据库调优，这需要对存储引擎的底层机制有一个大概的了解。1.哈希索引K-V类型随处可见，是其他复杂索引的基础构造模块，通常使用hashmap来实现。假设数据存储使用append追加模式，那么最简单的策略是：保存内存中的ha
go语言-context的基本使用 SRExianxian go语言 go语言 go
1.什么是Context？Go1.7标准库引入context，中文译作“上下文”，准确说它是goroutine的上下文，包含goroutine的运行状态、环境、现场等信息。context主要用来在goroutine之间传递上下文信息，包括：取消信号、超时时间、截止时间、k-v等。Context，也叫上下文，它的接口定义如下typeContextinterface{Deadline()(deadli
go数据格式-JSON、XML、MSGPack leellun go从入门到实践 golang json xml
1.JSONjson是完全独立于语言的文本格式，是k-v的形式name:zs应用场景：前后端交互，系统间数据交互json使用go语言内置的encoding/json标准库编码json使用json.Marshal()函数可以对一组数据进行JSON格式的编码funcMarshal(vinterface{})([]byte,error)示例过结构体生成jsonpackagemainimport("enc
redis cluster水平扩容中的客户端处理逻辑 kakaweb redis
作为一个分布式K-V存储，rediscluster通过将数据切分为16384个slot进行分布式存储，用公式表示如下：作为一个支持水平扩容的分布式系统，redis集群在扩容时通常遵循如下步骤：1.在新节点启动redis进程2.将新节点加入集群3.迁移部分slot至新节点在较老版本的rediscluster中，该指令通过redis-trib.rb脚本执行，在新版本的redis中，脚本功能已经与red
Java复习系列之阶段四：分布式技术（1）来自宇宙的曹先生 Java复习 java 分布式开发语言 redis
1.Redis1.1基础优点性能高数据结构丰富为什么快？基于内存进行数据处理的单线程模型，不存在线程竞争以及上下文切换基于k-v的数据结构，结构简单IO模型采用多路复用技术，尽可能充分使用单线程去完成连接处理以及读写IO（尽可能压榨单线程的IO模型）存在什么问题基于内存操作，数据稳定性、安全性不高，容易丢失k-v的结构导致数据检索能力较差事务支持不友好1.2数据结构基础stringhashlist
数据库设计十个原则 Mr_Ronny 数据库数据库架构
基础原则结构清晰：表名，字段命名没有歧义唯一职责：一表一用，领域定义清晰，不存储无关信息，相关数据在一张表中主键原则：设计不带业务意义的主键；有唯一约束，确保幂等扩展性原则长短分离：长文本短文本分离，长文本存储在k-v系统中冷热分离：当前数据与历史数据分离索引完备：合适索引方便查询不使用关联查询：不做2个表或者更多表关联查询（多个表未来不在一个库一个系统）完备性原则完整性：保证数据的准确性与完整性
vue或者js给数组添加新的主键key-value键值对程序员小蛋 vue.js javascript 添加主键key 创建新的数组
res.data需要循环的数组，newKey：创建新的key，newValue：创建对应key的值//从数组中循环想要的键值对的值，并重新创建k-v值varnewArr=res.data.map(item=>({newKey:item.key,newValue:item.value}))
架构篇16：高性能NoSQL 星猿杂谈软件架构架构
文章目录K-V存储文档数据库列式数据库全文搜索引擎小结关系数据库经过几十年的发展后已经非常成熟，强大的SQL功能和ACID的属性，使得关系数据库广泛应用于各式各样的系统中，但这并不意味着关系数据库是完美的，关系数据库存在如下缺点。关系数据库存储的是行记录，无法存储数据结构以微博的关注关系为例，“我关注的人”是一个用户ID列表，使用关系数据库存储只能将列表拆成多行，然后再查询出来组装，无法直接存储一
《互联网项目实战》课件琛哥的程序 java tomcat spring
01.绪论绪论部分，包括讲师介绍、初识电商项目、课程介绍、几个热身问题等模块。（1）讲师介绍知一老师主讲项目实战课高级架构师擅长微服务系统基础架构RPC注册中心MQ分布式文件存储分布式K-V存储擅长互联网业务系统架构电商系统IM系统直播答题（2）初识电商项目淘宝核心流程首页taobao.com搜索页s.taobao.com商品详情页item.taobao.com购物车页cart.taobao.co
redis 入门牛奶味的团子 redis 数据库缓存
一、什么是redis?redis是c语言编写的高性能(读的速度是110000次/s,写的速度是81000次/s)的k-v形式的数据库，数据存在内存中二、redis的使用场景？数据量小，访问量大三、redis的启动和关闭启动：打开cmd：redis-server.exeredis.windows.conf关闭：ctrl+c四、基本命令keys*：查看所有key值exists：判断key值是否存在ex
【Redis学习笔记01】快速入门（含安装教程）米饭好好吃. redis 笔记
【Redis学习笔记01】快速入门（含安装教程）1.Redis相关概念先来看门见山的给出Redis的概念：Redis：是一种基于内存的高性能K-V键值型NoSQL数据库Redis官网：https://redis.io/1.1初识NoSQL想必大家都对关系型数据库更为熟悉！如MySQL、Oracle、SQLServer都是比较常见的关系型数据库，所谓关系型数据库主要以二维表作为数据结构进行存储，但是
Redis多线程模型探究 mntalk Redis redis bootstrap 数据库安全缓存
在技术快速发展的当下，Redis以其高效的单线程模型在众多数据库技术中脱颖而出。这项被设计来高速读写内存数据的技术，如今却在面临多核心时代的挑战下，开始拥抱多线程。这篇文章将带你了解Redis的单线程之路，解读它为何能在多线程盛行的今天仍保持竞争力，以及它是如何优雅地在单线程和多线程间找到平衡。1、Redis的单线程模型回顾Redis单线程模型的运作原理Redis是一款基于内存的K-V存储系统，它
HashMap的put()操作流程详解柳蒿
HashMap的put方法流程总结1、put(key,value)中直接调用了内部的putVal方法，并且先对key进行了hash操作；2、putVal方法中，先检查HashMap数据结构中的索引数组表是否位空，如果是的话则进行一次resize操作；3、以HashMap索引数组表的长度减一与key的hash值进行与运算，得出在数组中的索引，如果索引指定的位置值为空，则新建一个k-v的新节点；4、如
Redis 存在线程安全问题吗？为什么？浮生带你学Java Java面试题 Redis redis 安全 java
一个工作了5年的粉丝私信我。他说自己准备了半年时间，想如蚂蚁金服，结果第一面就挂了，非常难过。问题是：“Redis存在线程安全问题吗？”一、问题解析关于这个问题，我从两个方面来回答。第一个，从Redis服务端层面。RedisServer本身是一个线程安全的K-V数据库，也就是说在RedisServer上执行的指令，不需要任何同步机制，不会存在线程安全问题。（如图）虽然Redis6.0里面，增加了多
算法通关村第十六关—滑动窗口经典问题(白银) 孤舟一叶～算法通关村算法开发语言数据结构 leetcode java
滑动窗口经典问题一、最长子串专题1.1无重复字符的最长子串 LeetCode3给定一个字符串s，请你找出其中不含有重复字符的最长子串的长度。例如：输入：s="abcabcbb"输出：3解释：因为无重复字符的最长子串是"abc"，所以其长度为3。要找最长子串，必然要知道无重复字符串的首和尾，然后再从中确定最长的那个，因此至少两个指针才可以，这就想到了滑动窗口思想。定义一个K-V形式的map,
Redis相关命令 OPice
什么是Redis Redis首先是一个存储数据库，数据在缓存在内存中，数据是K-V结构。Redis的使用Redis安装使用Redis的数据类型类型描述备注string字符串K-V最大值存储512Mlist简单字符串列表，可以将元素添加最左边或者右边最多存储232-1setstring类型的无序集合Hash表实现，查询效率O(1)，最多存储232-1zset有序集合，成员不能重复，但是scope可
推荐几个干货公众号机器铃砍菜刀数据库算法分布式中间件编程语言
大家好，我是菜刀。推荐几个优质号主，他们的文章我也一直有读，很多干货。希望能帮助到你~roseduan写字的地方号主Rose，曾是一名文科生，二本非科班出身，大三时自学计算机，在小厂做过Java，目前在某大厂做Go语言开发工作。他专注于Go语言、算法、分布式存储领域，在Github开源了一个k-v存储引擎项目rosedb，目前已有超过2.1kstar，他的Github主页：https://gith
Redis小计(4) 不会敲代码的运气选手^ 知识记录 redis
目录1.Set和Get操作2.mset和mget3.mset，mget，set后加参数的优点4.incr,incrby，incrbyfloat1.Set和Get操作flushall：清除所有k-v键值对。（删库跑路小技巧）setkv[ex|px]：设置超时时间,ex秒级，px毫秒级。setkv[nx|xx]：nx：当key不存在时再设置。xx：当key存在时再设置（刷新value）。否则返回nil
【Java集合篇】HashMap的hash方法是如何实现的? 昕宝爸爸爱编程 #Java集合类哈希算法 java 算法
HashMap的hash方法是如何实现的?✔️典型解析✔️拓展知识仓✔️使用&代替%运算✔️扰动计算✔️典型解析hash方法的功能是根据Key来定位这个K-V在链表数组中的位置的。也就是hash方法的输入应该是个Object类型的Key，输出应该是个int类型的数组下标。最简单的话，我们只要调用Object对象的hashCode()方法，该方法会返回一个整数，然后用这个数对HashMap或者Has
Redis生产环境最佳实践 11来了技术文章 redis 数据库缓存
欢迎关注公众号（通过文章导读关注：【11来了】），及时收到AI前沿项目工具及新技术的推送发送资料可领取深入理解Redis系列文章结合电商场景讲解Redis使用场景、中间件系列笔记和编程高频电子书！文章导读地址：点击查看文章导读！Redis生产环境最佳实践为什么要写这篇文章呢？因为我发现大多数朋友，在学习完Redis之后，发现还是对如何在项目中进行使用不太清楚，只了解简单的将数据按照k-v的方式放在
PHP如何实现二维数组排序？ IT独行者二维数组 PHP 排序　
二维数组在PHP开发中经常遇到，但是他的排序就不如一维数组那样用内置函数来的方便了，（一维数组排序可以参考本站另一篇文章【PHP中数组排序函数详解汇总】）。二维数组的排序需要我们自己写函数处理了，这里UncleToo给大家分享一个PHP二维数组排序的函数：代码： functionarray_sort($arr,$keys,$type='asc'){ $keysvalue= $new_arr
【Hadoop十七】HDFS HA配置 bit1129 hadoop
基于Zookeeper的HDFS HA配置主要涉及两个文件,core-site和hdfs-site.xml。测试环境有三台 hadoop.master hadoop.slave1 hadoop.slave2 hadoop.master包含的组件NameNode, JournalNode, Zookeeper，DFSZKFailoverController
由wsdl生成的java vo类不适合做普通java vo darrenzhu VO wsdl webservice rpc
开发java webservice项目时，如果我们通过SOAP协议来输入输出，我们会利用工具从wsdl文件生成webservice的client端类，但是这里面生成的java data model类却不适合做为项目中的普通java vo类来使用，当然有一中情况例外，如果这个自动生成的类里面的properties都是基本数据类型，就没问题，但是如果有集合类，就不行。原因如下： 1)使用了集合如Li
JAVA海量数据处理之二（BitMap）周凡杨 java 算法 bitmap bitset 数据
路漫漫其修远兮，吾将上下而求索。想要更快，就要深入挖掘 JAVA 基础的数据结构，从来分析出所编写的 JAVA 代码为什么把内存耗尽，思考有什么办法可以节省内存呢？啊哈！算法。这里采用了 BitMap 思想。首先来看一个实验：指定 VM 参数大小： -Xms256m -Xmx540m
java类型与数据库类型 g21121 java
很多时候我们用hibernate的时候往往并不是十分关心数据库类型和java类型的对应关心，因为大多数hbm文件是自动生成的，但有些时候诸如：数据库设计、没有生成工具、使用原始JDBC、使用mybatis(ibatIS)等等情况，就会手动的去对应数据库与java的数据类型关心，当然比较简单的数据类型即使配置错了也会很快发现问题，但有些数据类型却并不是十分常见，这就给程序员带来了很多麻烦。 &nb
Linux命令 510888780 linux命令
系统信息 arch 显示机器的处理器架构(1) uname -m 显示机器的处理器架构(2) uname -r 显示正在使用的内核版本 dmidecode -q 显示硬件系统部件 - (SMBIOS / DMI) hdparm -i /dev/hda 罗列一个磁盘的架构特性 hdparm -tT /dev/sda 在磁盘上执行测试性读取操作 cat /proc/cpuinfo 显示C
java常用JVM参数墙头上一根草 java jvm参数
-Xms：初始堆大小，默认为物理内存的1/64(<1GB)；默认(MinHeapFreeRatio参数可以调整)空余堆内存小于40%时，JVM就会增大堆直到-Xmx的最大限制 -Xmx：最大堆大小，默认(MaxHeapFreeRatio参数可以调整)空余堆内存大于70%时，JVM会减少堆直到 -Xms的最小限制 -Xmn：新生代的内存空间大小，注意：此处的大小是（eden+ 2
我的spring学习笔记9-Spring使用工厂方法实例化Bean的注意点 aijuans Spring 3
方法一： <bean id="musicBox" class="onlyfun.caterpillar.factory.MusicBoxFactory" factory-method="createMusicBoxStatic"></bean> 方法二：
mysql查询性能优化之二 annan211 UNION mysql 查询优化索引优化
1 union的限制有时mysql无法将限制条件从外层下推到内层，这使得原本能够限制部分返回结果的条件无法应用到内层查询的优化上。如果希望union的各个子句能够根据limit只取部分结果集，或者希望能够先排好序在合并结果集的话，就需要在union的各个子句中分别使用这些子句。例如想将两个子查询结果联合起来，然后再取前20条记录，那么mys
数据的备份与恢复百合不是茶 oracle sql 数据恢复数据备份
数据的备份与恢复的方式有: 表,方案 ,数据库; 数据的备份: 导出到的常见命令; 参数说明 USERID 确定执行导出实用程序的用户名和口令 BUFFER 确定导出数据时所使用的缓冲区大小，其大小用字节表示 FILE 指定导出的二进制文
线程组 bijian1013 java 多线程 thread java多线程线程组
有些程序包含了相当数量的线程。这时，如果按照线程的功能将他们分成不同的类别将很有用。线程组可以用来同时对一组线程进行操作。创建线程组：ThreadGroup g = new ThreadGroup(groupName); &nbs
top命令找到占用CPU最高的java线程 bijian1013 java linux top
上次分析系统中占用CPU高的问题，得到一些使用Java自身调试工具的经验，与大家分享。 (1)使用top命令找出占用cpu最高的JAVA进程PID:28174 (2)如下命令找出占用cpu最高的线程 top -Hp 28174 -d 1 -n 1 32694 root 20 0 3249m 2.0g 11m S 2 6.4 3:31.12 java
【持久化框架MyBatis3四】MyBatis3一对一关联查询 bit1129 Mybatis3
当两个实体具有1对1的对应关系时，可以使用One-To-One的进行映射关联查询 One-To-One示例数据以学生表Student和地址信息表为例，每个学生都有都有1个唯一的地址(现实中，这种对应关系是不合适的，因为人和地址是多对一的关系)，这里只是演示目的学生表 CREATE TABLE STUDENTS (
C/C++图片或文件的读写 bitcarter 写图片
先看代码： /*strTmpResult是文件或图片字符串 * filePath文件需要写入的地址或路径 */ int writeFile(std::string &strTmpResult,std::string &filePath) { int i,len = strTmpResult.length(); unsigned cha
nginx自定义指定加载配置 ronin47
进入 /usr/local/nginx/conf/include 目录，创建 nginx.node.conf 文件，在里面输入如下代码： upstream nodejs { server 127.0.0.1:3000; #server 127.0.0.1:3001; keepalive 64; } server { liste
java-71-数值的整数次方.实现函数double Power(double base, int exponent)，求base的exponent次方 bylijinnan double
public class Power { /** *Q71-数值的整数次方 *实现函数double Power(double base, int exponent)，求base的exponent次方。不需要考虑溢出。 */ private static boolean InvalidInput=false; public static void main(
Android四大组件的理解 Cb123456 android 四大组件的理解
分享一下，今天在Android开发文档-开发者指南中看到的: App components are the essential building blocks of an Android
[宇宙与计算]涡旋场计算与拓扑分析 comsci 计算
怎么阐述我这个理论呢？。。。。。。。。。首先：宇宙是一个非线性的拓扑结构与涡旋轨道时空的统一体。。。。我们要在宇宙中寻找到一个适合人类居住的行星，时间非常重要，早一个刻度和晚一个刻度，这颗行星的
同一个Tomcat不同Web应用之间共享会话Session cwqcwqmax9 session
实现两个WEB之间通过session 共享数据查看tomcat 关于 HTTP Connector 中有个emptySessionPath 其解释如下： If set to true, all paths for session cookies will be set to /. This can be useful for portlet specification impleme
springmvc Spring3 MVC，ajax，乱码 dashuaifu spring jquery mvc Ajax
springmvc Spring3 MVC @ResponseBody返回，jquery ajax调用中文乱码问题解决 Spring3.0 MVC @ResponseBody 的作用是把返回值直接写到HTTP response body里。具体实现AnnotationMethodHandlerAdapter类handleResponseBody方法，具体实
搭建WAMP环境 dcj3sjt126com wamp
这里先解释一下WAMP是什么意思。W:windows，A：Apache，M：MYSQL，P：PHP。也就是说本文说明的是在windows系统下搭建以apache做服务器、MYSQL为数据库的PHP开发环境。工欲善其事，必须先利其器。因为笔者的系统是WinXP，所以下文指的系统均为此系统。笔者所使用的Apache版本为apache_2.2.11-
yii2 使用raw http request dcj3sjt126com http
Parses a raw HTTP request using yii\helpers\Json::decode() To enable parsing for JSON requests you can configure yii\web\Request::$parsers using this class: 'request' =&g
Quartz-1.8.6 理论部分 eksliang quartz
转载请出自出处：http://eksliang.iteye.com/blog/2207691 一.概述基于Quartz-1.8.6进行学习，因为Quartz2.0以后的API发生的非常大的变化，统一采用了build模式进行构建；什么是quartz? 答：简单的说他是一个开源的java作业调度框架，为在 Java 应用程序中进行作业调度提供了简单却强大的机制。并且还能和Sp
什么是POJO？ gupeng_ie java POJO 框架 Hibernate
POJO--Plain Old Java Objects(简单的java对象) POJO是一个简单的、正规Java对象，它不包含业务逻辑处理或持久化逻辑等，也不是JavaBean、EntityBean等，不具有任何特殊角色和不继承或不实现任何其它Java框架的类或接口。 POJO对象有时也被称为Data对象，大量应用于表现现实中的对象。如果项目中使用了Hiber
jQuery网站顶部定时折叠广告 ini JavaScript html jquery Web css
效果体验：http://hovertree.com/texiao/jquery/4.htmHTML文件代码： <!DOCTYPE html> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>网页顶部定时收起广告jQuery特效 - HoverTree<
Spring boot内嵌的tomcat启动失败 kane_xie spring boot
根据这篇guide创建了一个简单的spring boot应用，能运行且成功的访问。但移植到现有项目（基于hbase）中的时候，却报出以下错误： SEVERE: A child container failed during start java.util.concurrent.ExecutionException: org.apache.catalina.Lif
leetcode: sort list michelle_0916 Algorithm linked list sort
Sort a linked list in O(n log n) time using constant space complexity. ====analysis======= mergeSort for singly-linked list ====code======= /** * Definition for sin
nginx的安装与配置,中途遇到问题的解决 qifeifei nginx
我使用的是ubuntu13.04系统，在安装nginx的时候遇到如下几个问题，然后找思路解决的，nginx 的下载与安装 wget http://nginx.org/download/nginx-1.0.11.tar.gz tar zxvf nginx-1.0.11.tar.gz ./configure make make install 安装的时候出现
用枚举来处理java自定义异常 tcrct java enum exception
在系统开发过程中，总少不免要自己处理一些异常信息，然后将异常信息变成友好的提示返回到客户端的这样一个过程，之前都是new一个自定义的异常，当然这个所谓的自定义异常也是继承RuntimeException的，但这样往往会造成异常信息说明不一致的情况，所以就想到了用枚举来解决的办法。 1，先创建一个接口，里面有两个方法，一个是getCode, 一个是getMessage public
erlang supervisor分析 wudixiaotie erlang
当我们给supervisor指定需要创建的子进程的时候，会指定M,F,A,如果是simple_one_for_one的策略的话，启动子进程的方式是supervisor:start_child(SupName, OtherArgs),这种方式可以根据调用者的需求传不同的参数给需要启动的子进程的方法。和最初的参数合并成一个数组，A ++ OtherArgs。那么这个时候就有个问题了，既然参数不一致，那

Rocksdb 代码学习 写流程1（WriteBatch写,WriterThead调度Writer）

1.几个需要使用的相关类

2.写流程

你可能感兴趣的:(k-v)

Rocksdb 代码学习写流程1（WriteBatch写,WriterThead调度Writer）