hadoop 2.2.0 关于 fsimage & edit log 的相关配置

在运行hadoop 2.2.0过程中,发现${dfs.namenode.checkpoint.edits.dir}目录下的edits_*文件越来越多,故整理一下相关配置。


在hadoop 2.2.0中关于fsimage和edit logs的相关配置有如下几项:

<property>
  <name>dfs.namenode.checkpoint.dir</name>
  <value>file://${hadoop.tmp.dir}/dfs/namesecondary</value>
  <description>Determines where on the local filesystem the DFS secondary
      name node should store the temporary images to merge.
      If this is a comma-delimited list of directories then the image is
      replicated in all of the directories for redundancy.
  </description>
</property>

<property>
  <name>dfs.namenode.checkpoint.edits.dir</name>
  <value>${dfs.namenode.checkpoint.dir}</value>
  <description>Determines where on the local filesystem the DFS secondary
      name node should store the temporary edits to merge.
      If this is a comma-delimited list of directoires then teh edits is
      replicated in all of the directoires for redundancy.
      Default value is same as dfs.namenode.checkpoint.dir
  </description>
</property>

<property>
  <name>dfs.namenode.checkpoint.period</name>
  <value>3600</value>
  <description>The number of seconds between two periodic checkpoints.
  </description>
</property>

<property>
  <name>dfs.namenode.checkpoint.txns</name>
  <value>1000000</value>
  <description>The Secondary NameNode or CheckpointNode will create a checkpoint
  of the namespace every 'dfs.namenode.checkpoint.txns' transactions, regardless
  of whether 'dfs.namenode.checkpoint.period' has expired.
  </description>
</property>

<property>
  <name>dfs.namenode.checkpoint.check.period</name>
  <value>60</value>
  <description>The SecondaryNameNode and CheckpointNode will poll the NameNode
  every 'dfs.namenode.checkpoint.check.period' seconds to query the number
  of uncheckpointed transactions.
  </description>
</property>

<property>
  <name>dfs.namenode.checkpoint.max-retries</name>
  <value>3</value>
  <description>The SecondaryNameNode retries failed checkpointing. If the 
  failure occurs while loading fsimage or replaying edits, the number of
  retries is limited by this variable. 
  </description>
</property>


还有如下两项(即困扰我的),用于设置fsimage和edit log保存的记录数,默认保存2份fsimge和1000000份edits日志信息。

<property>
  <name>dfs.namenode.num.checkpoints.retained</name>
  <value>2</value>
  <description>The number of image checkpoint files that will be retained by
  the NameNode and Secondary NameNode in their storage directories. All edit
  logs necessary to recover an up-to-date namespace from the oldest retained
  checkpoint will also be retained.
  </description>
</property>

<property>
  <name>dfs.namenode.num.extra.edits.retained</name>
  <value>1000000</value>
  <description>The number of extra transactions which should be retained
  beyond what is minimally necessary for a NN restart. This can be useful for
  audit purposes or for an HA setup where a remote Standby Node may have
  been offline for some time and need to have a longer backlog of retained
  edits in order to start again.
  Typically each edit is on the order of a few hundred bytes, so the default
  of 1 million edits should be on the order of hundreds of MBs or low GBs.

  NOTE: Fewer extra edits may be retained than value specified for this setting
  if doing so would mean that more segments would be retained than the number
  configured by dfs.namenode.max.extra.edits.segments.retained.
  </description>
</property>


下面一项参考<<hadoop 2.2.0 fsimage和edit logs的处理逻辑>>

<property>
  <name>dfs.namenode.max.extra.edits.segments.retained</name>
  <value>10000</value>
  <description>The maximum number of extra edit log segments which should be retained
  beyond what is minimally necessary for a NN restart. When used in conjunction with
  dfs.namenode.num.extra.edits.retained, this configuration property serves to cap
  the number of extra edits files to a reasonable value.
  </description>
</property>

你可能感兴趣的:(hadoop 2.2.0 关于 fsimage & edit log 的相关配置)