在运行hadoop 2.2.0过程中,发现${dfs.namenode.checkpoint.edits.dir}目录下的edits_*文件越来越多,故整理一下相关配置。
在hadoop 2.2.0中关于fsimage和edit logs的相关配置有如下几项:
<property> <name>dfs.namenode.checkpoint.dir</name> <value>file://${hadoop.tmp.dir}/dfs/namesecondary</value> <description>Determines where on the local filesystem the DFS secondary name node should store the temporary images to merge. If this is a comma-delimited list of directories then the image is replicated in all of the directories for redundancy. </description> </property> <property> <name>dfs.namenode.checkpoint.edits.dir</name> <value>${dfs.namenode.checkpoint.dir}</value> <description>Determines where on the local filesystem the DFS secondary name node should store the temporary edits to merge. If this is a comma-delimited list of directoires then teh edits is replicated in all of the directoires for redundancy. Default value is same as dfs.namenode.checkpoint.dir </description> </property> <property> <name>dfs.namenode.checkpoint.period</name> <value>3600</value> <description>The number of seconds between two periodic checkpoints. </description> </property> <property> <name>dfs.namenode.checkpoint.txns</name> <value>1000000</value> <description>The Secondary NameNode or CheckpointNode will create a checkpoint of the namespace every 'dfs.namenode.checkpoint.txns' transactions, regardless of whether 'dfs.namenode.checkpoint.period' has expired. </description> </property> <property> <name>dfs.namenode.checkpoint.check.period</name> <value>60</value> <description>The SecondaryNameNode and CheckpointNode will poll the NameNode every 'dfs.namenode.checkpoint.check.period' seconds to query the number of uncheckpointed transactions. </description> </property> <property> <name>dfs.namenode.checkpoint.max-retries</name> <value>3</value> <description>The SecondaryNameNode retries failed checkpointing. If the failure occurs while loading fsimage or replaying edits, the number of retries is limited by this variable. </description> </property>
还有如下两项(即困扰我的),用于设置fsimage和edit log保存的记录数,默认保存2份fsimge和1000000份edits日志信息。
<property> <name>dfs.namenode.num.checkpoints.retained</name> <value>2</value> <description>The number of image checkpoint files that will be retained by the NameNode and Secondary NameNode in their storage directories. All edit logs necessary to recover an up-to-date namespace from the oldest retained checkpoint will also be retained. </description> </property> <property> <name>dfs.namenode.num.extra.edits.retained</name> <value>1000000</value> <description>The number of extra transactions which should be retained beyond what is minimally necessary for a NN restart. This can be useful for audit purposes or for an HA setup where a remote Standby Node may have been offline for some time and need to have a longer backlog of retained edits in order to start again. Typically each edit is on the order of a few hundred bytes, so the default of 1 million edits should be on the order of hundreds of MBs or low GBs. NOTE: Fewer extra edits may be retained than value specified for this setting if doing so would mean that more segments would be retained than the number configured by dfs.namenode.max.extra.edits.segments.retained. </description> </property>
下面一项参考<<hadoop 2.2.0 fsimage和edit logs的处理逻辑>>
<property> <name>dfs.namenode.max.extra.edits.segments.retained</name> <value>10000</value> <description>The maximum number of extra edit log segments which should be retained beyond what is minimally necessary for a NN restart. When used in conjunction with dfs.namenode.num.extra.edits.retained, this configuration property serves to cap the number of extra edits files to a reasonable value. </description> </property>