secondarynamenoded 配置很容易被忽视,如果jps检查都正常,大家通常不会太关心,除非namenode发生问题的时候,才会想起还有个secondary namenode,它的配置共两步:
集群配置文件conf/master中添加secondarynamenode的机器
修改/添加 hdfs-site.xml中如下属性:
[html] view plaincopy
1. <property>
2. <name>dfs.http.address</name>
3. <value>{your_namenode_ip}:50070</value>
4. <description>
5. The address and the base port where the dfs namenode web ui will listen on.
6. If the port is 0 then the server will start on a free port.
7. </description>
8. </property>
这两项配置OK后,启动集群。进入secondarynamenode 机器,检查fs.checkpoint.dir(core-site.xml文件,默认为${hadoop.tmp.dir}/dfs/namesecondary)目录同步状态是否和namenode一致的。
如果不配置第二项则,secondary namenode同步文件夹永远为空,这时查看secondarynamenode的log显示错误为:
[plain] view plaincopy
1. 2011-06-09 11:06:41,430 INFO org.apache.hadoop.hdfs.server.common.Storage: Recovering storage directory /tmp/hadoop-hadoop/dfs/namesecondary from failed checkpoint.
2. 2011-06-09 11:06:41,433 ERROR org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Exception in doCheckpoint:
3. 2011-06-09 11:06:41,434 ERROR org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: java.net.ConnectException: Connection refused
4. at java.net.PlainSocketImpl.socketConnect(Native Method)
5. at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
6. at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:211)
7. at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)
8. at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
9. at java.net.Socket.connect(Socket.java:529)
10.at java.net.Socket.connect(Socket.java:478)
11.at sun.net.NetworkClient.doConnect(NetworkClient.java:163)
12.at sun.net.www.http.HttpClient.openServer(HttpClient.java:394)
13.at sun.net.www.http.HttpClient.openServer(HttpClient.java:529)
14.at sun.net.www.http.HttpClient.<init>(HttpClient.java:233)
15.at sun.net.www.http.HttpClient.New(HttpClient.java:306)
16.at sun.net.www.http.HttpClient.New(HttpClient.java:323)
17.at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:970)
18.at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:911)
19.at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:836)
20.at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1172)
21.at org.apache.hadoop.hdfs.server.namenode.TransferFsImage.getFileClient(TransferFsImage.java:151)
22.at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.downloadCheckpointFiles(SecondaryNameNode.java:256)
23.at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:313)
24.at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:225)
25.at java.lang.Thread.run(Thread.java:662)
可能用到的core-site.xml文件相关属性:
[html] view plaincopy
1. <property>
2. <name>fs.checkpoint.period</name>
3. <value>300</value>
4. <description>The number of seconds between two periodic checkpoints.
5. </description>
6. </property>
7.
8. <property>
9. <name>fs.checkpoint.dir</name>
10. <value>${hadoop.tmp.dir}/dfs/namesecondary</value>
11. <description>Determines where on the local filesystem the DFS secondary
12. name node should store the temporary images to merge.
13. If this is a comma-delimited list of directories then the image is
14. replicated in all of the directories for redundancy.
15. </description>
16.</property>