hama集群启动后貌似正常,运行ecample:
$ bin/hama jar hama-examples-0.6.0.jar pi
13/03/21 01:37:41 INFO bsp.BSPJobClient: Running job: job_201303210137_0001 13/03/21 01:37:44 INFO bsp.BSPJobClient: Current supersteps number: 0 attempt_201303210137_0001_000007_0: 13/03/21 01:37:16 INFO sync.ZKSyncClient: Initializing ZK Sync Client attempt_201303210137_0001_000007_0: 13/03/21 01:37:16 INFO sync.ZooKeeperSyncClientImpl: Start connecting to Zookeeper! At iir455-199/10.77.30.199:61002 attempt_201303210137_0001_000007_0: 13/03/21 01:37:16 ERROR sync.ZooKeeperSyncClientImpl: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /bsp/job_201303210137_0001/peers attempt_201303210137_0001_000007_0: 13/03/21 01:37:16 ERROR sync.ZKSyncClient: Error checking zk path /bsp/job_201303210137_0001/peers/iir455-199:61002 attempt_201303210137_0001_000007_0: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /bsp/job_201303210137_0001/peers/iir455-199:61002 attempt_201303210137_0001_000007_0: at org.apache.zookeeper.KeeperException.create(KeeperException.java:99) attempt_201303210137_0001_000007_0: at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) attempt_201303210137_0001_000007_0: at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1041) attempt_201303210137_0001_000007_0: at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1069) attempt_201303210137_0001_000007_0: at org.apache.hama.bsp.sync.ZKSyncClient.isExists(ZKSyncClient.java:108) attempt_201303210137_0001_000007_0: at org.apache.hama.bsp.sync.ZKSyncClient.writeNode(ZKSyncClient.java:262) attempt_201303210137_0001_000007_0: at org.apache.hama.bsp.sync.ZooKeeperSyncClientImpl.registerTask(ZooKeeperSyncClientImpl.java:270) attempt_201303210137_0001_000007_0: at org.apache.hama.bsp.sync.ZooKeeperSyncClientImpl.register(ZooKeeperSyncClientImpl.java:250) attempt_201303210137_0001_000007_0: at org.apache.hama.bsp.BSPPeerImpl.initializeSyncService(BSPPeerImpl.java:338) attempt_201303210137_0001_000007_0: at org.apache.hama.bsp.BSPPeerImpl.<init>(BSPPeerImpl.java:169) attempt_201303210137_0001_000007_0: at org.apache.hama.bsp.GroomServer$BSPPeerChild.main(GroomServer.java:1262) attempt_201303210137_0001_000007_0: 13/03/21 01:37:16 ERROR sync.ZKSyncClient: Error creating zk path /bsp/job_201303210137_0001/peers/iir455-199:61002 attempt_201303210137_0001_000007_0: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /bsp attempt_201303210137_0001_000007_0: at org.apache.zookeeper.KeeperException.create(KeeperException.java:99) attempt_201303210137_0001_000007_0: at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) attempt_201303210137_0001_000007_0: at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1041) attempt_201303210137_0001_000007_0: at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1069) attempt_201303210137_0001_000007_0: at org.apache.hama.bsp.sync.ZKSyncClient.createZnode(ZKSyncClient.java:135) attempt_201303210137_0001_000007_0: at org.apache.hama.bsp.sync.ZKSyncClient.writeNode(ZKSyncClient.java:282) attempt_201303210137_0001_000007_0: at org.apache.hama.bsp.sync.ZooKeeperSyncClientImpl.registerTask(ZooKeeperSyncClientImpl.java:270) attempt_201303210137_0001_000007_0: at org.apache.hama.bsp.sync.ZooKeeperSyncClientImpl.register(ZooKeeperSyncClientImpl.java:250) attempt_201303210137_0001_000007_0: at org.apache.hama.bsp.BSPPeerImpl.initializeSyncService(BSPPeerImpl.java:338) attempt_201303210137_0001_000007_0: at org.apache.hama.bsp.BSPPeerImpl.<init>(BSPPeerImpl.java:169) attempt_201303210137_0001_000007_0: at org.apache.hama.bsp.GroomServer$BSPPeerChild.main(GroomServer.java:1262) attempt_201303210137_0001_000007_0: 13/03/21 01:37:16 INFO ipc.Server: Starting SocketReader attempt_201303210137_0001_000007_0: 13/03/21 01:37:16 INFO ipc.Server: IPC Server Responder: starting attempt_201303210137_0001_000007_0: 13/03/21 01:37:16 INFO ipc.Server: IPC Server listener on 61002: starting attempt_201303210137_0001_000007_0: 13/03/21 01:37:16 INFO message.HadoopMessageManagerImpl: BSPPeer address:iir455-199 port:61002 attempt_201303210137_0001_000007_0: 13/03/21 01:37:16 INFO ipc.Server: IPC Server handler 0 on 61002: starting attempt_201303210137_0001_000007_0: 13/03/21 01:37:17 ERROR sync.ZKSyncClient: Error checking zk path /bsp/job_201303210137_0001/sync/-1 attempt_201303210137_0001_000007_0: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /bsp/job_201303210137_0001/sync/-1 attempt_201303210137_0001_000007_0: at org.apache.zookeeper.KeeperException.create(KeeperException.java:99) attempt_201303210137_0001_000007_0: at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) attempt_201303210137_0001_000007_0: at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1041) attempt_201303210137_0001_000007_0: at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1069) attempt_201303210137_0001_000007_0: at org.apache.hama.bsp.sync.ZKSyncClient.isExists(ZKSyncClient.java:108) attempt_201303210137_0001_000007_0: at org.apache.hama.bsp.sync.ZKSyncClient.writeNode(ZKSyncClient.java:262) attempt_201303210137_0001_000007_0: at org.apache.hama.bsp.sync.ZooKeeperSyncClientImpl.enterBarrier(ZooKeeperSyncClientImpl.java:99) attempt_201303210137_0001_000007_0: at org.apache.hama.bsp.BSPPeerImpl.doFirstSync(BSPPeerImpl.java:345) attempt_201303210137_0001_000007_0: at org.apache.hama.bsp.BSPPeerImpl.<init>(BSPPeerImpl.java:233) attempt_201303210137_0001_000007_0: at org.apache.hama.bsp.GroomServer$BSPPeerChild.main(GroomServer.java:1262) attempt_201303210137_0001_000007_0: 13/03/21 01:37:17 ERROR sync.ZKSyncClient: Error creating zk path /bsp/job_201303210137_0001/sync/-1 attempt_201303210137_0001_000007_0: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /bsp attempt_201303210137_0001_000007_0: at org.apache.zookeeper.KeeperException.create(KeeperException.java:99) attempt_201303210137_0001_000007_0: at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) attempt_201303210137_0001_000007_0: at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1041) attempt_201303210137_0001_000007_0: at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1069) attempt_201303210137_0001_000007_0: at org.apache.hama.bsp.sync.ZKSyncClient.createZnode(ZKSyncClient.java:135) attempt_201303210137_0001_000007_0: at org.apache.hama.bsp.sync.ZKSyncClient.writeNode(ZKSyncClient.java:282) attempt_201303210137_0001_000007_0: at org.apache.hama.bsp.sync.ZooKeeperSyncClientImpl.enterBarrier(ZooKeeperSyncClientImpl.java:99) attempt_201303210137_0001_000007_0: at org.apache.hama.bsp.BSPPeerImpl.doFirstSync(BSPPeerImpl.java:345) attempt_201303210137_0001_000007_0: at org.apache.hama.bsp.BSPPeerImpl.<init>(BSPPeerImpl.java:233) attempt_201303210137_0001_000007_0: at org.apache.hama.bsp.GroomServer$BSPPeerChild.main(GroomServer.java:1262) attempt_201303210137_0001_000007_0: 13/03/21 01:37:17 FATAL bsp.GroomServer: SyncError from child attempt_201303210137_0001_000007_0: org.apache.hama.bsp.sync.SyncException attempt_201303210137_0001_000007_0: at org.apache.hama.bsp.sync.ZooKeeperSyncClientImpl.enterBarrier(ZooKeeperSyncClientImpl.java:137) attempt_201303210137_0001_000007_0: at org.apache.hama.bsp.BSPPeerImpl.doFirstSync(BSPPeerImpl.java:345) attempt_201303210137_0001_000007_0: at org.apache.hama.bsp.BSPPeerImpl.<init>(BSPPeerImpl.java:233) attempt_201303210137_0001_000007_0: at org.apache.hama.bsp.GroomServer$BSPPeerChild.main(GroomServer.java:1262) 13/03/21 01:37:47 INFO bsp.BSPJobClient: Job failed.
zookeeper用的不是hama自带的,是一个3个节点的集群,查看zookeeper日志(通常在启动zookeeper的用户的home目录下)也是正常的,测试zookeeper:
$ bin/zkCli.sh -server ***:2181
查看hama的配置文件时发现了问题。hama有两个xml配置文件,hama-site.xml和hama-default.xml.前者可以覆盖后者中的默认配置。
我在hama-site.xml中配置了hama.zookeeper.quorum,却没有配置zookeeper的端口,本以为默认的应该和zookeeper默认的一样,可不想实际上hama默认的zookeeper端口是21810,而不是2181,所以在hama-site.xml中添加:
<property> <name>hama.zookeeper.property.clientPort</name> <value>2181</value> <description>Property from ZooKeeper's config zoo.cfg. The port at which the clients will connect. </description> </property>
重启hama,运行pi样例,OK了:
[iir@iir455-200 hama-0.6.0]$ bin/hama jar hama-examples-0.6.0.jar pi 13/03/21 01:47:41 INFO bsp.BSPJobClient: Running job: job_201303210147_0001 13/03/21 01:47:44 INFO bsp.BSPJobClient: Current supersteps number: 0 13/03/21 01:47:50 INFO bsp.BSPJobClient: Current supersteps number: 1 13/03/21 01:47:50 INFO bsp.BSPJobClient: The total number of supersteps: 1 13/03/21 01:47:50 INFO bsp.BSPJobClient: Counters: 6 13/03/21 01:47:50 INFO bsp.BSPJobClient: org.apache.hama.bsp.JobInProgress$JobCounter 13/03/21 01:47:50 INFO bsp.BSPJobClient: SUPERSTEPS=1 13/03/21 01:47:50 INFO bsp.BSPJobClient: LAUNCHED_TASKS=21 13/03/21 01:47:50 INFO bsp.BSPJobClient: org.apache.hama.bsp.BSPPeerImpl$PeerCounter 13/03/21 01:47:50 INFO bsp.BSPJobClient: SUPERSTEP_SUM=21 13/03/21 01:47:50 INFO bsp.BSPJobClient: TIME_IN_SYNC_MS=7313 13/03/21 01:47:50 INFO bsp.BSPJobClient: TOTAL_MESSAGES_SENT=21 13/03/21 01:47:50 INFO bsp.BSPJobClient: TOTAL_MESSAGES_RECEIVED=21 Estimated value of PI is 3.1463428571428564 Job Finished in 10.379 seconds