CDH4 HA test

场景:
      NN HA 设置成功,HA切换客户端出现异常,

错误分析
      用户执行Shell脚本问题
日志:

客户端
2012-08-01 14:37:07,798 WARN  ipc.Client (Client.java:run(787)) - Unexpected error reading responses on connection Thread[IPC Client (1333933549) connection to bigdata-3/172.16.206.206:9000 from peter,5,main]
java.lang.NullPointerException
at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:852)
at org.apache.hadoop.ipc.Client$Connection.run(Client.java:781)
2012-08-01 14:37:07,807 WARN  retry.RetryInvocationHandler (RetryInvocationHandler.java:invoke(118)) - Exception while invoking complete of class ClientNamenodeProtocolTranslatorPB. Trying to fail over immediately.
2012-08-01 14:37:07,970 WARN  retry.RetryInvocationHandler (RetryInvocationHandler.java:invoke(118)) - Exception while invoking complete of class ClientNamenodeProtocolTranslatorPB after 1 fail over attempts. Trying to fail over after sleeping for 713ms.
2012-08-01 14:37:08,686 WARN  retry.RetryInvocationHandler (RetryInvocationHandler.java:invoke(118)) - Exception while invoking complete of class ClientNamenodeProtocolTranslatorPB after 2 fail over attempts. Trying to fail over after sleeping for 1596ms.
2012-08-01 14:37:10,286 WARN  retry.RetryInvocationHandler (RetryInvocationHandler.java:invoke(118)) - Exception while invoking complete of class ClientNamenodeProtocolTranslatorPB after 3 fail over attempts. Trying to fail over after sleeping for 2974ms.
2012-08-01 14:37:13,262 WARN  retry.RetryInvocationHandler (RetryInvocationHandler.java:invoke(118)) - Exception while invoking complete of class ClientNamenodeProtocolTranslatorPB after 4 fail over attempts. Trying to fail over after sleeping for 7861ms.

服务器端
2012-08-01 14:54:45,614 WARN org.apache.hadoop.security.UserGroupInformation: No groups available for user peter
2012-08-01 14:54:45,619 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.allocateBlock: /user/peter/FS/100wan/1413. BP-283690147-172.16.206.206-1343792626658 blk_-6816230619303558443_3866{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[172.16.206.209:50010|RBW], ReplicaUnderConstruction[172.16.206.206:50010|RBW]]}
2012-08-01 14:54:46,529 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* addStoredBlock: blockMap updated: 172.16.206.206:50010 is added to blk_-6816230619303558443_3866{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[172.16.206.209:50010|RBW], ReplicaUnderConstruction[172.16.206.206:50010|RBW]]} size 0
2012-08-01 14:54:46,529 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* addStoredBlock: blockMap updated: 172.16.206.209:50010 is added to blk_-6816230619303558443_3866{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[172.16.206.209:50010|RBW], ReplicaUnderConstruction[172.16.206.206:50010|RBW]]} size 0
2012-08-01 14:54:46,531 INFO org.apache.hadoop.hdfs.StateChange: DIR* NameSystem.completeFile: file /user/peter/FS/100wan/1413 is closed by DFSClient_NONMAPREDUCE_-1368488343_1
2012-08-01 14:54:46,540 WARN org.apache.hadoop.security.ShellBasedUnixGroupsMapping: got exception trying to get groups for user peter
org.apache.hadoop.util.Shell$ExitCodeException: id: peter:无此用户

at org.apache.hadoop.util.Shell.runCommand(Shell.java:261)
at org.apache.hadoop.util.Shell.run(Shell.java:188)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:381)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:467)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:450)
at org.apache.hadoop.security.ShellBasedUnixGroupsMapping.getUnixGroups(ShellBasedUnixGroupsMapping.java:86)
at org.apache.hadoop.security.ShellBasedUnixGroupsMapping.getGroups(ShellBasedUnixGroupsMapping.java:55)
at org.apache.hadoop.security.Groups.getGroups(Groups.java:88)
at org.apache.hadoop.security.UserGroupInformation.getGroupNames(UserGroupInformation.java:1116)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.<init>(FSPermissionChecker.java:51)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:4259)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:4236)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1579)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1514)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:408)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:200)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:42590)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:427)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:916)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1692)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1688)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1686)

你可能感兴趣的:(test)