Container killed on request. Exit code is 143 Container exited with a non-zero exit code 143 14/10/11 17:19:49 INFO mapreduce.Job: Task Id : attempt_1413018622391_0001_r_000002_2, Status : FAILED Container [pid=11406,containerID=container_1413018622391_0001_01_000023] is running beyond virtual memory limits. Current usage: 178.8 MB of 1 GB physical memory used; 5.7 GB of 2.1 GB virtual memory used. Killing container. Dump of the process-tree for container_1413018622391_0001_01_000023 : |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE |- 11406 10814 11406 11406 (bash) 0 0 108642304 302 /bin/bash -c /opt/java/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Xmx5000m -Djava.io.tmpdir=/home/test/hadoop/data/tmp/nm-local-dir/usercache/test/appcache/application_1413018622391_0001/container_1413018622391_0001_01_000023/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/home/test/hadoop/app/hadoop-2.5.1/logs/userlogs/application_1413018622391_0001/container_1413018622391_0001_01_000023 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA org.apache.hadoop.mapred.YarnChild 172.16.2.193 60022 attempt_1413018622391_0001_r_000002_2 23 1>/home/test/hadoop/app/hadoop-2.5.1/logs/userlogs/application_1413018622391_0001/container_1413018622391_0001_01_000023/stdout 2>/home/test/hadoop/app/hadoop-2.5.1/logs/userlogs/application_1413018622391_0001/container_1413018622391_0001_01_000023/stderr |- 11411 11406 11406 11406 (java) 367 12 6029000704 45458 /opt/java/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Xmx5000m -Djava.io.tmpdir=/home/test/hadoop/data/tmp/nm-local-dir/usercache/test/appcache/application_1413018622391_0001/container_1413018622391_0001_01_000023/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/home/test/hadoop/app/hadoop-2.5.1/logs/userlogs/application_1413018622391_0001/container_1413018622391_0001_01_000023 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA org.apache.hadoop.mapred.YarnChild 172.16.2.193 60022 attempt_1413018622391_0001_r_000002_2 23
<name>mapreduce.job.maps</name> <name>mapreduce.job.reduces</name> <name>mapreduce.tasktracker.map.tasks.maximum</name> <name>mapreduce.tasktracker.reduce.tasks.maximum</name> <name>mapred.child.java.opts</name>
java.io.IOException: Incompatible clusterIDs in /home/hadoop/hadoop-2.5.1/data: namenode clusterID = CID-dedc085d-ec1c-4b07-89cb-5924880d2682; datanode clusterID = CID-ff0e415d-0734-4ae5-b828-6af6c6843ec4 at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:477) at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:226) at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:254) at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:975) at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:946) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:278) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:220) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:812) at java.lang.Thread.run(Thread.java:745)
[test@x197 hadoop]$ hdfs dfs -chmod 755 / 14/10/25 14:53:43 WARN retry.RetryInvocationHandler: Exception while invoking class org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.setPermission over /172.16.2.197:9000. Not retrying because retries (11) exceeded maximum allowed (10) org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.RetriableException): org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot set permission for /. Name node is in safe mode. The reported blocks 414 needs additional 3 blocks to reach the threshold 0.9990 of total blocks 417. The number of live datanodes 40 has reached the minimum number 0. Safe mode will be turned off automatically once the thresholds have been reached. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1276) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setPermissionInt(FSNamesystem.java:1624) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setPermission(FSNamesystem.java:1607) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.setPermission(NameNodeRpcServer.java:579) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.setPermission(ClientNamenodeProtocolServerSideTranslatorPB.java:416) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007) Caused by: org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot set permission for /. Name node is in safe mode. The reported blocks 414 needs additional 3 blocks to reach the threshold 0.9990 of total blocks 417. The number of live datanodes 40 has reached the minimum number 0. Safe mode will be turned off automatically once the thresholds have been reached. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1272) ... 13 more at org.apache.hadoop.ipc.Client.call(Client.java:1411) at org.apache.hadoop.ipc.Client.call(Client.java:1364) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206) at com.sun.proxy.$Proxy14.setPermission(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.setPermission(ClientNamenodeProtocolTranslatorPB.java:314) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) at com.sun.proxy.$Proxy15.setPermission(Unknown Source) at org.apache.hadoop.hdfs.DFSClient.setPermission(DFSClient.java:2163) at org.apache.hadoop.hdfs.DistributedFileSystem$23.doCall(DistributedFileSystem.java:1236) at org.apache.hadoop.hdfs.DistributedFileSystem$23.doCall(DistributedFileSystem.java:1232) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.setPermission(DistributedFileSystem.java:1232) at org.apache.hadoop.fs.FsShellPermissions$Chmod.processPath(FsShellPermissions.java:103) at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:306) at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:278) at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:260) at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:244) at org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:190) at org.apache.hadoop.fs.shell.Command.run(Command.java:154) at org.apache.hadoop.fs.FsShell.run(FsShell.java:287) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at org.apache.hadoop.fs.FsShell.main(FsShell.java:340) chmod: changing permissions of '/': org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot set permission for /. Name node is in safe mode.
hadoop dfsadmin -safemode leave
JMX enabled by default Using config: /home/test/hadoop/app/zookeeper-3.4.6/bin/../conf/zoo.cfg Error contacting service. It is probably not running.
如果查看日志的话,会显示这是connection错误,可以理解为几台机器链接不成功。
连接问题的话一般有三种:一、防火墙,2、ssh,三、配置问题,也就是/data/myid里面的数字
之前出过类似问题的时候,是最后一种情况,这次三种情况全部检查均未解决,最后将三台zookeeper中出错误的一台换掉,换成另一台服务器就OK 了,具体原因不清楚。
ERROR [main] regionserver.HRegionServerCommandLine: Region server exiting java.lang.RuntimeException: HRegionServer Aborted at org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.start(HRegionServerCommandLine.java:66) at org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.run(HRegionServerCommandLine.java:85) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126) at org.apache.hadoop.hbase.regionserver.HRegionServer.main(HRegionServer.java:2489)
date -s "2014-10-13 09:56:33"
2014-10-11 19:51:37,875 FATAL [regionserver60020] regionserver.HRegionServer: ABORTING region server x196,60020,1413028275179: Initialization of RS failed. Hence aborting RS. java.io.IOException: java.lang.reflect.InvocationTargetException at org.apache.hadoop.hbase.client.HConnectionManager.createConnection(HConnectionManager.java:411) at org.apache.hadoop.hbase.client.HConnectionManager.createConnection(HConnectionManager.java:388) at org.apache.hadoop.hbase.client.HConnectionManager.getConnection(HConnectionManager.java:269) at org.apache.hadoop.hbase.catalog.CatalogTracker.<init>(CatalogTracker.java:151) at org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:752) at org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:715) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:848) at java.lang.Thread.run(Thread.java:722) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:525) at org.apache.hadoop.hbase.client.HConnectionManager.createConnection(HConnectionManager.java:409) ... 7 more Caused by: java.lang.ExceptionInInitializerError at org.apache.hadoop.hbase.ClusterId.parseFrom(ClusterId.java:64) at org.apache.hadoop.hbase.zookeeper.ZKClusterId.readClusterIdZNode(ZKClusterId.java:69) at org.apache.hadoop.hbase.client.ZooKeeperRegistry.getClusterId(ZooKeeperRegistry.java:83) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.retrieveClusterId(HConnectionManager.java:837) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.<init>(HConnectionManager.java:640) ... 12 more Caused by: java.lang.IllegalArgumentException: java.net.UnknownHostException: cluster at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:418) at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:231) at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:139) at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:510) at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:453) at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:136) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2433) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:287) at org.apache.hadoop.hbase.util.DynamicClassLoader.<init>(DynamicClassLoader.java:104) at org.apache.hadoop.hbase.protobuf.ProtobufUtil.<clinit>(ProtobufUtil.java:201) ... 17 more Caused by: java.net.UnknownHostException: cluster ... 31 more 2014-10-11 19:51:37,880 FATAL [regionserver60020] regionserver.HRegionServer: RegionServer abort: loaded coprocessors are: [] 2014-10-11 19:51:37,881 INFO [regionserver60020] regionserver.HRegionServer: STOPPED: Initialization of RS failed. Hence aborting RS.
hbase-daemon.sh start regionserver
hbase(main):001:0> list TABLE SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/test/hadoop/app/hbase-0.98.6.1-hadoop2/lib/slf4j-log4j12-1.6.4.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/test/hadoop/app/hadoop-2.5.1/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.