先说结论:
1.配置DNS(略),正反解析都要配置
2.在集群上配置下面两个参数
hadoop.security.dns.interface
hadoop.security.dns.nameserver
过程:
2019-03-13 23:38:41,147 WARN org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:yarn/[email protected] (auth:KERBEROS) cause:java.io.IOException: java.lang.IllegalArgumentException: Server has invalid Kerb
eros principal: yarn/[email protected], expecting: yarn/[email protected]
2019-03-13 23:38:41,152 INFO org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider: Failing over to rm127
2019-03-13 23:38:41,155 WARN org.apache.hadoop.ipc.Client: Failed to connect to server: node0/192.167.1.246
:8031: retries get failed due to exceeded maximum allowed retries number: 0
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:648)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:744)
at org.apache.hadoop.ipc.Client$Connection.access$3000(Client.java:396)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1557)
at org.apache.hadoop.ipc.Client.call(Client.java:1480)
at org.apache.hadoop.ipc.Client.call(Client.java:1441)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
at com.sun.proxy.$Proxy39.registerNodeManager(Unknown Source)
at org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClien
tImpl.java:68)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:260)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
at com.sun.proxy.$Proxy40.registerNodeManager(Unknown Source)
at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:275)
at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:209)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStart(NodeManager.java:329)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:563)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:609)
2019-03-13 23:38:41,167 INFO org.apache.hadoop.io.retry.RetryInvocationHandler: Exception while invoking registerNodeManager of cl
ass ResourceTrackerPBClientImpl over rm127 after 1 fail over attempts. Trying to fail over after sleeping for 255ms.
这个是第一个异常栈没有意义,是个nodemanager服务启动的时候调用的一个回调函数,静态代码很难直接看到时哪里的的代码。但是给出的提示已经很充分了,principal拼错了,那么我们从loginuserfromkeytab的地方开始向下查找是哪里出错的。但是登陆的地方很多,最后也没找到是具体是哪里,这个不是我们关注的重点,就略过了后面有时间在研究。关键代码在这里
String principalName = SecurityUtil.getServerPrincipal(principalConfig,
hostname);
DataNode的方法和这里还略有不同,这个是从yarn的目录下找到的,那么就去看看这个东西是怎么实现的。
public static String getLocalHostName(@Nullable Configuration conf)
throws UnknownHostException {
if (conf != null) {
String dnsInterface = conf.get(HADOOP_SECURITY_DNS_INTERFACE_KEY);
String nameServer = conf.get(HADOOP_SECURITY_DNS_NAMESERVER_KEY);
if (dnsInterface != null) {
return DNS.getDefaultHost(dnsInterface, nameServer, true);
} else if (nameServer != null) {
throw new IllegalArgumentException(HADOOP_SECURITY_DNS_NAMESERVER_KEY +
" requires " + HADOOP_SECURITY_DNS_INTERFACE_KEY + ". Check your" +
"configuration.");
}
}
// Fallback to querying the default hostname as we did before.
return InetAddress.getLocalHost().getCanonicalHostName();
}
可以通过参数配置来指定一个dnsserver, 就是我们前面提到那两个参数。那么我们就试试,结果还是不行....
既然我们找到了principal的方法,那么我们就手动的取这个内容
import java.util.ArrayList;
import java.util.LinkedList;
import java.io.IOException;
import java.net.InetAddress;
import java.net.UnknownHostException;
import java.util.List;
import org.apache.hadoop.net.DNS;
public class Address {
public static void main(String[] args) throws IOException{
InetAddress inetAddress;//声明InetAddress对象
try {
inetAddress=InetAddress.getLocalHost();//实例化InetAddress对象,返回本地主机
String hostName=inetAddress.getHostName();//获取本地主机名
String canonicalHostName=inetAddress.getCanonicalHostName();//获取此 IP地址的完全限定域名
String[] addresses = DNS. getIPs("bond1");
for (int ctr = 0; ctr < addresses.length; ctr++) {
System.out.println("是否能到达此IP地址:"+addresses[ctr]);
}
String x = InetAddress.getByName(addresses[0]).getHostAddress();
System.out.println("是否能到达此IP地址:"+x);
String s = DNS.getDefaultHost("bond1", "192.167.1.246");
System.out.println("DNS.getDefaultHost:"+s);
byte[] address=inetAddress.getAddress();//获取原始IP地址
int a=0;
if(address[3]<0){
a=address[3]+256;
}
String hostAddress=inetAddress.getHostAddress();//获取本地主机的IP地址
boolean reachable=inetAddress.isReachable(2000);//获取布尔类型,看是否能到达此IP地址
System.out.println(inetAddress.toString());
System.out.println("主机名为:"+hostName);//输出本地主机名
System.out.println("此IP地址的完全限定域名:"+canonicalHostName);//输出此IP地址的完全限定域名
System.out.println("原始IP地址为:"+address[0]+"."+address[1]+"."+address[2]+"."+a);//输出本地主机的原始IP地址
System.out.println("IP地址为:"+hostAddress);//输出本地主机的IP地址
System.out.println("是否能到达此IP地址:"+reachable);
} catch (UnknownHostException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
DNS.getDefaultHost 这个函数没有获取到我们期望的值。
继续看下去发现这个里面有一段关键代码
public static String[] getHosts(String strInterface,
@Nullable String nameserver,
boolean tryfallbackResolution)
throws UnknownHostException {
final List hosts = new Vector();
final List addresses =
getIPsAsInetAddressList(strInterface, true);
for (InetAddress address : addresses) {
try {
hosts.add(reverseDns(address, nameserver));
} catch (NamingException ignored) {
}
}
if (hosts.isEmpty() && tryfallbackResolution) {
for (InetAddress address : addresses) {
final String canonicalHostName = address.getCanonicalHostName();
// Don't use the result if it looks like an IP address.
if (!InetAddresses.isInetAddress(canonicalHostName)) {
hosts.add(canonicalHostName);
}
}
}
原来需要反解,好像我们是没有配置这个内容,那么配置一下反解就生效了。
调试的过程可以intellij 来进行远程调试,但是生产环境有网络隔离,搭建测试环境又麻烦,就手写代码模拟这个过程了。