说明:这是一篇对DNS排错的文章,因为在网上(包括RedHat知识库)几乎没有对文中提到的错误进行直接描述和提出最好最快的解决方案的报告,经过长达近一个小时的排错和资料查阅才有了这篇文章的脱稿。
昨天我刚刚在非生产环境中的Red Hat Enterprise Linux Server上配置了一台DNS服务器,以做测试使用。但是很快遇到了一个奇怪的错误。
我在执行“service named status”后,其中第一行显示如下内容:
[root@localhost ~]# service named status
rndc: connect failed: 127.0.0.1#953: connection refused
named (pid 6207) is running...
[root@localhost ~]#
一般大家都知道,rndc 主要是用来控制named进程及其配置文件的,可以用来连接DNS服务器并对配置进行重新载入,其端口号就是953。那么导致这个错误的原因可能是什么呢?
我的解决思路:
首先,发现问题,仔细阅读查看命令的回显信息。例如我详细的查看service的状态信息。
[root@localhost gdd]# service --status-all
abrtd (pid 2371) is running...
abrt-dump-oops (pid 2379) is running...
acpid (pid 2111) is running...
atd (pid 5396) is running...
auditd (pid 1833) is running...
automount (pid 2195) is running...
avahi-daemon (pid 2016) is running...
Usage: /etc/init.d/bluetooth {start|stop}
certmonger is stopped
Stopped
cgred is stopped
Frequency scaling enabled using ondemand governor
crond (pid 2423) is running...
cupsd (pid 2086) is running...
dnsmasq is stopped
dovecot is stopped
Usage: /etc/init.d/firstboot {start|stop}
hald (pid 2120) is running...
I don't know of any running hsqldb server.
httpd (pid 6595) is running...
Table: filter
Chain INPUT (policy ACCEPT)
num target prot opt source destination
1 ACCEPT all ::/0 ::/0 state RELATED,ESTABLISHED
2 ACCEPT icmpv6 ::/0 ::/0
3 ACCEPT all ::/0 ::/0
4 ACCEPT tcp ::/0 ::/0 state NEW tcp dpt:22
5 REJECT all ::/0 ::/0 reject-with icmp6-adm-prohibited
Chain FORWARD (policy ACCEPT)
num target prot opt source destination
1 REJECT all ::/0 ::/0 reject-with icmp6-adm-prohibited
Chain OUTPUT (policy ACCEPT)
num target prot opt source destination
IPsec stopped
Table: filter
Chain INPUT (policy ACCEPT)
num target prot opt source destination
1 ACCEPT all -- 0.0.0.0/0 0.0.0.0/0 state RELATED,ESTABLISHED
2 ACCEPT icmp -- 0.0.0.0/0 0.0.0.0/0
3 ACCEPT all -- 0.0.0.0/0 0.0.0.0/0
4 ACCEPT tcp -- 10.0.0.0/8 0.0.0.0/0 tcp dpt:953
5 ACCEPT tcp -- 10.0.0.0/8 0.0.0.0/0 tcp dpt:53
6 ACCEPT tcp -- 10.0.0.0/8 0.0.0.0/0 tcp dpt:443
7 ACCEPT tcp -- 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpt:22
8 REJECT all -- 0.0.0.0/0 0.0.0.0/0 reject-with icmp-host-prohibited
Chain FORWARD (policy ACCEPT)
num target prot opt source destination
1 REJECT all -- 0.0.0.0/0 0.0.0.0/0 reject-with icmp-host-prohibited
Chain OUTPUT (policy ACCEPT)
num target prot opt source destination
Table: mangle
Chain PREROUTING (policy ACCEPT)
num target prot opt source destination
Chain INPUT (policy ACCEPT)
num target prot opt source destination
Chain FORWARD (policy ACCEPT)
num target prot opt source destination
Chain OUTPUT (policy ACCEPT)
num target prot opt source destination
Chain POSTROUTING (policy ACCEPT)
num target prot opt source destination
Table: nat
Chain PREROUTING (policy ACCEPT)
num target prot opt source destination
Chain POSTROUTING (policy ACCEPT)
num target prot opt source destination
Chain OUTPUT (policy ACCEPT)
num target prot opt source destination
irqbalance (pid 1895) is running...
Kdump is operational
started
qpidd is stopped
matahari-qmf-hostd is stopped
matahari-qmf-networkd is stopped
matahari-qmf-serviced is stopped
matahari-qmf-sysconfigd is stopped
Checking for mcelog
mcelog is stopped
mdmonitor is stopped
messagebus (pid 1993) is running...
mysqld is stopped
rndc: connect failed: 127.0.0.1#953: connection refused
named is stopped
No open transaction
netconsole module not loaded
Configured devices:
lo eth0
Currently active devices:
lo eth0
NetworkManager (pid 2004) is running...
rpc.svcgssd is stopped
rpc.mountd is stopped
nfsd is stopped
rpc.rquotad is stopped
rpc.statd (pid 2037) is running...
nmbd is stopped
ntpd (pid 2243) is running...
oddjobd is stopped
portreserve (pid 1851) is running...
master (pid 2347) is running...
postmaster is stopped
Process accounting is disabled.
qpidd (pid 2390) is running...
quota_nld is stopped
rdisc is stopped
restorecond (pid 10836) is running...
rhnsd (pid 2445) is running...
rhsmcertd (pid 2457 2456) is running...
rngd is stopped
rpcbind (pid 1909) is running...
rpc.gssd is stopped
rpc.idmapd (pid 2076) is running...
rpc.svcgssd is stopped
rsyslogd (pid 1858) is running...
sandbox is stopped
saslauthd is stopped
sfcb is not running, but pid file exists
smartd is stopped
smbd is stopped
snmpd is stopped
snmptrapd is stopped
spamd is stopped
spice-vdagentd is stopped
openssh-daemon (pid 2233) is running...
sssd is stopped
CIM server (2470) is runningtomcat6 is stopped [ OK ]
vsftpd is stopped
wdaemon is stopped
Webmin (pid 2498) is running
wpa_supplicant (pid 2020) is running...
ypbind is stopped
很显然,上面的显示中的第97行显示的
rndc: connect failed: 127.0.0.1#953: connection refused
named is stopped
是错误的信息。
然后我开始查看系统日志,显示结果如下:
[root@localhost ~]# named -g
28-Mar-2012 13:27:58.722 starting BIND 9.7.3-P3-RedHat-9.7.3-8.P3.el6_2.2 -g
28-Mar-2012 13:27:58.722 built with '--build=x86_64-redhat-linux-gnu' '--host=x86_64-redhat-linux-gnu' '--target=x86_64-redhat-linux-gnu' '--program-prefix=' '--prefix=/usr' '--exec-prefix=/usr' '--bindir=/usr/bin' '--sbindir=/usr/sbin' '--sysconfdir=/etc' '--datadir=/usr/share' '--includedir=/usr/include' '--libdir=/usr/lib64' '--libexecdir=/usr/libexec' '--sharedstatedir=/var/lib' '--mandir=/usr/share/man' '--infodir=/usr/share/info' '--with-libtool' '--localstatedir=/var' '--enable-threads' '--enable-ipv6' '--with-pic' '--disable-static' '--disable-openssl-version-check' '--with-dlz-ldap=yes' '--with-dlz-postgres=yes' '--with-dlz-mysql=yes' '--with-dlz-filesystem=yes' '--with-gssapi=yes' '--disable-isc-spnego' '--with-docbook-xsl=/usr/share/sgml/docbook/xsl-stylesheets' 'build_alias=x86_64-redhat-linux-gnu' 'host_alias=x86_64-redhat-linux-gnu' 'target_alias=x86_64-redhat-linux-gnu' 'CFLAGS= -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic' 'CPPFLAGS= -DDIG_SIGCHASE'
28-Mar-2012 13:27:58.722 adjusted limit on open files from 1024 to 1048576
28-Mar-2012 13:27:58.722 found 2 CPUs, using 2 worker threads
28-Mar-2012 13:27:58.723 using up to 4096 sockets
28-Mar-2012 13:27:58.734 loading configuration from '/etc/named.conf'
28-Mar-2012 13:27:58.735 reading built-in trusted keys from file '/etc/named.iscdlv.key'
28-Mar-2012 13:27:58.736 using default UDP/IPv4 port range: [1024, 65535]
28-Mar-2012 13:27:58.737 using default UDP/IPv6 port range: [1024, 65535]
28-Mar-2012 13:27:58.740 listening on IPv4 interface lo, 127.0.0.1#53
28-Mar-2012 13:27:58.744 binding TCP socket: address in use
28-Mar-2012 13:27:58.744 listening on IPv6 interface lo, ::1#53
28-Mar-2012 13:27:58.745 binding TCP socket: address in use
28-Mar-2012 13:27:58.747 could not open file '/var/run/named/named.pid': Permission denied
28-Mar-2012 13:27:58.747 generating session key for dynamic DNS
28-Mar-2012 13:27:58.747 could not open file '/var/run/named/session.key': Permission denied
28-Mar-2012 13:27:58.747 could not create /var/run/named/session.key
28-Mar-2012 13:27:58.747 failed to generate session key for dynamic DNS: permission denied
28-Mar-2012 13:27:58.753 using built-in trusted-keys for view _default
28-Mar-2012 13:27:58.754 set up managed keys zone for view _default, file 'dynamic/managed-keys.bind'
28-Mar-2012 13:27:58.754 automatic empty zone: 127.IN-ADDR.ARPA
28-Mar-2012 13:27:58.754 automatic empty zone: 254.169.IN-ADDR.ARPA
28-Mar-2012 13:27:58.754 automatic empty zone: 2.0.192.IN-ADDR.ARPA
28-Mar-2012 13:27:58.754 automatic empty zone: 100.51.198.IN-ADDR.ARPA
28-Mar-2012 13:27:58.754 automatic empty zone: 113.0.203.IN-ADDR.ARPA
28-Mar-2012 13:27:58.754 automatic empty zone: 255.255.255.255.IN-ADDR.ARPA
28-Mar-2012 13:27:58.754 automatic empty zone: 0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.IP6.ARPA
28-Mar-2012 13:27:58.754 automatic empty zone: D.F.IP6.ARPA
28-Mar-2012 13:27:58.754 automatic empty zone: 8.E.F.IP6.ARPA
28-Mar-2012 13:27:58.754 automatic empty zone: 9.E.F.IP6.ARPA
28-Mar-2012 13:27:58.754 automatic empty zone: A.E.F.IP6.ARPA
28-Mar-2012 13:27:58.754 automatic empty zone: B.E.F.IP6.ARPA
28-Mar-2012 13:27:58.755 automatic empty zone: 8.B.D.0.1.0.0.2.IP6.ARPA
28-Mar-2012 13:27:58.759 none:0: open: /etc/rndc.key: file not found
28-Mar-2012 13:27:58.760 couldn't add command channel 127.0.0.1#953: file not found
28-Mar-2012 13:27:58.760 none:0: open: /etc/rndc.key: file not found
28-Mar-2012 13:27:58.760 couldn't add command channel ::1#953: file not found
28-Mar-2012 13:27:58.760 ignoring config file logging statement due to -g option
28-Mar-2012 13:27:58.761 zone 0.in-addr.arpa/IN: loaded serial 0
28-Mar-2012 13:27:58.762 zone 1.0.0.127.in-addr.arpa/IN: loaded serial 0
28-Mar-2012 13:27:58.764 zone 1.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.ip6.arpa/IN: loaded serial 0
28-Mar-2012 13:27:58.765 zone localhost.localdomain/IN: loaded serial 0
28-Mar-2012 13:27:58.766 zone localhost/IN: loaded serial 0
28-Mar-2012 13:27:58.766 managed-keys-zone ./IN: loading from master file dynamic/managed-keys.bind failed: permission denied
28-Mar-2012 13:27:58.766 dynamic/managed-keys.bind.jnl: open: permission denied
28-Mar-2012 13:27:58.766 managed-keys-zone ./IN: journal rollforward failed: unexpected error
28-Mar-2012 13:27:58.767 running
很明显,根据上面的结果第35,37,46行的提示很可能是权限或者配置文件的错误造成的。所以下面一一检查即可。
首先不是权限的问题。我查看了所有DNS相关的所有配置文件,展示如下,也为大家以后出错作为参考。因为使用root登录终端对文件或目录执行移动或创建工作很容易导致权限问题。
[root@localhost ~]# ls /var/named/ -al
total 40
drwxr-x---. 6 root named 4096 Mar 28 13:05 .
drwxr-xr-x. 28 root root 4096 Mar 28 13:44 ..
drwxr-x---. 6 root named 4096 Mar 28 13:05 chroot
drwxrwx---. 2 named named 4096 Mar 28 13:23 data
drwxrwx---. 2 named named 4096 Mar 28 15:24 dynamic
-rw-r-----. 1 root named 1892 Feb 18 2008 named.ca
-rw-r-----. 1 root named 152 Dec 15 2009 named.empty
-rw-r-----. 1 root named 152 Jun 21 2007 named.localhost
-rw-r-----. 1 root named 168 Dec 15 2009 named.loopback
drwxrwx---. 2 named named 4096 Dec 20 23:53 slaves
[root@localhost ~]# ls /var/named/chroot/ -al
total 24
drwxr-x---. 6 root named 4096 Mar 28 13:05 .
drwxr-x---. 6 root named 4096 Mar 28 13:05 ..
drwxr-x---. 2 root named 4096 Mar 28 13:05 dev
drwxr-x---. 4 root named 4096 Mar 28 14:32 etc
drwxr-xr-x. 3 root root 4096 Mar 28 13:05 usr
drwxr-x---. 6 root named 4096 Mar 28 13:05 var
[root@localhost ~]# ls /var/named/chroot/etc/ -al
total 40
drwxr-x---. 4 root named 4096 Mar 28 14:32 .
drwxr-x---. 6 root named 4096 Mar 28 13:05 ..
-rw-r--r--. 1 root root 405 Oct 19 22:00 localtime
drwxr-x---. 2 root named 4096 Dec 20 23:53 named
-rw-r-----. 1 root named 1259 Mar 28 14:31 named.conf
-rw-r--r--. 1 root named 2544 Dec 20 23:53 named.iscdlv.key
-rw-r-----. 1 root named 931 Jun 21 2007 named.rfc1912.zones
-rw-r--r--. 1 root named 487 Dec 20 23:53 named.root.key
drwxr-xr-x. 3 root root 4096 Mar 28 13:05 pki
-rw-------. 1 root root 479 Mar 27 23:46 rndc.conf
[root@localhost ~]# ls /var/named/chroot/var -al
total 24
drwxr-x---. 6 root named 4096 Mar 28 13:05 .
drwxr-x---. 6 root named 4096 Mar 28 13:05 ..
drwxrwx---. 2 named named 4096 Dec 20 23:53 log
drwxr-x---. 6 root named 4096 Mar 28 13:05 named
drwxr-x---. 3 root named 4096 Mar 28 13:05 run
drwxrwx---. 2 named named 4096 Dec 20 23:53 tmp
[root@localhost ~]# ls /etc/named* -al
-rw-r-----. 1 root named 1259 Mar 28 14:31 /etc/named.conf
-rw-r-----. 1 root root 930 Mar 28 13:41 /etc/named.conf.backup
-rw-r--r--. 1 root named 2544 Dec 20 23:53 /etc/named.iscdlv.key
-rw-r-----. 1 root named 931 Jun 21 2007 /etc/named.rfc1912.zones
-rw-r--r--. 1 root named 487 Dec 20 23:53 /etc/named.root.key
/etc/named:
total 16
drwxr-x---. 2 root named 4096 Dec 20 23:53 .
drwxr-xr-x. 131 root root 12288 Mar 28 14:32 ..
[root@localhost ~]# ls /etc/rndc.* -al
-rw-------. 1 root root 479 Mar 27 23:46 /etc/rndc.conf
-rw-------. 1 root root 479 Mar 28 13:42 /etc/rndc.conf.backup
-rw-------. 1 root root 479 Mar 27 23:10 /etc/rndc.conf.original
-rw-------. 1 root root 479 Mar 27 23:46 /etc/rndc.conf.original_1_error_secret
-rw-------. 1 root root 510 Mar 27 23:43 /etc/rndc.key.removed_no_need
-rw-------. 1 root root 511 Mar 27 23:50 /etc/rndc.key.removed_no_need_1
[root@localhost ~]#
通过比对之前的备份,发现在权限上没有问题。
PS:如果大家遇到这方面的问题请使用如下的命令进行修改。
su -
chown -R root:named /derectory/directory/file
那么既然不是权限的问题,是不是iptables给设定的规则不正确呢?
查看iptables配置信息,显示如下:
[root@localhost ~]# service iptables status
Table: nat
Chain PREROUTING (policy ACCEPT)
num target prot opt source destination
Chain POSTROUTING (policy ACCEPT)
num target prot opt source destination
Chain OUTPUT (policy ACCEPT)
num target prot opt source destination
Table: mangle
Chain PREROUTING (policy ACCEPT)
num target prot opt source destination
Chain INPUT (policy ACCEPT)
num target prot opt source destination
Chain FORWARD (policy ACCEPT)
num target prot opt source destination
Chain OUTPUT (policy ACCEPT)
num target prot opt source destination
Chain POSTROUTING (policy ACCEPT)
num target prot opt source destination
Table: filter
Chain INPUT (policy ACCEPT)
num target prot opt source destination
1 ACCEPT all -- 0.0.0.0/0 0.0.0.0/0 state RELATED,ESTABLISHED
2 ACCEPT icmp -- 0.0.0.0/0 0.0.0.0/0
3 ACCEPT all -- 0.0.0.0/0 0.0.0.0/0
4 ACCEPT tcp -- 10.0.0.0/8 0.0.0.0/0 tcp dpt:953
5 ACCEPT tcp -- 10.0.0.0/8 0.0.0.0/0 tcp dpt:53
6 ACCEPT tcp -- 10.0.0.0/8 0.0.0.0/0 tcp dpt:443
7 ACCEPT tcp -- 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpt:22
8 REJECT all -- 0.0.0.0/0 0.0.0.0/0 reject-with icmp-host-prohibited
Chain FORWARD (policy ACCEPT)
num target prot opt source destination
1 REJECT all -- 0.0.0.0/0 0.0.0.0/0 reject-with icmp-host-prohibited
Chain OUTPUT (policy ACCEPT)
num target prot opt source destination
[root@localhost ~]#
显然,不是iptables的配置有问题。再者,iptables如果有策略在阻止访问,其错误信息也不是如上面所示。
最终我诊断为可能是/etc/named.conf 配置文件存在问题。
因此进行检查配置文件,操作和显示如下:
[root@localhost ~]# named-checkconf /etc/named.conf
[root@localhost ~]# named-checkconf -t /var/named/chroot/
[root@localhost ~]#
说明,在参数上没有问题。因此我开始怀疑,是不是/etc/named.conf或者/etc/rndc.conf存在配置错误?但是,作为新配置安装的DNS不会在密钥上出现问题,因此我检查了/etc/named.conf,确实没发现什么错误。然后我检查了/etc/rndc.conf这个文件,终于发现问题的所在。
结果如下:
[root@localhost ~]# cat /etc/rndc.conf
# Start of rndc.conf
key "rndc-key" {
algorithm hmac-md5;
secret "cK1Bt77B8kL9uLpxy4GDTg==";
};
options {
default-key "rndc-key";
default-server 127.0.0.1;
default-port 953;
};
# End of rndc.conf
# Use with the following in named.conf, adjusting the allow list as needed:
# key "rndc-key" {
# algorithm hmac-md5;
# secret "cK1Bt77B8kL9uLpxy4GDTg==";
# };
#
# controls {
# inet 127.0.0.1 port 953
# allow { 127.0.0.1; } keys { "rndc-key"; };
# };
# End of named.conf
显然,最后的注释说的很清楚,要想使用rndc就必须在/etc/named.conf中进行配置。
所以将显示如下的/etc/named.conf第一段代码更改为第二段代码。
第一段代码:
[root@localhost ~]# cat /etc/named.conf
//
// named.conf
//
// Provided by Red Hat bind package to configure the ISC BIND named(8) DNS
// server as a caching only nameserver (as a localhost DNS resolver only).
//
// See /usr/share/doc/bind*/sample/ for example named configuration files.
//
options {
listen-on port 53 { 127.0.0.1; };
listen-on-v6 port 53 { ::1; };
directory "/var/named";
dump-file "/var/named/data/cache_dump.db";
statistics-file "/var/named/data/named_stats.txt";
memstatistics-file "/var/named/data/named_mem_stats.txt";
allow-query { localhost; };
recursion yes;
dnssec-enable yes;
dnssec-validation yes;
dnssec-lookaside auto;
/* Path to ISC DLV key */
bindkeys-file "/etc/named.iscdlv.key";
};
logging {
channel default_debug {
file "data/named.run";
severity dynamic;
};
};
zone "." IN {
type hint;
file "named.ca";
};
include "/etc/named.rfc1912.zones";
第二段代码:
[root@localhost ~]# cat /etc/named.conf
//
// named.conf
//
// Provided by Red Hat bind package to configure the ISC BIND named(8) DNS
// server as a caching only nameserver (as a localhost DNS resolver only).
//
// See /usr/share/doc/bind*/sample/ for example named configuration files.
//
options {
listen-on port 53 { 127.0.0.1; };
listen-on-v6 port 53 { ::1; };
directory "/var/named";
dump-file "/var/named/data/cache_dump.db";
statistics-file "/var/named/data/named_stats.txt";
memstatistics-file "/var/named/data/named_mem_stats.txt";
allow-query { localhost; };
recursion yes;
dnssec-enable yes;
dnssec-validation yes;
dnssec-lookaside auto;
/* Path to ISC DLV key */
bindkeys-file "/etc/named.iscdlv.key";
};
logging {
channel default_debug {
file "data/named.run";
severity dynamic;
};
};
zone "." IN {
type hint;
file "named.ca";
};
include "/etc/named.rfc1912.zones";
# Add line to enable named working with "/etc/rndc.conf"
# Use with the following in named.conf, adjusting the allow list as needed:
key "rndc-key" {
algorithm hmac-md5;
secret "cK1Bt77B8kL9uLpxy4GDTg==";
};
controls {
inet 127.0.0.1 port 953
allow { 127.0.0.1; } keys { "rndc-key"; };
};
# End of named.conf
[root@localhost ~]#
最后,重新启动named守护进程
su -
service named restart
service named status
结果显示如下,就表示可以了。
[root@localhost ~]# service named status
version: 9.7.3-P3-RedHat-9.7.3-8.P3.el6_2.2
CPUs found: 2
worker threads: 2
number of zones: 19
debug level: 0
xfers running: 0
xfers deferred: 0
soa queries in progress: 0
query logging is OFF
recursive clients: 0/0/1000
tcp clients: 0/100
server is up and running
named (pid 11918) is running...
[root@localhost ~]#