基础架构错误

错误整体记录

rsync


1. not a regular file 不是普通的文件

scp默认只能复制普通文件 与cp类似
scp加一个 -r 参数就好了

[12:37 root@backup ~]# scp /etc 172.16.1.31:nfs01/tmp/
[email protected]'s password: 
/etc: not a regular file

2.cannot delete non-empty directory 无法删除非空目录

symlink: 软链接 导致报错,可能已经备份过了

[12:39 root@backup ~]# rsync -av /etc 172.16.1.31:/tmp/
[email protected]'s password: 
sending incremental file list
cannot delete non-empty directory: etc/init.d
could not make way for new symlink: etc/init.d
cannot delete non-empty directory: etc/rc0.d
could not make way for new symlink: etc/rc0.d
.....
..

3.Connection refused 连接拒绝

检查是否能ping通对应服务器

[12:41 root@backup ~]# rsync -avz /etc 176.16.1.31:/tmp
ssh: connect to host 176.16.1.31 port 22: Connection refused
rsync: connection unexpectedly closed (0 bytes received so far) [sender]
rsync error: unexplained error (code 255) at io.c(226) [sender=3.1.2]

4.auth failed on module data

data模块认证错误(密码)

原因:
1.密码写错
2.密码文件不存在
3.密码文件权限不对
4.没有创建data这个目录

[root 16:15 @ backup ~]# rsync -avz /etc/hosts [email protected]::data
Password: 
@ERROR: auth failed on module data
rsync error: error starting client-server protocol (code 5) at main.c(1648) [sender=3.1.2]
原因:密码文件权限没有更改为600
[root 16:25 @ backup ~]# ll /etc/rsync.password 
-rw-r--r-- 1 root root 20 May 20 16:05 /etc/rsync.password

解决办法:更改文件权限位600
[root 16:25 @ backup ~]# chmod 600  /etc/rsync.password 
[root 16:25 @ backup ~]# ll /etc/rsync.password 
-rw------- 1 root root 20 May 20 16:05 /etc/rsync.password

5.Unknown module 'data' 未知的设备模块

没有权限

[17:45 root@nfs01 ~]# rsync -avz /etc/hostname [email protected]::data --password-file=/etc/rsync.password
@ERROR: Unknown module 'data'
rsync error: error starting client-server protocol (code 5) at main.c(1648) [sender=3.1.2]

服务端查看rsyncd.conf ,准许哪些主机可以访问

hosts allow = 172.16.1.0/24
#hosts deny = 0.0.0.0/32

6.秘密文件

日志中
密码文件问题
查看日志的提示:

019/05/20 16:52:32 [15755] secrets file must be owned by root when running as root (see strict modes)
2019/05/20 16:52:32 [15755] auth failed on module data from backup (172.16.1.41) for rsync_backup: ignoring secrets file

 secrets file must be owned by root when running as root (see strict modes)

密码文件当root运行rsync的时候,必须属于root

[root@backup ~]# ll /etc/rsync.password 
-rw------- 1 rsync rsync 20 May 20 16:49 /etc/rsync.password

7.read错误:由对等方重置连接

配置文件错误

[root@backup ~]# rsync -avz /etc/hosts [email protected]::data
sending incremental file list
rsync: read error: Connection reset by peer (104)
rsync error: error in socket IO (code 10) at io.c(785) [sender=3.1.2]

8.密码文件不能是其他可访问的

rsync -avz /etc/sysconfig/ [email protected]::backup --password-file=/etc/rsync.password  
ERROR: password file must not be other-accessible
rsync error: syntax or usage error (code 1) at authenticate.c(196) [sender=3.1.2]

9.未知模块'数据'

@ERROR:未知模块'数据'

2019/05/20 17:45:46 [10514] rsync denied on module data from UNKNOWN (10.0.0.31)

10.@ERROR: chdir failed

没有创建data备份这个目录

[20:35 root@backup ~]# rsync -avz /etc/hosts [email protected]::data
Password: 
@ERROR: chdir failed
rsync error: error starting client-server protocol (code 5) at main.c(1648) [sender=3.1.2

解决办法:

[20:35 root@backup ~]# mkdir -p /data
[20:38 root@backup ~]# chown rsync.rsync /data/
[20:39 root@backup ~]# ll -d /data/
drwxr-xr-x 2 rsync rsync 6 May 20 20:38 /data/
[20:39 root@backup ~]# rsync -avz /etc/hosts [email protected]::data
Password: 
sending incremental file list
hosts

sent 140 bytes  received 43 bytes  52.29 bytes/sec
total size is 158  speedup is 0.86

11.invalid uid rsync 无效的uid rsync

推送到的接收端可能没有rsync这个虚拟用户

[09:40 root@nfs01 ~]# rsync -avz /etc/profile [email protected]::data --password-file=/etc/rsync.password 
@ERROR: invalid uid rsync
rsync error: error starting client-server protocol (code 5) at main.c(1648) [sender=3.1.2]

去接收端服务器看一下:

[09:41 root@backup ~]# id rsync
id: rsync: no such user   \\没有这个用户

12.getcwd: cannot directories

getcwd 命令无法定位到当前工作目录。一般来说是因为 cd 到了某个目录之后 rm 了这个目录,这时去执行某些 service 脚本的时候就会报 getcwd 错误。只需要 cd 到任何一个实际存在的目录下在执行命令即可。

[root@nfs01 /backup/172.16.1.31]# sh /tmp/rsync.sh 
shellinit: error retrieving current directory: getcwd: cannot directories: No such file or directory
/usr/bin/tar: Removing leading `/' from member names

13.failed to create pid file /var/run/rsyncd.pid: File exists

Rsync 服务器启动错误

[13:25 root@backup ~]# rsync --daemon
[13:28 root@backup ~]# failed to create pid file /var/run/rsyncd.pid: File exists

解决方案:

rm -rf /var/run/rsyncd.pid;再重新启动Rsync服务(rsync --daemon)

[13:28 root@backup ~]# rm -rf /var/run/rsyncd.pid 
[13:29 root@backup ~]# rsync --daemon
[13:30 root@backup ~]# netstat -lutup|grep rsync
tcp        0      0 0.0.0.0:rsync           0.0.0.0:*               LISTEN      8585/rsync          
tcp6       0      0 [::]:rsync              [::]:*                  LISTEN      8585/rsync       

14.auth failed on module backup 身份验证模块备份失败

[13:36 root@backup ~]# rsync -avz /etc/hosts [email protected]::backup
Password: 
@ERROR: auth failed on module backup
rsync error: error starting client-server protocol (code 5) at main.c(1648) [sender=3.1.2]

查看服务端和客户端的密码文件的权限是否为600
查看密码是否正确

还有一个深坑就是密码中有没有空格,把空格删除掉就可以了

image

二、nfs服务

1.权限拒绝

/etc/exports中指定的网段是172.16.1.31,这里是10.0.0.7外网段

[16:43 root@web01 ~]# mount -t nfs 10.0.0.7:/upload/ /video/
mount.nfs: access denied by server while mounting 10.0.0.7:/upload/

2.检查挂在信息失败

[16:51 root@nfs01 ~]# showmount -e 172.16.1.31
clnt_create: RPC: Program not registered

解决:nfs服务重启一下

[16:51 root@nfs01 ~]# systemctl restart nfs
[16:52 root@nfs01 ~]# showmount -e 172.16.1.31
Export list for 172.16.1.31:
/upload 172.16.1.0/24

3.wrong fs type

错误的文件系统类型 nfs文件系统无法识别
没有安装nfs-utils

[16:51 root@nfs01 ~]# mount -t nfs 172.16.1.31:/data /mnt/
mount: wrong fs type, bad option, bad superblock on 172.16.1.31:/data,
       missing codepage or helper program, or other error
       (for several filesystems (e.g. nfs, cifs) you might
       need a /sbin/mount. helper program)
       In some cases useful info is found in syslog - try
       dmesg | tail  or so

4.Read-only file system文件只读

在NFS服务挂载的/mnt/目录创建文件

[09:33 root@nfs01 /]# df -h
172.16.1.31:/data   19G  1.7G   18G   9% /opt
[09:33 root@nfs01 /]# touch /opt/1.txt
touch: cannot touch ‘/opt/1.txt’: Read-only file system

解决:/etc/exports的权限是只读访问

[09:33 root@nfs01 /]# vim /etc/exports
#share /data
/data           172.16.1.0/24(ro,sync,all_squash)

三、ansible批量管理

1."src and content are mutually exclusive"

copy的src参数和content参数相互冲突,只能用一个

[22:08 root@m01 ~]# ansible all -m copy -a 'src=/etc/hostname dest=/tmp/lidao.txt content="oldboy.com"'
172.16.1.7 | FAILED! => {
    "changed": false, 
    "msg": "src and content are mutually exclusive"
}
172.16.1.41 | FAILED! => {
    "changed": false, 
    "msg": "src and content are mutually exclusive"
}
172.16.1.31 | FAILED! => {
    "changed": false, 
    "msg": "src and content are mutually exclusive"
}

2.state is mounted but all of the following are missing: fstype

状态是按照,但是fstype参数找不到了
加上 fstype=nfs

[14:59 root@m01 /etc/ansible]# ansible 172.16.1.7 -m mount -a ' src=172.16.1.31:/nfs path=/mnt state=mounted'
172.16.1.7 | FAILED! => {
    "ansible_facts": {
        "discovered_interpreter_python": "/usr/bin/python"
    }, 
    "changed": false, 
    "msg": "state is mounted but all of the following are missing: fstype"
}

3.Could not match supplied host pattern

主机列表没有添加这个IP

[12:24 root@m01 /etc/ansible]# ansible 172.16.1.8 -m yum -a "name=tree name=lrzsz state=present"
 [WARNING]: Could not match supplied host pattern, ignoring: 172.16.1.8

 [WARNING]: No hosts matched, nothing to do

4.变量格式错误,需要加空格或者双引号

变量格式错误,需要加空格或者双引号

We could be wrong, but this one looks like it might be an issue with
missing quotes. Always quote template expression brackets when they
start a value. For instance:

    with_items:
      - {{ foo }}

Should be written as:

    with_items:
      - "{{ foo }}"

翻译错误

/etc: not a regular file

不是一个常规文件

ssh: connect to host 176.16.1.31 port 22: Connection refused

连接主机端口时连接被拒绝

@ERROR: auth failed on module data

模块数据验证失败

secrets file must be owned by root when running as root (see strict modes)

密码文件在root用户运行时必须为只有root用户可见(权限设置为600)

@ERROR: Unknown module 'data'

未知的模块

Name or service not known

名字和服务找不到

password mismatch

密码不匹配

permission denied

没有权限

remote command not found

未找到远程命令

wrong fs type

fs类型错误

SSH错误

故障起因

想通过m01创建秘钥,把公钥传到backup服务器上。从而实现免密管理backup服务器的作用

具体操作如下

[root@m01 ~]# ssh-keygen -t dsa 
Generating public/private dsa key pair.
Enter file in which to save the key (/root/.ssh/id_dsa): 
/root/.ssh/id_dsa already exists.
Overwrite (y/n)? 
[root@m01 ~]# ssh-copy-id -i .ssh/id_dsa.pub 172.16.1.41
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: ".ssh/id_dsa.pub"
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
[email protected]'s password: 

Number of key(s) added: 1

Now try logging into the machine, with:   "ssh '172.16.1.41'"
and check to make sure that only the key(s) you wanted were added.
故障:通过ssh执行命令不能实现免密
[root@m01 ~]# ssh 172.16.1.41 hostname
[email protected]'s password: 
backup
排查过程

1.检查了一下backup服务器,公钥已经传过来了。

[root@backup ~]# ll .ssh/
total 4
-rw------- 1 root root 598 May 27 19:28 authorized_keys
2.通过相同的方法向web01传递公钥后可以实现免密
[root@m01 ~]# ssh 172.16.1.7 hostname
web01

可以确定问题出现在backup这边

3.检查了root家目录的权限。发现不对
[root@backup ~]# ll -d /root
dr-xr-x---. 4 rsync rsync 262 May 27 19:03 /root

突然想起是当初搭建rsync服务时留的故障

4.家目录修改为root后可以免密了
[root@m01 ~]# ssh 172.16.1.41 hostname
backup

Nginx错误

1.Connection refused Nginx服务没有运行
2.unexpected "}" in /etc/nginx/nginx.conf 配置文件内容语法错误了
3.Linux或windows 使用域名 hosts没有解析域名
4.is not treminated by 没有以...结束
5.404 找不到 站点目录可能没有创建,或者nginx.conf配置文件站点目录写错了
6.403 Forbidden 权限拒绝 权限拒绝(第一种)
7.403 Forbidden 权限拒绝 首页文件不存在(第二种)
8.Cannot assign requested address 无法分配指定请求的ip地址
9.304 Not Modified (提示)用户读取浏览器缓存
10.Address already in use 地址已被使用
11.Connection refused 连接拒绝
12. conflicting server name 服务器名称冲突
13.500 internal Server Error 内部服务器错误
14.Error establishing a database connection 建立数据库连接失败
15.Can't connect to local MySQL server through socket 连接不到本地的数据库
16.Unit is masked nginx服务被删除了
17.bind() to 10.0.0.4:80 failed (99: Cannot assign requested address) 把这个10.0.0.4的ip绑定到这台机器 失败了

错误提示:Connection refused 连接拒绝

[root@web01 /etc/nginx]# curl www.yawei.com
curl: (7) Failed connect to www.yawei.com:80; Connection refused

服务 是否运行

\#Address already in use
nginx正在运行中

[root@web01 ~]# nginx
nginx: [emerg] bind() to 0.0.0.0:80 failed (98: Address already in use)
nginx: [emerg] bind() to 0.0.0.0:80 failed (98: Address already in use)
nginx: [emerg] bind() to 0.0.0.0:80 failed (98: Address already in use)
nginx: [emerg] bind() to 0.0.0.0:80 failed (98: Address already in use)
nginx: [emerg] bind() to 0.0.0.0:80 failed (98: Address already in use)
nginx: [emerg] still could not bind()

nginx 可以运行nginx服务

启动 或重启nginx 的报错

查看详细nginx错误提示 检查语法 nginx -t

systemctl  
[root@web-204 html]# systemctl start nginx
Job for nginx.service failed because the control process exited with error code. See "systemctl status nginx.service" and "journalctl -xe" for details.

[root@web-204 html]# nginx -t
nginx: [emerg] unexpected "}" in /etc/nginx/nginx.conf:49
nginx: configuration file /etc/nginx/nginx.conf test failed
#}不成对 

Linux或windows 使用域名

hosts(linux或windows) 没有解析

[root@web01 /etc/nginx]# curl blog.yawei.com
Found.


#server_name 这一行 没有以 ";" 结尾 
terminated 结束  

[root@wed01 ~]# nginx -t
nginx: [emerg] directive "server_name" is not terminated by ";" in /etc/nginx/nginx.conf:43
nginx: configuration file /etc/nginx/nginx.conf test failed


 31    # include /etc/nginx/conf.d/*.conf;
 32     server   {
 33         listen       80;
 34         server_name  www.yawei.com;
 35         location / {
 36         root   /usr/share/nginx/html/www;
 37         index  index.html index.htm;
 38         }
 39     }
 40     server   {
 41         listen       80;
 42         server_name  blog.yawei.com
 43         location / {
 44         root   /usr/share/nginx/html/blog;
 45         index  index.html index.htm;
 46         }
 47     }
 48 
 49 }

内部服务器错误 Nginx服务没有运行

80端口没有开
重启nginx服务 systemctl restart nginx

image
[17:10 root@web01 /etc/nginx/conf.d]# curl www.yawei.com
curl: (7) Failed connect to www.yawei.com:80; Connection refused
[17:10 root@web01 /etc/nginx/conf.d]# systemctl restart nginx
[17:10 root@web01 /etc/nginx/conf.d]# curl www.yawei.com
www.oldboy.com

查看详细Nginx错误提示 检查语法 nginx -t

一般花括号说明配置文件内容语法错误了
"}"不成对

systemctl  
[root@web-204 html]# systemctl start nginx
Job for nginx.service failed because the control process exited with error code. See "systemctl status nginx.service" and "journalctl -xe" for details.

[root@web-204 html]# nginx -t
nginx: [emerg] unexpected "}" in /etc/nginx/nginx.conf:49
nginx: configuration file /etc/nginx/nginx.conf test failed

image
image

server_name这行没有以分号结尾 ";"

treminated 结束
 40     server   {
 41         listen       80;
 42         server_name  blog.yawei.com   \\没有以 ; 结尾
 43         location / {
 44         root   /usr/share/nginx/html/blog;
 45         index  index.html index.htm;
 46         }
 47     }
 48 
 49 }

image

报错—404 找不到

站点目录可能没有创建,或者nginx.conf配置文件站点目录写错了
[root@web01 ~]# cat /usr/share/nginx/html/{www,blog}/index.html
www.oldboy.com
blog.oldboy.com
[root@web01 ~]# curl blog.yawei.com

404 Not Found

404 Not Found

* * *
nginx/1.16.0
image

报错—403 权限拒绝(第一种)

Forbidden 权限拒绝
image

模拟错误:站点目录权限修改为000 测试完后改回644:

[09:34 root@web01 ~]# ll  /usr/share/nginx/html/www/
total 4
-rw-r--r-- 1 root root 15 Jun  5 09:00 index.html
[09:35 root@web01 ~]# chmod 600 index.html
[09:35 root@web01 ~]#  ls -l index.html 
-rw------- 1 root root 15 Jun  5 09:00 index.html
[09:35 root@web01 ~]# curl www.yawei.com

403 Forbidden

403 Forbidden


nginx/1.16.0

报错—403 首页文件不存在(第二种)

首页文件不存在 默认是找首页文件
image

模拟将站点目录移动到别处:

[16:35 root@web01 ~]# mv /usr/share/nginx/html/www/index.html /tmp/
[16:35 root@web01 ~]# curl www.yawei.com

403 Forbidden

403 Forbidden


nginx/1.16.0

Cannot assign requested address 无法分配指定请求的ip地址

本地没有10.0.0.9 的ip
10.0.0.9:80 (failedCannot assign)

[root@web01 /etc/nginx]# nginx -t 
nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: [emerg] bind() to 10.0.0.9:80 failed (99: Cannot assign requested address)
nginx: configuration file /etc/nginx/nginx.conf test failed

解决方法1:添加1个ip地址:

[root@web01 ~]# ip addr add 10.0.0.9/24 dev eth0 label eth0:1   \\添加此ip

304 Not Modified 用户读取浏览器缓存

/var/log/nginx/access.log日志中显示304

image

Address already in use 地址已被使用

nginx已经启动了,在使用80端口
只是提示,不是报错
Address already in use
nginx正在运行中

image

Connection refused 连接拒绝

[16:25 root@web01 ~]# curl www.yawei.com
curl: (7) Failed connect to www.yawei.com:80; Connection refused
[16:25 root@web01 ~]# systemctl restart nginx
[16:25 root@web01 ~]# curl www.yawei.com
www.oldboy.com

conflicting server name 服务器名称冲突

域名冲突: 有两个或多个 虚拟主机的域名相同了
解决方法:可以将默认 default.conf压缩

 [root 12:27:22 @web01 conf.d]# vim 01-www.conf 

server {
      listen    80;
      server_name     www.yawei.com;  \\域名重复
      #charset koi8-r;
      access_log  /var/log/nginx/access_www.log  main;

      location / {
          root   /usr/share/nginx/html/www;
          index  index.html index.htm;
                }
     }

[root 12:35:02 @web01 conf.d]# cat default.conf
server {
    listen       80;
    server_name  www.yawei.com;   \\域名重复
    location / {
        root   /usr/share/nginx/html;
        index  index.html index.htm;
    }
}

[16:52 root@web01 /etc/nginx/conf.d]# gzip  default.conf
[16:52 root@web01 /etc/nginx/conf.d]# ll
total 16
-rw-r--r-- 1 root root 219 Jun  5 12:56 01-www.conf
-rw-r--r-- 1 root root 240 Jun  5 12:55 02-blog.conf
-rw-r--r-- 1 root root 488 Apr 23 22:34 default.conf.gz  \\压缩
-rw-r--r-- 1 root root 123 Jun  5 12:41 status.conf

500 internal Server Error

内部服务器错误
设置了权限认证auth_basic_user_file
密码文件权限不对

image.png

改为600

[09:45 root@web01 /etc/nginx]# chmod 600 htpasswd 
[09:46 root@web01 /etc/nginx]# ll htpasswd 
-rw------- 1 nginx nginx 45 Jun  6 09:30 htpasswd

Error establishing a database connection

建立数据库连接失败
搭建好的wordpress博客进不去
修改权限blog站点目录的属主属组为nginx

[root@web01 ~]#  chown  -R nginx.nginx /usr/share/nginx/html/blog/
[root@web01 ~]# ll -d /usr/share/nginx/html/blog/
drwxr-xr-x 5 nginx nginx 4096 Jun  6 21:03 /usr/share/nginx/html/blog/

image
[root@web02 ~]# mount -t  nfs 172.16.1.31:/data/www /application/nginx/html/wordpress/wp-content/uploads/
mount: wrong fs type, bad option, bad superblock on 172.16.1.31:/data/www,
       missing codepage or helper program, or other error
       (for several filesystems (e.g. nfs, cifs) you might
       need a /sbin/mount. helper program)

       In some cases useful info is found in syslog - try
       dmesg | tail or so.

本机未安装nfs

15.Can't connect to local MySQL server through socket

连接不到本地的数据库
数据库服务没有开启

[root@web01 html]# systemctl restart mariadb.service 

16.Unit is masked nginx服务被删除了

从别的机器拷贝一份
usr/sbin/nginx下的内容没了

image
image
image

17.

image

修改内核参数:

[root@lb01 nginx]# tail -1 /etc/sysctl.conf 
net.ipv4.ip_nonlocal_bind = 1
[root@lb01 nginx]# sysctl -p  #生效
net.ipv4.ip_nonlocal_bind = 1

[root@lb02 ~]# tail -1 /etc/sysctl.conf 
net.ipv4.ip_nonlocal_bind = 1
[root@lb02 ~]# sysctl -p  #生效
net.ipv4.ip_nonlocal_bind = 1

再重启就可以了

[root@lb01 nginx]# nginx -t
nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: configuration file /etc/nginx/nginx.conf test is successful
[root@lb01 nginx]# systemctl restart nginx

你可能感兴趣的:(基础架构错误)