ansible常见错误解析

背景

由于工作中经常用到ansible,所以整理了常用的ansible错误及原因分析,方便自己也方便别人参考。

1.shell 模块常见错误

1.1 使用shell遇到"msg": "non-zero return code"

ansible 脚本如下:

- name: Check the weblogic without wc
  shell: "ps -ef|grep weblogic|grep -v grep"
  register: check_weblogic0
  ignore_errors: true

ansible 返回错误:

TASK [Check the weblogic without wc] *********************************************************************************************************************************************************************************************************fatal: [robin.org.cn]: FAILED! => {"changed": true, "cmd": "ps -ef|grep weblogic|grep -v grep", "delta": "0:00:00.036565", "end": "2020-02-23 18:08:03.100106", "msg": "non-zero return code", "rc": 1, "start": "2020-02-23 18:08:03.063541", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}
...ignoring

ok: [robin.org.cn] => {
    "msg": {
        "changed": true,
        "cmd": "ps -ef|grep weblogic|grep -v grep",
        "delta": "0:00:00.036565",
        "end": "2020-02-23 18:08:03.100106",
        "failed": true,
        "msg": "non-zero return code",
        "rc": 1,
        "start": "2020-02-23 18:08:03.063541",
        "stderr": "",
        "stderr_lines": [],
        "stdout": "",
        "stdout_lines": []
    }
}

原因分析:

当使用shell模块并且返回为空的时候,ansible就会认为shell脚本出错了,rc就返回1。

解决方案:

在shell命令末尾增加cat,将返回的内容通过管道传递给cat,使用cat返回的rc始终为0. 最好的解决方式,无论你要获取整个返回内容或者返回行数。

- name: Check the weblogic without wc but use cat
  shell: "ps -ef|grep weblogic|grep -v grep|cat"
  register: check_weblogic1
  ignore_errors: true

- name: print the check_weblogic1
  debug:
    msg: "{{ check_weblogic1 }}"

在shell命令末尾增加wc -l,计算返回的行数,保证shell返回始终不为空。

- name: Check the weblogic with wc
  shell: "ps -ef|grep weblogic|grep -v grep|wc -l"
  register: check_weblogic2
  ignore_errors: true

- name: print the check_weblogic2
  debug:
    msg: "{{ check_weblogic2.stdout|int }}"

在脚本最后面增加ignore_errors: true,最不推荐的方式,除非暂时没找到根本原因,应急。

- name: Check the weblogic without wc
  shell: "ps -ef|grep weblogic|grep -v grep"
  register: check_weblogic0
  ignore_errors: true

1.2 使用shell遇到"/bin/sh: pip: command not found"错误

- name: pip insall web.py
  shell: "pip install requests"
  async: 200
  poll: 0
  register: pip_install_status
TASK [YUM - check on fire and forget task] ***************************************************************************************************************************************************************************************************fatal: [gcp2.lhh.pub]: FAILED! => {"ansible_job_id": "943105447304.15557", "attempts": 1, "changed": true, "cmd": "pip install requests", "delta": "0:00:00.003629", "end": "2020-02-24 15:05:40.012909", "finished": 1, "msg": "non-zero return code", "rc": 127, "start": "2020-02-24 15:05:40.009280", "stderr": "/bin/sh: pip: command not found", "stderr_lines": ["/bin/sh: pip: command not found"], "stdout": "", "stdout_lines": []}

原因分析

使用ansible的shell模块的时候,默认不会完全使用当前shell的环境变量PATH,只是导入最基本的几个目录,请看我使用echo $PATH打印的结果。


TASK [echo teh result2] **********************************************************************************************************************************************************************************************************************ok: [gcp.lhh.pub] => {
    "msg": {
        "ansible_job_id": "479995747105.21088",
        "attempts": 1,
        "changed": true,
        "cmd": "echo $PATH",
        "delta": "0:00:00.005452",
        "end": "2020-02-24 15:16:30.108684",
        "failed": false,
        "finished": 1,
        "rc": 0,
        "start": "2020-02-24 15:16:30.103232",
        "stderr": "",
        "stderr_lines": [],
        "stdout": "/sbin:/bin:/usr/sbin:/usr/bin",
        "stdout_lines": [
            "/sbin:/bin:/usr/sbin:/usr/bin"
        ]
    }
}

解决方案

如果要使用的命令在/sbin:/bin:/usr/sbin:/usr/bin没有,则使用命令的绝对路径或者事先导入常用的所有环境变量。

 - name: import all the path and run pip insall web.py
   shell: "export PATH=$PATH:/usr/bin/:/usr/local/bin/;pip install web.py"
   async: 20
   poll: 10
   register: rst1

2.copy模块常见错误

2.1 使用copy模块,遇到Remote copy does not support recursive copy of directory

ansible all -m copy -a 'src=/root/ansible/file1 dest=/etc/cc/file1 remote_src=yes backup=yes mode=0755'

TASK [cp files below folder4 to bak1] *************************************************************
ok: [localhost] => (item=subfile1)
ok: [localhost] => (item=subfile2)
failed: [localhost] (item=subfolder1) => {"changed": false, "item": "subfolder1", "msg": "Remote copy does not support recursive copy of directory: /apps/ansible-test/folder4/subfolder1"}
        to retry, use: --limit @/apps/ansible-test/test-cp.retry

PLAY RECAP ****************************************************************************************
localhost                  : ok=3    changed=1    unreachable=0    failed=1

原因分析:

如果在远程机器上执行copy,相当于在远端机器本机执行cp命令,remote_src: true。对于asible 2.6,只支持copy单个文件,不允许递归copy。对于ansible 2.8 已经支持递归复制。详见官方说明:https://docs.ansible.com/ansi...

解决方案:

使用ansible 2.8 或者 使用linux shell cp -rf实现递归复制。
ansible all -m shell -a 'cp -rf /root/ansible/* /etc/cc/file1'

你可能感兴趣的:(ansible)