利用ansible检测网络连通性(多个网段多IP)

        在云平台的运维中,有时会遇到某个节点的一个IP无法连通导致云平台故障,比较常见的是ceph的某个osd节点storage网络不通导致osd全部down.为了快速检测云平台全部网络的连通性,利用ansible自带的的fact,写了个playbook,特记录下.

1 每台主机三个网卡

利用ansible检测网络连通性(多个网段多IP)_第1张图片


2  脚本内容

---
- hosts: all
  #vars_prompt:  
  #  - name: share_user
  #    prompt: "input share_user"
  tasks:
    - block
        - name: restart acpid  service
          service: name=acpid state=restarted

        - name: get  the network connection  ip
          shell: |
              ping -c 2  "{{ hostvars[item[0]]['ansible_' + item[1]].ipv4.address }}"
          register: netinfo
          ignore_errors: yes
          with_nested:
            - "{{ groups['all'] }}"
            - ["eth0","eth1","eth2"]
        
        #- debug:
        #    var: netinfo
        - name: echo the no ping ip
          shell: echo "ip {{item.cmd}} is no ok" >>/root/noping.txt
          with_items:
            - "{{ netinfo.results }}"
          when: item.rc != 0
      delegate_to: localhost

利用ansible检测网络连通性(多个网段多IP)_第2张图片


3 测试 

      3.1 关掉minion1的eth2网卡

利用ansible检测网络连通性(多个网段多IP)_第3张图片

    

   3.2 执行检测脚本

利用ansible检测网络连通性(多个网段多IP)_第4张图片

  

    3.3 测试结果



补充一个playbook,利用fact的ansible_all_ipv4_addresses变量

---
- hosts: all
  become: yes
  become_user: root
  become_method: sudo
  tasks:
    - block:
        - name: check the net connection(ping)
          shell: ping -c 2 {{ item }}
          register: netResult
          ignore_errors: yes
          with_items:
            - "{{ ansible_all_ipv4_addresses }}"
          when: item !='240.0.0.1'          #增加过滤不想检测的IP
        - debug:
            var=netResult
        - name: get the no ping  ipadress
          shell: |
            echo "ip {{ item['item'] }} is unreachable" >>/root/noPing.txt
          with_items:
            "{{ netResult.results }}"
          when: item.item !='240.0.0.1' and item.rc !=0
      delegate_to: localhost


你可能感兴趣的:(ansible)