ansible 安装与配置以及实现运维自动化
一、ansible介绍
ansible是新出现的运维工具是基于Python研发的糅合了众多老牌运维工具的优点实现了批量操作系统配置、批量程序的部署、批量运行命令等功能。
实验环境介绍:
ansible Centos 6.6 x86_64 hostname:ansible
web1 Centos 6.6 x86_64 hostname:web1
web2 Centos 6.6 x86_64 hostname:web2
1、在服务端安装ansible,不需要在客户端安装。
首先安装epel源
#rpm -ivh http://fr2.rpmfind.net/linux/epel/6/x86_64/epel-release-6-8.noarch.rpm
安装ansible
#yum install ansible -y
2、批量建立服务端和客户端的ssh信任
如果管理的客户端比较多使用脚本跑一下(不再赘述)
如果不建立服务端与客户端的ssh信任,相对安全,但是每次都有需要输入远端服务器密码,如下:
#ansible 192.168.1.20 -m ping -k
SSH password: --输入远端服务器的root密码
192.168.1.20 | success >> {
"change": false,
"ping": "pong"
}
注:服务端与客户端没有配置SSH证书信任,需要在执行ansible命令时添加 -k 参数,需要提供root(默认)账号密码
有些朋友更倾向于使用普通用户账户进行连接并使用sudo命令实现root权限
格式:
#ansible webservers -m ping -u ansible -s
我本人直接使用root用户!
PS:有一次在centos 6.5 x86_64 系统安装ansible之后,执行
#ansible --version --报错如下:
Traceback (most recent call last):
File "/usr/bin/ansible", line 36, in <module>
from ansible.runner import Runner
ImportError: No module named ansible.runner
在google找了好多资料,也没解决,后来好不容易看到一篇说是修改/usr/bin/ansible文件的第一行python解释器(ansible依赖python 2.6以上版本)
由/usr/bin/python 改为/usr/bin/python2.6 改好之后,就不报错了!
(原因是之前在这台服务器上编译安装过python2.7版本,而/usr/bin/ansible的第一行调用的是/usr/bin/python版本,所以报错)
3、ansible 配置ansible服务端与客户端ssh信任关系(客户端比较少的情况):
在服务端
#ssh-keygen -t rsa --生成加密串儿,一路回车
{ssh-keygen -t rsa -P '' --密钥设置为空}
拷贝key到客户端
[root@ansible ~]# ssh-copy-id -i ~/.ssh/id_rsa.pub [email protected]
The authenticity of host '172.16.29.193 (172.16.29.193)' can't be established.
RSA key fingerprint is 0d:2c:da:c7:2b:2c:38:d3:28:bc:78:65:f4:dc:af:4f.
Are you sure you want to continue connecting (yes/no)? yes --输入yes
Warning: Permanently added '172.16.29.193' (RSA) to the list of known hosts.
[email protected]'s password: --输入172.16.29.193服务器的root密码
Now try logging into the machine, with "ssh '[email protected]'", and check in:
.ssh/authorized_keys
to make sure we haven't added extra keys that you weren't expecting.
到客户端查看key
[root@web1 ~]# ls ~/.ssh/authorized_keys
/root/.ssh/authorized_keys
在服务端端ssh登录客户端测试
#ssh 172.16.29.193
可以正常登录
(## 写入信任文件(将/root/.ssh/id_rsa_storm1.pub分发到其他服务器,并在所有服务器上执行如下指令):
# cat /root/.ssh/id_rsa_storm1.pub >> /root/.ssh/authorized_keys
# chmod 600 /root/.ssh/authorized_keys)
4、服务端配置:
[root@ansible ansible]# cat hosts
[web]
10.0.90.24
10.0.90.25
[hosts1]
172.16.29.193
注:
hosts文件定义:如果没有配置服务端通过ssh无密码登录客户端,hosts文件配置如下:
[webhosts]
172.16.10.22 ansible_ssh_user=root ansible_ssh_pass=mima
172.16.10.33 ansible_ssh_user=root ansible_ssh_pass=mima
解释
#ansible_ssh_user=root 是ssh登陆用户
#ansible_ssh_pass=mima 是ssh登陆密码
简单测试:
[root@ansible ansible]# ansible 172* -m shell -a "hostname"
172.16.29.193 | success | rc=0 >>
guang
[root@ansible ansible]# ansible host1 -m shell -a "hostname"
172.16.29.193 | success | rc=0 >>
guang
以上两种方式都可以,其中模块shell 也可以换成command
[root@ansible ~]# ansible host1 -m command -a 'date'
172.16.29.193 | success | rc=0 >>
Wed Jun 10 22:37:20 CST 2015
默认的模块名为command ,即“-m command” 可以省略
如:
ansible host1 -m command -a "uptime" 等价与 ansible host1 -a "uptime"
#ansible host1 -m service -a "name=httpd state=restarted"
5、ansible 管理系统用户
首先生成密码:
#openssl passwd -1 -salt 12345678
Password: --输入密码,就会生成加密字符串
创建:
#ansible web -m user -a 'name=test1 comment="add a test user" password="$1$12345678$qT.Vr20lsSaufZbuk4JIb."'
删除:
#ansible web -m user -a "name=test1 state=absent" --使用这种方式删除用户,不会删除用户的家目录
#ansible web -m user -a "name=test1 state=absent remove=yes" --使用这种方式删除用户,可以删除用户的家目录
ansible 使用普通用户操作
#su - test
$ansible webservers -m ping -u test1 -sudo
使用yml文件批量创建用户,并且将用户添加到wheel组,如果不想添加到wheel组,去掉groups=wheel即可(未设置密码)
#cat add_user.yml
---
- hosts: web
remote_user: root
gather_facts: true
tasks:
- name: Add several users
user: name={{ item }} state=present groups=wheel
with_items:
- testuser1
- testuser2
执行:
#ansible-playbook add_user.yml
批量删除用户: --可以将用户家目录也删除,从whell组中删除
#cat del_user.yml
---
- hosts: web
remote_user: root
gather_facts: true
tasks:
- name: del several users
user: name={{ item }} state=absent remove=yes
with_items:
- testuser1
- testuser2
执行:
#ansible-playbook del_user.yml
批量创建用户并且设置密码:
使用openssl 生成密码,参考上面的。
#cat add_user.yml
---
- hosts: web
remote_user: root
gather_facts: true
tasks:
- name: Add several users
user: name={{ item }} state=present password="$1$1234567$IElhfIqK0wF7y.p/fYkzb/"
with_items:
- testuser1
- testuser2
执行:
#ansible-playbook add_user.yml
二、ansible常用模块简单介绍:
ansible的每个模块用法可以使用ansible-doc MOD 来查看
比如:ansible-doc copy
查看所支持的模块,可以使用ansible-doc -l 查看
#ansible-doc -l
ansible 命令的最常用的用法:
ansible <Host-partten> -m MOE -a 'MOD_ARV'
1、远程命令模块
command、script 、shell
例如:
ansible host1 -m command -a "free -m"
ansible host1 -m script -a "/home/test.sh 12 34"
ansible host1 -m shell -a "/home/test.sh"
比如在服务端执行:
[root@ansible ~]# ansible web -m shell -a "/root/test.sh 3 4 "
10.0.90.25 | success | rc=0 >>
7
20151119-171933
10.0.90.24 | success | rc=0 >>
7
20151119-171933
注:test.sh 在客户端服务器/root目录。
2、copy模块
实现服务端向目标主机拷贝文件,类似于scp功能
-m copy -a “command”
例如:
[root@ansible ~]# ansible host1 -m copy -a "src=/root/php-5.5.24-1.ele.el6.x86_64.rpm dest=/usr/local/src owner=root group=root mode=0755"
查看客户端文件是否存在
[root@ansible ~]# ansible host1 -m shell -a "ls -l /usr/local/src"
172.16.29.193 | success | rc=0 >>
total 10264
-rw-r--r--. 1 root root 10507544 May 30 02:40 php-5.5.24-1.ele.el6.x86_64.rpm
3、stat模块:
获取远程文件状态信息,包括atime,ctime,mtime,md5,uid,gid等信息
#ansible host1 -m stat -a "path=/etc/sysctl.conf"
获取客户端服务器的信息,信息太多,不再列出:
#ansible host1 -m setup
4、get_url 模块
实现在远程主机下载指定URL到本地,支持sha256sum文件校验
例如:
#ansible host1 -m get_url -a "url=http://www.baidu.com dest=/tmp/index.html mode=0440 force=yes"
172.16.29.193 | success >> {
"changed": true,
"checksum": "8bc43056c39fbb882cf5d7b0391d70b6e84096c6",
"dest": "/tmp/index.html",
"gid": 0,
"group": "root",
"md5sum": "324aa881293b385d2c0b355cf752cff9",
"mode": "0440",
"msg": "OK (unknown bytes)",
"owner": "root",
"secontext": "unconfined_u:object_r:user_tmp_t:s0",
"sha256sum": "",
"size": 93299,
"src": "/tmp/tmp3WI5fE",
"state": "file",
"uid": 0,
"url": "http://www.baidu.com"
}
5、yum 模块
linux 平台软件包管理操作,常见的有yum,apt
例如:
ansible host1 -m yum -a "name=vsftpd state=latest"
Ubuntu系列:
ansible host1 -m apt -a "pkg=vsftpd state=latest"
yum 模块的一些用法:
- name: install the latest version of Apache
yum: name=httpd state=latest
- name: remove the Apache package
yum: name=httpd state=absent
- name: install the latest version of Apache from the testing repo
yum: name=httpd enablerepo=testing state=present
- name: install one specific version of Apache
yum: name=httpd-2.2.29-1.4.amzn1 state=present
- name: upgrade all packages
yum: name=* state=latest
- name: install the nginx rpm from a remote repo
yum: name=http://nginx.org/packages/centos/6/noarch/RPMS/nginx-release-centos-6-0.el6.ngx.noarch.rpm state=present
- name: install nginx rpm from a local file
yum: name=/usr/local/src/nginx-release-centos-6-0.el6.ngx.noarch.rpm state=present
- name: install the 'Development tools' package group
yum: name="@Development tools" state=present
- name: install the 'Gnome desktop' environment group
yum: name="@^gnome-desktop-environment" state=present
6、cron 模块
远程主机crontab 配置
#ansible host1 -m cron -a "name='crontab test' minute=0 hour=5,2 job='ls -alh > /dev/null'"
172.16.29.193 | success >> {
"changed": true,
"jobs": [
"crontab test"
]
}
效果如下:
#Ansible: crontab test
0 5,2 * * * ls -alh > /dev/null
一个简单的yml例子添加cron:
#cat add_cron.yml
---
- hosts: web_crontab
remote_user: root
gather_facts: True
tasks:
- name: add ntp server cron job
cron: name="local network ntpserver" minute="*/12" hour="*" job="/usr/sbin/ntpdate 10.0.18.1 > /root/ntp.log"
执行:
#ansible-playbook add_cron.yml
移除一个cron任务
[root@ansible roles]# cat del_cron.yml
---
- hosts: web
remote_user: root
gather_facts: false
tasks:
- name: del an old crontab job
cron: name="local network ntpserver" state=absent
执行:
#ansible-playbook del_cron.yml
小案例: 假如有一个db备份脚本,需要推到所有db服务器上,并加入crontab,每隔1分钟执行一次,需要定义playbook结构的yml文件
#tree cronjob/
cronjob/
└── tasks
├── crontest.yml
└── main.yml
yml文件如下:
#cat test-cron.yml
---
- name: cron jobs test
hosts: "{{ host }}"
remote_user: "{{ user }}"
gather_facts: True
roles:
- cronjob
tasks目录中的crontest.yml
#cat crontest.yml
#copy cron job to client server
- copy: src=/tmp/test_time.sh dest=/usr/local/src/test_time.sh owner=root group=root mode=0755
#add cron job
- cron: name="test time jobs" minute="*/1" hour="*" job="/usr/local/src/test_time.sh >> /tmp/time.log"
tasks目录中的main.yml
#cat main.yml
- include: crontest.yml
执行:
ansible-playbook test-cron.yml --extra-vars "host=172.16.29.193 user=root" --单台执行
客户端crontab效果如下:
#Ansible: test time jobs
*/1 * * * * /usr/local/src/test_time.sh >> /tmp/time.log
以上有点复杂,整合进一个简单的yml文件:
#cat test-cron.yml
---
- hosts: host1
remote_user: root
gather_facts: True
tasks:
- name: copy cron job file to client server
copy: src=/tmp/test_time.sh dest=/usr/local/src/test_time.sh owner=root group=root mode=0755
- name: add cron job
cron: name="test time jobs" minute="*/2" hour="*" job="/usr/local/src/test_time.sh >> /tmp/time.log"
执行:
ansible-playbook test-cron.yml
效果:
#Ansible: test time jobs
*/2 * * * * /usr/local/src/test_time.sh >> /tmp/time.log
还可以再根据情况添加:
tasks:
- name: ensure nginx is at the latest version
yum: pkg=nginx state=latest
- name: write the nginx config file
template: src=/home/test/ansible/nginx.conf dest=/etc/nginx/nginx.conf
notify:
- restart nginx
- name: ensure nginx is running
service: name=nginx state=started
handlers:
- name: restart nginx
service: name=nginx state=restarted
启用10个并行进程数执行playbook :
ansible-playbook host1 nginx.yml -f 10
7、mount 模块
远程主机分区挂载
例如:
ansible host1 -m mount -a "name=/mnt/data src=/dev/sd0 fstype=ext3 opts=ro state=present"
8、service 模块
远程主机系统服务管理
例如:
ansible host1 -m service -a "name=httpd stete=restarted"
9、sysctl 模块
远程linux主机sysctl配置
例如:
sysctl: name=kernel.panic value=3 sysctl_file=/etc/sysctl.conf checks=before reload=yes
以下是定义在yml格式文件中的例子:
- sysctl: name=net.ipv4.tcp_rmem 'value=4096 87380 16777216' state=present
- sysctl: name=net.ipv4.tcp_wmem 'value=4096 65536 16777216' state=present
- sysctl: name=net.ipv6.conf.lo.disable_ipv6 value=1 state=present
10、file模块
[root@ansible ~]# ansible host1 -m shell -a "ls -l /root/test"
172.16.29.193 | success | rc=0 >>
-rw-r--r--. 1 test test 12 May 30 03:05 /root/test
权限是644, 属主和属组都是test
修改权限为755, 属主和属组改为root
[root@ansible ~]# ansible host1 -m file -a "dest=/root/test mode=755 owner=root group=root"
172.16.29.193 | success >> {
"changed": true,
"gid": 0,
"group": "root",
"mode": "0755",
"owner": "root",
"path": "/root/test",
"secontext": "unconfined_u:object_r:admin_home_t:s0",
"size": 12,
"state": "file",
"uid": 0
}
[root@ansible ~]# ansible host1 -m shell -a "ls -l /root/test"
172.16.29.193 | success | rc=0 >>
-rwxr-xr-x. 1 root root 12 May 30 03:05 /root/test
三、playbook 介绍
playbook 是一个不同于使用ansible命令行执行方式的模式,功能强大灵活,是一个简单的配置管理和多主机部署系统,不同于任何已经存在的模式,
可作为一个适合部署复杂应用程序的基础,playbook可以定制配置,可以按指定的操作步骤有序执行,支持同步以及异步方式。
1、写一个简单的删除playbook
[root@ansible ~]# cat del_test.yml
---
- hosts: host1
remote_user: root
tasks:
- name: delete /root/test
shell: rm -rf /root/test
执行:
[root@ansible ~]# ansible-playbook del_test.yml
PLAY [host1] ******************************************************************
GATHERING FACTS ***************************************************************
ok: [172.16.29.193]
TASK: [delete /root/test] *****************************************************
changed: [172.16.29.193]
PLAY RECAP ********************************************************************
172.16.29.193 : ok=2 changed=1 unreachable=0 failed=0
再次查看:
[root@ansible ~]# ansible host1 -m shell -a "ls -l /root/test"
172.16.29.193 | FAILED | rc=2 >>
ls: cannot access /root/test: No such file or directory
提示没有那个文件,删除成功
2、进行一下template模块操作,测试文件传输
[root@ansible ~]# cat copyfile.yml
---
- hosts: host1
remote_user: root
tasks:
- name: copy local server file to client /tmp/
template: src=/root/housekeeping.sh dest=/tmp/
执行
[root@ansible ~]# ansible-playbook copyfile.yml
PLAY [host1] ******************************************************************
GATHERING FACTS ***************************************************************
ok: [172.16.29.193]
TASK: [copy local server file to client /tmp/] ********************************
changed: [172.16.29.193]
PLAY RECAP ********************************************************************
172.16.29.193 : ok=2 changed=1 unreachable=0 failed=0
查看是否copy成功
[root@ansible ~]# ansible host1 -m shell -a 'ls -l /tmp'
172.16.29.193 | success | rc=0 >>
total 28
-rw-r--r--. 1 root root 378 Jun 10 22:21 housekeeping.sh
3、多项目同时更新
[root@ansible ~]# cat multi_copy.yml
---
- hosts: host1
remote_user: root
gather_facts: false
tasks:
- name: copy local server file to clint /tmp
template: src=/root/install.log dest=/tmp/test-{{item}}
with_items:
- install.log-1
- install.log-2
- install.log-3
执行:
#ansible-playbook multi_copy.yml
根据条件进行删除:
[root@ansible ~]# cat delete.yml
---
- hosts: host1
remote_user: root
gather_facts: true
tasks:
- name: if system is centos ,then rm /tmp/install.log-1
shell: rm -f /tmp/test-install.log-3
when: ansible_os_family == "Ubuntu"
执行:
#ansible-playbook delete.yml
注:当条件语句when:ansible_os_family == "Ubuntu" 成立时,shell:rm -rf /tmp/test-install.log-3命令才会运行,而且gather_facts: true必须为true。
redhat系列的系统:
when: ansible_os_family == "RedHat"
3、ansible-playbook 检查yml文件语法
a、check模式,仅检测,但不实行
[root@ansible ~]# ansible-playbook copyfile.yml --check
b、--syntax-check模式,检查yml语法正确与否
[root@ansible ~]# ansible-playbook copyfile.yml --syntax-check,如果语法正确,显示如下,如果语法错误,会提示错误信息
playbook: copyfile.yml
注意:
优化ansible-playbook运行时间
默认playbook是进行客户端fact搜集,一般如果你配置里没有使用fact的话,可以关闭这样就能减少运行时间
测试脚本:
#cat shell.yml
---
- hosts: host1
remote_user: root
#gather_facts: False
tasks:
- name: echo hi
shell: echo "hi"
real0m2.491s
user0m0.439s
sys 0m0.124s
关闭fact
#cat shell.yml
---
- hosts: host1
remote_user: root
gather_facts: False
tasks:
- name: echo hi
shell: echo "hi"
real0m1.746s
user0m0.314s
sys0m0.084s
参考文档:http://www.jb51.net/article/52154.htm
注:本人ansible在学习和摸索中,不足之处请多多指教!
补充:
默认ansible是使用key验证的,如果使用密码登陆的服务器,使用ansible的话,要不修改ansible.cfg配置文件的ask_pass = True给取消注释,要不就在运行命令时候加上-k,这个意思是-k, --ask-pass #ask for SSH password
如果客户端不在know_hosts里将会报错,如下:
#ansible web1 -m shell -a "hostname"
paramiko: The authenticity of host 'web1' can't be established.
The ssh-rsa key fingerprint is 59b5a079b9680998ad40c56166be96cc.
Are you sure you want to continue connecting (yes/no)?
如果出现上面的报错,说明客户端不在know_hosts里面
需要修改ansible.cfg文件,将#host_key_checking = False取消注释,即可!