1. 环境说明
本次安装节点信息:3master,3node,1lb(haproxy)
192.118.80.187 master01
192.118.80.188 master02
192.118.80.189 master03
192.118.80.190 node01
192.118.80.191 node02
192.118.80.192 node03
192.118.80.181 lb
2. 安装步骤
1) 配置主机名及hosts(所有节点机器)
hostnamectl set-hostname master01.company.com
hostnamectl set-hostname master02.company.com
hostnamectl set-hostname master03.company.com
hostnamectl set-hostname node01.company.com
hostnamectl set-hostname node02.company.com
hostnamectl set-hostname node03.company.com
hostnamectl set-hostname lb.company.com
vim /etc/hosts
192.118.80.187 pass.company.com
192.118.80.187 master01.company.com
192.118.80.188 master02.company.com
192.118.80.189 master03.company.com
192.118.80.190 node01.company.com
192.118.80.191 node02.company.com
192.118.80.192 node03.company.com
192.118.80.181 lb.company.com
2) 安装依赖包(所有节点机器)
yum install wget git net-tools bind-utils yum-utils
iptables-services bridge-utils bash-completion kexec-tools sos psacct -y
yum update -y
3) 设置内核引导参数(所有节点机器)
vim /etc/selinux/config
SELINUX=enforcing
4) 重启(所有节点机器)
systemctl reboot
5) master01节点准备工作(ansible使用master01)
yum install ansible pyOpenSSL -y
6) 生成SSH密钥,并配置节点间ssh互信
ssh-keygen -t rsa
for host in master01.company.com master02.company.com master03.company.com node01.company.com node02.company.com node03.company.com lb.company.com
do
ssh-copy-id-i ~/.ssh/id_rsa.pub $host
done
7) 三台master安装etcd(etcd存储在master上)
yum install -y etcd
systemctl enable etcd & systemctl startetcd
8) ansible主机下载解压3.10.x系列最新release版本openshift-ansible
wgethttps://github.com/openshift/openshift-ansible/archive/openshift-ansible-3.10357-1.tar.gz
tar zxvf openshift-ansible-3.10357-1.tar.gz
9)配置ansible/hosts
vim /etc/ansible/hosts
#reate an OSEv3 group that contains themaster, nodes, etcd, and lb groups.
# The lb group lets Ansible configureHAProxy as the load balancing solution.
# Comment lb out if your load balancer ispre-configured.
[OSEv3:children]
masters
nodes
etcd
# Since we are providing a pre-configuredLB VIP, no need for this group
lb
# Set variables common for all OSEv3 hosts
[OSEv3:vars]
ansible_ssh_user=root
#deployment_type=openshift-enterprise
#openshift_deployment_type=openshift-enterprise
openshift_deployment_type=origin
openshift_disable_check=docker_image_availability,docker_storage,memory_availability,package_version,disk_availability
penshift_master_identity_providers=[{'name':'htpasswd_auth','login': 'true', 'challenge': 'true','kind':'HTPasswdPasswordIdentityProvider',}]
# Native HA with External LB VIPs
openshift_master_default_subdomain=svc.company.com
openshift_master_cluster_method=native
openshift_master_cluster_hostname=paas.company.com
openshift_master_cluster_public_hostname=paas.company.com
openshift_enable_excluders=false
debug_level=2
# host group for masters
[masters]
master01.company.com
master02.company.com
master03.company.com
# host group for etcd
[etcd]
master01.company.com
master02.company.com
master03.company.com
# Since we are providing a pre-configuredLB VIP, no need for this group
[lb]
lb.company.com
# host group for nodes, includes regioninfo
[nodes]
master[1:3]. company.comopenshift_node_group_name='node-config-master'
node1.company.com openshift_node_group_name='node-config-compute'
node2.company.com openshift_node_group_name='node-config-compute'
node3.company.comopenshift_node_group_name='node-config-infra'
10) 执行ansible-playbook预检查及实际安装
(1)预安装检查
cd %openshift-ansible_home%
ansible-playbook . /playbooks/prerequisites.yml #安装预检查
(2)预检查完毕修改docker参数(master和node节点机器均执行)
vim /etc/sysconfig/docker
用以下配置覆盖原来的options,mirror使用内网mirror仓库,防止重复安装每次需要去公网下载镜像
OPTIONS='--selinux-enabled --log-driver=journald --signature-verification=False--registry-mirror=https://docker.mirrors.ustc.edu.cn
systemctl restart docker
(3)执行安装
ansible-playbook ./playbooks/deploy_cluster.yml #执行安装
11)报错及解决记录
a) One or more required container imagesare not available.Default registries searched: d: docker.io.Failed connectingto: o: docker.io
拉取镜像问题
解决方案:6台机器配置加速仓库地址并重启docker服务
vim /etc/sysconfig/docker
用一下配置覆盖原来的options
OPTIONS=' --selinux-enabled--log-driver=journald --signature-verification=False--registry-mirror=https://docker.mirrors.ustc.edu.cn
systemctl restart docker
b) fatal: [master2]: FAILED! =>{"changed": false, "msg": "SELinux is disabled on thishost."}
对应机器设置selinux为enforce
c) Unable to connect to the server: dialtcp: lookup paas.dev.insaiccorp.com on 10.118.80.187:53: no such host
集群域名无法解析,需要在/etc/hosts中指定集群域名到master机器中的一台
d) FAILED - RETRYING: Check status of node imagepre-pull (20 retries left).
等待对应节点拉取镜像,速度太慢,脚本指定时间内失败会报错
e) 多次安装后各种报错,建议先卸载再重装
卸载命令:ansible-playbook ./playbooks/adhoc/uninstall.yml
比如以下报错:
fatal: [master1.insaictest.com]: FAILED!=> {"changed": true, "finished": false, "msg":"Timed out accepting certificate signing requests. Failing asrequested.", "nodes": [{"client_accepted": false,"csrs": {"csr-4thjr": {"apiVersion":"certificates.k8s.io/v1beta1", "kind": "CertificateSigningRequest","metadata": {"creationTimestamp":"2018-08-28T01:54:10Z", "generateName": "csr-","name": "csr-4thjr", "namespace": "","resourceVersion": "83986", "selfLink":"/apis/certificates.k8s.io/v1beta1/certificatesigningrequests/csr-4thjr","uid": "41f46440-aa65-11e8-a94c-005056b81bfe"},"spec": {"groups": ["system:masters","system:cluster-admins", "system:authenticated"],
多次安装导致证书不一致导致node认证master失败,也可以执行证书重新生成操作:
ansible-playbook ./playbooks/redeploy-certificates.yml
f) ansible报版本过低
系统默认安装的是2.4.2,安装openshift最低需要2.4.3
卸载ansible:
yum list | grepansible
yumremove ansible.noarch
github下载2.6.3源码tar包并解压
wgethttps://github.com/ansible/ansible/archive/v2.6.3.tar.gz
tarzxvf v2.6.3.tar.gz
源码安装ansible:cd ansible-2.6.3
& python setup.py install
12) 安装完成后集成ADLDAP
第一步:备份文件:/etc/origin/master/master-config.yaml
第二部:修改:vim /etc/origin/master/master-config.yaml
找到oauthConfig部分替换identityProviders部分的内容,替换内容如下:
identityProviders:
-name: company_auth_provider
challenge: true
login: true
mappingMethod: claim
provider:
apiVersion: v1
Kind: LDAPPasswordIdentityProvider
attributes:
id:
- dn
email:
name:
- cn
preferredUsername:
- sAMAccountName
bindDN: "cn=Administrator,cn=Users,dc=,dc=com"
bindPassword: "密码"
insecure: true
url: "ldap://ou=,dc=,dc=com?sAMAccountName"
第三步:重启api和controller
master-restart api
master-restart controllers
第四步:三台master都执行上述操作