机器集群规划

操作系统要求

IBM POWER9: RHEL-ALT 7.5 with the "Minimal" installation option and the latest packages from the Extras channel.

IBM POWER8: RHEL 7.5 with the "Minimal" installation option and the latest packages from the Extras channel.

Master :

Minimum 4 vCPU (additional are strongly recommended).

Minimum 16 GB RAM (additional memory is strongly recommended, especially if etcd is co-located on masters).

Minimum 40 GB hard disk space for the file system containing/var/.

Minimum 1 GB hard disk space for the file system containing/usr/local/bin/.

Minimum 1 GB hard disk space for the file system containing the system’s temporary directory.

Masters with a co-located etcd require a minimum of 4 cores. Two-core systems do not work.

Nodes:

NetworkManager 1.0 or later.

1 vCPU.

Minimum 8 GB RAM.

Minimum 15 GB hard disk space for the file system containing/var/.

Minimum 1 GB hard disk space for the file system containing/usr/local/bin/.

Minimum 1 GB hard disk space for the file system containing the system’s temporary directory.

An additional minimum 15 GB unallocated space per system running containers for Docker’s storage back end; seeConfiguring Docker Storage. Additional space might be required, depending on the size and number of containers that run on the node.

实验集群

Master 172.XX.XX.175

Node 172.XX.XX.182

172.XX.XX.183

操作步骤

1 Enable Security-Enhanced Linux (SELinux) on all of the nodes

a. vi /etc/selinux/config

set SELINUX=enforcing and SELINUXTYPE=targeted

b. touch /.autorelabel; reboot

2 Ensuring host access

设置master到各个Node的免密登录

2.1 Generate an SSH key on the host you run the installation playbook on:

# ssh-keygen

2.2 Distribute the key to the other cluster hosts. You can use abashloop:

# for host in master.openshift.example.com \1node1.openshift.example.com \2node2.openshift.example.com; \3do ssh-copy-id -i ~/.ssh/id_rsa.pub $host; \ done

3 更新网卡配置信息

        In /etc/sysconfig/network-scripts/ifcfg-ethxx

                a. Make sure that: NM_CONTROLLED=yes

                b. Add following entries:

                        DNS1=

                        DNS2=

                        DOMAIN=

                (You can get DNS values from: /etc/sysconfig/network-scripts/ifcfg-bootnet and /etc/resolv.conf)

如果都没有值DNS1=本机IP地址

                (You can get DOMAIN value by this command: domainname -d)

4 每台机器设置/etc/hosts

[root@node1 network-scripts]# cat /etc/hosts

127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4

::1 localhost localhost.localdomain localhost6 localhost6.localdomain6

172.xx.xx.175 master.openshift.example.com

172.xx.xx.182 node1.openshift.example.com

172.xx.xx.183 node2.openshift.example.com

5 yum 设置代理

如果机器不能直接上网，需要设置上网代理服务器

vi /etc/yum.conf

set proxy=http://xx.xx.xx.xx:xxxx

6 Registering hosts(需要有红帽的订阅）

在每台机器执行

# subscription-manager register --username= --password=

# subscription-manager refresh

# subscription-manager list --available --matches '*OpenShift*'

# subscription-manager attach --pool=

6 注册yum 源

For on-premise installations on IBM POWER8 servers, run the following command

subscription-manager repos \

--enable="rhel-7-for-power-le-rpms" \

--enable="rhel-7-for-power-le-extras-rpms" \

--enable="rhel-7-for-power-le-optional-rpms" \

--enable="rhel-7-server-ansible-2.6-for-power-le-rpms" \

--enable="rhel-7-for-power-le-ose-3.11-rpms" \

--enable="rhel-7-for-power-le-fast-datapath-rpms" \

--enable="rhel-7-server-for-power-le-rhscl-rpms"

For on-premise installations on IBM POWER9 servers, run the following command:

# subscription-manager repos \

--enable="rhel-7-for-power-9-rpms" \

--enable="rhel-7-for-power-9-extras-rpms" \

--enable="rhel-7-for-power-9-optional-rpms" \

--enable="rhel-7-server-ansible-2.6-for-power-9-rpms" \

--enable="rhel-7-server-for-power-9-rhscl-rpms" \

--enable="rhel-7-for-power-9-ose-3.11-rpms"

7 安装基础包

7.1 每台机器都执行

# yum -y install wget git net-tools bind-utils iptables-services bridge-utils bash-completion kexec-tools sos psacct

# yum -y update

# reboot

# yum install atomic-openshift-excluder-3.11.141*

Now install a container engine:

To install CRI-O:

# yum -y install cri-o

To install Docker:

# yum -y install docker

7.2在master执行

# yum -y install openshift-ansible

# yum install atomic-openshift atomic-openshift-clients atomic-openshift-hyperkube atomic-openshift-node flannel glusterfs-fuse (可以不执行此命令）

# yum install cockpit-docker cockpit-kubernetes

7.3 在node执行

# yum install atomic-openshift atomic-openshift-node flannel glusterfs-fuse (可以不执行此命令）

8 开始安装openshift 在master节点上执行

8.1 安装前检查

$ cd /usr/share/ansible/openshift-ansible

$ ansible-playbook -i playbooks/prerequisites.yml

8.2 执行安装

$ cd /usr/share/ansible/openshift-ansible

$ ansible-playbook -i playbooks/deploy_cluster.yml

9 inventory_file 示例（1 master +2 node ）

[root@master openshift-ansible]# ls

ansible.cfg host.311 inventory playbooks roles

[root@master openshift-ansible]# cat host.311

# Create an OSEv3 group that contains the masters, nodes, and etcd groups

[OSEv3:children]

masters

nodes

etcd

# Set variables common for all OSEv3 hosts

[OSEv3:vars]

# SSH user, this user should allow ssh based auth without requiring a password

ansible_ssh_user=root

openshift_deployment_type=openshift-enterprise

# If ansible_ssh_user is not root, ansible_become must be set to true

#ansible_become=true

openshift_master_default_subdomain=master.openshift.example.com

debug_level=2

# default selectors for router and registry services

# openshift_router_selector='node-role.kubernetes.io/infra=true'

# openshift_registry_selector='node-role.kubernetes.io/infra=true'

# uncomment the following to enable htpasswd authentication; defaults to DenyAllPasswordIdentityProvider

#openshift_master_identity_providers=[{'name': 'htpasswd_auth', 'login': 'true', 'challenge': 'true', 'kind': 'HTPasswdPasswordIdentityProvider', 'filename': '/etc/origin/master/htpasswd'}]

openshift_master_identity_providers=[{'name': 'htpasswd_auth', 'login': 'true', 'challenge': 'true', 'kind': 'HTPasswdPasswordIdentityProvider'}]

openshift_master_htpasswd_users={'my-rhel-icp-admin': '$apr1$6eO/grkf$9jRafb0tw/2KQEAejT8Lc.'}

# supposedly encrypted password of: S3cure-icp-wordP*s?

openshift_disable_check=memory_availability,disk_availability,docker_image_availability

openshift_master_cluster_hostname=master.openshift.example.com

openshift_master_cluster_public_hostname=master.openshift.example.com

# false

#ansible_service_broker_install=false

#openshift_enable_service_catalog=false

#template_service_broker_install=false

#openshift_logging_install_logging=false

# registry passwd

oreg_url=registry.redhat.io/openshift3/ose-${component}:${version}

oreg_auth_user=****@xxx

oreg_auth_password=*******

openshift_http_proxy=http://xxx.xxx.xxx.xxx:3130

#openshift_https_proxy=https://xx.xxx.xxx.xxx:3130

openshift_no_proxy=".openshift.example.com"

# docker config

openshift_docker_additional_registries=registry.redhat.io

#openshift_docker_insecure_registries

#openshift_docker_blocked_registries

openshift_docker_options="--log-driver json-file --log-opt max-size=1M --log-opt max-file=3"

# openshift_cluster_monitoring_operator_install=false

# openshift_metrics_install_metrics=true

# openshift_enable_unsupported_configurations=True

#openshift_logging_es_nodeselector='node-role.kubernetes.io/infra: "true"'

#openshift_logging_kibana_nodeselector='node-role.kubernetes.io/infra: "true"'

# host group for masters

[masters]

master.openshift.example.com openshift_public_hostname="master.openshift.example.com"

# host group for etcd

[etcd]

master.openshift.example.com openshift_public_hostname="master.openshift.example.com"

# host group for nodes, includes region info

[nodes]

master.openshift.example.com openshift_public_hostname="master.openshift.example.com" openshift_node_group_name='node-config-master-infra'

node[1:2].openshift.example.com openshift_public_hostname="node-[1:2].openshift.example.com" openshift_node_group_name='node-config-compute'

10 安装过程中可能出现的错误情况

10.1 如果安装openshift_cluster_monitoring_operator_install ，对master需要设置openshift_node_group_name='node-config-master-infra'

参考https://github.com/vorburger/opendaylight-coe-kubernetes-openshift/issues/5

10.2 对于代理设置，需要设置no_proxy

参考https://github.com/openshift/openshift-ansible/issues/11365

10.3 https://github.com/openshift/openshift-ansible/issues/10427

10.3.1 FAILED - RETRYING: Wait for the ServiceMonitor CRD to be created #10427

File /etc/sysconfig/network-scripts/ifcfg-eth0 (CentOS)

There is a flag NM_CONTROLLED=no

10.3.2 FAILED - RETRYING: Wait for the ServiceMonitor CRD to be created #10427

I have the same issue, but what I did was....

Add NM_CONTROLLED=yes to ifcfg-eth0 to all my nodes

Verify my pods with $oc get pods --all-namespaces

$oc describe [pod cluster-monitoring-operator-WXYZ-ASDF] -n openshift-monitoring ==> With this command, in last part I could see reason with my pod didn't initiate, I have this message....

Warning FailedCreatePodSandBox 1h kubelet, infra-openshift-nuuptech Failed create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container "70719b9ee2bb9c54fc1d866a6134b229b3c1c151148c9558ea0a4ef8cb66526a" network for pod "cluster-monitoring-operator-67579f5cb5-gxmwc": NetworkPlugin cni failed to set up pod "cluster-monitoring-operator-67579f5cb5-gxmwc_openshift-monitoring" network:failed to find plugin "bridge" in path [/opt/cni/bin], failed to clean up sandbox container "70719b9ee2bb9c54fc1d866a6134b229b3c1c151148c9558ea0a4ef8cb66526a" network for pod "cluster-monitoring-operator-67579f5cb5-gxmwc": NetworkPlugin cni failed to teardown pod "cluster-monitoring-operator-67579f5cb5-gxmwc_openshift-monitoring" network: failed to find plugin "bridge" in path [/opt/cni/bin]]

I searched what is in bold, and I find a next solution.....

$ls -l /etc/cni/net.d ==> Normally the only file should be 80-openshift-network.conf, and I had 3 files

$ ls -l /etc/cni/net.d

-rw-r--r--. 1 root root 294 Mar 12 16:46 100-crio-bridge.conf

-rw-r--r--. 1 root root 54 Mar 12 16:46 200-loopback.conf

-rw-r--r--. 1 root root 83 May 15 16:15 80-openshift-network.conf

Red Hat suggest delete extra files and only keep 80-openshift-network.conf, but I only move 100-crio-bridge.conf and 200-loopback.conf to other directory. After do that, I reboot all my nodes, and in master node I executeplaybooks/openshift-monitoring/config.ymlagain and it worked.

11 安装成功后登陆用户创建

由于admin无法直接登陆，需要创建用户

11.1 用htpasswd创建dev/dev的用户

htpasswd -b /etc/origin/master/htpasswd dev dev

11.2 给dev用户添加集群管理员权限，这样可以访问集群内所有项目

# oc login -u system:admin

# htpasswd -b /etc/origin/master/htpasswd dev dev

# oc adm policy add-cluster-role-to-user cluster-admin dev

[root@master openshift-ansible]# oc get clusterrolebindings |grep dev

cluster-admin-0 /cluster-admin dev

11.3 访问https://master.openshift.example.com:8443

输入用户名dev 密码dev

12 卸载 openshift

ansible-playbook -i hosts.311 /usr/share/ansible/openshift-ansible/playbooks/adhoc/uninstall.yml

在OpenPOWER上安装红帽OpenShift3.11教程

机器集群规划

操作系统要求

实验集群

操作步骤

你可能感兴趣的:(在OpenPOWER上安装红帽OpenShift3.11教程)