OCP 4.5 Air Gap Installation with Static IP

OCP 4.5 Air Gap Installation with Static IP

In OpenShift Container Platform 4.5, you can perform an installation that does not require an active connection to the internet to obtain software components. You complete an installation in air-gapped environment on only infrastructure that you provision, not infrastructure that the installation program provisions, so your platform selection is limited. Typically, this appraoch applies to installation on bare metal hardware or on VMware vSphere.

Note: Restricted network installations always use user-provisioned infrastructure.

Requirements

Entitlement requirement

To download the pull secret for pulling the OpenShift images, you need a Redhat subscription with OpenShfit entitlement.

Machine requirement

In this runbook. we focus on the installation with x86_64 machines. The machines could be bare metal server or VMs provisioned with VMWare VSphere ESXi. VMWare VSphere ESXi 6.7U2+ is recommended.

Note:

To maintain high availability of your cluster, use separate physical hosts for these cluster machines.

The bootstrap, control plane machines must use the Red Hat Enterprise Linux CoreOS (RHCOS) as the operating system. For easier maintenance or upgrade, it is recommened to use the RHCOS as the operating system for compute (worker) machines too.

The Disk of master and worker nodes should be SSD drive. To ensure that the storage partition has good disk I/O performance, run the disk latency test and the disk throughput test:

Disk latency test

dd if=/dev/zero of=/PVC_mount_path/testfile bs=4096 count=1000 oflag=dsync

The value must be better or comparable to: 4096000 bytes (4.1 MB, 3.9 MiB) copied, 1.5625 s, 2.5 MB/s

Disk throughput test

dd if=/dev/zero of=/PVC_mount_path/testfile bs=1G count=1 oflag=dsync

The value must be better or comparable to: 1073741824 bytes (1.1 GB) copied, 5.14444 s, 209 MB/s

Network connectivity requirements

10GB bandwidth between nodes

Prepare and set static IP address for each node.

Make sure the following ports opened on the load balancer node:  6443/tcp, 22623/tcp, 443/tcp, 80/tcp and 123/udp

User-provisioned DNS requirements

All node names, should be resolvable by the DNS server.

The following DNS records are also required and must be resolvable by both clients external to the cluster and from all the nodes within the cluster. In each record, is the cluster name and is the cluster base domain that you specify in the install-config.yaml file. A complete DNS record takes the form: ...

The following DNS A/AAAA or CNAME record must point to the load balancer for the control plane machines.

*.apps..base_domain IN A load_balancer_IP

api..base_domain IN A load_balancer_IP

api-int..base_domain IN A load_balancer_IP

The following DNS A/AAAA and SRV record must point to the the control plane machines respectively.

etcd-0..base_domain IN A master1_IP

etcd-1..base_domain IN A master2_IP

etcd-2..base_domain IN A master3_IP

_etcd-server-ssl._tcp IN SRV 0 10 2380 etcd-0..base_domain.

_etcd-server-ssl._tcp IN SRV 0 10 2380 etcd-1..base_domain.

_etcd-server-ssl._tcp IN SRV 0 10 2380 etcd-2..base_domain.

An example for showcasing the installation

Topology

To complete an air gap installation, you must create a registry that mirrors the contents of the OpenShift Container Platform registry and contains the installation media. But for most air gap installation scenarios, even the bastion node is not able to access the internet. So you may need to create this registry on a mirror host on customer site, which can access both the internet and your closed network. And a download server on customer site is recommeded. Alternatively, you can consider setting up a download server with your own machine and the copy the mirrored the contents of the OpenShift Container Platform registry and the installation media to the bastion node on customer site.

Installation environment

Download server: 9.30.56.172

VMs provisioed with VMware vSphere ESXi 6.7:

coc-g1-bastion         rhel7.6  8vcpu, 32G, 150G + 500G + 500G     9.123.120.101

coc-g1-bootstrap     rhcos     4vcpu, 16G, 120G          9.123.120.102

coc-g1-master01     rhcos     4vcpu, 16G, 300G        9.123.120.103

coc-g1-master02     rhcos     4vcpu, 16G, 300G         9.123.120.104

coc-g1-master03     rhcos     4vcpu, 16G, 300G         9.123.120.105

coc-g1-worker01     rhcos     16vcpu,64G, 300G          9.123.120.106

coc-g1-worker02     rhcos     16vcpu,64G, 300G          9.123.120.107

coc-g1-worker03     rhcos     16vcpu,64G, 300G          9.123.120.115

Gateway: 9.123.120.1

DNS: 9.0.148.50

Base domain:cdl.ibm.com

Cluster name: coc-g1

DNS Records:

api IN A 9.123.120.101

api-int IN A 9.123.120.101

*.apps IN A 9.123.120.101

etcd-0 IN A 9.123.120.103

etcd-1 IN A 9.123.120.104

etcd-2 IN A 9.123.120.105

_etcd-server-ssl._tcp IN SRV 0 10 2380etcd-0.coc-g1.cdl.ibm.com.

_etcd-server-ssl._tcp IN SRV 0 10 2380etcd-1.coc-g1.cdl.ibm.com.

_etcd-server-ssl._tcp IN SRV 0 10 2380etcd-2.coc-g1.cdl.ibm.com.

coc-g1-bastion  IN A  9.123.120.101

coc-g1-bootstrap  IN A  9.123.120.102

coc-g1-master01  IN A  9.123.120.103

coc-g1-master02  IN A  9.123.120.104

coc-g1-master03  IN A  9.123.120.105

coc-g1-worker01  IN A  9.123.120.106

coc-g1-worker02  IN A  9.123.120.107

coc-g1-worker03  IN A  9.123.120.115

Download assets for installation

Assumption: All the nodes including the bastion node can't connect to the internet.

You need to do this in an internet connected server with Redhat 7.6+ installed. Let's call it the download server here. 

Connect to the download server

ssh [email protected]

Stop the firewall (if active)

systemctl stop firewalld

systemctl disable firewalld

Install required packages

wgethttps://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm

yum install -y epel-release-latest-7.noarch.rpm

yum -y install wget podman httpd-tools jq --nogpgcheck

Update the /etc/hosts file in the download server with the registry server information

If you are using a separate download server (full air-gapped install), you can avoid having to regenerate the self-signed certificate and tweak the json files, by temporarily adding the registry server in the /etc/hosts file. 

In the below example, thejhwcx1.fyre.ibm.comserver is connected to the internet and is used to download all packages and images; thecoc-g1-bastion.coc-g1.cdl.ibm.comserver is the registry server (Bastion node) which is accessible from the OpenShift cluster. In reality, thecoc-g1-bastion.coc-g1.cdl.ibm.comserver has IP address 9.123.120.101.

[root@jhwcx1 ~]#cat /etc/hosts

127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4

::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

10.11.17.215jhwcx1.fyre.ibm.comjhwcx1

10.11.17.215coc-g1-bastion.coc-g1.cdl.ibm.com

Set environment variables

Make sure you adapt the variables below to your environment.

export REGISTRY_SERVER=coc-g1-bastion.coc-g1.cdl.ibm.com

export REGISTRY_PORT=5000

export LOCAL_REGISTRY="${REGISTRY_SERVER}:${REGISTRY_PORT}"

exportEMAIL="[email protected]"

export REGISTRY_USER="admin"

export REGISTRY_PASSWORD="passw0rd"

export OCP_RELEASE="4.5.6"

export RHCOS_RELEASE="4.5.6"

export LOCAL_REPOSITORY="ocp4/openshift4"

export PRODUCT_REPO="openshift-release-dev"

export LOCAL_SECRET_JSON="/ocp4_downloads/ocp4_install/ocp_pullsecret.json"

export RELEASE_NAME="ocp-release"

Prepare OpenShift download directory

mkdir -p /ocp4_downloads/{clients,dependencies,ocp4_install}

mkdir -p /ocp4_downloads/registry/{auth,certs,data,images}

Retrieve OpenShift client and CoreOS downloads

cd /ocp4_downloads/clients

wgethttps://mirror.openshift.com/pub/openshift-v4/clients/ocp/4.5.6/openshift-client-linux.tar.gz

or 

wgethttp://9.111.98.221/repo/openshift-client-linux.tar.gz

wgethttps://mirror.openshift.com/pub/openshift-v4/clients/ocp/4.5.6/openshift-install-linux.tar.gz

or

wgethttp://9.111.98.221/repo/openshift-install-linux.tar.gz


cd /ocp4_downloads/dependencies

wgethttps://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/latest/latest/rhcos-4.5.6-x86_64-installer.x86_64.iso

or

wgethttp://9.111.98.221/repo/rhcos-4.5.6-x86_64-installer.x86_64.iso

wgethttps://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/latest/latest/rhcos-4.5.6-x86_64-metal.x86_64.raw.gz

or

wgethttp://9.111.98.221/repo/rhcos-4.5.6-x86_64-metal.x86_64.raw.gz

Install OpenShift client

tar xvzf /ocp4_downloads/clients/openshift-client-linux.tar.gz -C /usr/local/bin

Generate certificate

cd /ocp4_downloads/registry/certs

openssl req -newkey rsa:4096 -nodes -sha256 -keyout registry.key -x509 -days 365 -out registry.crt -subj "/C=US/ST=/L=/O=/CN=$REGISTRY_SERVER"

Create password for registry

Change the password to something more secure if you want to.

htpasswd -bBc /ocp4_downloads/registry/auth/htpasswd $REGISTRY_USER $REGISTRY_PASSWORD

Download registry image

cd /ocp4_downloads/registry/images

wgethttp://9.111.98.221/repo/registry2.tar

podman load -iregistry2.tar

Alternatively:

podman pulldocker.io/library/registry:2

podman save -o /ocp4_downloads/registry/images/registry2.tardocker.io/library/registry:2

Download NFS provisioner image

cd /ocp4_downloads/registry/images

wgethttp://9.111.98.221/repo/nfs-client-provisioner.tar

Alternatively:

podman pullquay.io/external_storage/nfs-client-provisioner:latest

podman save -o /ocp4_downloads/registry/images/nfs-client-provisioner.tarquay.io/external_storage/nfs-client-provisioner:latest

Create registry pod

podman run --name mirror-registry --publish $REGISTRY_PORT:5000 --detach --volume /ocp4_downloads/registry/data:/var/lib/registry:z --volume /ocp4_downloads/registry/auth:/auth:z --volume /ocp4_downloads/registry/certs:/certs:z --env "REGISTRY_AUTH=htpasswd" --env "REGISTRY_AUTH_HTPASSWD_REALM=Registry Realm" --env REGISTRY_AUTH_HTPASSWD_PATH=/auth/htpasswd --env REGISTRY_HTTP_TLS_CERTIFICATE=/certs/registry.crt --env REGISTRY_HTTP_TLS_KEY=/certs/registry.keydocker.io/library/registry:2

Add certificate to trusted store

cp -f /ocp4_downloads/registry/certs/registry.crt /etc/pki/ca-trust/source/anchors/

update-ca-trust

Check if you can connect to the registry

curl -u $REGISTRY_USER:$REGISTRY_PASSWORD https://${LOCAL_REGISTRY}/v2/_catalog

Output should be:

{"repositories":[]}

Create pull secret file

Create file /tmp/ocp_pullsecret.json and insert the contents of the pull secret you retrieved from:

https://cloud.redhat.com/openshift/install/metal/user-provisioned

Generate air-gapped pull secret

The air-gapped pull secret will be used when installing OpenShift.

AUTH=$(echo -n "$REGISTRY_USER:$REGISTRY_PASSWORD" | base64 -w0)

CUST_REG='{"%s": {"auth":"%s", "email":"%s"}}\n'

printf "$CUST_REG" "$LOCAL_REGISTRY" "$AUTH" "$EMAIL" > /tmp/local_reg.json

jq --argjson authinfo "$( /ocp4_downloads/ocp4_install/ocp_pullsecret.json

cat /ocp4_downloads/ocp4_install/ocp_pullsecret.json | jq

The contents of the /ocp4_downloads/ocp4_install/ocp_pullsecret.json should be something like this:

{

  "auths": {

    "cloud.openshift.com": {

      "auth": "xxx",

      "email":"[email protected]"

    },

    "quay.io": {

      "auth": "xxx",

      "email":"[email protected]"

    },

    "registry.connect.redhat.com": {

      "auth": "xxx",

      "email":"[email protected]"

    },

    "registry.redhat.io": {

      "auth": "xxx",

      "email":"[email protected]"

    },

    "coc-g1-bastion.coc-g1.cdl.ibm.com:5000": {

      "auth": "YWRtaW46cGFzc3cwcmQ=",

      "email":"[email protected]"

    }

  }

}

Mirror registry

Start a screen session.

The screen command launches a terminal in the background which can be detached from and then reconnected to. This is especially useful when you log in to the system remotely. You can start a screen, kick off a command, detach from the screen, and log out. You can then log in later and reattach to the screen and see the program running.

screen

This takes 5-10 minutes to complete.

oc adm -a ${LOCAL_SECRET_JSON} release mirror --from=quay.io/${PRODUCT_REPO}/${RELEASE_NAME}:${OCP_RELEASE}-x86_64 --to=${LOCAL_REGISTRY}/${LOCAL_REPOSITORY} --to-release-image=${LOCAL_REGISTRY}/${LOCAL_REPOSITORY}:${OCP_RELEASE}

Output should be something like this:

Success

Update image: coc-g1-bastion.coc-g1.cdl.ibm.com:5000/ocp4/openshift4:4.5.6

Mirror prefix:coc-g1-bastion.coc-g1.cdl.ibm.com:5000/ocp4/openshift4

To use the new mirrored repository to install, add the following section to the install-config.yaml:

imageContentSources:

- mirrors:

  -coc-g1-bastion.coc-g1.cdl.ibm.com:5000/ocp4/openshift4

  source:quay.io/openshift-release-dev/ocp-release

- mirrors:

  -coc-g1-bastion.coc-g1.cdl.ibm.com:5000/ocp4/openshift4

  source:quay.io/openshift-release-dev/ocp-v4.0-art-dev

To use the new mirrored repository for upgrades, use the following to create an ImageContentSourcePolicy:

apiVersion:operator.openshift.io/v1alpha1

kind: ImageContentSourcePolicy

metadata:

  name: example

spec:

  repositoryDigestMirrors:

  - mirrors:

    -coc-g1-bastion.coc-g1.cdl.ibm.com:5000/ocp4/openshift4

    source:quay.io/openshift-release-dev/ocp-release

  - mirrors:

    -coc-g1-bastion.coc-g1.cdl.ibm.com:5000/ocp4/openshift4

    source:quay.io/openshift-release-dev/ocp-v4.0-art-dev

Note:

Record the entireimageContentSourcessection from the output of the previous command. The information about your mirrors is unique to your mirrored repository, and you must add the imageContentSources section to the install-config.yaml file during installation.

To create the installation program that is based on the content that you mirrored, extract it and pin it to the release.

cd /ocp4_downloads/clients

oc adm -a ${LOCAL_SECRET_JSON} release extract --command=openshift-install "${LOCAL_REGISTRY}/${LOCAL_REPOSITORY}:${OCP_RELEASE}" --loglevel=10

cp openshift-install /usr/local/bin/

openshift-install -h 

openshift-install version 


Note:

To ensure that you use the correct images for the version of OpenShift Container Platform that you selected, you must extract the installation program from the mirrored content.

You must perform this step on a machine with an active internet connection.

Package up downloads directory and ship it to the Bastion node

If your registry server cannot be connected to the internet, you will have to create a tar ball and send it to the registry server. 

Stop the registry 

podman rm -f mirror-registry

Remove the registry server from the /etc/hosts file

You can now remove the registry server entry from the /etc/hosts file on the download server.

Tar the downloads directory on the downloads server.This will create a tar ball of ~5GB.

tar czf /tmp/ocp4_downloads.tar.gz /ocp4_downloads

Send the tar ball to the registry server (bastion node)

The way the tar ball is shipped is dependent on how the registry server can be reached. Either scp, some kind if shared folder or plain USB sticks may have to be used.

Download RHEL RPMs and ship it to the Bastion node

Creating a Local Repository and Sharing With Disconnected/Offline/Air-gapped Systems

https://access.redhat.com/solutions/3176811

RHEL 7 RPM downloads for creating local repository

https://docs.openshift.com/container-platform/3.11/install/disconnected_install.html#disconnected-syncing-repos

Alternatively:

cd /tmp

wgethttp://9.111.98.221/repo/rhel-7-server.tgz

wgethttp://9.111.98.221/repo/rhel-7-server-extras.tgz

wgethttps://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm

Send the tar ball to the registry server (bastion node). Suppose you send these files to the folder on the bastion node./ocp4_downloads/

The way the tar ball is shipped is dependent on how the bastion node can be reached. Either scp, some kind if shared folder or plain USB sticks may have to be used.

======================Day1 Milestone=============================================

Extend Storage for Bastion node

Log on to the bastion node as root

pvcreate /dev/sdb

vgextend rhel /dev/sdb

lvextend -l +100%FREE /dev/rhel/root

xfs_growfs /

df -h

Note: no need for group1 and group 2 because it has been done.

Serve the registry on Bastion node

Log on to the bastion node

Log on to the bastion node as root

Enable required repositories on the bastion node

You must install certain packages on the bastion node (and optionally NFS) for the installation. These will come from Red Hat Enterprise Linux and EPEL repositories. 

Make sure the following repositories are available from the Bastion node or the satellite server in use for the infrastructure:

rhel-server-rpms - Red Hat Enterprise Linux Server (RPMs)

As the Bastion node is offiline, we can refer to the followiing link for creating the local respository.

Creating a Local Repository and Sharing With Disconnected/Offline/Air-gapped Systems:

https://access.redhat.com/solutions/3176811

tar -xvf rhel-7-server.tgz

tar -xvf rhel-7-server-extras.tgz

(This is only for Ring Cloud in Lab.)

systemctl stop puppet

systemctl disable puppet

vi /etc/yum.repos.d/local.repo

[rhel-7-server-rpms]

name=rhel-7-server-rpms

baseurl=file:///ocp4_downloads/rhel-7-server-rpms

enabled=1

gpgcheck=0

[rhel-7-server-extras-rpms]

name=rhel-7-server-extras-rpms

baseurl=file:///ocp4_downloads/rhel-7-server-extras-rpms

enabled=1

gpgcheck=0

For EPEL, you need the following repository:

epel/x86_64 - Extra Packages for Enterprise Linux - x86_64

If you don't have this repository configured yet, you can do as as follows for RHEL-8:

yum install -y /ocp4_downloads/epel-release-latest-8.noarch.rpm

For RHEL-7, do the following:

yum install -y /ocp4_downloads/epel-release-latest-7.noarch.rpm

mv /etc/yum.repos.d/itaas* /tmp

rm -rf /var/cache/yum

yum repolist

Make sure therhel-7-server-rpmsandrhel-7-server-extras-rpms listed.

Change SELinux to be Permissive on the Basion node

Some services such as httpd nginx and haproxy require special settings to allow for running under SElinux. As we're targeting the installation steps, disable SELinux.

sed -i 's/SELINUX=enforcing/SELINUX=permissive/g' /etc/selinux/config

setenforce 0

cat /etc/selinux/config

sestatus

getenforce

Stop and disable firewalld on the Bastion node

systemctl stop firewalld

systemctl disable firewalld


Install required packages

yum -y install wget podman httpd httpd-tools jq net-tools tree bind-utils nfs-utils screen python3 jq yum-utils chrony --nogpgcheck

podman images

podman version

Check and modify the /etc/hosts file

Check the /etc/hosts file and ensure that it has the correct entry for the registry server, such as:

127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4

::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

9.123.120.101coc-g1-bastion.coc-g1.cdl.ibm.comcoc-g1-bastion

Untar the tar ball on the registry server

scp [email protected]:/tmp/ocp4_downloads.tar.gz /tmp/ocp4_downloads.tar.gz

tar xzf /tmp/ocp4_downloads.tar.gz -C /

Set environment variables

export REGISTRY_SERVER="coc-g1-bastion.coc-g1.cdl.ibm.com"

export REGISTRY_PORT=5000

export LOCAL_REGISTRY="${REGISTRY_SERVER}:${REGISTRY_PORT}"

export REGISTRY_USER="admin"

export REGISTRY_PASSWORD="passw0rd"

export LOCAL_REPOSITORY="ocp4/openshift4"

export LOCAL_SECRET_JSON="/ocp4_downloads/ocp4_install/ocp_pullsecret.json"

Create registry pod on the bastion node

podman load -i /ocp4_downloads/registry/images/registry2.tar

podman run --name mirror-registry --publish $REGISTRY_PORT:5000 --detach --volume /ocp4_downloads/registry/data:/var/lib/registry:z --volume /ocp4_downloads/registry/auth:/auth:z --volume /ocp4_downloads/registry/certs:/certs:z --env "REGISTRY_AUTH=htpasswd" --env "REGISTRY_AUTH_HTPASSWD_REALM=Registry Realm" --env REGISTRY_AUTH_HTPASSWD_PATH=/auth/htpasswd --env REGISTRY_HTTP_TLS_CERTIFICATE=/certs/registry.crt --env REGISTRY_HTTP_TLS_KEY=/certs/registry.keydocker.io/library/registry:2

Add certificate to trusted store on the new registry server

/usr/bin/cp -f /ocp4_downloads/registry/certs/registry.crt /etc/pki/ca-trust/source/anchors/

update-ca-trust

Check if you can connect to the registry

curl -u $REGISTRY_USER:$REGISTRY_PASSWORD https://${LOCAL_REGISTRY}/v2/_catalog

The output should be as follows (as the registry now has content):

{"repositories":["ocp4/openshift4"]}

curl -u $REGISTRY_USER:$REGISTRY_PASSWORD https://${LOCAL_REGISTRY}/v2/ocp4/openshift4/tags/list | jq

Create systemd unit file to ensure the registry is started after reboot

podman generate systemd mirror-registry -n > /etc/systemd/system/container-mirror-registry.service

systemctl enable container-mirror-registry.service

systemctl daemon-reload

Create push secret for the registry

podman login --authfile push-secret.jsoncoc-g1-bastion.coc-g1.cdl.ibm.com:5000

Input the user name and password created:

Username:admin

Password:passw0rd

Login Succeeded! 


Check the created Push Secret:

cat push-secret.json

Get the pull secret from RedHat's official site

Go tohttps://cloud.redhat.com/openshift/install/metal

And thendownload the pull secret  by clicking 'Dowload pull secret' or copy it by clicking 'Copy pull secret' and save it to a text filepull-secret.txt.  

Note: You need a RedHat account for downloading the pull secret.

Merge the push secret and the pull secret

Upload the pull secret filepull-secret.txtto the same directory as what the push secret file push-secret.json resides in.

Merge thepull-secret.txt andpush-secret.json into a new filepull-push_secret.json.

cat pull-secret.txt | jq . > pull-push_secret.json

cat pull-push_secret.json

At the end of the third line to the last, add the content of thepush-secret.json to thepull-push_secret.json.

It would be like this:

[root@coc-g1-bastion ~]# 

cat pull-push_secret.json

{

  "auths": {

    "cloud.openshift.com": {

      "auth": "xxx",

      "email":"[email protected]"

    },

    "quay.io": {

      "auth": "xxx",

      "email":"[email protected]"

    },

    "registry.connect.redhat.com": {

      "auth": "xxx",

      "email":"[email protected]"

    },

    "registry.redhat.io": {

      "auth": "xxx",

      "email":"[email protected]"

    },

   "coc-g1-bastion.coc-g1.cdl.ibm.com:5000": {

        "auth": "YWRtaW46cGFzc3cwcmQ="

   }

  }

}

Check if the modification is correct.

cat pull-push_secret.json | jq

Config Http Server

yum install –y httpd

systemctl enable httpd

systemctl start httpd

ln -s /ocp4_downloads /var/www/html/ocp4

chown -R apache: /var/www/html/

chmod -R 755 /var/www/html/


vi /etc/httpd/conf/httpd.conf

From: 

Listen 80

To be:

Listen 81

Note: it's recommended to use 81 instead of 80 port if you will config both HAProxy and Http server in the bastion node.

systemctl restart httpd

Check that we can list entries:

curl -L -s http://${REGISTRY_SERVER}:81/ocp4 --list-only

Output should be:

[root@coc-g1-bastion ocp4_downloads]# curl -L -s http://${REGISTRY_SERVER}:81/ocp4 --list-only

 

  Index of /ocp4

 

 

Index of /ocp4

............

Prepare the installation files

cd /ocp4_downloads/clients

tar xvzf /ocp4_downloads/clients/openshift-client-linux.tar.gz -C /usr/local/bin

tar xvzf /ocp4_downloads/clients/openshift-install-linux.tar.gz -C /usr/local/bin


GenerateSSH key for SSH access to cluster nodes

ssh-keygen -t rsa -b 4096 -N ''

eval "$(ssh-agent -s)"

ssh-add ~/.ssh/id_rsa

Note: it's ~/.ssh/id_rsa.pub rather than ~/.ssh/id_rsa will be specified in install-config.yaml

Set up Load Balancer/HAProxy

yum install -y haproxy 

systemctl enable haproxy

systemctl start haproxy

systemctl status haproxy

vi /etc/haproxy/haproxy.cfg

For details, please copy a  haproxy.cfg file from a workable environment.

haproxy templatehttps://ibm.box.com/s/nco6340m1t4yrk945w4xm4bgtgqsqn2c

systemctl restart haproxy

Create installation configuration file

mkdir /ibm

cd /ibm

mkdir -p installation_directory

cd installation_directory

vi install-config.yaml 

You can follow this install-config template:

https://ibm.box.com/s/6ivavyiaxm6uwf6qo9dcime36gtm7guv

Note:

1) It's ~/.ssh/id_rsa.pub rather than ~/.ssh/id_rsa will be specified in install-config.yaml

2) The pullSecret should be specified with the push secrect we created for accessing the local image registry.

3) The additionalTrustBundle should be specified with the certificate we created for accessing the local image registry.

In this installation case, it should be specified with the content of the certificate file /ocp4_downloads/registry/certs/registry.crt. And it should be one line. If there are multiple lines in the certificate file, you need to combine it into one line.

4) The content of the imageContentSources should be from the imageContentSources section of the output when doing mirror registry.

imageContentSources:

- mirrors:

  -coc-g1-bastion.coc-g1.cdl.ibm.com:5000/ocp4/openshift4

  source:quay.io/openshift-release-dev/ocp-release

- mirrors:

  -coc-g1-bastion.coc-g1.cdl.ibm.com:5000/ocp4/openshift4

  source:quay.io/openshift-release-dev/ocp-v4.0-art-dev

Note:

We'd better backup the installation_config.yaml file as it would be deleted during the installation.

cp install-config.yaml /ibm/install-config.yaml 

cd ..

If the certificate and ignition files  expired after 24hours and OCP still hadn't installed, start from here again:

Generate the Kubernetes manifest and Ignition config files

Generate manifests

openshift-install create manifests --dir=installation_directory

Example output:

INFO Consuming Install Config from target directory

WARNING Making control-plane schedulable by setting MastersSchedulable to true for Scheduler cluster settings

For , specify the installation directory that contains the install-config.yaml file you created.

Because you create your own compute machines later in the installation process, you can safely ignore this warning.

Note:

Modify the /manifests/cluster-scheduler-02-config.yml Kubernetes manifest file to prevent Pods from being scheduled on the control plane machines:

Open the /manifests/cluster-scheduler-02-config.yml file.

Locate the mastersSchedulable parameter and set its value to False.

Save and exit the file.

Generate ignition files

cd /ibm

openshift-install create ignition-configs --dir=installation_directory

The following files are generated in the directory:

.

├── auth

│   ├── kubeadmin-password

│   └── kubeconfig

├── bootstrap.ign

├── master.ign

├── metadata.json

└── worker.ign

Note:

1.The certificate and ignition files will expire after 24hours

2.For each install, you need to backup installation_directory and then delete the installation_directory.

Configuring time synchronization service with chrony

Configure chrony server

1.Installing the chrony time service

yum install -y chrony

systemctl enable chronyd

systemctl start chronyd

systemctl status chronyd

2.Configuring Chrony

The UDP port number 123 needs to be open in the firewall in order to allow the client access.

firewall-cmd --permanent --zone=public --add-port=123/udp

firewall-cmd --reload

vi /etc/chrony.conf

For the details, please refer to this template and modify the server and allow sections accordingly. It is recommended to take the Bastion node as the chrony server.

https://ibm.box.com/s/vgo11wdax74ubx6hlfvkjckxyouy64vx

systemctl restart chronyd

3.Verify

Run the chronyc tracking command to check chrony tracking.

chronyc tracking

Some of the important fields in the output are :

Reference ID: This is the reference ID and name (or IP address) if available, of the server to which the computer is currently synchronized.

Stratum: The stratum indicates how many hops away from a computer with an attached reference clock you are.

Ref time: This is the time (UT C) at which the last measurement from the reference source was processed.

Configure chrony client

1.Create the contents of the chrony.conf file and encode it as base64.

For example:

$ cat << EOF | base64 

server 9.123.120.101 iburst

driftfile /var/lib/chrony/drift

makestep 1.0 3

rtcsync

logdir /var/log/chrony

EOF

Example output:

ICAgIHNlcnZlciBjbG9jay5yZWRoYXQuY29tIGlidXJzdAogICAgZHJpZnRmaWxlIC92YXIvb

GliL2Nocm9ueS9kcmlmdAogICAgbWFrZXN0ZXAgMS4wIDMKICAgIHJ0Y3N5bmMKICAgIGxvZ2RpciAvdmFyL2xvZy9jaHJvbnkK

2.Create the MachineConfig files, replacing the base64 string with the one you just created yourself.

Refer to this template for adding the chrony.conf to master nodes.  Make sure you modify the base64 value accordingly.

99-masters-chrony-configuration.yaml

https://ibm.box.com/s/dzzdojnmhi34lrzn7l686m521uxqmup0

Refer to this template for adding the chrony.conf to worker nodes. Make sure you modify the base64 value accordingly.

99-workers-chrony-configuration.yaml

https://ibm.box.com/s/klfbzti9ph1dy0t52bycr1w4dfdm97k7

3.Make a backup copy of the configuration files.

4.Add this file to the /openshift directory.

Prepare CoreOS Installation Files

mkdir /ibm/coreos_inst

cp /ibm/installation_directory/*.ign /ibm/coreos_inst

cp /ocp4_downloads/dependencies/rhcos-4.5.6-x86_64-metal.x86_64.raw.gz /ibm/coreos_inst/rhcos-4.5.6-x86_64-metal.x86_64.raw.gz

ln -s /ibm/coreos_inst /var/www/html/inst

chown -R apache: /var/www/html/

chmod -R 755 /var/www/html/

chown -R apache: /var/www/html/inst

chmod -R 755 /var/www/html/inst

Install Operating System

You need the VMWare VSphere EXSi administrator priviledge for performing the CoreOS operation system installation. If you can't get the administrator priviledge from customer, then maybe you can co-work with customer's IT administrator for the installation.

Login to the VMWare VSphere EXSi web console

Login to the VMWare VSphere EXSi web console with the administrator credential.

Upload the CoreOS ISO image to the datastore

Upload the rhcos-4.5.6-x86_64-installer.x86_64.iso image to the datastore.

Edit Settings for selecting the ISO image as the boot option

Select the uploaded rhcos-4.5.6-x86_64-installer.x86_64.iso image.

Start the installation

Power on the VM

You will see the screen looks like below.

Press the Tab key and then input the following entry into the dracat for the bootstrap node:

coreos.inst.install_dev=sda coreos.inst.image_url=http://9.123.120.101:81/inst/rhcos-4.5.6-x86_64-metal.x86_64.raw.gzcoreos.inst.ignition_url=http://9.123.120.101:81/inst/bootstrap.ignip=9.123.120.102::9.123.120.1:255.255.255.0:coc-g1-bootstrap.coc-g1.cdl.ibm.com:ens192:nonenameserver=9.0.148.50

Press the Enter key for kicking off the CoreOS installation.

Note:

When the the CoreOS installed successfully, it will restart and then pause at the login screen. No need to log into it.

You can ignore the following message which may occur repeatedly after the CoreOS installed.

"kernel: SELinux: mount invalid.  Same superblock, different security settings for (dev mqueue, type mqueue)" 

Repeat the CoreOS installaton steps for other cluster nodes.

The only difference when installing the CoreOS for other nodes is the dracat.

Input the following entries respectively into the dracat when installing CoreOS for other cluster nodes.

CoreOS Install for the master01 node:

coreos.inst.install_dev=sda coreos.inst.image_url=http://9.123.120.101:81/inst/rhcos-4.5.6-x86_64-metal.x86_64.raw.gzcoreos.inst.ignition_url=http://9.123.120.101:81/inst/master.ignip=9.123.120.103::9.123.120.1:255.255.255.0:coc-g1-master01.coc-g1.cdl.ibm.com:ens192:nonenameserver=9.0.148.50

CoreOS Install for the master02 node:

coreos.inst.install_dev=sda coreos.inst.image_url=http://9.123.120.101:81/inst/rhcos-4.5.6-x86_64-metal.x86_64.raw.gzcoreos.inst.ignition_url=http://9.123.120.101:81/inst/master.ignip=9.123.120.104::9.123.120.1:255.255.255.0:coc-g1-master02.coc-g1.cdl.ibm.com:ens192:nonenameserver=9.0.148.50

CoreOS Install for the master03 node:

coreos.inst.install_dev=sda coreos.inst.image_url=http://9.123.120.101:81/inst/rhcos-4.5.6-x86_64-metal.x86_64.raw.gzcoreos.inst.ignition_url=http://9.123.120.101:81/inst/master.ignip=9.123.120.105::9.123.120.1:255.255.255.0:coc-g1-master03.coc-g1.cdl.ibm.com:ens192:nonenameserver=9.0.148.50

CoreOS Install for the worker01 node:

coreos.inst.install_dev=sda coreos.inst.image_url=http://9.123.120.101:81/inst/rhcos-4.5.6-x86_64-metal.x86_64.raw.gzcoreos.inst.ignition_url=http://9.123.120.101:81/inst/worker.ignip=9.123.120.106::9.123.120.1:255.255.255.0:coc-g1-worker01.coc-g1.cdl.ibm.com:ens192:nonenameserver=9.0.148.50

CoreOS Install for the worker02 node:

coreos.inst.install_dev=sda coreos.inst.image_url=http://9.123.120.101:81/inst/rhcos-4.5.6-x86_64-metal.x86_64.raw.gzcoreos.inst.ignition_url=http://9.123.120.101:81/inst/worker.ignip=9.123.120.107::9.123.120.1:255.255.255.0:coc-g1-worker02.coc-g1.cdl.ibm.com:ens192:nonenameserver=9.0.148.50

CoreOS Install for the worker03 node:

coreos.inst.install_dev=sda coreos.inst.image_url=http://9.123.120.101:81/inst/rhcos-4.5.6-x86_64-metal.x86_64.raw.gzcoreos.inst.ignition_url=http://9.123.120.101:81/inst/worker.ignip=9.123.120.115::9.123.120.1:255.255.255.0:coc-g1-worker03.coc-g1.cdl.ibm.com:ens192:nonenameserver=9.0.148.50


Reinstall CoreOS from a previous installation failure (Optional)

Force BIOS setup

Configure the boot order and make CD-ROM Drive the first boot option.

Exit and save the changes. 

Then restart the VM and perform the CoreOS installation again.

Monitorthebootstrapinstallation 

openshift-install --dir=installation_directory wait-for bootstrap-complete --log-level=debug

Gather logs (if the above command failed):

./openshift-install gather bootstrap --bootstrap 9.123.120.102--master9.123.120.103

======================Day2 Milestone=============================================

Logging in the cluster

export KUBECONFIG=/ibm/installation_directory/auth/kubeconfig

oc whoami

Approving the CSRs for your machines

https://docs.openshift.com/container-platform/4.1/installing/installing_bare_metal/installing-bare-metal.html#installation-approve-csrs_installing-bare-metal

oc get csr

oc adm certificate approve  

Approve all:

oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | xargs oc adm certificate approve

Configure the Operators that are not available.

watch -n5 oc get clusteroperators

Configure image registry

oc describeconfigs.imageregistry.operator.openshift.io 

oc editconfigs.imageregistry.operator.openshift.io/cluster

Change from

 managementState: Removed 

to be

 managementState:Managed 

Set up NFS server in Bastion node

systemctl enable rpcbind

systemctl enable nfs-server

systemctl start rpcbind

systemctl start nfs-server

firewall-cmd --zone=public --add-port=111/tcp --permanent

firewall-cmd --zone=public --add-port=111/udp --permanent

firewall-cmd --zone=public --add-port=2049/tcp --permanent

firewall-cmd --zone=public --add-port=2049/udp --permanent

firewall-cmd --zone=public --add-port=892/tcp --permanent

firewall-cmd --zone=public --add-port=662/udp --permanent

firewall-cmd --reload

pvcreate /dev/sdc

vgcreate nfs /dev/sdc

lvcreate -L 199G -n lv_image_registry nfs

mkfs.xfs -f -n ftype=1 -i size=512 -n size=8192 /dev/nfs/lv_image_registry

Set up Image Registry with NFS Storage

mkdir -p /opt/IBM/Cloud/OpenShift/PV/images

vi /etc/fstab

/dev/nfs/lv_image_registry/opt/IBM/Cloud/OpenShift/PV/imagesxfs defaults,noatime 1 2

mount /dev/nfs/lv_image_registry /opt/IBM/Cloud/OpenShift/PV/images

vi /etc/exports

# For IBM RedHat OpenShift private image registry

/opt/IBM/Cloud/OpenShift/PV/images *(rw,sync,no_wdelay,no_root_squash,insecure,fsid=0)

Checkif configuration succeeds.

exportfs -rav

Create Storage Class image-registry-sc

Dynamic nfs provision: 

https://medium.com/faun/openshift-dynamic-nfs-persistent-volume-using-nfs-client-provisioner-fcbb8c9344e

wgethttp://9.111.98.221/repo/kubernetes-incubator.zip

Create PVC:

vi image-registry-pvc.yaml

apiVersion: v1

kind: PersistentVolumeClaim

metadata:

    name: image-registry-pvc

    namespace:openshift-image-registry

spec:

    accessModes:

        - ReadWriteMany

    resources:

        requests:

            storage: 100Gi

    storageClassName:image-registry-sc

You can use this template directly.

https://ibm.box.com/s/a5h167t86he3jy5o046g4pgv698vx4tn

oc apply -f image-registry-pvc.yaml


Editimageregistry operator

oc editconfigs.imageregistry.operator.openshift.io

change to be as follows:

  managementState: Managed

 ...............

    write:

      maxInQueue: 0

      maxRunning: 0

      maxWaitInQueue: 0s

  storage:

      pvc:

          claim: image-registry-pvc


Approve the CSR

oc get csr -o name | xargs oc adm certificate approve

Check if the image-registry pod restarted successfully.

oc get pod -n openshift-image-registry

oc rsh image-registry-xxx-xxx 

df -h | grep registry

mount | grep registry 

Completing installation on user-provisioned infrastructure

Confirm that all the cluster components are online:

watch -n5 oc get clusteroperators

When all of the cluster Operators are AVAILABLE, you can complete the installation.

Monitor forcluster operatorcompletion(worker nodes)

openshift-install --dir=installation_directory wait-for install-complete 

[root@coc-g1-bastion ibm]# openshift-install --dir=installation_directory wait-for install-complete

INFO Waiting up to 30m0s for the cluster athttps://api.coc-g1.cdl.ibm.com:6443to initialize...

INFO Waiting up to 10m0s for the openshift-console route to be created...

INFO Install complete!

INFO To access the cluster as the system:admin user when using 'oc', run 'export KUBECONFIG=/ibm/installation_directory/auth/kubeconfig'

INFO Access the OpenShift web-console here:https://console-openshift-console.apps.coc-g1.cdl.ibm.com

INFO Login to the console with user: "kubeadmin", and password: "F55t5-XeSoj-ospVr-BZTuV"

INFO Time elapsed: 0s

Expose and assess Image registry

oc policy add-role-to-user registry-viewer kubeadmin

oc policy add-role-to-user registry-editor kubeadmin

oc patchconfigs.imageregistry.operator.openshift.io/cluster--patch '{"spec":{"defaultRoute":true}}' --type=merge

HOST=$(oc get route default-route -n openshift-image-registry --template='{{ .spec.host }}')

podman login -u kubeadmin -p $(oc whoami -t) --tls-verify=false $HOST

[ Also see:https://docs.openshift.com/container-platform/4.3/registry/securing-exposing-registry.htmlon exposing the registry ]

OpenShift Console Log In

oc loginhttps://api.coc-g1.cdl.ibm.com:6443-u kubeadmin -pF55t5-XeSoj-ospVr-BZTuV

======================Day3 Milestone=============================================

References

https://access.redhat.com/solutions/3348951

"kernel: SELinux: mount invalid.  Same superblock, different security settings for (dev mqueue, type mqueue)" 

2. How to customize ignition

https://github.com/ashcrow/filetranspiler

3. OpenShift Install failed with"x509: certificate has expired or is not yet valid"

https://github.com/openshift/installer/issues/1955

echo | openssl s_client -connectapi.coc.cdl.ibm.com:6443| openssl x509 -noout -text

4. DHCP introduction

https://www.howtogeek.com/404891/what-is-dhcp-dynamic-host-configuration-protocol/

isoinfo -d -i rhcos-4.3.8-x86_64-installer.x86_64.iso | awk '/Volume id/ { print $3 }'

5. How to change boot order for VMWare vm

Edit Settings->Virtual Machine Options-> Boot Options -> Check the 'Enter BIOS Mode For Boot' -> Restart the Virtual Machine->Enter BIOS Settings->Boot->Shift+/_ for adjusting the boot order

6.How to add OpenShift 4 RHCOS Worker Nodes in UPI in new installations (< 24 hours) 

https://access.redhat.com/solutions/4246261

7.Adding worker nodes to the OCP 4 UPI cluster existing 24+ hours

https://access.redhat.com/solutions/4799921

8.Authentication Operator in Unknown state post Installation

https://access.redhat.com/solutions/4685861

9.Cannot see logs in console and oc logs, oc exec, etc give tls internal server error

https://access.redhat.com/solutions/4307511

10.openshift4-vsphere-static-ip

https://shanna-chan.blog/2019/07/26/openshift4-vsphere-static-ip/

11.https://docs.openshift.com/container-platform/4.3/installing/installing_bare_metal/installing-bare-metal.html

12.https://github.com/RedHatOfficial/ocp4-helpernode/blob/master/docs/quickstart.md

13.https://ibm.box.com/s/sm1puxxmuc8nlnjdjgyngnvvnkha758a

14.https://github.ibm.com/PrivateCloud-analytics/CEA-Zen/wiki/How-to-install-Portworx-2.5.0.1-on-RedHat-OpenShift-4.3-System

15.https://github.ibm.com/PrivateCloud-analytics/CEA-Zen/wiki/How-to-install-RHOS-Metrics-Server-for-managing-platform-resources

16.https://www.openshift.com/blog/openshift-4-2-vsphere-install-quickstart

17.https://www.openshift.com/blog/openshift-4-2-vsphere-install-with-static-ips

18. Obtaining OpenShift Container Platform packages

https://docs.openshift.com/container-platform/3.11/install/disconnected_install.html#disconnected-syncing-repos

19.https://medium.com/@zhimin.wen/airgap-disconnected-installation-of-openshift-4-2-abd7794fc7fe

20.https://www.cnblogs.com/wandering-star/p/12722609.html

21.https://docs.openshift.com/container-platform/4.3/installing/installing_bare_metal/installing-restricted-networks-bare-metal.html

22.https://access.redhat.com/documentation/en-us/openshift_container_platform/4.1/html/architecture/architecture-rhcos

23.https://www.openshift.com/blog/openshift-4.x-installation-quick-overview


24:[RHCOS] Increasing partition size:

https://access.redhat.com/solutions/4608041

https://coreos.com/os/docs/latest/adding-disk-space.html

25.Create Users on OpenShift 4:

https://medium.com/kubelancer-private-limited/create-users-on-openshift-4-dc5cfdf85661

26.

https://mirror.openshift.com/pub/openshift-v4/clients/ocp/4.5.6/release.txt

27.

https://github.com/RedHatOfficial/ocp4-helpernode

Troubleshooting

CoreOS installation hung

From bastion node

ssh [email protected](bootstrap server)

journalctl -b -f -u release-image.service -u bootkube.service

你可能感兴趣的:(OCP 4.5 Air Gap Installation with Static IP)