OCP 4.5 Air Gap Installation with Static IP
In OpenShift Container Platform 4.5, you can perform an installation that does not require an active connection to the internet to obtain software components. You complete an installation in air-gapped environment on only infrastructure that you provision, not infrastructure that the installation program provisions, so your platform selection is limited. Typically, this appraoch applies to installation on bare metal hardware or on VMware vSphere.
Note: Restricted network installations always use user-provisioned infrastructure.
Requirements
Entitlement requirement
To download the pull secret for pulling the OpenShift images, you need a Redhat subscription with OpenShfit entitlement.
Machine requirement
In this runbook. we focus on the installation with x86_64 machines. The machines could be bare metal server or VMs provisioned with VMWare VSphere ESXi. VMWare VSphere ESXi 6.7U2+ is recommended.
Note:
To maintain high availability of your cluster, use separate physical hosts for these cluster machines.
The bootstrap, control plane machines must use the Red Hat Enterprise Linux CoreOS (RHCOS) as the operating system. For easier maintenance or upgrade, it is recommened to use the RHCOS as the operating system for compute (worker) machines too.
The Disk of master and worker nodes should be SSD drive. To ensure that the storage partition has good disk I/O performance, run the disk latency test and the disk throughput test:
Disk latency test
dd if=/dev/zero of=/PVC_mount_path/testfile bs=4096 count=1000 oflag=dsync
The value must be better or comparable to: 4096000 bytes (4.1 MB, 3.9 MiB) copied, 1.5625 s, 2.5 MB/s
Disk throughput test
dd if=/dev/zero of=/PVC_mount_path/testfile bs=1G count=1 oflag=dsync
The value must be better or comparable to: 1073741824 bytes (1.1 GB) copied, 5.14444 s, 209 MB/s
Network connectivity requirements
10GB bandwidth between nodes
Prepare and set static IP address for each node.
Make sure the following ports opened on the load balancer node: 6443/tcp, 22623/tcp, 443/tcp, 80/tcp and 123/udp
User-provisioned DNS requirements
All node names, should be resolvable by the DNS server.
The following DNS records are also required and must be resolvable by both clients external to the cluster and from all the nodes within the cluster. In each record,
The following DNS A/AAAA or CNAME record must point to the load balancer for the control plane machines.
*.apps.
api.
api-int.
The following DNS A/AAAA and SRV record must point to the the control plane machines respectively.
etcd-0.
etcd-1.
etcd-2.
_etcd-server-ssl._tcp IN SRV 0 10 2380 etcd-0.
_etcd-server-ssl._tcp IN SRV 0 10 2380 etcd-1.
_etcd-server-ssl._tcp IN SRV 0 10 2380 etcd-2.
An example for showcasing the installation
Topology
To complete an air gap installation, you must create a registry that mirrors the contents of the OpenShift Container Platform registry and contains the installation media. But for most air gap installation scenarios, even the bastion node is not able to access the internet. So you may need to create this registry on a mirror host on customer site, which can access both the internet and your closed network. And a download server on customer site is recommeded. Alternatively, you can consider setting up a download server with your own machine and the copy the mirrored the contents of the OpenShift Container Platform registry and the installation media to the bastion node on customer site.
Installation environment
Download server: 9.30.56.172
VMs provisioed with VMware vSphere ESXi 6.7:
coc-g1-bastion rhel7.6 8vcpu, 32G, 150G + 500G + 500G 9.123.120.101
coc-g1-bootstrap rhcos 4vcpu, 16G, 120G 9.123.120.102
coc-g1-master01 rhcos 4vcpu, 16G, 300G 9.123.120.103
coc-g1-master02 rhcos 4vcpu, 16G, 300G 9.123.120.104
coc-g1-master03 rhcos 4vcpu, 16G, 300G 9.123.120.105
coc-g1-worker01 rhcos 16vcpu,64G, 300G 9.123.120.106
coc-g1-worker02 rhcos 16vcpu,64G, 300G 9.123.120.107
coc-g1-worker03 rhcos 16vcpu,64G, 300G 9.123.120.115
Gateway: 9.123.120.1
DNS: 9.0.148.50
Base domain:cdl.ibm.com
Cluster name: coc-g1
DNS Records:
api IN A 9.123.120.101
api-int IN A 9.123.120.101
*.apps IN A 9.123.120.101
etcd-0 IN A 9.123.120.103
etcd-1 IN A 9.123.120.104
etcd-2 IN A 9.123.120.105
_etcd-server-ssl._tcp IN SRV 0 10 2380etcd-0.coc-g1.cdl.ibm.com.
_etcd-server-ssl._tcp IN SRV 0 10 2380etcd-1.coc-g1.cdl.ibm.com.
_etcd-server-ssl._tcp IN SRV 0 10 2380etcd-2.coc-g1.cdl.ibm.com.
coc-g1-bastion IN A 9.123.120.101
coc-g1-bootstrap IN A 9.123.120.102
coc-g1-master01 IN A 9.123.120.103
coc-g1-master02 IN A 9.123.120.104
coc-g1-master03 IN A 9.123.120.105
coc-g1-worker01 IN A 9.123.120.106
coc-g1-worker02 IN A 9.123.120.107
coc-g1-worker03 IN A 9.123.120.115
Download assets for installation
Assumption: All the nodes including the bastion node can't connect to the internet.
You need to do this in an internet connected server with Redhat 7.6+ installed. Let's call it the download server here.
Connect to the download server
Stop the firewall (if active)
systemctl stop firewalld
systemctl disable firewalld
Install required packages
wgethttps://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
yum install -y epel-release-latest-7.noarch.rpm
yum -y install wget podman httpd-tools jq --nogpgcheck
Update the /etc/hosts file in the download server with the registry server information
If you are using a separate download server (full air-gapped install), you can avoid having to regenerate the self-signed certificate and tweak the json files, by temporarily adding the registry server in the /etc/hosts file.
In the below example, thejhwcx1.fyre.ibm.comserver is connected to the internet and is used to download all packages and images; thecoc-g1-bastion.coc-g1.cdl.ibm.comserver is the registry server (Bastion node) which is accessible from the OpenShift cluster. In reality, thecoc-g1-bastion.coc-g1.cdl.ibm.comserver has IP address 9.123.120.101.
[root@jhwcx1 ~]#cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
10.11.17.215jhwcx1.fyre.ibm.comjhwcx1
10.11.17.215coc-g1-bastion.coc-g1.cdl.ibm.com
Set environment variables
Make sure you adapt the variables below to your environment.
export REGISTRY_SERVER=coc-g1-bastion.coc-g1.cdl.ibm.com
export REGISTRY_PORT=5000
export LOCAL_REGISTRY="${REGISTRY_SERVER}:${REGISTRY_PORT}"
exportEMAIL="[email protected]"
export REGISTRY_USER="admin"
export REGISTRY_PASSWORD="passw0rd"
export OCP_RELEASE="4.5.6"
export RHCOS_RELEASE="4.5.6"
export LOCAL_REPOSITORY="ocp4/openshift4"
export PRODUCT_REPO="openshift-release-dev"
export LOCAL_SECRET_JSON="/ocp4_downloads/ocp4_install/ocp_pullsecret.json"
export RELEASE_NAME="ocp-release"
Prepare OpenShift download directory
mkdir -p /ocp4_downloads/{clients,dependencies,ocp4_install}
mkdir -p /ocp4_downloads/registry/{auth,certs,data,images}
Retrieve OpenShift client and CoreOS downloads
cd /ocp4_downloads/clients
wgethttps://mirror.openshift.com/pub/openshift-v4/clients/ocp/4.5.6/openshift-client-linux.tar.gz
or
wgethttp://9.111.98.221/repo/openshift-client-linux.tar.gz
wgethttps://mirror.openshift.com/pub/openshift-v4/clients/ocp/4.5.6/openshift-install-linux.tar.gz
or
wgethttp://9.111.98.221/repo/openshift-install-linux.tar.gz
cd /ocp4_downloads/dependencies
wgethttps://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/latest/latest/rhcos-4.5.6-x86_64-installer.x86_64.iso
or
wgethttp://9.111.98.221/repo/rhcos-4.5.6-x86_64-installer.x86_64.iso
wgethttps://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/latest/latest/rhcos-4.5.6-x86_64-metal.x86_64.raw.gz
or
wgethttp://9.111.98.221/repo/rhcos-4.5.6-x86_64-metal.x86_64.raw.gz
Install OpenShift client
tar xvzf /ocp4_downloads/clients/openshift-client-linux.tar.gz -C /usr/local/bin
Generate certificate
cd /ocp4_downloads/registry/certs
openssl req -newkey rsa:4096 -nodes -sha256 -keyout registry.key -x509 -days 365 -out registry.crt -subj "/C=US/ST=/L=/O=/CN=$REGISTRY_SERVER"
Create password for registry
Change the password to something more secure if you want to.
htpasswd -bBc /ocp4_downloads/registry/auth/htpasswd $REGISTRY_USER $REGISTRY_PASSWORD
Download registry image
cd /ocp4_downloads/registry/images
wgethttp://9.111.98.221/repo/registry2.tar
podman load -iregistry2.tar
Alternatively:
podman pulldocker.io/library/registry:2
podman save -o /ocp4_downloads/registry/images/registry2.tardocker.io/library/registry:2
Download NFS provisioner image
cd /ocp4_downloads/registry/images
wgethttp://9.111.98.221/repo/nfs-client-provisioner.tar
Alternatively:
podman pullquay.io/external_storage/nfs-client-provisioner:latest
podman save -o /ocp4_downloads/registry/images/nfs-client-provisioner.tarquay.io/external_storage/nfs-client-provisioner:latest
Create registry pod
podman run --name mirror-registry --publish $REGISTRY_PORT:5000 --detach --volume /ocp4_downloads/registry/data:/var/lib/registry:z --volume /ocp4_downloads/registry/auth:/auth:z --volume /ocp4_downloads/registry/certs:/certs:z --env "REGISTRY_AUTH=htpasswd" --env "REGISTRY_AUTH_HTPASSWD_REALM=Registry Realm" --env REGISTRY_AUTH_HTPASSWD_PATH=/auth/htpasswd --env REGISTRY_HTTP_TLS_CERTIFICATE=/certs/registry.crt --env REGISTRY_HTTP_TLS_KEY=/certs/registry.keydocker.io/library/registry:2
Add certificate to trusted store
cp -f /ocp4_downloads/registry/certs/registry.crt /etc/pki/ca-trust/source/anchors/
update-ca-trust
Check if you can connect to the registry
curl -u $REGISTRY_USER:$REGISTRY_PASSWORD https://${LOCAL_REGISTRY}/v2/_catalog
Output should be:
{"repositories":[]}
Create pull secret file
Create file /tmp/ocp_pullsecret.json and insert the contents of the pull secret you retrieved from:
https://cloud.redhat.com/openshift/install/metal/user-provisioned
Generate air-gapped pull secret
The air-gapped pull secret will be used when installing OpenShift.
AUTH=$(echo -n "$REGISTRY_USER:$REGISTRY_PASSWORD" | base64 -w0)
CUST_REG='{"%s": {"auth":"%s", "email":"%s"}}\n'
printf "$CUST_REG" "$LOCAL_REGISTRY" "$AUTH" "$EMAIL" > /tmp/local_reg.json
jq --argjson authinfo "$( /ocp4_downloads/ocp4_install/ocp_pullsecret.json
cat /ocp4_downloads/ocp4_install/ocp_pullsecret.json | jq
The contents of the /ocp4_downloads/ocp4_install/ocp_pullsecret.json should be something like this:
{
"auths": {
"cloud.openshift.com": {
"auth": "xxx",
"email":"[email protected]"
},
"quay.io": {
"auth": "xxx",
"email":"[email protected]"
},
"registry.connect.redhat.com": {
"auth": "xxx",
"email":"[email protected]"
},
"registry.redhat.io": {
"auth": "xxx",
"email":"[email protected]"
},
"coc-g1-bastion.coc-g1.cdl.ibm.com:5000": {
"auth": "YWRtaW46cGFzc3cwcmQ=",
"email":"[email protected]"
}
}
}
Mirror registry
Start a screen session.
The screen command launches a terminal in the background which can be detached from and then reconnected to. This is especially useful when you log in to the system remotely. You can start a screen, kick off a command, detach from the screen, and log out. You can then log in later and reattach to the screen and see the program running.
screen
This takes 5-10 minutes to complete.
oc adm -a ${LOCAL_SECRET_JSON} release mirror --from=quay.io/${PRODUCT_REPO}/${RELEASE_NAME}:${OCP_RELEASE}-x86_64 --to=${LOCAL_REGISTRY}/${LOCAL_REPOSITORY} --to-release-image=${LOCAL_REGISTRY}/${LOCAL_REPOSITORY}:${OCP_RELEASE}
Output should be something like this:
Success
Update image: coc-g1-bastion.coc-g1.cdl.ibm.com:5000/ocp4/openshift4:4.5.6
Mirror prefix:coc-g1-bastion.coc-g1.cdl.ibm.com:5000/ocp4/openshift4
To use the new mirrored repository to install, add the following section to the install-config.yaml:
imageContentSources:
- mirrors:
-coc-g1-bastion.coc-g1.cdl.ibm.com:5000/ocp4/openshift4
source:quay.io/openshift-release-dev/ocp-release
- mirrors:
-coc-g1-bastion.coc-g1.cdl.ibm.com:5000/ocp4/openshift4
source:quay.io/openshift-release-dev/ocp-v4.0-art-dev
To use the new mirrored repository for upgrades, use the following to create an ImageContentSourcePolicy:
apiVersion:operator.openshift.io/v1alpha1
kind: ImageContentSourcePolicy
metadata:
name: example
spec:
repositoryDigestMirrors:
- mirrors:
-coc-g1-bastion.coc-g1.cdl.ibm.com:5000/ocp4/openshift4
source:quay.io/openshift-release-dev/ocp-release
- mirrors:
-coc-g1-bastion.coc-g1.cdl.ibm.com:5000/ocp4/openshift4
source:quay.io/openshift-release-dev/ocp-v4.0-art-dev
Note:
Record the entireimageContentSourcessection from the output of the previous command. The information about your mirrors is unique to your mirrored repository, and you must add the imageContentSources section to the install-config.yaml file during installation.
To create the installation program that is based on the content that you mirrored, extract it and pin it to the release.
cd /ocp4_downloads/clients
oc adm -a ${LOCAL_SECRET_JSON} release extract --command=openshift-install "${LOCAL_REGISTRY}/${LOCAL_REPOSITORY}:${OCP_RELEASE}" --loglevel=10
cp openshift-install /usr/local/bin/
openshift-install -h
openshift-install version
Note:
To ensure that you use the correct images for the version of OpenShift Container Platform that you selected, you must extract the installation program from the mirrored content.
You must perform this step on a machine with an active internet connection.
Package up downloads directory and ship it to the Bastion node
If your registry server cannot be connected to the internet, you will have to create a tar ball and send it to the registry server.
Stop the registry
podman rm -f mirror-registry
Remove the registry server from the /etc/hosts file
You can now remove the registry server entry from the /etc/hosts file on the download server.
Tar the downloads directory on the downloads server.This will create a tar ball of ~5GB.
tar czf /tmp/ocp4_downloads.tar.gz /ocp4_downloads
Send the tar ball to the registry server (bastion node)
The way the tar ball is shipped is dependent on how the registry server can be reached. Either scp, some kind if shared folder or plain USB sticks may have to be used.
Download RHEL RPMs and ship it to the Bastion node
Creating a Local Repository and Sharing With Disconnected/Offline/Air-gapped Systems
https://access.redhat.com/solutions/3176811
RHEL 7 RPM downloads for creating local repository
https://docs.openshift.com/container-platform/3.11/install/disconnected_install.html#disconnected-syncing-repos
Alternatively:
cd /tmp
wgethttp://9.111.98.221/repo/rhel-7-server.tgz
wgethttp://9.111.98.221/repo/rhel-7-server-extras.tgz
wgethttps://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
Send the tar ball to the registry server (bastion node). Suppose you send these files to the folder on the bastion node./ocp4_downloads/
The way the tar ball is shipped is dependent on how the bastion node can be reached. Either scp, some kind if shared folder or plain USB sticks may have to be used.
======================Day1 Milestone=============================================
Extend Storage for Bastion node
Log on to the bastion node as root
pvcreate /dev/sdb
vgextend rhel /dev/sdb
lvextend -l +100%FREE /dev/rhel/root
xfs_growfs /
df -h
Note: no need for group1 and group 2 because it has been done.
Serve the registry on Bastion node
Log on to the bastion node
Log on to the bastion node as root
Enable required repositories on the bastion node
You must install certain packages on the bastion node (and optionally NFS) for the installation. These will come from Red Hat Enterprise Linux and EPEL repositories.
Make sure the following repositories are available from the Bastion node or the satellite server in use for the infrastructure:
rhel-server-rpms - Red Hat Enterprise Linux Server (RPMs)
As the Bastion node is offiline, we can refer to the followiing link for creating the local respository.
Creating a Local Repository and Sharing With Disconnected/Offline/Air-gapped Systems:
https://access.redhat.com/solutions/3176811
tar -xvf rhel-7-server.tgz
tar -xvf rhel-7-server-extras.tgz
(This is only for Ring Cloud in Lab.)
systemctl stop puppet
systemctl disable puppet
vi /etc/yum.repos.d/local.repo
[rhel-7-server-rpms]
name=rhel-7-server-rpms
baseurl=file:///ocp4_downloads/rhel-7-server-rpms
enabled=1
gpgcheck=0
[rhel-7-server-extras-rpms]
name=rhel-7-server-extras-rpms
baseurl=file:///ocp4_downloads/rhel-7-server-extras-rpms
enabled=1
gpgcheck=0
For EPEL, you need the following repository:
epel/x86_64 - Extra Packages for Enterprise Linux - x86_64
If you don't have this repository configured yet, you can do as as follows for RHEL-8:
yum install -y /ocp4_downloads/epel-release-latest-8.noarch.rpm
For RHEL-7, do the following:
yum install -y /ocp4_downloads/epel-release-latest-7.noarch.rpm
mv /etc/yum.repos.d/itaas* /tmp
rm -rf /var/cache/yum
yum repolist
Make sure therhel-7-server-rpmsandrhel-7-server-extras-rpms listed.
Change SELinux to be Permissive on the Basion node
Some services such as httpd nginx and haproxy require special settings to allow for running under SElinux. As we're targeting the installation steps, disable SELinux.
sed -i 's/SELINUX=enforcing/SELINUX=permissive/g' /etc/selinux/config
setenforce 0
cat /etc/selinux/config
sestatus
getenforce
Stop and disable firewalld on the Bastion node
systemctl stop firewalld
systemctl disable firewalld
Install required packages
yum -y install wget podman httpd httpd-tools jq net-tools tree bind-utils nfs-utils screen python3 jq yum-utils chrony --nogpgcheck
podman images
podman version
Check and modify the /etc/hosts file
Check the /etc/hosts file and ensure that it has the correct entry for the registry server, such as:
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
9.123.120.101coc-g1-bastion.coc-g1.cdl.ibm.comcoc-g1-bastion
Untar the tar ball on the registry server
scp [email protected]:/tmp/ocp4_downloads.tar.gz /tmp/ocp4_downloads.tar.gz
tar xzf /tmp/ocp4_downloads.tar.gz -C /
Set environment variables
export REGISTRY_SERVER="coc-g1-bastion.coc-g1.cdl.ibm.com"
export REGISTRY_PORT=5000
export LOCAL_REGISTRY="${REGISTRY_SERVER}:${REGISTRY_PORT}"
export REGISTRY_USER="admin"
export REGISTRY_PASSWORD="passw0rd"
export LOCAL_REPOSITORY="ocp4/openshift4"
export LOCAL_SECRET_JSON="/ocp4_downloads/ocp4_install/ocp_pullsecret.json"
Create registry pod on the bastion node
podman load -i /ocp4_downloads/registry/images/registry2.tar
podman run --name mirror-registry --publish $REGISTRY_PORT:5000 --detach --volume /ocp4_downloads/registry/data:/var/lib/registry:z --volume /ocp4_downloads/registry/auth:/auth:z --volume /ocp4_downloads/registry/certs:/certs:z --env "REGISTRY_AUTH=htpasswd" --env "REGISTRY_AUTH_HTPASSWD_REALM=Registry Realm" --env REGISTRY_AUTH_HTPASSWD_PATH=/auth/htpasswd --env REGISTRY_HTTP_TLS_CERTIFICATE=/certs/registry.crt --env REGISTRY_HTTP_TLS_KEY=/certs/registry.keydocker.io/library/registry:2
Add certificate to trusted store on the new registry server
/usr/bin/cp -f /ocp4_downloads/registry/certs/registry.crt /etc/pki/ca-trust/source/anchors/
update-ca-trust
Check if you can connect to the registry
curl -u $REGISTRY_USER:$REGISTRY_PASSWORD https://${LOCAL_REGISTRY}/v2/_catalog
The output should be as follows (as the registry now has content):
{"repositories":["ocp4/openshift4"]}
curl -u $REGISTRY_USER:$REGISTRY_PASSWORD https://${LOCAL_REGISTRY}/v2/ocp4/openshift4/tags/list | jq
Create systemd unit file to ensure the registry is started after reboot
podman generate systemd mirror-registry -n > /etc/systemd/system/container-mirror-registry.service
systemctl enable container-mirror-registry.service
systemctl daemon-reload
Create push secret for the registry
podman login --authfile push-secret.jsoncoc-g1-bastion.coc-g1.cdl.ibm.com:5000
Input the user name and password created:
Username:admin
Password:passw0rd
Login Succeeded!
Check the created Push Secret:
cat push-secret.json
Get the pull secret from RedHat's official site
Go tohttps://cloud.redhat.com/openshift/install/metal
And thendownload the pull secret by clicking 'Dowload pull secret' or copy it by clicking 'Copy pull secret' and save it to a text filepull-secret.txt.
Note: You need a RedHat account for downloading the pull secret.
Merge the push secret and the pull secret
Upload the pull secret filepull-secret.txtto the same directory as what the push secret file push-secret.json resides in.
Merge thepull-secret.txt andpush-secret.json into a new filepull-push_secret.json.
cat pull-secret.txt | jq . > pull-push_secret.json
cat pull-push_secret.json
At the end of the third line to the last, add the content of thepush-secret.json to thepull-push_secret.json.
It would be like this:
[root@coc-g1-bastion ~]#
cat pull-push_secret.json
{
"auths": {
"cloud.openshift.com": {
"auth": "xxx",
"email":"[email protected]"
},
"quay.io": {
"auth": "xxx",
"email":"[email protected]"
},
"registry.connect.redhat.com": {
"auth": "xxx",
"email":"[email protected]"
},
"registry.redhat.io": {
"auth": "xxx",
"email":"[email protected]"
},
"coc-g1-bastion.coc-g1.cdl.ibm.com:5000": {
"auth": "YWRtaW46cGFzc3cwcmQ="
}
}
}
Check if the modification is correct.
cat pull-push_secret.json | jq
Config Http Server
yum install –y httpd
systemctl enable httpd
systemctl start httpd
ln -s /ocp4_downloads /var/www/html/ocp4
chown -R apache: /var/www/html/
chmod -R 755 /var/www/html/
vi /etc/httpd/conf/httpd.conf
From:
Listen 80
To be:
Listen 81
Note: it's recommended to use 81 instead of 80 port if you will config both HAProxy and Http server in the bastion node.
systemctl restart httpd
Check that we can list entries:
curl -L -s http://${REGISTRY_SERVER}:81/ocp4 --list-only
Output should be:
[root@coc-g1-bastion ocp4_downloads]# curl -L -s http://${REGISTRY_SERVER}:81/ocp4 --list-only
Index of /ocp4
............
Prepare the installation files
cd /ocp4_downloads/clients
tar xvzf /ocp4_downloads/clients/openshift-client-linux.tar.gz -C /usr/local/bin
tar xvzf /ocp4_downloads/clients/openshift-install-linux.tar.gz -C /usr/local/bin
GenerateSSH key for SSH access to cluster nodes
ssh-keygen -t rsa -b 4096 -N ''
eval "$(ssh-agent -s)"
ssh-add ~/.ssh/id_rsa
Note: it's ~/.ssh/id_rsa.pub rather than ~/.ssh/id_rsa will be specified in install-config.yaml
Set up Load Balancer/HAProxy
yum install -y haproxy
systemctl enable haproxy
systemctl start haproxy
systemctl status haproxy
vi /etc/haproxy/haproxy.cfg
For details, please copy a haproxy.cfg file from a workable environment.
haproxy templatehttps://ibm.box.com/s/nco6340m1t4yrk945w4xm4bgtgqsqn2c
systemctl restart haproxy
Create installation configuration file
mkdir /ibm
cd /ibm
mkdir -p installation_directory
cd installation_directory
vi install-config.yaml
You can follow this install-config template:
https://ibm.box.com/s/6ivavyiaxm6uwf6qo9dcime36gtm7guv
Note:
1) It's ~/.ssh/id_rsa.pub rather than ~/.ssh/id_rsa will be specified in install-config.yaml
2) The pullSecret should be specified with the push secrect we created for accessing the local image registry.
3) The additionalTrustBundle should be specified with the certificate we created for accessing the local image registry.
In this installation case, it should be specified with the content of the certificate file /ocp4_downloads/registry/certs/registry.crt. And it should be one line. If there are multiple lines in the certificate file, you need to combine it into one line.
4) The content of the imageContentSources should be from the imageContentSources section of the output when doing mirror registry.
imageContentSources:
- mirrors:
-coc-g1-bastion.coc-g1.cdl.ibm.com:5000/ocp4/openshift4
source:quay.io/openshift-release-dev/ocp-release
- mirrors:
-coc-g1-bastion.coc-g1.cdl.ibm.com:5000/ocp4/openshift4
source:quay.io/openshift-release-dev/ocp-v4.0-art-dev
Note:
We'd better backup the installation_config.yaml file as it would be deleted during the installation.
cp install-config.yaml /ibm/install-config.yaml
cd ..
If the certificate and ignition files expired after 24hours and OCP still hadn't installed, start from here again:
Generate the Kubernetes manifest and Ignition config files
Generate manifests
openshift-install create manifests --dir=installation_directory
Example output:
INFO Consuming Install Config from target directory
WARNING Making control-plane schedulable by setting MastersSchedulable to true for Scheduler cluster settings
For
Because you create your own compute machines later in the installation process, you can safely ignore this warning.
Note:
Modify the
Open the
Locate the mastersSchedulable parameter and set its value to False.
Save and exit the file.
Generate ignition files
cd /ibm
openshift-install create ignition-configs --dir=installation_directory
The following files are generated in the directory:
.
├── auth
│ ├── kubeadmin-password
│ └── kubeconfig
├── bootstrap.ign
├── master.ign
├── metadata.json
└── worker.ign
Note:
1.The certificate and ignition files will expire after 24hours
2.For each install, you need to backup installation_directory and then delete the installation_directory.
Configuring time synchronization service with chrony
Configure chrony server
1.Installing the chrony time service
yum install -y chrony
systemctl enable chronyd
systemctl start chronyd
systemctl status chronyd
2.Configuring Chrony
The UDP port number 123 needs to be open in the firewall in order to allow the client access.
firewall-cmd --permanent --zone=public --add-port=123/udp
firewall-cmd --reload
vi /etc/chrony.conf
For the details, please refer to this template and modify the server and allow sections accordingly. It is recommended to take the Bastion node as the chrony server.
https://ibm.box.com/s/vgo11wdax74ubx6hlfvkjckxyouy64vx
systemctl restart chronyd
3.Verify
Run the chronyc tracking command to check chrony tracking.
chronyc tracking
Some of the important fields in the output are :
Reference ID: This is the reference ID and name (or IP address) if available, of the server to which the computer is currently synchronized.
Stratum: The stratum indicates how many hops away from a computer with an attached reference clock you are.
Ref time: This is the time (UT C) at which the last measurement from the reference source was processed.
Configure chrony client
1.Create the contents of the chrony.conf file and encode it as base64.
For example:
$ cat << EOF | base64
server 9.123.120.101 iburst
driftfile /var/lib/chrony/drift
makestep 1.0 3
rtcsync
logdir /var/log/chrony
EOF
Example output:
ICAgIHNlcnZlciBjbG9jay5yZWRoYXQuY29tIGlidXJzdAogICAgZHJpZnRmaWxlIC92YXIvb
GliL2Nocm9ueS9kcmlmdAogICAgbWFrZXN0ZXAgMS4wIDMKICAgIHJ0Y3N5bmMKICAgIGxvZ2RpciAvdmFyL2xvZy9jaHJvbnkK
2.Create the MachineConfig files, replacing the base64 string with the one you just created yourself.
Refer to this template for adding the chrony.conf to master nodes. Make sure you modify the base64 value accordingly.
99-masters-chrony-configuration.yaml
https://ibm.box.com/s/dzzdojnmhi34lrzn7l686m521uxqmup0
Refer to this template for adding the chrony.conf to worker nodes. Make sure you modify the base64 value accordingly.
99-workers-chrony-configuration.yaml
https://ibm.box.com/s/klfbzti9ph1dy0t52bycr1w4dfdm97k7
3.Make a backup copy of the configuration files.
4.Add this file to the /openshift directory.
Prepare CoreOS Installation Files
mkdir /ibm/coreos_inst
cp /ibm/installation_directory/*.ign /ibm/coreos_inst
cp /ocp4_downloads/dependencies/rhcos-4.5.6-x86_64-metal.x86_64.raw.gz /ibm/coreos_inst/rhcos-4.5.6-x86_64-metal.x86_64.raw.gz
ln -s /ibm/coreos_inst /var/www/html/inst
chown -R apache: /var/www/html/
chmod -R 755 /var/www/html/
chown -R apache: /var/www/html/inst
chmod -R 755 /var/www/html/inst
Install Operating System
You need the VMWare VSphere EXSi administrator priviledge for performing the CoreOS operation system installation. If you can't get the administrator priviledge from customer, then maybe you can co-work with customer's IT administrator for the installation.
Login to the VMWare VSphere EXSi web console
Login to the VMWare VSphere EXSi web console with the administrator credential.
Upload the CoreOS ISO image to the datastore
Upload the rhcos-4.5.6-x86_64-installer.x86_64.iso image to the datastore.
Edit Settings for selecting the ISO image as the boot option
Select the uploaded rhcos-4.5.6-x86_64-installer.x86_64.iso image.
Start the installation
Power on the VM
You will see the screen looks like below.
Press the Tab key and then input the following entry into the dracat for the bootstrap node:
coreos.inst.install_dev=sda coreos.inst.image_url=http://9.123.120.101:81/inst/rhcos-4.5.6-x86_64-metal.x86_64.raw.gzcoreos.inst.ignition_url=http://9.123.120.101:81/inst/bootstrap.ignip=9.123.120.102::9.123.120.1:255.255.255.0:coc-g1-bootstrap.coc-g1.cdl.ibm.com:ens192:nonenameserver=9.0.148.50
Press the Enter key for kicking off the CoreOS installation.
Note:
When the the CoreOS installed successfully, it will restart and then pause at the login screen. No need to log into it.
You can ignore the following message which may occur repeatedly after the CoreOS installed.
"kernel: SELinux: mount invalid. Same superblock, different security settings for (dev mqueue, type mqueue)"
Repeat the CoreOS installaton steps for other cluster nodes.
The only difference when installing the CoreOS for other nodes is the dracat.
Input the following entries respectively into the dracat when installing CoreOS for other cluster nodes.
CoreOS Install for the master01 node:
coreos.inst.install_dev=sda coreos.inst.image_url=http://9.123.120.101:81/inst/rhcos-4.5.6-x86_64-metal.x86_64.raw.gzcoreos.inst.ignition_url=http://9.123.120.101:81/inst/master.ignip=9.123.120.103::9.123.120.1:255.255.255.0:coc-g1-master01.coc-g1.cdl.ibm.com:ens192:nonenameserver=9.0.148.50
CoreOS Install for the master02 node:
coreos.inst.install_dev=sda coreos.inst.image_url=http://9.123.120.101:81/inst/rhcos-4.5.6-x86_64-metal.x86_64.raw.gzcoreos.inst.ignition_url=http://9.123.120.101:81/inst/master.ignip=9.123.120.104::9.123.120.1:255.255.255.0:coc-g1-master02.coc-g1.cdl.ibm.com:ens192:nonenameserver=9.0.148.50
CoreOS Install for the master03 node:
coreos.inst.install_dev=sda coreos.inst.image_url=http://9.123.120.101:81/inst/rhcos-4.5.6-x86_64-metal.x86_64.raw.gzcoreos.inst.ignition_url=http://9.123.120.101:81/inst/master.ignip=9.123.120.105::9.123.120.1:255.255.255.0:coc-g1-master03.coc-g1.cdl.ibm.com:ens192:nonenameserver=9.0.148.50
CoreOS Install for the worker01 node:
coreos.inst.install_dev=sda coreos.inst.image_url=http://9.123.120.101:81/inst/rhcos-4.5.6-x86_64-metal.x86_64.raw.gzcoreos.inst.ignition_url=http://9.123.120.101:81/inst/worker.ignip=9.123.120.106::9.123.120.1:255.255.255.0:coc-g1-worker01.coc-g1.cdl.ibm.com:ens192:nonenameserver=9.0.148.50
CoreOS Install for the worker02 node:
coreos.inst.install_dev=sda coreos.inst.image_url=http://9.123.120.101:81/inst/rhcos-4.5.6-x86_64-metal.x86_64.raw.gzcoreos.inst.ignition_url=http://9.123.120.101:81/inst/worker.ignip=9.123.120.107::9.123.120.1:255.255.255.0:coc-g1-worker02.coc-g1.cdl.ibm.com:ens192:nonenameserver=9.0.148.50
CoreOS Install for the worker03 node:
coreos.inst.install_dev=sda coreos.inst.image_url=http://9.123.120.101:81/inst/rhcos-4.5.6-x86_64-metal.x86_64.raw.gzcoreos.inst.ignition_url=http://9.123.120.101:81/inst/worker.ignip=9.123.120.115::9.123.120.1:255.255.255.0:coc-g1-worker03.coc-g1.cdl.ibm.com:ens192:nonenameserver=9.0.148.50
Reinstall CoreOS from a previous installation failure (Optional)
Force BIOS setup
Configure the boot order and make CD-ROM Drive the first boot option.
Exit and save the changes.
Then restart the VM and perform the CoreOS installation again.
Monitorthebootstrapinstallation
openshift-install --dir=installation_directory wait-for bootstrap-complete --log-level=debug
Gather logs (if the above command failed):
./openshift-install gather bootstrap --bootstrap 9.123.120.102--master9.123.120.103
======================Day2 Milestone=============================================
Logging in the cluster
export KUBECONFIG=/ibm/installation_directory/auth/kubeconfig
oc whoami
Approving the CSRs for your machines
https://docs.openshift.com/container-platform/4.1/installing/installing_bare_metal/installing-bare-metal.html#installation-approve-csrs_installing-bare-metal
oc get csr
oc adm certificate approve
Approve all:
oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | xargs oc adm certificate approve
Configure the Operators that are not available.
watch -n5 oc get clusteroperators
Configure image registry
oc describeconfigs.imageregistry.operator.openshift.io
oc editconfigs.imageregistry.operator.openshift.io/cluster
Change from
managementState: Removed
to be
managementState:Managed
Set up NFS server in Bastion node
systemctl enable rpcbind
systemctl enable nfs-server
systemctl start rpcbind
systemctl start nfs-server
firewall-cmd --zone=public --add-port=111/tcp --permanent
firewall-cmd --zone=public --add-port=111/udp --permanent
firewall-cmd --zone=public --add-port=2049/tcp --permanent
firewall-cmd --zone=public --add-port=2049/udp --permanent
firewall-cmd --zone=public --add-port=892/tcp --permanent
firewall-cmd --zone=public --add-port=662/udp --permanent
firewall-cmd --reload
pvcreate /dev/sdc
vgcreate nfs /dev/sdc
lvcreate -L 199G -n lv_image_registry nfs
mkfs.xfs -f -n ftype=1 -i size=512 -n size=8192 /dev/nfs/lv_image_registry
Set up Image Registry with NFS Storage
mkdir -p /opt/IBM/Cloud/OpenShift/PV/images
vi /etc/fstab
/dev/nfs/lv_image_registry/opt/IBM/Cloud/OpenShift/PV/imagesxfs defaults,noatime 1 2
mount /dev/nfs/lv_image_registry /opt/IBM/Cloud/OpenShift/PV/images
vi /etc/exports
# For IBM RedHat OpenShift private image registry
/opt/IBM/Cloud/OpenShift/PV/images *(rw,sync,no_wdelay,no_root_squash,insecure,fsid=0)
Checkif configuration succeeds.
exportfs -rav
Create Storage Class image-registry-sc
Dynamic nfs provision:
https://medium.com/faun/openshift-dynamic-nfs-persistent-volume-using-nfs-client-provisioner-fcbb8c9344e
wgethttp://9.111.98.221/repo/kubernetes-incubator.zip
Create PVC:
vi image-registry-pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: image-registry-pvc
namespace:openshift-image-registry
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 100Gi
storageClassName:image-registry-sc
You can use this template directly.
https://ibm.box.com/s/a5h167t86he3jy5o046g4pgv698vx4tn
oc apply -f image-registry-pvc.yaml
Editimageregistry operator
oc editconfigs.imageregistry.operator.openshift.io
change to be as follows:
managementState: Managed
...............
write:
maxInQueue: 0
maxRunning: 0
maxWaitInQueue: 0s
storage:
pvc:
claim: image-registry-pvc
Approve the CSR
oc get csr -o name | xargs oc adm certificate approve
Check if the image-registry pod restarted successfully.
oc get pod -n openshift-image-registry
oc rsh image-registry-xxx-xxx
df -h | grep registry
mount | grep registry
Completing installation on user-provisioned infrastructure
Confirm that all the cluster components are online:
watch -n5 oc get clusteroperators
When all of the cluster Operators are AVAILABLE, you can complete the installation.
Monitor forcluster operatorcompletion(worker nodes)
openshift-install --dir=installation_directory wait-for install-complete
[root@coc-g1-bastion ibm]# openshift-install --dir=installation_directory wait-for install-complete
INFO Waiting up to 30m0s for the cluster athttps://api.coc-g1.cdl.ibm.com:6443to initialize...
INFO Waiting up to 10m0s for the openshift-console route to be created...
INFO Install complete!
INFO To access the cluster as the system:admin user when using 'oc', run 'export KUBECONFIG=/ibm/installation_directory/auth/kubeconfig'
INFO Access the OpenShift web-console here:https://console-openshift-console.apps.coc-g1.cdl.ibm.com
INFO Login to the console with user: "kubeadmin", and password: "F55t5-XeSoj-ospVr-BZTuV"
INFO Time elapsed: 0s
Expose and assess Image registry
oc policy add-role-to-user registry-viewer kubeadmin
oc policy add-role-to-user registry-editor kubeadmin
oc patchconfigs.imageregistry.operator.openshift.io/cluster--patch '{"spec":{"defaultRoute":true}}' --type=merge
HOST=$(oc get route default-route -n openshift-image-registry --template='{{ .spec.host }}')
podman login -u kubeadmin -p $(oc whoami -t) --tls-verify=false $HOST
[ Also see:https://docs.openshift.com/container-platform/4.3/registry/securing-exposing-registry.htmlon exposing the registry ]
OpenShift Console Log In
oc loginhttps://api.coc-g1.cdl.ibm.com:6443-u kubeadmin -pF55t5-XeSoj-ospVr-BZTuV
======================Day3 Milestone=============================================
References
https://access.redhat.com/solutions/3348951
"kernel: SELinux: mount invalid. Same superblock, different security settings for (dev mqueue, type mqueue)"
2. How to customize ignition
https://github.com/ashcrow/filetranspiler
3. OpenShift Install failed with"x509: certificate has expired or is not yet valid"
https://github.com/openshift/installer/issues/1955
echo | openssl s_client -connectapi.coc.cdl.ibm.com:6443| openssl x509 -noout -text
4. DHCP introduction
https://www.howtogeek.com/404891/what-is-dhcp-dynamic-host-configuration-protocol/
isoinfo -d -i rhcos-4.3.8-x86_64-installer.x86_64.iso | awk '/Volume id/ { print $3 }'
5. How to change boot order for VMWare vm
Edit Settings->Virtual Machine Options-> Boot Options -> Check the 'Enter BIOS Mode For Boot' -> Restart the Virtual Machine->Enter BIOS Settings->Boot->Shift+/_ for adjusting the boot order
6.How to add OpenShift 4 RHCOS Worker Nodes in UPI in new installations (< 24 hours)
https://access.redhat.com/solutions/4246261
7.Adding worker nodes to the OCP 4 UPI cluster existing 24+ hours
https://access.redhat.com/solutions/4799921
8.Authentication Operator in Unknown state post Installation
https://access.redhat.com/solutions/4685861
9.Cannot see logs in console and oc logs, oc exec, etc give tls internal server error
https://access.redhat.com/solutions/4307511
10.openshift4-vsphere-static-ip
https://shanna-chan.blog/2019/07/26/openshift4-vsphere-static-ip/
11.https://docs.openshift.com/container-platform/4.3/installing/installing_bare_metal/installing-bare-metal.html
12.https://github.com/RedHatOfficial/ocp4-helpernode/blob/master/docs/quickstart.md
13.https://ibm.box.com/s/sm1puxxmuc8nlnjdjgyngnvvnkha758a
14.https://github.ibm.com/PrivateCloud-analytics/CEA-Zen/wiki/How-to-install-Portworx-2.5.0.1-on-RedHat-OpenShift-4.3-System
15.https://github.ibm.com/PrivateCloud-analytics/CEA-Zen/wiki/How-to-install-RHOS-Metrics-Server-for-managing-platform-resources
16.https://www.openshift.com/blog/openshift-4-2-vsphere-install-quickstart
17.https://www.openshift.com/blog/openshift-4-2-vsphere-install-with-static-ips
18. Obtaining OpenShift Container Platform packages
https://docs.openshift.com/container-platform/3.11/install/disconnected_install.html#disconnected-syncing-repos
19.https://medium.com/@zhimin.wen/airgap-disconnected-installation-of-openshift-4-2-abd7794fc7fe
20.https://www.cnblogs.com/wandering-star/p/12722609.html
21.https://docs.openshift.com/container-platform/4.3/installing/installing_bare_metal/installing-restricted-networks-bare-metal.html
22.https://access.redhat.com/documentation/en-us/openshift_container_platform/4.1/html/architecture/architecture-rhcos
23.https://www.openshift.com/blog/openshift-4.x-installation-quick-overview
24:[RHCOS] Increasing partition size:
https://access.redhat.com/solutions/4608041
https://coreos.com/os/docs/latest/adding-disk-space.html
25.Create Users on OpenShift 4:
https://medium.com/kubelancer-private-limited/create-users-on-openshift-4-dc5cfdf85661
26.
https://mirror.openshift.com/pub/openshift-v4/clients/ocp/4.5.6/release.txt
27.
https://github.com/RedHatOfficial/ocp4-helpernode
Troubleshooting
CoreOS installation hung
From bastion node
ssh [email protected](bootstrap server)
journalctl -b -f -u release-image.service -u bootkube.service