本文参照红帽官方文档,在裸机安装Openshift4.3文档进行。因为只有一台64G内存的PC机,安装vmware vsphere 6.7免费版进行本测试,所以尝试在OCP官方文档要求的最低内存需求基础上,内存减半安装,记录如下。
红帽官方文档记载的安装过程如下:
服务器规划如下:
Hostname | vcpu | ram | hdd | ip | fqdn |
---|---|---|---|---|---|
misc/lb | 4 | 8g | 120g | 192.168.128.30 | misc.ocptest.ipingcloud.com/lb.ocptest.ipincloud.com |
bootstrap | 4 | 8g | 120g | 192.168.128.31 | bootstrap.ocptest.ipincloud.com |
master1 | 4 | 8g | 120g | 192.168.128.32 | master1.ocptest.ipincloud.com |
master2 | 4 | 8g | 120g | 192.168.128.33 | master2.ocptest.ipincloud.com |
master3 | 4 | 8g | 120g | 192.168.128.34 | master3.ocptest.ipincloud.com |
worker1 | 2 | 4g | 120g | 192.168.128.35 | worker1.ocptest.ipincloud.com |
worker2 | 2 | 4g | 120g | 192.168.128.36 | worker2.ocptest.ipincloud.com |
api server和ingress公用一个lb,即misc/lb
以为dns配置记录,ocptest是cluster名,ipingcloud.com是基础域名.这些配置,需要修改ansi-playbook文件的tasks/相应模板。
参见
https://github.com/scwang18/ocp4-upi-helpernode.git
组件 | dns记录 | 描述 |
---|---|---|
Kubernetes API | api.ocptest.ipincloud.com | 该DNS记录指向control plane节点的负载平衡器。群集外部和群集中所有节点都必须可以解析此记录。 |
Kubernetes API | api-int.ocptest.ipincloud.com | 该DNS记录指向control plane节点的负载平衡器。该记录必须可从群集中的所有节点上解析。 |
Routes | *.apps.ocptest.ipincloud.com | 通配符DNS记录指向ingress slb。群集外部和群集中所有节点都必须可以解析此记录。 |
etcd | etcd-.ocptest.ipincloud.com | DNS记录指向etcd节点,群集所有节点都必须可以解析此记录。 |
etcd | _etcd-server-ssl._tcp.ocptest.ipincloud.com | 因为etcd使用2380对外服务,因此,需要建立对应每台etcd节点的srv dns记录,优先级0,权重10和端口2380,如下表 |
#一下激怒是必须的,用于bootstrap创建etcd服务器上,自动配置etcd服务解析
#_service._proto.name. | TTL | class | SRV | priority | weight | port | target. |
---|---|---|---|---|---|---|---|
_etcd-server-ssl._tcp. |
86400 | IN | SRV | 0 | 10 | 2380 | etcd-0. |
_etcd-server-ssl._tcp. |
86400 | IN | SRV | 0 | 10 | 2380 | etcd-1. |
_etcd-server-ssl._tcp. |
86400 | IN | SRV | 0 | 10 | 2380 | etcd-2. |
通过免登陆ssh私钥,可以用core用户身份登录到master节点,在集群上进行安装调试和灾难恢复。
(1)在misc节点上执行一下命令创建sshkey
ssh-keygen -t rsa -b 4096 -N ''
以上命令在~/.ssh/文件夹下创建id_rsa和id_rsa.pub两个文件。
(2)启动ssh agent进程并把将无密码登录的私钥加入ssh agent
eval "$(ssh-agent -s)"
ssh-add ~/.ssh/id_rsa
下一步安装ocp时,需要将ssh公钥提供给安装程序配置文件。
因为我们采用自己手动准备资源方式,因此,需要将ssh公钥放到集群各节点,本机就可以免密码登录集群节点
#将刚才生成的 ~/.ssh目录中的 id_rsa.pub 这个文件拷贝到你要登录的集群节点 的~/.ssh目录中
scp ~/.ssh/id_rsa.pub [email protected]:~/.ssh/
#然后在集群节点上运行以下命令来将公钥导入到~/.ssh/authorized_keys这个文件中
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
需要注册红帽官网账号,下载测试版安装程序,下载链接具体过程略。
https://cloud.redhat.com/openshift/install/metal/user-provisioned
rm -rf /data/pkg
mkdir -p /data/pkg
cd /data/pkg
#ocp安装程序
#wget https://mirror.openshift.com/pub/openshift-v4/clients/ocp/latest/openshift-install-linux-4.3.0.tar.gz
#ocp 客户端
#wget https://mirror.openshift.com/pub/openshift-v4/clients/ocp/latest/openshift-client-linux-4.3.0.tar.gz
#rhcos安装程序
wget https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/latest/latest/rhcos-4.3.0-x86_64-installer.iso
#rhcos bios raw文件
wget https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/latest/latest/rhcos-4.3.0-x86_64-metal.raw.gz
#如果采用iso文件方式安装,相面两个文件都不需要下载
#rhcos安装程序内核文件,用于使用ipex方式安装
wget https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/latest/latest/rhcos-4.3.0-x86_64-installer-kernel
#rhcos初始化镜像文件,用于使用ipex方式安装
wget https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/latest/latest/rhcos-4.3.0-x86_64-installer-initramfs.img
参照王征的脚本修改的工具机准备工具,可以方便的在工具机上启动 LB、DHCP、PXE、DNS和HTTP服务
(1)安装ansible和git
yum -y install ansible git
(2)从github拉取playbook
cd /data/pkg
git clone https://github.com/scwang18/ocp4-upi-helpernode.git
(3)修改playbook的参数文件
根据自己的网络规划修改参数文件
[root@centos75 pkg]# cd /data/pkg/ocp4-upi-helpernode/
[root@centos75 ocp4-upi-helpernode]# cat vars-static.yaml
[root@misc pkg]# cat vars-static.yaml
---
staticips: true
named: true
helper:
name: "helper"
ipaddr: "192.168.128.30"
networkifacename: "ens192"
dns:
domain: "ipincloud.com"
clusterid: "ocptest"
forwarder1: "192.168.128.30"
forwarder2: "192.168.128.30"
registry:
name: "registry"
ipaddr: "192.168.128.30"
yum:
name: "yum"
ipaddr: "192.168.128.30"
bootstrap:
name: "bootstrap"
ipaddr: "192.168.128.31"
masters:
- name: "master1"
ipaddr: "192.168.128.32"
- name: "master2"
ipaddr: "192.168.128.33"
- name: "master3"
ipaddr: "192.168.128.34"
workers:
- name: "worker1"
ipaddr: "192.168.128.35"
- name: "worker2"
ipaddr: "192.168.128.36"
force_ocp_download: false
ocp_bios: "file:///data/pkg/rhcos-4.3.0-x86_64-metal.raw.gz"
ocp_initramfs: "file:///data/pkg/rhcos-4.3.0-x86_64-installer-initramfs.img"
ocp_install_kernel: "file:///data/pkg/rhcos-4.3.0-x86_64-installer-kernel"
ocp_client: "file:///data/pkg/openshift-client-linux-4.3.0.tar.gz"
ocp_installer: "file:///data/pkg/openshift-install-linux-4.3.0.tar.gz"
ocp_filetranspiler: "file:///data/pkg/filetranspiler-master.zip"
registry_server: "registry.ipincloud.com:8443"
[root@misc pkg]#
(4)执行ansible安装
ansible-playbook -e @vars-static.yaml tasks/main.yml
# 在可以科学上网的机器上打包必要的镜像文件
#rm -rf /data/ocp4
mkdir -p /data/ocp4
cd /data/ocp4
# 这个脚本不好用,不下载,使用下面自己修改过
# wget https://raw.githubusercontent.com/wangzheng422/docker_env/dev/redhat/ocp4/4.3/scripts/build.dist.sh
yum -y install podman docker-distribution pigz skopeo docker buildah jq python3-pip
pip3 install yq
# https://blog.csdn.net/ffzhihua/article/details/85237411
wget http://mirror.centos.org/centos/7/os/x86_64/Packages/python-rhsm-certificates-1.19.10-1.el7_4.x86_64.rpm
rpm2cpio python-rhsm-certificates-1.19.10-1.el7_4.x86_64.rpm | cpio -iv --to-stdout ./etc/rhsm/ca/redhat-uep.pem | tee /etc/rhsm/ca/redhat-uep.pem
systemctl start docker
docker login -u wuliangye2019 -p Red@123! registry.redhat.io
docker login -u wuliangye2019 -p Red@123! registry.access.redhat.com
docker login -u wuliangye2019 -p Red@123! registry.connect.redhat.com
podman login -u wuliangye2019 -p Red@123! registry.redhat.io
podman login -u wuliangye2019 -p Red@123! registry.access.redhat.com
podman login -u wuliangye2019 -p Red@123! registry.connect.redhat.com
# to download the pull-secret.json, open following link
# https://cloud.redhat.com/openshift/install/metal/user-provisioned
cat << 'EOF' > /data/pull-secret.json
{"auths":{"cloud.openshift.com":{"auth":"xxxxxxxxxxx}}}
EOF
创建 build.dist.sh文件
#!/usr/bin/env bash
set -e
set -x
var_date=$(date '+%Y-%m-%d')
echo $var_date
#以下不用每次都执行
#cat << EOF >> /etc/hosts
#127.0.0.1 registry.ipincloud.com
#EOF
#mkdir -p /etc/crts/
#cd /etc/crts
#openssl req \
# -newkey rsa:2048 -nodes -keyout ipincloud.com.key \
# -x509 -days 3650 -out ipincloud.com.crt -subj \
# "/C=CN/ST=GD/L=SZ/O=Global Security/OU=IT Department/CN=*.ipincloud.com"
#cp /etc/crts/ipincloud.com.crt /etc/pki/ca-trust/source/anchors/
#update-ca-trust extract
systemctl stop docker-distribution
rm -rf /data/registry
mkdir -p /data/registry
cat << EOF > /etc/docker-distribution/registry/config.yml
version: 0.1
log:
fields:
service: registry
storage:
cache:
layerinfo: inmemory
filesystem:
rootdirectory: /data/registry
delete:
enabled: true
http:
addr: :8443
tls:
certificate: /etc/crts/ipincloud.com.crt
key: /etc/crts/ipincloud.com.key
EOF
systemctl restart docker
systemctl enable docker-distribution
systemctl restart docker-distribution
build_number_list=$(cat << EOF
4.3.0
EOF
)
mkdir -p /data/ocp4
cd /data/ocp4
install_build() {
BUILDNUMBER=$1
echo ${BUILDNUMBER}
mkdir -p /data/ocp4/${BUILDNUMBER}
cd /data/ocp4/${BUILDNUMBER}
#下载并安装openshift客户端和安装程序 第一次需要运行,工具机ansi初始化时,已经完成这些动作了
#wget https://mirror.openshift.com/pub/openshift-v4/clients/ocp/${BUILDNUMBER}/release.txt
#wget https://mirror.openshift.com/pub/openshift-v4/clients/ocp/${BUILDNUMBER}/openshift-client-linux-${BUILDNUMBER}.tar.gz
#wget https://mirror.openshift.com/pub/openshift-v4/clients/ocp/${BUILDNUMBER}/openshift-install-linux-${BUILDNUMBER}.tar.gz
#解压安装程序和客户端到用户执行目录 第一次需要运行
#tar -xzf openshift-client-linux-${BUILDNUMBER}.tar.gz -C /usr/local/bin/
#tar -xzf openshift-install-linux-${BUILDNUMBER}.tar.gz -C /usr/local/bin/
export OCP_RELEASE=${BUILDNUMBER}
export LOCAL_REG='registry.ipincloud.com:8443'
export LOCAL_REPO='ocp4/openshift4'
export UPSTREAM_REPO='openshift-release-dev'
export LOCAL_SECRET_JSON="/data/pull-secret.json"
export OPENSHIFT_INSTALL_RELEASE_IMAGE_OVERRIDE=${LOCAL_REG}/${LOCAL_REPO}:${OCP_RELEASE}
export RELEASE_NAME="ocp-release"
oc adm release mirror -a ${LOCAL_SECRET_JSON} \
--from=quay.io/${UPSTREAM_REPO}/${RELEASE_NAME}:${OCP_RELEASE}-x86_64 \
--to-release-image=${LOCAL_REG}/${LOCAL_REPO}:${OCP_RELEASE} \
--to=${LOCAL_REG}/${LOCAL_REPO}
}
while read -r line; do
install_build $line
done <<< "$build_number_list"
cd /data/ocp4
#wget -O ocp4-upi-helpernode-master.zip https://github.com/wangzheng422/ocp4-upi-helpernode/archive/master.zip
#以下注释,因为quay.io/wangzheng422这个仓库的registry版本是v1不能与v2共存
#podman pull quay.io/wangzheng422/filetranspiler
#podman save quay.io/wangzheng422/filetranspiler | pigz -c > filetranspiler.tgz
#podman pull docker.io/library/registry:2
#podman save docker.io/library/registry:2 | pigz -c > registry.tgz
systemctl start docker
docker login -u wuliangye2019 -p Red@123! registry.redhat.io
docker login -u wuliangye2019 -p Red@123! registry.access.redhat.com
docker login -u wuliangye2019 -p Red@123! registry.connect.redhat.com
podman login -u wuliangye2019 -p Red@123! registry.redhat.io
podman login -u wuliangye2019 -p Red@123! registry.access.redhat.com
podman login -u wuliangye2019 -p Red@123! registry.connect.redhat.com
# 以下命令要运行 2-3个小时,耐心等待。。。
# build operator catalog
podman login registry.ipincloud.com:8443 -u root -p Scwang18
oc adm catalog build \
--appregistry-endpoint https://quay.io/cnr \
--appregistry-org redhat-operators \
--to=${LOCAL_REG}/ocp4-operator/redhat-operators:v1
oc adm catalog mirror \
${LOCAL_REG}/ocp4-operator/redhat-operators:v1 \
${LOCAL_REG}/operator
#cd /data
#tar cf - registry/ | pigz -c > registry.tgz
#cd /data
#tar cf - ocp4/ | pigz -c > ocp4.tgz
执行build.dist.sh脚本
这里有个巨坑,因为从quay.io拉取image镜像到本地时,拉取的文件有5G多,通常一次拉取不完,会出错,每次出错后,重新运行build.dist.sh会把以前的registry删除掉,从头再来,浪费很多时间,实际上可以不用删除,执行oc adm release mirror时会自动跳过已经存在的image。血泪教训。
bash build.dist.sh
oc adm release mirror执行完毕后,回根据官方镜像仓库生成本地镜像仓库,返回的信息需要记录下来,特别是imageContentSource信息,后面 install-config.yaml 文件里配置进去
Success
Update image: registry.ipincloud.com:8443/ocp4/openshift4:4.3.0
Mirror prefix: registry.ipincloud.com:8443/ocp4/openshift4
To use the new mirrored repository to install, add the following section to the install-config.yaml:
imageContentSources:
- mirrors:
- registry.ipincloud.com:8443/ocp4/openshift4
source: quay.io/openshift-release-dev/ocp-release
- mirrors:
- registry.ipincloud.com:8443/ocp4/openshift4
source: quay.io/openshift-release-dev/ocp-v4.0-art-dev
To use the new mirrored repository for upgrades, use the following to create an ImageContentSourcePolicy:
apiVersion: operator.openshift.io/v1alpha1
kind: ImageContentSourcePolicy
metadata:
name: example
spec:
repositoryDigestMirrors:
- mirrors:
- registry.ipincloud.com:8443/ocp4/openshift4
source: quay.io/openshift-release-dev/ocp-release
- mirrors:
- registry.ipincloud.com:8443/ocp4/openshift4
source: quay.io/openshift-release-dev/ocp-v4.0-art-dev
以下命令不需要执行,在build.dish.sh里已经执行了
oc adm release mirror -a /data/pull-secret.json --from=quay.io/openshift-release-dev/ocp-release:4.3.0-x86_64 --to-release-image=registry.ipincloud.com:8443/ocp4/openshift4:4.3.0 --to=registry.ipincloud.com:8443/ocp4/openshift4
podman login registry.ipincloud.com:8443 -u root -p Scwang18
oc adm catalog build \
--appregistry-endpoint https://quay.io/cnr \
--appregistry-org redhat-operators \
--to=registry.ipincloud.com:8443/ocp4-operator/redhat-operators:v1
oc adm catalog mirror \
registry.ipincloud.com:8443/ocp4-operator/redhat-operators:v1 \
registry.ipincloud.com:8443/operator
#如果oc adm catalog mirror执行不成功,会生成一个mapping.txt的文件,可以根据这个文件,执行不成功的行删除,再以下面的方式执行
oc image mirror -a /data/pull-secret.json -f /data/mapping-ok.txt
oc image mirror quay.io/external_storage/nfs-client-provisioner:latest registry.ipincloud.com:8443/ocp4/openshift4/nfs-client-provisioner:latest
oc image mirror quay.io/external_storage/nfs-client-provisioner:latest registry.ipincloud.com:8443/quay.io/external_storage/nfs-client-provisioner:latest
#查看镜像的sha
curl -v --silent -H "Accept: application/vnd.docker.distribution.manifest.v2+json" -X GET https://registry.ipincloud.com:8443/v2/ocp4/openshift4/nfs-client-provisioner/manifests/latest 2>&1 | grep Docker-Content-Digest | awk '{print ($3)}'
#删除镜像摘要
curl -v --silent -H "Accept: application/vnd.docker.distribution.manifest.v2+json" -X DELETE https://registry.ipincloud.com:8443/v2/ocp4/openshift4/nfs-client-provisioner/manifests/sha256:022ea0b0d69834b652a4c53655d78642ae23f0324309097be874fb58d09d2919
#回收镜像空间
podman exec -it mirror-registry /bin/registry garbage-collect /etc/docker/registry/config.yml
(1)创建installer文件夹
rm -rf /data/install
mkdir -p /data/install
cd /data/install
(2)定制install-config.yaml文件
[root@misc data]# cat /data/pull-secret.json
{"auths":{"cloud.openshift.com":{"auth":"省略"}}}
cat ~/.ssh/id_rsa.pub
[root@misc crts]# cat /etc/crts/ipincloud.com.crt
-----BEGIN CERTIFICATE-----
xxx省略
-----END CERTIFICATE-----
生产环境可以不用直连外网,通过在install-config.yaml文件为集群设置代理。
本次测试,为了加速外网下载,我在aws上事先搭建了一个v2ray server,misc服务器作为v2ray客户端,具体搭建过程另文叙述。
在反复试验时,比如 install-config.yaml 所在的目录是 config,必须 rm -rf install 而不是 rm -rf install/*,后者未删除其中的隐藏文件 .openshift_install_state.json,有可能引起:x509: certificate has expired or is not yet valid。
在文档和博客示例中 install-config.yaml 的 cidr 配置为 10 网段,由于未细看文档理解成了节点机网段,这造成了整个过程中最莫名其妙的错误:no matches for kind MachineConfig。
最终文件内容如下:
[root@centos75 install]# vi install-config.yaml
apiVersion: v1
baseDomain: ipincloud.com
proxy:
httpProxy: http://192.168.128.30:8001
httpsProxy: http://192.168.128.30:8001
compute:
- hyperthreading: Enabled
name: worker
replicas: 0
controlPlane:
hyperthreading: Enabled
name: master
replicas: 3
metadata:
name: ocptest
networking:
clusterNetwork:
- cidr: 10.128.0.0/14
hostPrefix: 23
networkType: OpenShiftSDN
serviceNetwork:
- 172.30.0.0/16
platform:
none: {}
fips: false
pullSecret: '{"auths":{"省略'
additionalTrustBundle: |
-----BEGIN CERTIFICATE-----
省略,注意这里要前面空两格
-----END CERTIFICATE-----
imageContentSources:
- mirrors:
- registry.ipincloud.com:8443/ocp4/openshift4
source: quay.io/openshift-release-dev/ocp-release
- mirrors:
- registry.ipincloud.com:8443/ocp4/openshift4
source: quay.io/openshift-release-dev/ocp-v4.0-art-dev
(3)备份定制install-config.yaml文件,便于以后可以重复使用
cd /data/install
cp install-config.yaml ../install-config.yaml.20200205
(1)生成Kubernetes manifests文件
openshift-install create manifests --dir=/data/install
注意:指定install-config.yaml所在目录是,需要使用绝的路径
(2)修改 manifests/cluster-scheduler-02-config.yml文件以防止pod调度到control plane节点
红帽官方安装文档说明,kubernetes不支持ingress的load balancer访问control-plane节点的pod
a.打开manifests/cluster-scheduler-02-config.yml
b.找到mastersSchedulable参数,设置为False
c.保存并退出。
vi /data/install/manifests/cluster-scheduler-02-config.yml
(3)创建Ignition配置文件
注意:创建Ignition配置文件完成后,install-config.yaml文件将被删除,请务必先备份此文件。
openshift-install create ignition-configs --dir=/data/install
(4)将Ignition配置文件拷贝到http服务器目录,待安装时使用
cd /data/install
\cp -f bootstrap.ign /var/www/html/ignition/bootstrap.ign
\cp -f master.ign /var/www/html/ignition/master1.ign
\cp -f master.ign /var/www/html/ignition/master2.ign
\cp -f master.ign /var/www/html/ignition/master3.ign
\cp -f worker.ign /var/www/html/ignition/worker1.ign
\cp -f worker.ign /var/www/html/ignition/worker2.ign
cd /var/www/html/ignition/
chmod 755 *.ign
至此,已完成必要的配置文件设置,开始进入下一步创建节点。
安装时需要修改启动参数,只能手动录入,每台机器修改很麻烦,容易出错,因此我们采用genisoimage来定制每台机器的安装镜像。
#安装镜像创建工具
yum -y install genisoimage libguestfs-tools
systemctl start libvirtd
#设置环境变量
export NGINX_DIRECTORY=/data/pkg
export RHCOSVERSION=4.3.0
export VOLID=$(isoinfo -d -i ${NGINX_DIRECTORY}/rhcos-${RHCOSVERSION}-x86_64-installer.iso | awk '/Volume id/ { print $3 }')
#生成一个临时文件目录,用于放置过程文件
TEMPDIR=$(mktemp -d)
echo $VOLID
echo $TEMPDIR
cd ${TEMPDIR}
# Extract the ISO content using guestfish (to avoid sudo mount)
#使用guestfish可以将不用sudo mount将iso文件解压出来
guestfish -a ${NGINX_DIRECTORY}/rhcos-${RHCOSVERSION}-x86_64-installer.iso \
-m /dev/sda tar-out / - | tar xvf -
#定义修改配置文件的函数
modify_cfg(){
for file in "EFI/redhat/grub.cfg" "isolinux/isolinux.cfg"; do
# 添加恰当的 image 和 ignition url
sed -e '/coreos.inst=yes/s|$| coreos.inst.install_dev=sda coreos.inst.image_url='"${URL}"'\/install\/'"${BIOSMODE}"'.raw.gz coreos.inst.ignition_url='"${URL}"'\/ignition\/'"${NODE}"'.ign ip='"${IP}"'::'"${GATEWAY}"':'"${NETMASK}"':'"${FQDN}"':'"${NET_INTERFACE}"':none:'"${DNS}"' nameserver='"${DNS}"'|' ${file} > $(pwd)/${NODE}_${file##*/}
# 修改参数里的启动等待时间
sed -i -e 's/default vesamenu.c32/default linux/g' -e 's/timeout 600/timeout 10/g' $(pwd)/${NODE}_${file##*/}
done
}
#设置url,网关、dns等iso启动通用参数变量
URL="http://192.168.128.30:8080"
GATEWAY="192.168.128.254"
NETMASK="255.255.255.0"
DNS="192.168.128.30"
#设置bootstrap节点变量
NODE="bootstrap"
IP="192.168.128.31"
FQDN="bootstrap"
BIOSMODE="bios"
NET_INTERFACE="ens192"
modify_cfg
#设置master1节点变量
NODE="master1"
IP="192.168.128.32"
FQDN="master1"
BIOSMODE="bios"
NET_INTERFACE="ens192"
modify_cfg
#设置master2节点变量
NODE="master2"
IP="192.168.128.33"
FQDN="master2"
BIOSMODE="bios"
NET_INTERFACE="ens192"
modify_cfg
#设置master3节点变量
NODE="master3"
IP="192.168.128.34"
FQDN="master3"
BIOSMODE="bios"
NET_INTERFACE="ens192"
modify_cfg
#设置master4节点变量
NODE="worker1"
IP="192.168.128.35"
FQDN="worker1"
BIOSMODE="bios"
NET_INTERFACE="ens192"
modify_cfg
#设置master5节点变量
NODE="worker2"
IP="192.168.128.36"
FQDN="worker2"
BIOSMODE="bios"
NET_INTERFACE="ens192"
modify_cfg
# 为每个节点创建不同的安装镜像
# https://github.com/coreos/coreos-assembler/blob/master/src/cmd-buildextend-installer#L97-L103
for node in bootstrap master1 master2 master3 worker1 worker2; do
# 为每个节点创建不同的 grub.cfg and isolinux.cfg 文件
for file in "EFI/redhat/grub.cfg" "isolinux/isolinux.cfg"; do
/bin/cp -f $(pwd)/${node}_${file##*/} ${file}
done
# 创建iso镜像
genisoimage -verbose -rock -J -joliet-long -volset ${VOLID} \
-eltorito-boot isolinux/isolinux.bin -eltorito-catalog isolinux/boot.cat \
-no-emul-boot -boot-load-size 4 -boot-info-table \
-eltorito-alt-boot -efi-boot images/efiboot.img -no-emul-boot \
-o ${NGINX_DIRECTORY}/${node}.iso .
done
# 清除过程文件
cd
rm -Rf ${TEMPDIR}
cd ${NGINX_DIRECTORY}
(1)将定制的ISO文件拷贝到vmware esxi主机上,准备装节点
[root@misc pkg]# scp bootstrap.iso [email protected]:/vmfs/volumes/hdd/iso
[root@misc pkg]# scp m*.iso [email protected]:/vmfs/volumes/hdd/iso
[root@misc pkg]# scp w*.iso [email protected]:/vmfs/volumes/hdd/iso
(2)按规划创建master,设置从iso启动安装
openshift-install --dir=/data/install wait-for bootstrap-complete --log-level debug
注意事项:
手工修改-aster1节点的etcd的yaml文件,在exec etcd命令末尾增加–initial-cluster-state=existing参数,再删除问题POD后,系统会自动重新安装etcd pod,恢复正常。
正常启动以后,要把这个改回去,否则machine-config回一直无法完成
#
[root@master1 /]# vi /etc/kubernetes/manifests/etcd-member.yaml
exec etcd \
--initial-advertise-peer-urls=https://${ETCD_IPV4_ADDRESS}:2380 \
--cert-file=/etc/ssl/etcd/system:etcd-server:${ETCD_DNS_NAME}.crt \
--key-file=/etc/ssl/etcd/system:etcd-server:${ETCD_DNS_NAME}.key \
--trusted-ca-file=/etc/ssl/etcd/ca.crt \
--client-cert-auth=true \
--peer-cert-file=/etc/ssl/etcd/system:etcd-peer:${ETCD_DNS_NAME}.crt \
--peer-key-file=/etc/ssl/etcd/system:etcd-peer:${ETCD_DNS_NAME}.key \
--peer-trusted-ca-file=/etc/ssl/etcd/ca.crt \
--peer-client-cert-auth=true \
--advertise-client-urls=https://${ETCD_IPV4_ADDRESS}:2379 \
--listen-client-urls=https://0.0.0.0:2379 \
--listen-peer-urls=https://0.0.0.0:2380 \
--listen-metrics-urls=https://0.0.0.0:9978 \
--initial-cluster-state=existing
[root@master1 /]# crictl pods
POD ID CREATED STATE NAME NAMESPACE ATTEMPT
c4686dc3e5f4f 38 minutes ago Ready etcd-member-master1.ocptest.ipincloud.com openshift-etcd 5
[root@master1 /]# crictl rmp xxx
[root@misc install]# openshift-install --dir=/data/install wait-for bootstrap-complete --log-level debug
DEBUG OpenShift Installer v4.3.0
DEBUG Built from commit 2055609f95b19322ee6cfdd0bea73399297c4a3e
INFO Waiting up to 30m0s for the Kubernetes API at https://api.ocptest.ipincloud.com:6443...
INFO API v1.16.2 up
INFO Waiting up to 30m0s for bootstrapping to complete...
DEBUG Bootstrap status: complete
INFO It is now safe to remove the bootstrap resources
[root@misc install]#
(3)安装worker
[root@misc redhat-operators-manifests]# openshift-install --dir=/data/install wait-for install-complete --log-level debug
DEBUG OpenShift Installer v4.3.0
DEBUG Built from commit 2055609f95b19322ee6cfdd0bea73399297c4a3e
INFO Waiting up to 30m0s for the cluster at https://api.ocptest.ipincloud.com:6443 to initialize...
DEBUG Cluster is initialized
INFO Waiting up to 10m0s for the openshift-console route to be created...
DEBUG Route found in openshift-console namespace: console
DEBUG Route found in openshift-console namespace: downloads
DEBUG OpenShift console route is created
INFO Install complete!
INFO To access the cluster as the system:admin user when using 'oc', run 'export KUBECONFIG=/data/install/auth/kubeconfig'
INFO Access the OpenShift web-console here:
https://console-openshift-console.apps.ocptest.ipincloud.com
INFO Login to the console with user: kubeadmin, password: pubmD-8Baaq-IX36r-WIWWf
查看待审批的csr
[root@misc ~]# oc get csr
NAME AGE REQUESTOR CONDITION
csr-7lln5 70m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued
csr-d48xk 69m system:node:master1.ocptest.ipincloud.com Approved,Issued
csr-f2g7r 69m system:node:master2.ocptest.ipincloud.com Approved,Issued
csr-gbn2n 69m system:node:master3.ocptest.ipincloud.com Approved,Issued
csr-hwxwx 13m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending
csr-ppgxx 13m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending
csr-wg874 70m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued
csr-zkp79 70m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued
[root@misc ~]#
执行审批
oc get csr -ojson | jq -r '.items[] | select(.status == {} ) | .metadata.name' | xargs oc adm certificate approve
(3)在misc上启动nfs
bash /data/pkg/ocp4-upi-helpernode/files/nfs-provisioner-setup.sh
#查看状态
oc get pods -n nfs-provisioner
(4)ocp内部registry使用nfs作为存储
oc patch configs.imageregistry.operator.openshift.io cluster -p '{"spec":{"storage":{"pvc":{"claim":""}}}}' --type=merge
oc get clusteroperator image-registry
(1)配置普通管理员账号
#在misc机器上创建admin token
mkdir -p ~/auth
htpasswd -bBc ~/auth/admin-passwd admin scwang18
#拷贝到本地
mkdir -p ~/auth
scp -P 20030 [email protected]:/root/auth/admin-passwd ~/auth/
#在 OAuth Details 页面添加 HTPasswd 类型的 Identity Providers 并上传admin-passwd 文件。
https://console-openshift-console.apps.ocptest.ipincloud.com
#授予新建的admin用户集群管理员权限
oc adm policy add-cluster-role-to-user cluster-admin admin