kubernetes的兴起与应用不仅为容器的发展推波助澜,也成就了云原生技术的火爆。同样,金融行业也逐步涌现出很多上云的系统。为了保证我行后期上云更加容易,更容易上手,因此对k8s及云原生技术也开展一些学习和实验。实验才是掌握知识最快的方式,开展k8s的相关学习,我也是选择从安装部署开始,拥有一套自己的kubernetes集群,然后带着疑问进行研究学习,后期也会通过书本进行一些系统的了解和学习,希望自己能够坚持下去。下面则通过离线的方式基于RHEL7搭建一套k8s集群。
IP | 主机名 | 功能 |
---|---|---|
172.16.131.83 | k8s-master | master管理节点 |
172.16.131.84 | k8s-node1 | 工作节点1 |
172.16.131.85 | k8s-node2 | 工作节点2 |
172.16.131.86 | k8s-node3 | 工作节点3 |
172.16.131.87 | registry-harbor | 仓库 |
172.16.131.88 | k8s-zhongzhuan | 外网中转 |
1)在k8s节点修改主机名:
cp /etc/hosts /etc/hosts_`date +%y%m%d`
echo "
172.16.131.83 k8s-master
172.16.131.84 k8s-node1
172.16.131.85 k8s-node2
172.16.131.86 k8s-node3
172.16.131.87 registry-harbor
" >> /etc/hosts
2)系统参数配置:
echo "fs.file-max = 6815744
kernel.sem = 10000 10240000 10000 1024
kernel.shmmni = 4096
kernel.shmall = 1073741824
kernel.shmmax = 751619276800
net.ipv4.ip_local_port_range = 9000 65500
net.core.rmem_default = 16777216
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.core.wmem_default = 16777216
fs.aio-max-nr = 6194304
vm.dirty_ratio=20
vm.dirty_background_ratio=3
vm.dirty_writeback_centisecs=100
vm.dirty_expire_centisecs=500
vm.min_free_kbytes=524288
net.core.netdev_max_backlog = 30000
net.core.netdev_budget = 600
#vm.nr_hugepages =
net.ipv4.conf.all.rp_filter = 2
net.ipv4.conf.default.rp_filter = 2
net.ipv4.ipfrag_time = 60
net.ipv4.ipfrag_low_thresh = 6291456
net.ipv4.ipfrag_high_thresh = 8388608
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
vm.swappiness=0">> /etc/sysctl.conf && sysctl -p
3)用户限制参数配置:
cp /etc/security/limits.conf /etc/security/limits_`date +"%Y%m%d_%H%M%S"`.conf
echo "
* soft nproc 655350
* hard nproc 655350
* soft nofile 655360
* hard nofile 655360
* soft stack 102400
* hard stack 327680
* soft stack 102400
* hard stack 327680
* soft memlock -1
* hard memlock -1" >>/etc/security/limits.conf
4)关闭防火墙:
systemctl stop firewalld
systemctl disable firewalld
5)关闭selinux:
setenforce 0
sed -i "s/^SELINUX=enforcing/SELINUX=disabled/g" /etc/selinux/config
6)关闭透明大页:
[ -f /sys/kernel/mm/transparent_hugepage/enabled ] && echo never > /sys/kernel/mm/transparent_hugepage/enabled
[ -f /sys/kernel/mm/redhat_transparent_hugepage/enabled ] && echo never > /sys/kernel/mm/redhat_transparent_hugepage/enabled
grep transparent_hugepage /etc/rc.d/rc.local 1>/dev/null || echo '[ -f /sys/kernel/mm/transparent_hugepage/enabled ] && echo never > /sys/kernel/mm/transparent_hugepage/enabled' >> /etc/rc.local
grep redhat_transparent_hugepage /etc/rc.d/rc.local 1>/dev/null || echo '[ -f /sys/kernel/mm/redhat_transparent_hugepage/enabled ] && echo never > /sys/kernel/mm/redhat_transparent_hugepage/enabled' >> /etc/rc.local
[ -x /etc/rc.d/rc.local ] || chmod +x /etc/rc.d/rc.local
7)关闭swap
swapoff -a
sed -i 's/.*swap.*/#&/' /etc/fstab
8)配置ssh(sshUserSetup.sh具体内容见附录)
sh sshUserSetup.sh -user root -hosts "k8s-master k8s-node1 k8s-node2 k8s-node3"
9)同步时钟(其他节点同步):
master中:
vi /etc/ntp.conf
#server 0.rhel.pool.ntp.org iburst
#server 1.rhel.pool.ntp.org iburst
#server 2.rhel.pool.ntp.org iburst
#server 3.rhel.pool.ntp.org iburst
server 127.127.1.0
fudge 127.127.1.0 stratum 10
其他机器:
crontab -e
*/2 * * * * /usr/sbin/ntpdate 172.16.131.83
date && ssh k8s-node1 date && ssh k8s-node2 date && ssh k8s-node3 date
在kubernetes的1.24之后,kubernetes对docker作为容器运行时兼容性不好,在部署初始化时时会出现无法从私有仓库里拉取镜像的问题。因此,此时则有两种解决方案,即方案一,部署cri-docker配合docker容器运行时进行使用;方案二,使用containerd作为容器运行时。这里我们选择第一种方式进行。
1)安装需要软件(利用本地源即可)
yum install -y yum-utils device-mapper-persistent-data lvm2 wget
2)安装epel(需要centos7源)
获取阿里云的centos-7的repo文件:
wget -O /etc/yum.repos.d/CentOS-Base.repo http://mirrors.aliyun.com/repo/Centos-7.repo
3)修改CentOS-Base.repo文件,把文件里面的$releasever全部替换为版本号7:
vi /etc/yum.repos.d/CentOS-Base.repo
%s/$releasever/7/g
4)清理注册源:
yum clean all&& yum makecache fast
5)安装epel-release.noarch
yum install -y epel-release.noarch
6)下载docker源
yum-config-manager --add-repo=https://download.docker.com/linux/centos/docker-ce.repo
or
yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
7)生效yum仓库
yum-config-manager --enable docker-ce-nightly
(检查可以安装的docker版本:yum list docker-ce --showduplicates | sort -r)
注:当检查可安装的docker版本时出现以下类似错误的时候
https://mirrors.aliyun.com/docker-ce/linux/centos/7Server/x86_64/stable/repodata/7cc100684a6630e5382cf07c92483acecdff60eb94243af9acb95654c2913d70-primary.sqlite.bz2: [Errno 14] HTTPS Error 404 - Not Found
Trying other mirror.
主要原因是由于,仓库配置中的$releasever找不到导致,此时可以作如下操作:
vi /etc/yum.repos.d/docker-ce.repo
%s/$releasever/7/g
8)清理注册源:
yum clean all&& yum makecache fast
9)下载指定版本的docker的相关部署包:
mkdir -p /app/soft/docker
cd /app/soft/docker
yumdownloader --resolve docker-ce-23.0.1
10)打包:
cd /app/soft
tar -cvzf docker_v23.0.1_offline_pkg.tar.gz docker
11)将docker_v23.0.1_offline_pkg.tar.gz包发送至离线机器
scp -rp docker_offline_pkg.tar.gz 172.16.131.83:/app/soft/
scp -rp docker_offline_pkg.tar.gz 172.16.131.84:/app/soft/
scp -rp docker_offline_pkg.tar.gz 172.16.131.85:/app/soft/
scp -rp docker_offline_pkg.tar.gz 172.16.131.86:/app/soft/
scp -rp docker_offline_pkg.tar.gz 172.16.131.87:/app/soft/
12)下载cri-doceker的二进制包
下载地址:
https://github.com/Mirantis/cri-dockerd/releases/
选择二进制包:
cri-dockerd-0.3.1.amd64.tgz
1)解压离线安装包:
tar -xvzf docker_offline_pkg.tar.gz -C /app/soft/
tar -xvzf cri-dockerd-0.3.1.amd64.tgz -C /app/soft/
2)安装docker:
cd /app/soft/
yum install *.rpm
3)启动docker:
systemctl start docker && systemctl enable docker
4)安装cri-docker,解压安装包
tar -xvzf cri-dockerd-0.3.1.amd64.tgz -C /app/soft
5)拷贝二进制文件到/usr/bin下,并设置权限:
cd cri-dockerd
cp cri-dockerd /usr/bin/
chmod +x /usr/bin/cri-dockerd
6)配置cri-dockerd的启动文件:
cat <<"EOF" > /usr/lib/systemd/system/cri-docker.service
[Unit]
Description=CRI Interface for Docker Application Container Engine
Documentation=https://docs.mirantis.com
After=network-online.target firewalld.service docker.service
Wants=network-online.target
Requires=cri-docker.socket
[Service]
Type=notify
ExecStart=/usr/bin/cri-dockerd --network-plugin=cni --pod-infra-container-image=172.16.131.87:1088/kubernetes-deploy/pause:3.7
ExecReload=/bin/kill -s HUP $MAINPID
TimeoutSec=0
RestartSec=2
Restart=always
StartLimitBurst=3
StartLimitInterval=60s
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
TasksMax=infinity
Delegate=yes
KillMode=process
[Install]
WantedBy=multi-user.target
EOF
注:在启动文件里面需要加pod-infra-container-image的配置,否则后续在进行kubernetes安装部署的时候,pause的下载会默认到k8s.gcr.io/pause3.7上下载,从而无法获取,加上改参数,则会到我们指定的仓库下载镜像,具体参数如下:–pod-infra-container-image=172.16.131.87:1088/kubernetes-v1.24.12-deploy/pause:3.7
7)配置生成socket文件:
cat <<"EOF" > /usr/lib/systemd/system/cri-docker.socket
[Unit]
Description=CRI Docker Socket for the API
PartOf=cri-docker.service
[Socket]
ListenStream=%t/cri-dockerd.sock
SocketMode=0660
SocketUser=root
SocketGroup=docker
[Install]
WantedBy=sockets.target
EOF
8)启动cri-docker运行时
systemctl daemon-reload
systemctl start cri-docker
systemctl enable cri-docker
systemctl status cri-docker
1)下载配置epel源
wget -O /etc/yum.repos.d/epel.repo http://mirrors.aliyun.com/repo/epel-7.repo
2)下载docker-compose
检查版本:
yum list docker-compose --showduplicates | sort -r
创建目录:
mkdir -p /app/soft/docker-compose
cd /app/soft/docker-compose
安装指定版本:
yumdownloader --resolve docker-compose-1.18.0
3)打包docker-compose安装包:
cd /app/soft
tar -cvzf docker-compase_offline_pkg_v1.18.0.tar.gz docker-compase
4)将docker_offline_pkg.tar.gz包发送至离线机器
scp -rp docker-compase_offline_pkg_v1.18.0.tar.gz 172.16.220.83:/app/soft/
scp -rp docker-compase_offline_pkg_v1.18.0.tar.gz 172.16.220.84:/app/soft/
scp -rp docker-compase_offline_pkg_v1.18.0.tar.gz 172.16.220.85:/app/soft/
scp -rp docker-compase_offline_pkg_v1.18.0.tar.gz 172.16.220.86:/app/soft/
scp -rp docker-compase_offline_pkg_v1.18.0.tar.gz 172.16.220.87:/app/soft/
5)在离线机器上解压离线安装包:
tar -xvzf docker-compase_offline_pkg_v1.18.0.tar.gz -C /app/soft/
6)在离线机器上安装docker-compase:
cd /app/soft/docker-compase
yum install *.rpm
7)下载harbor的离线安装包(联网中转机)
curl -O https://github.com/goharbor/harbor/releases/download/v2.7.1/harbor-offline-installer-v2.7.1.tgz
或者直接到github上手动下载上传
8)传输离线包至registry-harbor主机下并解压
scp -rp /app/soft/harbor-offline-installer-v2.7.1.tgz 172.16.131.87:/app/soft/
tar -xvzf /app/soft/harbor-offline-installer-v2.7.1.tgz -C /app/
9)根据需求修改yaml文件
cp harbor.yml.tmpl harbor.yml
vi harbor.yml
主要修改内容包括:
# Configuration file of Harbor
# The IP address or hostname to access admin UI and registry service.
# DO NOT use localhost or 127.0.0.1, because Harbor needs to be accessed by external clients.
hostname: 172.16.131.87
# http related config
http:
# port for http, default is 80. If https enabled, this port will redirect to https port
port: 1088
# https related config
#https:
# # https port for harbor, default is 443
# port: 443
# # The path of cert and key files for nginx
# certificate: /your/certificate/path
# private_key: /your/private/key/path
# # Uncomment following will enable tls communication between all harbor components
# internal_tls:
# # set enabled to true means internal tls is enabled
# enabled: true
# # put your cert and key files on dir
# dir: /etc/harbor/tls/internal
# Uncomment external_url if you want to enable external proxy
# And when it enabled the hostname will no longer used
# external_url: https://reg.mydomain.com:8433
# The initial password of Harbor admin
# It only works in first time to install harbor
# Remember Change the admin password from UI after launching Harbor.
harbor_admin_password: Harbor@1234
# Harbor DB configuration
database:
# The password for the root user of Harbor DB. Change this before any production use.
password: Harbor@1234
# The maximum number of connections in the idle connection pool. If it <=0, no idle connections are retained.
max_idle_conns: 100
# The maximum number of open connections to the database. If it <= 0, then there is no limit on the number of open connections.
# Note: the default number of connections is 1024 for postgres of harbor.
max_open_conns: 900
# The maximum amount of time a connection may be reused. Expired connections may be closed lazily before reuse. If it <= 0, connections are not closed due to a connection's age.
# The value is a duration string. A duration string is a possibly signed sequence of decimal numbers, each with optional fraction and a unit suffix, such as "300ms", "-1.5h" or "2h45m". Valid time units are "ns", "us" (or "µs"), "ms", "s", "m", "h".
conn_max_lifetime: 5m
# The maximum amount of time a connection may be idle. Expired connections may be closed lazily before reuse. If it <= 0, connections are not closed due to a connection's idle time.
# The value is a duration string. A duration string is a possibly signed sequence of decimal numbers, each with optional fraction and a unit suffix, such as "300ms", "-1.5h" or "2h45m". Valid time units are "ns", "us" (or "µs"), "ms", "s", "m", "h".
conn_max_idle_time: 0
# The default data volume
data_volume: /app/data
# Harbor Storage settings by default is using /data dir on local filesystem
# Uncomment storage_service setting If you want to using external storage
# storage_service:
# # ca_bundle is the path to the custom root ca certificate, which will be injected into the truststore
# # of registry's and chart repository's containers. This is usually needed when the user hosts a internal storage with self signed certificate.
# ca_bundle:
# # storage backend, default is filesystem, options include filesystem, azure, gcs, s3, swift and oss
# # for more info about this configuration please refer https://docs.docker.com/registry/configuration/
# filesystem:
# maxthreads: 100
# # set disable to true when you want to disable registry redirect
# redirect:
# disabled: false
# Trivy configuration
#
# Trivy DB contains vulnerability information from NVD, Red Hat, and many other upstream vulnerability databases.
# It is downloaded by Trivy from the GitHub release page https://github.com/aquasecurity/trivy-db/releases and cached
# in the local file system. In addition, the database contains the update timestamp so Trivy can detect whether it
# should download a newer version from the Internet or use the cached one. Currently, the database is updated every
# 12 hours and published as a new release to GitHub.
trivy:
# ignoreUnfixed The flag to display only fixed vulnerabilities
ignore_unfixed: false
# skipUpdate The flag to enable or disable Trivy DB downloads from GitHub
#
# You might want to enable this flag in test or CI/CD environments to avoid GitHub rate limiting issues.
# If the flag is enabled you have to download the `trivy-offline.tar.gz` archive manually, extract `trivy.db` and
# `metadata.json` files and mount them in the `/home/scanner/.cache/trivy/db` path.
skip_update: false
#
# The offline_scan option prevents Trivy from sending API requests to identify dependencies.
# Scanning JAR files and pom.xml may require Internet access for better detection, but this option tries to avoid it.
# For example, the offline mode will not try to resolve transitive dependencies in pom.xml when the dependency doesn't
# exist in the local repositories. It means a number of detected vulnerabilities might be fewer in offline mode.
# It would work if all the dependencies are in local.
# This option doesn’t affect DB download. You need to specify "skip-update" as well as "offline-scan" in an air-gapped environment.
offline_scan: false
#
# Comma-separated list of what security issues to detect. Possible values are `vuln`, `config` and `secret`. Defaults to `vuln`.
security_check: vuln
#
# insecure The flag to skip verifying registry certificate
insecure: false
# github_token The GitHub access token to download Trivy DB
#
# Anonymous downloads from GitHub are subject to the limit of 60 requests per hour. Normally such rate limit is enough
# for production operations. If, for any reason, it's not enough, you could increase the rate limit to 5000
# requests per hour by specifying the GitHub access token. For more details on GitHub rate limiting please consult
# https://developer.github.com/v3/#rate-limiting
#
# You can create a GitHub token by following the instructions in
# https://help.github.com/en/github/authenticating-to-github/creating-a-personal-access-token-for-the-command-line
#
# github_token: xxx
jobservice:
# Maximum number of job workers in job service
max_job_workers: 10
notification:
# Maximum retry count for webhook job
webhook_job_max_retry: 10
chart:
# Change the value of absolute_url to enabled can enable absolute url in chart
absolute_url: disabled
# Log configurations
log:
# options are debug, info, warning, error, fatal
level: info
# configs for logs in local storage
local:
# Log files are rotated log_rotate_count times before being removed. If count is 0, old versions are removed rather than rotated.
rotate_count: 50
# Log files are rotated only if they grow bigger than log_rotate_size bytes. If size is followed by k, the size is assumed to be in kilobytes.
# If the M is used, the size is in megabytes, and if G is used, the size is in gigabytes. So size 100, size 100k, size 100M and size 100G
# are all valid.
rotate_size: 200M
# The directory on your host that store log
location: /app/harbor/log
# Uncomment following lines to enable external syslog endpoint.
# external_endpoint:
# # protocol used to transmit log to external endpoint, options is tcp or udp
# protocol: tcp
# # The host of external endpoint
# host: localhost
# # Port of external endpoint
# port: 5140
#This attribute is for migrator to detect the version of the .cfg file, DO NOT MODIFY!
_version: 2.7.0
# Uncomment external_database if using external database.
# external_database:
# harbor:
# host: harbor_db_host
# port: harbor_db_port
# db_name: harbor_db_name
# username: harbor_db_username
# password: harbor_db_password
# ssl_mode: disable
# max_idle_conns: 2
# max_open_conns: 0
# notary_signer:
# host: notary_signer_db_host
# port: notary_signer_db_port
# db_name: notary_signer_db_name
# username: notary_signer_db_username
# password: notary_signer_db_password
# ssl_mode: disable
# notary_server:
# host: notary_server_db_host
# port: notary_server_db_port
# db_name: notary_server_db_name
# username: notary_server_db_username
# password: notary_server_db_password
# ssl_mode: disable
# Uncomment external_redis if using external Redis server
# external_redis:
# # support redis, redis+sentinel
# # host for redis: :
# # host for redis+sentinel:
# # :,:,:
# host: redis:6379
# password:
# # sentinel_master_set must be set to support redis+sentinel
# #sentinel_master_set:
# # db_index 0 is for core, it's unchangeable
# registry_db_index: 1
# jobservice_db_index: 2
# chartmuseum_db_index: 3
# trivy_db_index: 5
# idle_timeout_seconds: 30
# Uncomment uaa for trusting the certificate of uaa instance that is hosted via self-signed cert.
# uaa:
# ca_file: /path/to/ca
# Global proxy
# Config http proxy for components, e.g. http://my.proxy.com:3128
# Components doesn't need to connect to each others via http proxy.
# Remove component from `components` array if want disable proxy
# for it. If you want use proxy for replication, MUST enable proxy
# for core and jobservice, and set `http_proxy` and `https_proxy`.
# Add domain to the `no_proxy` field, when you want disable proxy
# for some special registry.
proxy:
http_proxy:
https_proxy:
no_proxy:
components:
- core
- jobservice
- trivy
# metric:
# enabled: false
# port: 9090
# path: /metrics
# Trace related config
# only can enable one trace provider(jaeger or otel) at the same time,
# and when using jaeger as provider, can only enable it with agent mode or collector mode.
# if using jaeger collector mode, uncomment endpoint and uncomment username, password if needed
# if using jaeger agetn mode uncomment agent_host and agent_port
# trace:
# enabled: true
# # set sample_rate to 1 if you wanna sampling 100% of trace data; set 0.5 if you wanna sampling 50% of trace data, and so forth
# sample_rate: 1
# # # namespace used to differenciate different harbor services
# # namespace:
# # # attributes is a key value dict contains user defined attributes used to initialize trace provider
# # attributes:
# # application: harbor
# # # jaeger should be 1.26 or newer.
# # jaeger:
# # endpoint: http://hostname:14268/api/traces
# # username:
# # password:
# # agent_host: hostname
# # # export trace data by jaeger.thrift in compact mode
# # agent_port: 6831
# # otel:
# # endpoint: hostname:4318
# # url_path: /v1/traces
# # compression: false
# # insecure: true
# # timeout: 10s
# enable purge _upload directories
upload_purging:
enabled: true
# remove files in _upload directories which exist for a period of time, default is one week.
age: 168h
# the interval of the purge operations
interval: 24h
dryrun: false
# cache layer configurations
# If this feature enabled, harbor will cache the resource
# `project/project_metadata/repository/artifact/manifest` in the redis
# which can especially help to improve the performance of high concurrent
# manifest pulling.
# NOTICE
# If you are deploying Harbor in HA mode, make sure that all the harbor
# instances have the same behaviour, all with caching enabled or disabled,
# otherwise it can lead to potential data inconsistency.
cache:
# not enabled by default
enabled: false
# keep cache for one day by default
expire_hours: 24
10)安装harbor:
cd /app/soft/harbor
./install.sh
11)修改各服务器的容器仓库源为内网harbor,且将docker容器的cgroup的控制模式调整为systemd:
cat > /etc/docker/daemon.json<
1)配置kubernetes的yum源
cat < /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
2)重新加载yum源
yum clean all && yum makecache
3)查看版本kubelet,kubeadm,kubectl的版本
yum list kubelet --showduplicates | sort -r
yum list kubeadm --showduplicates | sort -r
yum list kubectl --showduplicates | sort -r
4)下载kubeadm相关包
yumdownloader kubelet-1.25.6 --resolve --destdir=/app/soft/kubernetes/kubelet
yumdownloader kubeadm-1.25.6 --resolve --destdir=/app/soft/kubernetes/kubeadm
yumdownloader kubectl-1.25.6 --resolve --destdir=/app/soft/kubernetes/kubectl
5)生成后,将kubeadm文件夹下载的kubectl-1.26.3和kubelet-1.26.3移走,并打包剩余的安装包
cd /app/soft/kubernetes/kubeadm/
mv *kubectl-1.26.3*.rpm *kubelet-1.26*.rpm ../../
tar -cvzf kubeadm_1.25.6_offline_install_pkg.tar.gz /app/soft/kubernetes
6)传输至离线的所有节点:
scp -rp /app/soft/kubeadm_1.25.6_offline_install_pkg.tar.gz 172.16.131.83:/app/soft/
scp -rp /app/soft/kubeadm_1.25.6_offline_install_pkg.tar.gz 172.16.131.84:/app/soft/
scp -rp /app/soft/kubeadm_1.25.6_offline_install_pkg.tar.gz 172.16.131.85:/app/soft/
scp -rp /app/soft/kubeadm_1.25.6_offline_install_pkg.tar.gz 172.16.131.86:/app/soft/
1)所有机器,解压并安装kubelet,kubectl,kubeadm
tar -xvzf /app/soft/kubeadm_1.25.6_offline_install_pkg.tar.gz -C /app/
cd /app/kubernets/kubelet/
yum install -y *.rpm
cd /app/kubernets/kubectl/
yum install -y *.rpm
cd /app/kubernets/kubeadm/
yum install -y *.rpm
2)启动kubelet服务
systemctl start kubelet && systemctl enable kubelet && systemctl status kubelet
3)查看部署kubernetes所需的镜像版本
kubeadm config images list --kubernetes-version=v1.25.6
registry.k8s.io/kube-apiserver:v1.25.6
registry.k8s.io/kube-controller-manager:v1.25.6
registry.k8s.io/kube-scheduler:v1.25.6
registry.k8s.io/kube-proxy:v1.25.6
registry.k8s.io/pause:3.7
registry.k8s.io/etcd:3.5.6-0
registry.k8s.io/coredns/coredns:v1.8.6
1)下载k8s镜像:
mkdir -p /app/soft/k8s_images
docker pull dyrnq/kube-apiserver:v1.25.6
docker pull dyrnq/kube-controller-manager:v1.25.6
docker pull dyrnq/kube-scheduler:v1.25.6
docker pull dyrnq/kube-proxy:v1.25.6
docker pull dyrnq/pause:3.7
docker pull dyrnq/etcd:3.5.6-0
docker pull dyrnq/coredns:v1.8.6
docker pull registry:latest
docker pull quay.io/coreos/flannel:v0.15.1
docker pull flannel/flannel-cni-plugin:v1.1.2
docker pull nginx:latest
2)打包镜像:
docker save dyrnq/kube-apiserver:v1.25.6 -o kube-apiserver_v1.25.6.tar
docker save dyrnq/kube-controller-manager:v1.25.6 -o kube-controller-manager_v1.25.6.tar
docker save dyrnq/kube-scheduler:v1.25.6 -o kube-scheduler_v1.25.6.tar
docker save dyrnq/kube-proxy:v1.25.6 -o kube-proxy_v1.25.6.tar
docker save dyrnq/pause:3.7 -o pause_v1.25.6.tar
docker save dyrnq/etcd:3.5.6-0 -o etcd_v1.25.6.tar
docker save dyrnq/coredns:v1.8.6 -o coredns_v1.25.6.tar
docker save registry:latest -o registry_latest.tar
docker save quay.io/coreos/flannel:v0.15.1 -o flannel_v0.15.1.tar
docker save flannel/flannel-cni-plugin:v1.1.2 -o flannel-cni-plugin_v1.1.2.tar
docker save nginx:latest -o nginx:latest
3)将打包的镜像压缩,并传输至k8s的master节点
tar -cvzf /app/soft/k8s_images.tar.gz /app/soft/k8s_images
scp -rp /app/soft/k8s_images.tar.gz 172.16.131.83:/app/soft
1)解压镜像
tar -xvzf /app/soft/k8s_images.tar.gz -C /app/soft
2)加载镜像
cd k8s_images
for i in `ls`
> do
> docker load -i $i
> done
3)重新给镜像打包:
docker images|awk '{print "docker tag " $1 ":" $2 " 172.16.131.87:1088/kubernetes-deploy/" $1 ":" $2}'|sed 1d
docker tag dyrnq/kube-apiserver:v1.25.6 172.16.131.87:1088/kubernetes-deploy/kube-apiserver:v1.25.6
docker tag dyrnq/kube-controller-manager:v1.25.6 172.16.131.87:1088/kubernetes-deploy/kube-controller-manager:v1.25.6
docker tag dyrnq/kube-scheduler:v1.25.6 172.16.131.87:1088/kubernetes-deploy/kube-scheduler:v1.25.6
docker tag dyrnq/kube-proxy:v1.25.6 172.16.131.87:1088/kubernetes-deploy/kube-proxy:v1.25.6
docker tag dyrnq/pause:3.7 172.16.131.87:1088/kubernetes-deploy/pause:3.7
docker tag dyrnq/etcd:3.5.6-0 172.16.131.87:1088/kubernetes-deploy/etcd:3.5.6-0
docker tag dyrnq/coredns:v1.8.6 172.16.131.87:1088/kubernetes-deploy/coredns:v1.8.6
docker tag registry:latest 172.16.131.87:1088/kubernetes-deploy/registry:latest
docker tag quay.io/coreos/flannel:v0.15.1 172.16.131.87:1088/kubernetes-deploy/flannel:v0.15.1
docker tag flannel/flannel-cni-plugin:v1.1.2 172.16.131.87:1088/kubernetes-deploy/flannel-cni-plugin:v1.1.2
docker tag nginx:latest 172.16.131.87:1088/kubernetes-deploy/nginx:latest
4)在各个节点登陆私有并在master节点推入新tag的镜像到仓库中:
docker login 172.16.131.87:1088
Username: admin
Password:
WARNING! Your password will be stored unencrypted in /root/.docker/config.json.
Configure a credential helper to remove this warning. See
https://docs.docker.com/engine/reference/commandline/login/#credentials-store
Login Succeeded
docker images|grep "172.16.131.87"|awk '{print "docker push " $1 ":" $2}'
docker push 172.16.131.87:1088/kubernetes-deploy/kube-apiserver:v1.25.6
docker push 172.16.131.87:1088/kubernetes-deploy/kube-controller-manager:v1.25.6
docker push 172.16.131.87:1088/kubernetes-deploy/kube-scheduler:v1.25.6
docker push 172.16.131.87:1088/kubernetes-deploy/kube-proxy:v1.25.6
docker push 172.16.131.87:1088/kubernetes-deploy/pause:3.7
docker push 172.16.131.87:1088/kubernetes-deploy/etcd:3.5.6-0
docker push 172.16.131.87:1088/kubernetes-deploy/coredns:v1.8.6
docker push 172.16.131.87:1088/kubernetes-deploy/registry:latest
docker push 172.16.131.87:1088/kubernetes-deploy/flannel:v0.15.1
docker push 172.16.131.87:1088/kubernetes-deploy/flannel-cni-plugin:v1.1.2
docker push 172.16.131.87:1088/kubernetes-deploy/nginx:latest
5)在master节点初始化kubernetes集群
在master节点生成初始化集群参数配置文件:
kubeadm config print init-defaults > kubeadm.yaml
修改配置文件参数:
apiVersion: kubeadm.k8s.io/v1beta3
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 172.16.131.83
bindPort: 6443
nodeRegistration:
criSocket: unix:///var/run/cri-dockerd.sock
imagePullPolicy: IfNotPresent
name: k8s-master
taints: null
---
apiServer:
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta3
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns: {}
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: 172.16.131.87:1088/kubernetes-deploy
kind: ClusterConfiguration
kubernetesVersion: 1.25.6
networking:
dnsDomain: cluster.local
serviceSubnet: 10.96.0.0/12
podSubnet: 10.224.0.0/16
scheduler: {}
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: ipvs
6)修改containered的cri配置文件
vi /etc/containered/config.toml
将diaabled_plugins=["cri"]禁用
7)初始化kubernetes
kubeadm init --config=kubeadm.yaml
[init] Using Kubernetes version: v1.25.6
[preflight] Running pre-flight checks
[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 23.0.1. Latest validated version: 18.09
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Activating the kubelet service
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [k8s-master kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.1.0.1 172.16.131.83]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [k8s-master localhost] and IPs [172.16.131.83 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [k8s-master localhost] and IPs [172.16.131.83 127.0.0.1 ::1]
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 17.002750 seconds
[upload-config] storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.14" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --experimental-upload-certs
[mark-control-plane] Marking the node k8s-master as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node k8s-master as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: f93xna.7kr79tn4z6fmzf23
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] creating the "cluster-info" ConfigMap in the "kube-public" namespace
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 172.16.131.83:6443 --token f93xna.7kr79tn4z6fmzf23 \
--discovery-token-ca-cert-hash sha256:40dba9e45ffce1c08d415c44a962974f9081c1ae9e74e922a2410d4e0ebac590
8)根据提示启动kubernetes集群
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
注:
如果想重新初始化集群则需要做reset,此时则可进行如下操作(必须加–cri-socket unix:///var/run/cri-docker.sock,否则会报错):
kubeadm reset --cri-socket unix:///var/run/cri-docker.sock
9)配置fannel(或calcio)网络,用于不同主机之间的容器网络交互:
联网中转机操作下载fannel的yml配置文件:
wget https://github.com/flannel-io/flannel/blob/master/Documentation/kube-flannel.yml
10)修改kube-flannel.yml文件
---
kind: Namespace
apiVersion: v1
metadata:
name: kube-flannel
labels:
k8s-app: flannel
pod-security.kubernetes.io/enforce: privileged
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
labels:
k8s-app: flannel
name: flannel
rules:
- apiGroups:
- ""
resources:
- pods
verbs:
- get
- apiGroups:
- ""
resources:
- nodes
verbs:
- get
- list
- watch
- apiGroups:
- ""
resources:
- nodes/status
verbs:
- patch
- apiGroups:
- networking.k8s.io
resources:
- clustercidrs
verbs:
- list
- watch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
labels:
k8s-app: flannel
name: flannel
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: flannel
subjects:
- kind: ServiceAccount
name: flannel
namespace: kube-flannel
---
apiVersion: v1
kind: ServiceAccount
metadata:
labels:
k8s-app: flannel
name: flannel
namespace: kube-flannel
---
kind: ConfigMap
apiVersion: v1
metadata:
name: kube-flannel-cfg
namespace: kube-flannel
labels:
tier: node
k8s-app: flannel
app: flannel
data:
cni-conf.json: |
{
"name": "cbr0",
"cniVersion": "0.3.1",
"plugins": [
{
"type": "flannel",
"delegate": {
"hairpinMode": true,
"isDefaultGateway": true
}
},
{
"type": "portmap",
"capabilities": {
"portMappings": true
}
}
]
}
net-conf.json: |
{
"Network": "10.244.0.0/16",
"Backend": {
"Type": "vxlan"
}
}
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: kube-flannel-ds
namespace: kube-flannel
labels:
tier: node
app: flannel
k8s-app: flannel
spec:
selector:
matchLabels:
app: flannel
template:
metadata:
labels:
tier: node
app: flannel
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/os
operator: In
values:
- linux
hostNetwork: true
priorityClassName: system-node-critical
tolerations:
- operator: Exists
effect: NoSchedule
serviceAccountName: flannel
initContainers:
- name: install-cni-plugin
image: 172.16.131.87:1088/kubernetes-v1.24.12-deploy/flannel-cni-plugin:v1.1.2
#image: docker.io/rancher/mirrored-flannelcni-flannel-cni-plugin:v1.1.2
command:
- cp
args:
- -f
- /flannel
- /opt/cni/bin/flannel
volumeMounts:
- name: cni-plugin
mountPath: /opt/cni/bin
- name: install-cni
image: 172.16.131.87:1088/kubernetes-v1.24.12-deploy/flannel:v0.15.1
#image: docker.io/rancher/mirrored-flannelcni-flannel:v0.21.4
command:
- cp
args:
- -f
- /etc/kube-flannel/cni-conf.json
- /etc/cni/net.d/10-flannel.conflist
volumeMounts:
- name: cni
mountPath: /etc/cni/net.d
- name: flannel-cfg
mountPath: /etc/kube-flannel/
containers:
- name: kube-flannel
image: 172.16.131.87:1088/kubernetes-v1.24.12-deploy/flannel:v0.15.1
#image: docker.io/rancher/mirrored-flannelcni-flannel:v0.21.4
command:
- /opt/bin/flanneld
args:
- --ip-masq
- --kube-subnet-mgr
resources:
requests:
cpu: "100m"
memory: "50Mi"
securityContext:
privileged: false
capabilities:
add: ["NET_ADMIN", "NET_RAW"]
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: EVENT_QUEUE_DEPTH
value: "5000"
volumeMounts:
- name: run
mountPath: /run/flannel
- name: flannel-cfg
mountPath: /etc/kube-flannel/
- name: xtables-lock
mountPath: /run/xtables.lock
volumes:
- name: run
hostPath:
path: /run/flannel
- name: cni-plugin
hostPath:
path: /opt/cni/bin
- name: cni
hostPath:
path: /etc/cni/net.d
- name: flannel-cfg
configMap:
name: kube-flannel-cfg
- name: xtables-lock
hostPath:
path: /run/xtables.lock
type: FileOrCreate
11)在master上配置FANNEL网络:
kubectl apply -f /apps/flannel/kube-flannel.yml
12)根据上述提示在其他节点上执行命令加入kubectl集群(需要在命令上加入–cri-socket unix:///var/run/cri-dockerd.sock,否则会失败):
kubeadm join 172.16.131.83:6443 --token f93xna.7kr79tn4z6fmzf23 --discovery-token-ca-cert-hash sha256:40dba9e45ffce1c08d415c44a962974f9081c1ae9e74e922a2410d4e0ebac590 --cri-socket unix:///var/run/cri-dockerd.sock
也可以通过以下方式在主节点生成集群加入命令,并拷贝到其他node上执行:
kubeadm token create --print-join-command
kubeadm join 172.16.131.83:6443 --token r7oaex.qgqvdqvlyuubt5aw --discovery-token-ca-cert-hash sha256:40dba9e45ffce1c08d415c44a962974f9081c1ae9e74e922a2410d4e0ebac590 --cri-socket unix:///var/run/cri-dockerd.sock
13)node节点执行后,如下则说明成功将节点加入集群,以后有新的节点需要加入kubernets集群也一样:
e922a2410d4e0ebac590
[preflight] Running pre-flight checks
[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 23.0.1. Latest validated version: 18.09
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.14" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Activating the kubelet service
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.
Run 'kubectl get nodes' on the control-plane to see this node join the cluster.
14)检查集群状态:
kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-master Ready master 1h v1.25.6
k8s-node1 Ready 2h v1.25.6
k8s-node2 Ready 1h v1.25.6
k8s-node3 Ready 1h v1.25.6
注:
我在部署完成后,长时间检查发现node节点一直处于NotReady的状态
NAME STATUS ROLES AGE VERSION
k8s-master Ready master 1h v1.25.6
k8s-node1 NotReady 2h v1.25.6
k8s-node2 NotReady 1h v1.25.6
k8s-node3 NotReady 1h v1.25.6
此时kubenetes的状态是不正确的,因此需要排查,我们可以在k8s节点上运行如下命令用于查看错误日志,方便我们排查问题:
journalctl -u kubelet -f
此时在日志中,我看到两个报错:
k8s-node1 kubelet[27242]: I1014 11:17:29.409068 27242 cni.go:239] "Unable to update cni config" err="no networks found in /etc/cni/net.d"
Oct 14 11:17:29 k8s-node1 kubelet[27242]: E1014 11:17:29.996079 27242 kubelet.go:2332] "Container runtime network not ready" networkReady="NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized"
问题1的处理方式,即其他节点缺失配置文件,传输主节点的网络配置文件到其他节点即可
scp -rp /etc/cni k8s-node1:/etc/
scp -rp /etc/cni k8s-node2:/etc/
scp -rp /etc/cni k8s-node3:/etc/
此时可以发现所有节点状态为ready,即kubernetes的状态已经正确。
问题2的出现则是由于在搭建仓库上传k8s镜像的时候,将项目kubernetes-deploy项目设置为了私有,因此无法下载,最简单的方式就是直接在harbor上将该项目设置为公开即可(私有方式如何获取镜像后续再讨论)。
至此,我们整个基于红帽7的k8s通过kubeadm的离线安装部署整个就完成了,接下来就是通过部署一个nginx来验证整个集群的可用性了。
附:sshUserSetup.sh
#!/bin/sh
# Nitin Jerath - Aug 2005
#Usage sshUserSetup.sh -user [ -hosts \"\" | -hostfile ] [ -advanced ] [ -verify] [ -exverify ] [ -logfile ] [-confirm] [-shared] [-help] [-usePassphrase] [-noPromptPassphrase]
#eg. sshUserSetup.sh -hosts "host1 host2" -user njerath -advanced
#This script is used to setup SSH connectivity from the host on which it is
# run to the specified remote hosts. After this script is run, the user can use # SSH to run commands on the remote hosts or copy files between the local host
# and the remote hosts without being prompted for passwords or confirmations.
# The list of remote hosts and the user name on the remote host is specified as
# a command line parameter to the script. Note that in case the user on the
# remote host has its home directory NFS mounted or shared across the remote
# hosts, this script should be used with -shared option.
#Specifying the -advanced option on the command line would result in SSH
# connectivity being setup among the remote hosts which means that SSH can be
# used to run commands on one remote host from the other remote host or copy
# files between the remote hosts without being prompted for passwords or
# confirmations.
#Please note that the script would remove write permissions on the remote hosts
#for the user home directory and ~/.ssh directory for "group" and "others". This
# is an SSH requirement. The user would be explicitly informed about this by teh script and prompted to continue. In case the user presses no, the script would exit. In case the user does not want to be prompted, he can use -confirm option.
# As a part of the setup, the script would use SSH to create files within ~/.ssh
# directory of the remote node and to setup the requisite permissions. The
#script also uses SCP to copy the local host public key to the remote hosts so
# that the remote hosts trust the local host for SSH. At the time, the script
#performs these steps, SSH connectivity has not been completely setup hence
# the script would prompt the user for the remote host password.
#For each remote host, for remote users with non-shared homes this would be
# done once for SSH and once for SCP. If the number of remote hosts are x, the
# user would be prompted 2x times for passwords. For remote users with shared
# homes, the user would be prompted only twice, once each for SCP and SSH.
#For security reasons, the script does not save passwords and reuse it. Also,
# for security reasons, the script does not accept passwords redirected from a
#file. The user has to key in the confirmations and passwords at the prompts.
#The -verify option means that the user just wants to verify whether SSH has
#been set up. In this case, the script would not setup SSH but would only check
# whether SSH connectivity has been setup from the local host to the remote
# hosts. The script would run the date command on each remote host using SSH. In
# case the user is prompted for a password or sees a warning message for a
#particular host, it means SSH connectivity has not been setup correctly for
# that host.
#In case the -verify option is not specified, the script would setup SSH and
#then do the verification as well.
#In case the user speciies the -exverify option, an exhaustive verification would be done. In that case, the following would be checked:
# 1. SSH connectivity from local host to all remote hosts.
# 2. SSH connectivity from each remote host to itself and other remote hosts.
#echo Parsing command line arguments
numargs=$#
ADVANCED=false
HOSTNAME=`hostname`
CONFIRM=no
SHARED=false
i=1
USR=$USER
if test -z "$TEMP"
then
TEMP=/tmp
fi
IDENTITY=id_rsa
LOGFILE=$TEMP/sshUserSetup_`date +%F-%H-%M-%S`.log
VERIFY=false
EXHAUSTIVE_VERIFY=false
HELP=false
PASSPHRASE=no
RERUN_SSHKEYGEN=no
NO_PROMPT_PASSPHRASE=no
while [ $i -le $numargs ]
do
j=$1
if [ $j = "-hosts" ]
then
HOSTS=$2
shift 1
i=`expr $i + 1`
fi
if [ $j = "-user" ]
then
USR=$2
shift 1
i=`expr $i + 1`
fi
if [ $j = "-logfile" ]
then
LOGFILE=$2
shift 1
i=`expr $i + 1`
fi
if [ $j = "-confirm" ]
then
CONFIRM=yes
fi
if [ $j = "-hostfile" ]
then
CLUSTER_CONFIGURATION_FILE=$2
shift 1
i=`expr $i + 1`
fi
if [ $j = "-usePassphrase" ]
then
PASSPHRASE=yes
fi
if [ $j = "-noPromptPassphrase" ]
then
NO_PROMPT_PASSPHRASE=yes
fi
if [ $j = "-shared" ]
then
SHARED=true
fi
if [ $j = "-exverify" ]
then
EXHAUSTIVE_VERIFY=true
fi
if [ $j = "-verify" ]
then
VERIFY=true
fi
if [ $j = "-advanced" ]
then
ADVANCED=true
fi
if [ $j = "-help" ]
then
HELP=true
fi
i=`expr $i + 1`
shift 1
done
if [ $HELP = "true" ]
then
echo "Usage $0 -user [ -hosts \"\" | -hostfile ] [ -advanced ] [ -verify] [ -exverify ] [ -logfile ] [-confirm] [-shared] [-help] [-usePassphrase] [-noPromptPassphrase]"
echo "This script is used to setup SSH connectivity from the host on which it is run to the specified remote hosts. After this script is run, the user can use SSH to run commands on the remote hosts or copy files between the local host and the remote hosts without being prompted for passwords or confirmations. The list of remote hosts and the user name on the remote host is specified as a command line parameter to the script. "
echo "-user : User on remote hosts. "
echo "-hosts : Space separated remote hosts list. "
echo "-hostfile : The user can specify the host names either through the -hosts option or by specifying the absolute path of a cluster configuration file. A sample host file contents are below: "
echo
echo " stacg30 stacg30int 10.1.0.0 stacg30v -"
echo " stacg34 stacg34int 10.1.0.1 stacg34v -"
echo
echo " The first column in each row of the host file will be used as the host name."
echo
echo "-usePassphrase : The user wants to set up passphrase to encrypt the private key on the local host. "
echo "-noPromptPassphrase : The user does not want to be prompted for passphrase related questions. This is for users who want the default behavior to be followed."
echo "-shared : In case the user on the remote host has its home directory NFS mounted or shared across the remote hosts, this script should be used with -shared option. "
echo " It is possible for the user to determine whether a user's home directory is shared or non-shared. Let us say we want to determine that user user1's home directory is shared across hosts A, B and C."
echo " Follow the following steps:"
echo " 1. On host A, touch ~user1/checkSharedHome.tmp"
echo " 2. On hosts B and C, ls -al ~user1/checkSharedHome.tmp"
echo " 3. If the file is present on hosts B and C in ~user1 directory and"
echo " is identical on all hosts A, B, C, it means that the user's home "
echo " directory is shared."
echo " 4. On host A, rm -f ~user1/checkSharedHome.tmp"
echo " In case the user accidentally passes -shared option for non-shared homes or viceversa,SSH connectivity would only be set up for a subset of the hosts. The user would have to re-run the setyp script with the correct option to rectify this problem."
echo "-advanced : Specifying the -advanced option on the command line would result in SSH connectivity being setup among the remote hosts which means that SSH can be used to run commands on one remote host from the other remote host or copy files between the remote hosts without being prompted for passwords or confirmations."
echo "-confirm: The script would remove write permissions on the remote hosts for the user home directory and ~/.ssh directory for "group" and "others". This is an SSH requirement. The user would be explicitly informed about this by the script and prompted to continue. In case the user presses no, the script would exit. In case the user does not want to be prompted, he can use -confirm option."
echo "As a part of the setup, the script would use SSH to create files within ~/.ssh directory of the remote node and to setup the requisite permissions. The script also uses SCP to copy the local host public key to the remote hosts so that the remote hosts trust the local host for SSH. At the time, the script performs these steps, SSH connectivity has not been completely setup hence the script would prompt the user for the remote host password. "
echo "For each remote host, for remote users with non-shared homes this would be done once for SSH and once for SCP. If the number of remote hosts are x, the user would be prompted 2x times for passwords. For remote users with shared homes, the user would be prompted only twice, once each for SCP and SSH. For security reasons, the script does not save passwords and reuse it. Also, for security reasons, the script does not accept passwords redirected from a file. The user has to key in the confirmations and passwords at the prompts. "
echo "-verify : -verify option means that the user just wants to verify whether SSH has been set up. In this case, the script would not setup SSH but would only check whether SSH connectivity has been setup from the local host to the remote hosts. The script would run the date command on each remote host using SSH. In case the user is prompted for a password or sees a warning message for a particular host, it means SSH connectivity has not been setup correctly for that host. In case the -verify option is not specified, the script would setup SSH and then do the verification as well. "
echo "-exverify : In case the user speciies the -exverify option, an exhaustive verification for all hosts would be done. In that case, the following would be checked: "
echo " 1. SSH connectivity from local host to all remote hosts. "
echo " 2. SSH connectivity from each remote host to itself and other remote hosts. "
echo The -exverify option can be used in conjunction with the -verify option as well to do an exhaustive verification once the setup has been done.
echo "Taking some examples: Let us say local host is Z, remote hosts are A,B and C. Local user is njerath. Remote users are racqa(non-shared), aime(shared)."
echo "$0 -user racqa -hosts "A B C" -advanced -exverify -confirm"
echo "Script would set up connectivity from Z -> A, Z -> B, Z -> C, A -> A, A -> B, A -> C, B -> A, B -> B, B -> C, C -> A, C -> B, C -> C."
echo "Since user has given -exverify option, all these scenario would be verified too."
echo
echo "Now the user runs : $0 -user racqa -hosts "A B C" -verify"
echo "Since -verify option is given, no SSH setup would be done, only verification of existing setup. Also, since -exverify or -advanced options are not given, script would only verify connectivity from Z -> A, Z -> B, Z -> C"
echo "Now the user runs : $0 -user racqa -hosts "A B C" -verify -advanced"
echo "Since -verify option is given, no SSH setup would be done, only verification of existing setup. Also, since -advanced options is given, script would verify connectivity from Z -> A, Z -> B, Z -> C, A-> A, A->B, A->C, A->D"
echo "Now the user runs:"
echo "$0 -user aime -hosts "A B C" -confirm -shared"
echo "Script would set up connectivity between Z->A, Z->B, Z->C only since advanced option is not given."
echo "All these scenarios would be verified too."
exit
fi
if test -z "$HOSTS"
then
if test -n "$CLUSTER_CONFIGURATION_FILE" && test -f "$CLUSTER_CONFIGURATION_FILE"
then
HOSTS=`awk '$1 !~ /^#/ { str = str " " $1 } END { print str }' $CLUSTER_CONFIGURATION_FILE`
elif ! test -f "$CLUSTER_CONFIGURATION_FILE"
then
echo "Please specify a valid and existing cluster configuration file."
fi
fi
if test -z "$HOSTS" || test -z $USR
then
echo "Either user name or host information is missing"
echo "Usage $0 -user [ -hosts \"\" | -hostfile ] [ -advanced ] [ -verify] [ -exverify ] [ -logfile ] [-confirm] [-shared] [-help] [-usePassphrase] [-noPromptPassphrase]"
exit 1
fi
if [ -d $LOGFILE ]; then
echo $LOGFILE is a directory, setting logfile to $LOGFILE/ssh.log
LOGFILE=$LOGFILE/ssh.log
fi
echo The output of this script is also logged into $LOGFILE | tee -a $LOGFILE
if [ `echo $?` != 0 ]; then
echo Error writing to the logfile $LOGFILE, Exiting
exit 1
fi
echo Hosts are $HOSTS | tee -a $LOGFILE
echo user is $USR | tee -a $LOGFILE
SSH="/usr/bin/ssh"
SCP="/usr/bin/scp"
SSH_KEYGEN="/usr/bin/ssh-keygen"
calculateOS()
{
platform=`uname -s`
case "$platform"
in
"SunOS") os=solaris;;
"Linux") os=linux;;
"HP-UX") os=hpunix;;
"AIX") os=aix;;
*) echo "Sorry, $platform is not currently supported." | tee -a $LOGFILE
exit 1;;
esac
echo "Platform:- $platform " | tee -a $LOGFILE
}
calculateOS
BITS=1024
ENCR="rsa"
deadhosts=""
alivehosts=""
if [ $platform = "Linux" ]
then
PING="/bin/ping"
else
PING="/usr/sbin/ping"
fi
#bug 9044791
if [ -n "$SSH_PATH" ]; then
SSH=$SSH_PATH
fi
if [ -n "$SCP_PATH" ]; then
SCP=$SCP_PATH
fi
if [ -n "$SSH_KEYGEN_PATH" ]; then
SSH_KEYGEN=$SSH_KEYGEN_PATH
fi
if [ -n "$PING_PATH" ]; then
PING=$PING_PATH
fi
PATH_ERROR=0
if test ! -x $SSH ; then
echo "ssh not found at $SSH. Please set the variable SSH_PATH to the correct location of ssh and retry."
PATH_ERROR=1
fi
if test ! -x $SCP ; then
echo "scp not found at $SCP. Please set the variable SCP_PATH to the correct location of scp and retry."
PATH_ERROR=1
fi
if test ! -x $SSH_KEYGEN ; then
echo "ssh-keygen not found at $SSH_KEYGEN. Please set the variable SSH_KEYGEN_PATH to the correct location of ssh-keygen and retry."
PATH_ERROR=1
fi
if test ! -x $PING ; then
echo "ping not found at $PING. Please set the variable PING_PATH to the correct location of ping and retry."
PATH_ERROR=1
fi
if [ $PATH_ERROR = 1 ]; then
echo "ERROR: one or more of the required binaries not found, exiting"
exit 1
fi
#9044791 end
echo Checking if the remote hosts are reachable | tee -a $LOGFILE
for host in $HOSTS
do
if [ $platform = "SunOS" ]; then
$PING -s $host 5 5
elif [ $platform = "HP-UX" ]; then
$PING $host -n 5 -m 5
else
$PING -c 5 -w 5 $host
fi
exitcode=`echo $?`
if [ $exitcode = 0 ]
then
alivehosts="$alivehosts $host"
else
deadhosts="$deadhosts $host"
fi
done
if test -z "$deadhosts"
then
echo Remote host reachability check succeeded. | tee -a $LOGFILE
echo The following hosts are reachable: $alivehosts. | tee -a $LOGFILE
echo The following hosts are not reachable: $deadhosts. | tee -a $LOGFILE
echo All hosts are reachable. Proceeding further... | tee -a $LOGFILE
else
echo Remote host reachability check failed. | tee -a $LOGFILE
echo The following hosts are reachable: $alivehosts. | tee -a $LOGFILE
echo The following hosts are not reachable: $deadhosts. | tee -a $LOGFILE
echo Please ensure that all the hosts are up and re-run the script. | tee -a $LOGFILE
echo Exiting now... | tee -a $LOGFILE
exit 1
fi
firsthost=`echo $HOSTS | awk '{print $1}; END { }'`
echo firsthost $firsthost
numhosts=`echo $HOSTS | awk '{ }; END {print NF}'`
echo numhosts $numhosts
if [ $VERIFY = "true" ]
then
echo Since user has specified -verify option, SSH setup would not be done. Only, existing SSH setup would be verified. | tee -a $LOGFILE
continue
else
echo The script will setup SSH connectivity from the host ''`hostname`'' to all | tee -a $LOGFILE
echo the remote hosts. After the script is executed, the user can use SSH to run | tee -a $LOGFILE
echo commands on the remote hosts or copy files between this host ''`hostname`'' | tee -a $LOGFILE
echo and the remote hosts without being prompted for passwords or confirmations. | tee -a $LOGFILE
echo | tee -a $LOGFILE
echo NOTE 1: | tee -a $LOGFILE
echo As part of the setup procedure, this script will use 'ssh' and 'scp' to copy | tee -a $LOGFILE
echo files between the local host and the remote hosts. Since the script does not | tee -a $LOGFILE
echo store passwords, you may be prompted for the passwords during the execution of | tee -a $LOGFILE
echo the script whenever 'ssh' or 'scp' is invoked. | tee -a $LOGFILE
echo | tee -a $LOGFILE
echo NOTE 2: | tee -a $LOGFILE
echo "AS PER SSH REQUIREMENTS, THIS SCRIPT WILL SECURE THE USER HOME DIRECTORY" | tee -a $LOGFILE
echo AND THE .ssh DIRECTORY BY REVOKING GROUP AND WORLD WRITE PRIVILEGES TO THESE | tee -a $LOGFILE
echo "directories." | tee -a $LOGFILE
echo | tee -a $LOGFILE
echo "Do you want to continue and let the script make the above mentioned changes (yes/no)?" | tee -a $LOGFILE
if [ "$CONFIRM" = "no" ]
then
read CONFIRM
else
echo "Confirmation provided on the command line" | tee -a $LOGFILE
fi
echo | tee -a $LOGFILE
echo The user chose ''$CONFIRM'' | tee -a $LOGFILE
if [ -z "$CONFIRM" -o "$CONFIRM" != "yes" -a "$CONFIRM" != "no" ]
then
echo "You haven't specified proper input. Please enter 'yes' or 'no'. Exiting...."
exit 0
fi
if [ "$CONFIRM" = "no" ]
then
echo "SSH setup is not done." | tee -a $LOGFILE
exit 1
else
if [ $NO_PROMPT_PASSPHRASE = "yes" ]
then
echo "User chose to skip passphrase related questions." | tee -a $LOGFILE
else
if [ $SHARED = "true" ]
then
hostcount=`expr ${numhosts} + 1`
PASSPHRASE_PROMPT=`expr 2 \* $hostcount`
else
PASSPHRASE_PROMPT=`expr 2 \* ${numhosts}`
fi
echo "Please specify if you want to specify a passphrase for the private key this script will create for the local host. Passphrase is used to encrypt the private key and makes SSH much more secure. Type 'yes' or 'no' and then press enter. In case you press 'yes', you would need to enter the passphrase whenever the script executes ssh or scp. $PASSPHRASE " | tee -a $LOGFILE
echo "The estimated number of times the user would be prompted for a passphrase is $PASSPHRASE_PROMPT. In addition, if the private-public files are also newly created, the user would have to specify the passphrase on one additional occasion. " | tee -a $LOGFILE
echo "Enter 'yes' or 'no'." | tee -a $LOGFILE
if [ "$PASSPHRASE" = "no" ]
then
read PASSPHRASE
else
echo "Confirmation provided on the command line" | tee -a $LOGFILE
fi
echo | tee -a $LOGFILE
echo The user chose ''$PASSPHRASE'' | tee -a $LOGFILE
if [ -z "$PASSPHRASE" -o "$PASSPHRASE" != "yes" -a "$PASSPHRASE" != "no" ]
then
echo "You haven't specified whether to use Passphrase or not. Please specify 'yes' or 'no'. Exiting..."
exit 0
fi
if [ "$PASSPHRASE" = "yes" ]
then
RERUN_SSHKEYGEN="yes"
#Checking for existence of ${IDENTITY} file
if test -f $HOME/.ssh/${IDENTITY}.pub && test -f $HOME/.ssh/${IDENTITY}
then
echo "The files containing the client public and private keys already exist on the local host. The current private key may or may not have a passphrase associated with it. In case you remember the passphrase and do not want to re-run ssh-keygen, press 'no' and enter. If you press 'no', the script will not attempt to create any new public/private key pairs. If you press 'yes', the script will remove the old private/public key files existing and create new ones prompting the user to enter the passphrase. If you enter 'yes', any previous SSH user setups would be reset. If you press 'change', the script will associate a new passphrase with the old keys." | tee -a $LOGFILE
echo "Press 'yes', 'no' or 'change'" | tee -a $LOGFILE
read RERUN_SSHKEYGEN
echo The user chose ''$RERUN_SSHKEYGEN'' | tee -a $LOGFILE
if [ -z "$RERUN_SSHKEYGEN" -o "$RERUN_SSHKEYGEN" != "yes" -a "$RERUN_SSHKEYGEN" != "no" -a "$RERUN_SSHKEYGEN" != "change" ]
then
echo "You haven't specified whether to re-run 'ssh-keygen' or not. Please enter 'yes' , 'no' or 'change'. Exiting..."
exit 0;
fi
fi
else
if test -f $HOME/.ssh/${IDENTITY}.pub && test -f $HOME/.ssh/${IDENTITY}
then
echo "The files containing the client public and private keys already exist on the local host. The current private key may have a passphrase associated with it. In case you find using passphrase inconvenient(although it is more secure), you can change to it empty through this script. Press 'change' if you want the script to change the passphrase for you. Press 'no' if you want to use your old passphrase, if you had one."
read RERUN_SSHKEYGEN
echo The user chose ''$RERUN_SSHKEYGEN'' | tee -a $LOGFILE
if [ -z "$RERUN_SSHKEYGEN" -o "$RERUN_SSHKEYGEN" != "yes" -a "$RERUN_SSHKEYGEN" != "no" -a "$RERUN_SSHKEYGEN" != "change" ]
then
echo "You haven't specified whether to re-run 'ssh-keygen' or not. Please enter 'yes' , 'no' or 'change'. Exiting..."
exit 0
fi
fi
fi
fi
echo Creating .ssh directory on local host, if not present already | tee -a $LOGFILE
mkdir -p $HOME/.ssh | tee -a $LOGFILE
echo Creating authorized_keys file on local host | tee -a $LOGFILE
touch $HOME/.ssh/authorized_keys | tee -a $LOGFILE
echo Changing permissions on authorized_keys to 644 on local host | tee -a $LOGFILE
chmod 644 $HOME/.ssh/authorized_keys | tee -a $LOGFILE
mv -f $HOME/.ssh/authorized_keys $HOME/.ssh/authorized_keys.tmp | tee -a $LOGFILE
echo Creating known_hosts file on local host | tee -a $LOGFILE
touch $HOME/.ssh/known_hosts | tee -a $LOGFILE
echo Changing permissions on known_hosts to 644 on local host | tee -a $LOGFILE
chmod 644 $HOME/.ssh/known_hosts | tee -a $LOGFILE
mv -f $HOME/.ssh/known_hosts $HOME/.ssh/known_hosts.tmp | tee -a $LOGFILE
echo Creating config file on local host | tee -a $LOGFILE
echo If a config file exists already at $HOME/.ssh/config, it would be backed up to $HOME/.ssh/config.backup.
echo "Host *" > $HOME/.ssh/config.tmp | tee -a $LOGFILE
echo "ForwardX11 no" >> $HOME/.ssh/config.tmp | tee -a $LOGFILE
if test -f $HOME/.ssh/config
then
cp -f $HOME/.ssh/config $HOME/.ssh/config.backup
fi
mv -f $HOME/.ssh/config.tmp $HOME/.ssh/config | tee -a $LOGFILE
chmod 644 $HOME/.ssh/config
if [ "$RERUN_SSHKEYGEN" = "yes" ]
then
echo Removing old private/public keys on local host | tee -a $LOGFILE
rm -f $HOME/.ssh/${IDENTITY} | tee -a $LOGFILE
rm -f $HOME/.ssh/${IDENTITY}.pub | tee -a $LOGFILE
echo Running SSH keygen on local host | tee -a $LOGFILE
$SSH_KEYGEN -t $ENCR -b $BITS -f $HOME/.ssh/${IDENTITY} | tee -a $LOGFILE
elif [ "$RERUN_SSHKEYGEN" = "change" ]
then
echo Running SSH Keygen on local host to change the passphrase associated with the existing private key | tee -a $LOGFILE
$SSH_KEYGEN -p -t $ENCR -b $BITS -f $HOME/.ssh/${IDENTITY} | tee -a $LOGFILE
elif test -f $HOME/.ssh/${IDENTITY}.pub && test -f $HOME/.ssh/${IDENTITY}
then
continue
else
echo Removing old private/public keys on local host | tee -a $LOGFILE
rm -f $HOME/.ssh/${IDENTITY} | tee -a $LOGFILE
rm -f $HOME/.ssh/${IDENTITY}.pub | tee -a $LOGFILE
echo Running SSH keygen on local host with empty passphrase | tee -a $LOGFILE
$SSH_KEYGEN -t $ENCR -b $BITS -f $HOME/.ssh/${IDENTITY} -N '' | tee -a $LOGFILE
fi
if [ $SHARED = "true" ]
then
if [ $USER = $USR ]
then
#No remote operations required
echo Remote user is same as local user | tee -a $LOGFILE
REMOTEHOSTS=""
chmod og-w $HOME $HOME/.ssh | tee -a $LOGFILE
else
REMOTEHOSTS="${firsthost}"
fi
else
REMOTEHOSTS="$HOSTS"
fi
for host in $REMOTEHOSTS
do
echo Creating .ssh directory and setting permissions on remote host $host | tee -a $LOGFILE
echo "THE SCRIPT WOULD ALSO BE REVOKING WRITE PERMISSIONS FOR "group" AND "others" ON THE HOME DIRECTORY FOR $USR. THIS IS AN SSH REQUIREMENT." | tee -a $LOGFILE
echo The script would create ~$USR/.ssh/config file on remote host $host. If a config file exists already at ~$USR/.ssh/config, it would be backed up to ~$USR/.ssh/config.backup. | tee -a $LOGFILE
echo The user may be prompted for a password here since the script would be running SSH on host $host. | tee -a $LOGFILE
$SSH -o StrictHostKeyChecking=no -x -l $USR $host "/bin/sh -c \" mkdir -p .ssh ; chmod og-w . .ssh; touch .ssh/authorized_keys .ssh/known_hosts; chmod 644 .ssh/authorized_keys .ssh/known_hosts; cp .ssh/authorized_keys .ssh/authorized_keys.tmp ; cp .ssh/known_hosts .ssh/known_hosts.tmp; echo \\"Host *\\" > .ssh/config.tmp; echo \\"ForwardX11 no\\" >> .ssh/config.tmp; if test -f .ssh/config ; then cp -f .ssh/config .ssh/config.backup; fi ; mv -f .ssh/config.tmp .ssh/config\"" | tee -a $LOGFILE
echo Done with creating .ssh directory and setting permissions on remote host $host. | tee -a $LOGFILE
done
for host in $REMOTEHOSTS
do
echo Copying local host public key to the remote host $host | tee -a $LOGFILE
echo The user may be prompted for a password or passphrase here since the script would be using SCP for host $host. | tee -a $LOGFILE
$SCP $HOME/.ssh/${IDENTITY}.pub $USR@$host:.ssh/authorized_keys | tee -a $LOGFILE
echo Done copying local host public key to the remote host $host | tee -a $LOGFILE
done
cat $HOME/.ssh/${IDENTITY}.pub >> $HOME/.ssh/authorized_keys | tee -a $LOGFILE
for host in $HOSTS
do
if [ "$ADVANCED" = "true" ]
then
echo Creating keys on remote host $host if they do not exist already. This is required to setup SSH on host $host. | tee -a $LOGFILE
if [ "$SHARED" = "true" ]
then
IDENTITY_FILE_NAME=${IDENTITY}_$host
COALESCE_IDENTITY_FILES_COMMAND="cat .ssh/${IDENTITY_FILE_NAME}.pub >> .ssh/authorized_keys"
else
IDENTITY_FILE_NAME=${IDENTITY}
fi
$SSH -o StrictHostKeyChecking=no -x -l $USR $host " /bin/sh -c \"if test -f .ssh/${IDENTITY_FILE_NAME}.pub && test -f .ssh/${IDENTITY_FILE_NAME}; then echo; else rm -f .ssh/${IDENTITY_FILE_NAME} ; rm -f .ssh/${IDENTITY_FILE_NAME}.pub ; $SSH_KEYGEN -t $ENCR -b $BITS -f .ssh/${IDENTITY_FILE_NAME} -N '' ; fi; ${COALESCE_IDENTITY_FILES_COMMAND} \"" | tee -a $LOGFILE
else
#At least get the host keys from all hosts for shared case - advanced option not set
if test $SHARED = "true" && test $ADVANCED = "false"
then
if [ "$PASSPHRASE" = "yes" ]
then
echo "The script will fetch the host keys from all hosts. The user may be prompted for a passphrase here in case the private key has been encrypted with a passphrase." | tee -a $LOGFILE
fi
$SSH -o StrictHostKeyChecking=no -x -l $USR $host "/bin/sh -c true"
fi
fi
done
for host in $REMOTEHOSTS
do
if test $ADVANCED = "true" && test $SHARED = "false"
then
$SCP $USR@$host:.ssh/${IDENTITY}.pub $HOME/.ssh/${IDENTITY}.pub.$host | tee -a $LOGFILE
cat $HOME/.ssh/${IDENTITY}.pub.$host >> $HOME/.ssh/authorized_keys | tee -a $LOGFILE
rm -f $HOME/.ssh/${IDENTITY}.pub.$host | tee -a $LOGFILE
fi
done
for host in $REMOTEHOSTS
do
if [ "$ADVANCED" = "true" ]
then
if [ "$SHARED" != "true" ]
then
echo Updating authorized_keys file on remote host $host | tee -a $LOGFILE
$SCP $HOME/.ssh/authorized_keys $USR@$host:.ssh/authorized_keys | tee -a $LOGFILE
fi
echo Updating known_hosts file on remote host $host | tee -a $LOGFILE
$SCP $HOME/.ssh/known_hosts $USR@$host:.ssh/known_hosts | tee -a $LOGFILE
fi
if [ "$PASSPHRASE" = "yes" ]
then
echo "The script will run SSH on the remote machine $host. The user may be prompted for a passphrase here in case the private key has been encrypted with a passphrase." | tee -a $LOGFILE
fi
$SSH -x -l $USR $host "/bin/sh -c \"cat .ssh/authorized_keys.tmp >> .ssh/authorized_keys; cat .ssh/known_hosts.tmp >> .ssh/known_hosts; rm -f .ssh/known_hosts.tmp .ssh/authorized_keys.tmp\"" | tee -a $LOGFILE
done
cat $HOME/.ssh/known_hosts.tmp >> $HOME/.ssh/known_hosts | tee -a $LOGFILE
cat $HOME/.ssh/authorized_keys.tmp >> $HOME/.ssh/authorized_keys | tee -a $LOGFILE
#Added chmod to fix BUG NO 5238814
chmod 644 $HOME/.ssh/authorized_keys
#Fix for BUG NO 5157782
chmod 644 $HOME/.ssh/config
rm -f $HOME/.ssh/known_hosts.tmp $HOME/.ssh/authorized_keys.tmp | tee -a $LOGFILE
echo SSH setup is complete. | tee -a $LOGFILE
fi
fi
echo | tee -a $LOGFILE
echo ------------------------------------------------------------------------ | tee -a $LOGFILE
echo Verifying SSH setup | tee -a $LOGFILE
echo =================== | tee -a $LOGFILE
echo The script will now run the 'date' command on the remote nodes using ssh | tee -a $LOGFILE
echo to verify if ssh is setup correctly. IF THE SETUP IS CORRECTLY SETUP, | tee -a $LOGFILE
echo THERE SHOULD BE NO OUTPUT OTHER THAN THE DATE AND SSH SHOULD NOT ASK FOR | tee -a $LOGFILE
echo PASSWORDS. If you see any output other than date or are prompted for the | tee -a $LOGFILE
echo password, ssh is not setup correctly and you will need to resolve the | tee -a $LOGFILE
echo issue and set up ssh again. | tee -a $LOGFILE
echo The possible causes for failure could be: | tee -a $LOGFILE
echo 1. The server settings in /etc/ssh/sshd_config file do not allow ssh | tee -a $LOGFILE
echo for user $USR. | tee -a $LOGFILE
echo 2. The server may have disabled public key based authentication.
echo 3. The client public key on the server may be outdated.
echo 4. ~$USR or ~$USR/.ssh on the remote host may not be owned by $USR. | tee -a $LOGFILE
echo 5. User may not have passed -shared option for shared remote users or | tee -a $LOGFILE
echo may be passing the -shared option for non-shared remote users. | tee -a $LOGFILE
echo 6. If there is output in addition to the date, but no password is asked, | tee -a $LOGFILE
echo it may be a security alert shown as part of company policy. Append the | tee -a $LOGFILE
echo "additional text to the /sysman/prov/resources/ignoreMessages.txt file." | tee -a $LOGFILE
echo ------------------------------------------------------------------------ | tee -a $LOGFILE
#read -t 30 dummy
for host in $HOSTS
do
echo --$host:-- | tee -a $LOGFILE
echo Running $SSH -x -l $USR $host date to verify SSH connectivity has been setup from local host to $host. | tee -a $LOGFILE
echo "IF YOU SEE ANY OTHER OUTPUT BESIDES THE OUTPUT OF THE DATE COMMAND OR IF YOU ARE PROMPTED FOR A PASSWORD HERE, IT MEANS SSH SETUP HAS NOT BEEN SUCCESSFUL. Please note that being prompted for a passphrase may be OK but being prompted for a password is ERROR." | tee -a $LOGFILE
if [ "$PASSPHRASE" = "yes" ]
then
echo "The script will run SSH on the remote machine $host. The user may be prompted for a passphrase here in case the private key has been encrypted with a passphrase." | tee -a $LOGFILE
fi
$SSH -l $USR $host "/bin/sh -c date" | tee -a $LOGFILE
echo ------------------------------------------------------------------------ | tee -a $LOGFILE
done
if [ "$EXHAUSTIVE_VERIFY" = "true" ]
then
for clienthost in $HOSTS
do
if [ "$SHARED" = "true" ]
then
REMOTESSH="$SSH -i .ssh/${IDENTITY}_${clienthost}"
else
REMOTESSH=$SSH
fi
for serverhost in $HOSTS
do
echo ------------------------------------------------------------------------ | tee -a $LOGFILE
echo Verifying SSH connectivity has been setup from $clienthost to $serverhost | tee -a $LOGFILE
echo ------------------------------------------------------------------------ | tee -a $LOGFILE
echo "IF YOU SEE ANY OTHER OUTPUT BESIDES THE OUTPUT OF THE DATE COMMAND OR IF YOU ARE PROMPTED FOR A PASSWORD HERE, IT MEANS SSH SETUP HAS NOT BEEN SUCCESSFUL." | tee -a $LOGFILE
$SSH -l $USR $clienthost "$REMOTESSH $serverhost \"/bin/sh -c date\"" | tee -a $LOGFILE
echo ------------------------------------------------------------------------ | tee -a $LOGFILE
done
echo -Verification from $clienthost complete- | tee -a $LOGFILE
done
else
if [ "$ADVANCED" = "true" ]
then
if [ "$SHARED" = "true" ]
then
REMOTESSH="$SSH -i .ssh/${IDENTITY}_${firsthost}"
else
REMOTESSH=$SSH
fi
for host in $HOSTS
do
echo ------------------------------------------------------------------------ | tee -a $LOGFILE
echo Verifying SSH connectivity has been setup from $firsthost to $host | tee -a $LOGFILE
echo "IF YOU SEE ANY OTHER OUTPUT BESIDES THE OUTPUT OF THE DATE COMMAND OR IF YOU ARE PROMPTED FOR A PASSWORD HERE, IT MEANS SSH SETUP HAS NOT BEEN SUCCESSFUL." | tee -a $LOGFILE
$SSH -l $USR $firsthost "$REMOTESSH $host \"/bin/sh -c date\"" | tee -a $LOGFILE
echo ------------------------------------------------------------------------ | tee -a $LOGFILE
done
echo -Verification from $clienthost complete- | tee -a $LOGFILE
fi
fi
echo "SSH verification complete." | tee -a $LOGFILE