参考网址:https://www.cnblogs.com/tz90/p/15467122.html
kubernetes中文官网:https://kubernetes.io/zh/docs/reference/setup-tools/kubeadm/kubeadm-init/#config-file
摘取自:https://www.cnblogs.com/tz90/
Master组件是集群的控制平台(control plane):
replicas
字段的规定时,启动一个新的 Pod)Master组件可以运行于集群中的任何机器上。但是,为了简洁性,通常在同一台机器上运行所有的 master 组件,且不在此机器上运行用户的容器。参考 安装Kubernetes高可用。
此 master 组件提供 Kubernetes API。这是Kubernetes控制平台的前端(front-end),可以水平扩展(通过部署更多的实例以达到性能要求)。kubectl / kubernetes dashboard / kuboard 等Kubernetes管理工具就是通过 kubernetes API 实现对 Kubernetes 集群的管理。
支持一致性和高可用的名值对存储组件,Kubernetes集群的所有配置信息都存储在 etcd 中。请确保您 备份 (opens new window)了 etcd 的数据。关于 etcd 的更多信息,可参考 etcd 官方文档(opens new window)
此 master 组件监控所有新创建尚未分配到节点上的 Pod,并且自动选择为 Pod 选择一个合适的节点去运行。
影响调度的因素有:
此 master 组件运行了所有的控制器
逻辑上来说,每一个控制器是一个独立的进程,但是为了降低复杂度,这些控制器都被合并运行在一个进程里。
kube-controller-manager 中包含的控制器有:
cloud-controller-manager 中运行了与具体云基础设施供应商互动的控制器。这是 Kubernetes 1.6 版本中引入的特性,尚处在 alpha 阶段。
cloud-controller-manager 只运行特定于云基础设施供应商的控制器。如果您参考 www.kuboard.cn 上提供的文档安装 Kubernetes 集群,默认不安装 cloud-controller-manager。
cloud-controller-manager 使得云供应商的代码和 Kubernetes 的代码可以各自独立的演化。在此之前的版本中,Kubernetes的核心代码是依赖于云供应商的代码的。在后续的版本中,特定于云供应商的代码将由云供应商自行维护,并在运行Kubernetes时链接到 cloud-controller-manager。
以下控制器中包含与云供应商相关的依赖:
节点控制器:当某一个节点停止响应时,调用云供应商的接口,以检查该节点的虚拟机是否已经被云供应商删除
译者注:私有化部署Kubernetes时,我们不知道节点的操作系统是否删除,所以在移除节点后,要自行通过
kubectl delete node
将节点对象从 Kubernetes 中删除
路由控制器:在云供应商的基础设施中设定网络路由
译者注:私有化部署Kubernetes时,需要自行规划Kubernetes的拓扑结构,并做好路由配置,例如 离线安装高可用的Kubernetes集群 中所作的
服务(Service)控制器:创建、更新、删除云供应商提供的负载均衡器
译者注:私有化部署Kubernetes时,不支持 LoadBalancer 类型的 Service,如需要此特性,需要创建 NodePort 类型的 Service,并自行配置负载均衡器
数据卷(Volume)控制器:创建、绑定、挂载数据卷,并协调云供应商编排数据卷
译者注:私有化部署Kubernetes时,需要自行创建和管理存储资源,并通过Kubernetes的存储类、存储卷、数据卷等与之关联
译者注:通过 cloud-controller-manager,Kubernetes可以更好地与云供应商结合,例如,在阿里云的 Kubernetes 服务里,您可以在云控制台界面上轻松点击鼠标,即可完成 Kubernetes 集群的创建和管理。在私有化部署环境时,您必须自行处理更多的内容。幸运的是,通过合适的教程指引,这些任务的达成并不困难。
Node 组件运行在每一个节点上(包括 master 节点和 worker 节点),负责维护运行中的 Pod 并提供 Kubernetes 运行时环境。
此组件是运行在每一个集群节点上的代理程序。它确保 Pod 中的容器处于运行状态。Kubelet 通过多种途径获得 PodSpec 定义,并确保 PodSpec 定义中所描述的容器处于运行和健康的状态。Kubelet不管理不是通过 Kubernetes 创建的容器。
kube-proxy 是一个网络代理程序,运行在集群中的每一个节点上,是实现 Kubernetes Service 概念的重要部分。
kube-proxy 在节点上维护网络规则。这些网络规则使得您可以在集群内、集群外正确地与 Pod 进行网络通信。如果操作系统中存在 packet filtering layer,kube-proxy 将使用这一特性(iptables代理模式),否则,kube-proxy将自行转发网络请求(User space代理模式)
容器引擎负责运行容器。Kubernetes支持多种容器引擎:Docker (opens new window)、containerd (opens new window)、cri-o (opens new window)、rktlet (opens new window)以及任何实现了 Kubernetes容器引擎接口 (opens new window)的容器引擎
Addons 使用 Kubernetes 资源(DaemonSet、Deployment等)实现集群的功能特性。由于他们提供集群级别的功能特性,addons使用到的Kubernetes资源都放置在 kube-system
名称空间下。
下面描述了一些经常用到的 addons,参考 Addons (opens new window)查看更多列表。
除了 DNS Addon 以外,其他的 addon 都不是必须的,所有 Kubernetes 集群都应该有 Cluster DNS
Cluster DNS 是一个 DNS 服务器,是对您已有环境中其他 DNS 服务器的一个补充,存放了 Kubernetes Service 的 DNS 记录。
Kubernetes 启动容器时,自动将该 DNS 服务器加入到容器的 DNS 搜索列表中。
如果您参考 www.kuboard.cn 上提供的文档安装 Kubernetes,默认已经安装了 Core DNS(opens new window)
Dashboard (opens new window)是一个Kubernetes集群的 Web 管理界面。用户可以通过该界面管理集群。
Kuboard 是一款基于Kubernetes的微服务管理界面,相较于 Dashboard,Kuboard 强调:
Container Resource Monitoring (opens new window)将容器的度量指标(metrics)记录在时间序列数据库中,并提供了 UI 界面查看这些数据
Cluster-level logging (opens new window)机制负责将容器的日志存储到一个统一存储中,并提供搜索浏览的界面
刚开始学k8s,使用二进制搭建k8s集群,网上教程大多都是v1.20版本的,要搞就搞难的,直接部署一个目前最新版v1.22.2的,想着报了错就地解决以后工作中遇到了也好整。
先部署单Master节点环境,之后再扩容成为多Master节点,以及多Work节点。
节点 IP 复用
k8s-master1 192.168.0.3 etcd01
k8s-node1 192.168.0.4 etcd02
k8s-node2 192.168.0.5 etcd03
这里节点复用,把etcd集群装在这三个节点上
如果你的实验环境IP跟我的不一样,不要手动改,直接ctrl+h替换为你的IP,一定要注意,这样避免改错
以下如未特别说明,则所有机器都要做,使用xshell–>工具–>发送键到所有会话会很方便操作
如果你的linux内核小于5.x,需要先更新内核(参考地址:https://www.cnblogs.com/tz90/p/15466646.html或者https://www.cnblogs.com/xzkzzz/p/9627658.html)
yum install ntpdate -y
ntpdate time2.aliyun.com
ln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
echo 'Asia/Shanghai' > /etc/timezone
crontab -e
0 12 * * * /usr/sbin/ntpdate time2.aliyun.com
systemctl stop firewalld
systemctl disable firewalld
sed -i 's/enforcing/disabled/' /etc/selinux/config
setenforce 0
swapoff -a
sed -ri 's/.*swap.*/#&/' /etc/fstab
cat > /etc/sysctl.d/k8s_better.conf << EOF
net.bridge.bridge-nf-call-iptables=1
net.bridge.bridge-nf-call-ip6tables=1
vm.swappiness=0
vm.overcommit_memory=1
vm.panic_on_oom=0
fs.inotify.max_user_instances=8192
fs.inotify.max_user_watches=1048576
fs.file-max=52706963
fs.nr_open=52706963
net.ipv6.conf.all.disable_ipv6=1
net.netfilter.nf_conntrack_max=2310720
EOF
modprobe ip_conntrack
modprobe br_netfilter
sysctl -p /etc/sysctl.d/k8s_better.conf
cat >> /etc/hosts << "EOF"
192.168.0.3 k8s-master1
192.168.0.4 k8s-node1
192.168.0.5 k8s-node2
EOF
cat /sys/class/dmi/id/product_uuid
hostnamectl set-hostname k8s-master1
hostnamectl set-hostname k8s-node1
hostnamectl set-hostname k8s-node2
ssh-keygen -t rsa
ssh-copy-id [email protected]
ssh-copy-id [email protected]
reboot
yum install -y yum-utils
yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
yum-config-manager --enable docker-ce-nightly
yum install docker-ce docker-ce-cli containerd.io -y
接受所有ip的数据包转发
vi /usr/lib/systemd/system/docker.service
#找到ExecStart=xxx,在这行上面加入一行,内容如下:(k8s的网络需要)
ExecStartPost=/usr/sbin/iptables -I FORWARD -s 0.0.0.0/0 -j ACCEPT
cat > /etc/docker/daemon.json << EOF
{
"registry-mirrors": ["https://b9pmyelo.mirror.aliyuncs.com"]
}
EOF
systemctl daemon-reload
systemctl start docker
systemctl enable docker
systemctl status docker
下载etcd官网(https://github.com/etcd-io/etcd/releases/download/v3.5.1/etcd-v3.5.1-linux-amd64.tar.gz)
目前版本为3.5.1
以下没有特殊说明均在master上操作
mkdir -p /opt/cluster/ssl/{rootca,etcd,kubernetes}
mkdir -p /opt/cluster/kubelet/ssl
mkdir -p /opt/cluster/log/{kube-apiserver,kube-controller-manager,kube-scheduler,kube-proxy,kubelet}
mkdir -p /opt/cluster/plugins/{calico,coredns}
mkdir -p /opt/cluster/etcd/{data,wal}
cd ~/tools
mv cfssl_1.6.1_linux_amd64 cfssl
mv cfssl-certinfo_1.6.1_linux_amd64 cfssl-certinfo
mv cfssljson_1.6.1_linux_amd64 cfssljson
chmod +x cfssl*
cp cfssl* /usr/local/bin
cd /opt/cluster/ssl
cat > cfssl-conf.json << "EOF"
{
"signing": {
"default": {
"expiry": "87600h"
},
"profiles": {
"common": {
"usages": [
"signing",
"key encipherment",
"server auth",
"client auth"
],
"expiry": "87600h"
}
}
}
}
EOF
cd /opt/cluster/ssl
cat > rootca/rootca-csr.json << "EOF"
{
"CN": "rootca",
"key": {
"algo": "ecdsa",
"size": 256
},
"names": [{
"C": "CN",
"ST": "Beijing",
"L": "Beijing",
"O": "ROOTCA",
"OU": "tz"
}]
}
EOF
cd /opt/cluster/ssl
cat > etcd/etcd-csr.json << EOF
{
"CN": "etcd-cluster",
"hosts": [
"127.0.0.1",
"192.168.0.3",
"192.168.0.4",
"192.168.0.5"
],
"key": {
"algo": "ecdsa",
"size": 256
},
"names": [{
"C": "CN",
"ST": "Beijing",
"L": "Beijing",
"O": "KUBERNETES-ETCD",
"OU": "tz"
}]
}
EOF
cd /opt/cluster/ssl
cfssl gencert -initca rootca/rootca-csr.json | cfssljson -bare rootca/rootca
cfssl gencert \
-ca=rootca/rootca.pem \
-ca-key=rootca/rootca-key.pem \
--config=cfssl-conf.json \
-profile=common etcd/etcd-csr.json | cfssljson -bare etcd/etcd
scp -r /opt/cluster/ssl 192.168.0.4:/opt/cluster/
scp -r /opt/cluster/ssl 192.168.0.5:/opt/cluster/
cd ~/tools/
tar zxvf etcd-v3.5.1-linux-amd64.tar.gz
cp etcd-v3.5.1-linux-amd64/{etcd,etcdctl} /usr/local/bin
chmod +x /usr/local/bin/
scp -r etcd-v3.5.1-linux-amd64/{etcd,etcdctl} [email protected]:/usr/local/bin
scp -r etcd-v3.5.1-linux-amd64/{etcd,etcdctl} [email protected]:/usr/local/bin
k8s-master1配置文件如下
cat > /usr/lib/systemd/system/etcd.service << "EOF"
[Unit]
Description=Kubernetes:Etcd
After=network.target network-online.target
Wants=network-online.target
[Service]
Restart=on-failure
RestartSec=5
ExecStart=/usr/local/bin/etcd \
--name=etcd01 \
--data-dir=/opt/cluster/etcd/data \
--wal-dir=/opt/cluster/etcd/wal \
--listen-peer-urls=https://192.168.0.3:2380 \
--listen-client-urls=https://192.168.0.3:2379,http://127.0.0.1:2379 \
--initial-advertise-peer-urls=https://192.168.0.3:2380 \
--initial-cluster=etcd01=https://192.168.0.3:2380,etcd02=https://192.168.0.4:2380,etcd03=https://192.168.0.5:2380 \
--initial-cluster-state=new \
--initial-cluster-token=373b3543a301630c \
--advertise-client-urls=https://192.168.0.3:2379 \
--cert-file=/opt/cluster/ssl/etcd/etcd.pem \
--key-file=/opt/cluster/ssl/etcd/etcd-key.pem \
--peer-cert-file=/opt/cluster/ssl/etcd/etcd.pem \
--peer-key-file=/opt/cluster/ssl/etcd/etcd-key.pem \
--trusted-ca-file=/opt/cluster/ssl/rootca/rootca.pem \
--peer-trusted-ca-file=/opt/cluster/ssl/rootca/rootca.pem \
--client-cert-auth=true \
--peer-client-cert-auth=true \
--logger=zap \
--log-outputs=default \
--log-level=info \
--listen-metrics-urls=https://192.168.0.3:2381 \
--enable-pprof=false
[Install]
WantedBy=multi-user.target
EOF
k8s-node1配置文件如下
cat > /usr/lib/systemd/system/etcd.service << "EOF"
[Unit]
Description=Kubernetes:Etcd
After=network.target network-online.target
Wants=network-online.target
[Service]
Restart=on-failure
RestartSec=5
ExecStart=/usr/local/bin/etcd \
--name=etcd02 \
--data-dir=/opt/cluster/etcd/data \
--wal-dir=/opt/cluster/etcd/wal \
--listen-peer-urls=https://192.168.0.4:2380 \
--listen-client-urls=https://192.168.0.4:2379,http://127.0.0.1:2379 \
--initial-advertise-peer-urls=https://192.168.0.4:2380 \
--initial-cluster=etcd01=https://192.168.0.3:2380,etcd02=https://192.168.0.4:2380,etcd03=https://192.168.0.5:2380 \
--initial-cluster-state=new \
--initial-cluster-token=373b3543a301630c \
--advertise-client-urls=https://192.168.0.4:2379 \
--cert-file=/opt/cluster/ssl/etcd/etcd.pem \
--key-file=/opt/cluster/ssl/etcd/etcd-key.pem \
--peer-cert-file=/opt/cluster/ssl/etcd/etcd.pem \
--peer-key-file=/opt/cluster/ssl/etcd/etcd-key.pem \
--trusted-ca-file=/opt/cluster/ssl/rootca/rootca.pem \
--peer-trusted-ca-file=/opt/cluster/ssl/rootca/rootca.pem \
--client-cert-auth=true \
--peer-client-cert-auth=true \
--logger=zap \
--log-outputs=default \
--log-level=info \
--listen-metrics-urls=https://192.168.0.4:2381 \
--enable-pprof=false
[Install]
WantedBy=multi-user.target
EOF
k8s-node2配置文件如下
cat > /usr/lib/systemd/system/etcd.service << "EOF"
[Unit]
Description=Kubernetes:Etcd
After=network.target network-online.target
Wants=network-online.target
[Service]
Restart=on-failure
RestartSec=5
ExecStart=/usr/local/bin/etcd \
--name=etcd03 \
--data-dir=/opt/cluster/etcd/data \
--wal-dir=/opt/cluster/etcd/wal \
--listen-peer-urls=https://192.168.0.5:2380 \
--listen-client-urls=https://192.168.0.5:2379,http://127.0.0.1:2379 \
--initial-advertise-peer-urls=https://192.168.0.5:2380 \
--initial-cluster=etcd01=https://192.168.0.3:2380,etcd02=https://192.168.0.4:2380,etcd03=https://192.168.0.5:2380 \
--initial-cluster-state=new \
--initial-cluster-token=373b3543a301630c \
--advertise-client-urls=https://192.168.0.5:2379 \
--cert-file=/opt/cluster/ssl/etcd/etcd.pem \
--key-file=/opt/cluster/ssl/etcd/etcd-key.pem \
--peer-cert-file=/opt/cluster/ssl/etcd/etcd.pem \
--peer-key-file=/opt/cluster/ssl/etcd/etcd-key.pem \
--trusted-ca-file=/opt/cluster/ssl/rootca/rootca.pem \
--peer-trusted-ca-file=/opt/cluster/ssl/rootca/rootca.pem \
--client-cert-auth=true \
--peer-client-cert-auth=true \
--logger=zap \
--log-outputs=default \
--log-level=info \
--listen-metrics-urls=https://192.168.0.5:2381 \
--enable-pprof=false
[Install]
WantedBy=multi-user.target
EOF
所有机器都操作
systemctl daemon-reload && \
systemctl enable etcd.service && \
systemctl start etcd.service && \
systemctl status etcd.service
报错看这里
journalctl -u etcd >error.log
vim error.log
一定要清理残留数据
rm -rf /opt/cluster/etcd/wal/
rm -rf /opt/cluster/etcd/data/
rm -rf /opt/cluster/ssl/etcd/
任意一台都可执行
ETCDCTL_API=3 /usr/local/bin/etcdctl \
--cacert=/opt/cluster/ssl/rootca/rootca.pem \
--cert=/opt/cluster/ssl/etcd/etcd.pem \
--key=/opt/cluster/ssl/etcd/etcd-key.pem \
--endpoints="https://192.168.0.3:2379,https://192.168.0.4:2379,https://192.168.0.5:2379" \
endpoint health --write-out=table
image-20211101164055440
目前版本为v1.22.2
下载官网(https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.22.md)
注:打开链接你会发现里面有很多包,下载一个Server Binaries包就够了,里面包含了Master和Worker节点的二进制文件。
cd ~/tools/
tar zxvf kubernetes-server-linux-amd64.tar.gz
cd kubernetes/server/bin
cp kube-apiserver kube-scheduler kube-controller-manager /usr/local/bin
cp kubectl /usr/local/bin
cd /opt/cluster/ssl
cat > kubernetes/kube-apiserver-csr.json << "EOF"
{
"CN": "kube-apiserver",
"hosts": [
"127.0.0.1",
"192.168.0.3",
"192.168.0.4",
"192.168.0.5",
"192.168.0.6",
"192.168.0.7",
"192.168.0.8",
"192.168.0.100",
"10.0.0.1",
"kubernetes",
"kubernetes.default",
"kubernetes.default.svc",
"kubernetes.default.svc.cluster",
"kubernetes.default.svc.cluster.local"
],
"key": {
"algo": "ecdsa",
"size": 256
},
"names": [{
"C": "CN",
"L": "BeiJing",
"ST": "BeiJing",
"O": "system:masters",
"OU": "tz"
}]
}
EOF
#node节点的证书使用API授权,不自己签发,所以这里的IP地址除了node节点不用写,其他都要写。
cd /opt/cluster/ssl
cfssl gencert \
-ca=rootca/rootca.pem \
-ca-key=rootca/rootca-key.pem \
--config=cfssl-conf.json \
-profile=common kubernetes/kube-apiserver-csr.json | cfssljson -bare kubernetes/kube-apiserver
#10.0.0.1是service-cluster-ip的首个IP
cd /opt/cluster/ssl
echo $(head -c 16 /dev/urandom | od -An -t x | tr -d ' '),kubelet-bootstrap,10001,"system:kubelet-bootstrap" > kubernetes/kube-apiserver.token.csv
Work节点请求证书需要用到,这里是注册了一个低权限的用户kubelet-bootstrap,工作节点使用该用户向API请求证书
cat > /usr/lib/systemd/system/kube-apiserver.service << "EOF"
[Unit]
Description=Kubernetes:Apiserver
After=network.target network-online.target
Wants=network-online.target
[Service]
Restart=on-failure
RestartSec=5
ExecStart=/usr/local/bin/kube-apiserver \
--runtime-config=api/all=true \
--anonymous-auth=false \
--bind-address=0.0.0.0 \
--advertise-address=192.168.0.3 \
--secure-port=6443 \
--tls-cert-file=/opt/cluster/ssl/kubernetes/kube-apiserver.pem \
--tls-private-key-file=/opt/cluster/ssl/kubernetes/kube-apiserver-key.pem \
--client-ca-file=/opt/cluster/ssl/rootca/rootca.pem \
--etcd-cafile=/opt/cluster/ssl/rootca/rootca.pem \
--etcd-certfile=/opt/cluster/ssl/etcd/etcd.pem \
--etcd-keyfile=/opt/cluster/ssl/etcd/etcd-key.pem \
--etcd-servers=https://192.168.0.3:2379,https://192.168.0.4:2379,https://192.168.0.5:2379 \
--kubelet-client-certificate=/opt/cluster/ssl/kubernetes/kube-apiserver.pem \
--kubelet-client-key=/opt/cluster/ssl/kubernetes/kube-apiserver-key.pem \
--service-account-key-file=/opt/cluster/ssl/rootca/rootca-key.pem \
--service-account-signing-key-file=/opt/cluster/ssl/rootca/rootca-key.pem \
--service-account-issuer=https://kubernetes.default.svc.cluster.local \
--enable-bootstrap-token-auth=true \
--token-auth-file=/opt/cluster/ssl/kubernetes/kube-apiserver.token.csv \
--allow-privileged=true \
--service-cluster-ip-range=10.0.0.0/16 \
--service-node-port-range=30000-50000 \
--authorization-mode=RBAC,Node \
--enable-aggregator-routing=true \
--enable-admission-plugins=NamespaceLifecycle,LimitRanger,ServiceAccount,ResourceQuota,NodeRestriction \
--audit-log-maxage=30 \
--audit-log-maxbackup=3 \
--audit-log-maxsize=100 \
--audit-log-path=/opt/cluster/log/kube-apiserver/audit.log \
--logtostderr=false \
--v=2 \
--log-dir=/opt/cluster/log/kube-apiserver
[Install]
WantedBy=multi-user.target
EOF
systemctl daemon-reload
systemctl enable --now kube-apiserver.service
systemctl status kube-apiserver.service
#报错看日志
journalctl -u kube-apiserver > error.log
vim error.log
kubectl是一个管理集群的工具
cd /opt/cluster/ssl
cat > kubernetes/kubectl-csr.json << "EOF"
{
"CN": "clusteradmin",
"key": {
"algo": "ecdsa",
"size": 256
},
"names": [{
"C": "CN",
"L": "BeiJing",
"ST": "BeiJing",
"O": "system:masters",
"OU": "tz"
}]
}
EOF
cd /opt/cluster/ssl
cfssl gencert -ca=rootca/rootca.pem \
-ca-key=rootca/rootca-key.pem \
--config=cfssl-conf.json \
-profile=common kubernetes/kubectl-csr.json | cfssljson -bare kubernetes/kubectl
我是先部署的单点Master集群,还没用负载均衡器,所以这里的–server填写的是k8s-master1的地址,如果部署了负载均衡器,则填写VIP地址。
cd /opt/cluster/ssl
kubectl config set-cluster kubernetes \
--certificate-authority=/opt/cluster/ssl/rootca/rootca.pem \
--embed-certs=true \
--server=https://192.168.0.3:6443 \
--kubeconfig=kubernetes/kubectl.kubeconfig
kubectl config set-credentials clusteradmin \
--client-certificate=/opt/cluster/ssl/kubernetes/kubectl.pem \
--client-key=/opt/cluster/ssl/kubernetes/kubectl-key.pem \
--embed-certs=true \
--kubeconfig=kubernetes/kubectl.kubeconfig
kubectl config set-context default \
--cluster=kubernetes \
--user=clusteradmin \
--kubeconfig=kubernetes/kubectl.kubeconfig
kubectl config use-context default \
--kubeconfig=kubernetes/kubectl.kubeconfig
mkdir /root/.kube
cp /opt/cluster/ssl/kubernetes/kubectl.kubeconfig /root/.kube/config
#报错看日志
journalctl -u kubectl > error.log
vim error.log
#如果重新部署需要删除相关证书
rm -rf /opt/cluster/ssl/kubernetes/kubectl*
rm -rf /opt/cluster/ssl/kubernetes/kube-api*
kubectl cluster-info
kubectl get cs
kubectl get all --all-namespaces
#命令补全[需要退出SHELL环境重新进入]
kubectl completion bash > /usr/share/bash-completion/completions/kubectl
image-20211101181951667
cd /opt/cluster/ssl
cat > kubernetes/kube-controller-manager-csr.json << "EOF"
{
"CN": "system:kube-controller-manager",
"hosts": [
"127.0.0.1",
"192.168.0.3",
"192.168.0.7",
"192.168.0.8",
"192.168.0.100"
],
"key": {
"algo": "ecdsa",
"size": 256
},
"names": [{
"C": "CN",
"ST": "BeiJing",
"L": "BeiJing",
"O": "KUBERNETES",
"OU": "tz"
}]
}
EOF
cd /opt/cluster/ssl
cfssl gencert -ca=rootca/rootca.pem \
-ca-key=rootca/rootca-key.pem \
--config=cfssl-conf.json \
-profile=common kubernetes/kube-controller-manager-csr.json | cfssljson -bare kubernetes/kube-controller-manager
cd /opt/cluster/ssl
kubectl config set-cluster kubernetes --certificate-authority=/opt/cluster/ssl/rootca/rootca.pem \
--embed-certs=true --server=https://192.168.0.3:6443 \
--kubeconfig=kubernetes/kube-controller-manager.kubeconfig
kubectl config set-credentials kube-controller-manager --client-certificate=kubernetes/kube-controller-manager.pem \
--client-key=kubernetes/kube-controller-manager-key.pem --embed-certs=true \
--kubeconfig=kubernetes/kube-controller-manager.kubeconfig
kubectl config set-context default --cluster=kubernetes --user=kube-controller-manager \
--kubeconfig=kubernetes/kube-controller-manager.kubeconfig
kubectl config use-context default --kubeconfig=kubernetes/kube-controller-manager.kubeconfig
cat > /usr/lib/systemd/system/kube-controller-manager.service << "EOF"
[Unit]
Description=Kubernetes:Kube-Controller-Manager
After=network.target network-online.target
Wants=network-online.target
[Service]
Restart=on-failure
RestartSec=5
ExecStart=/usr/local/bin/kube-controller-manager \
--cluster-name=kubernetes \
--secure-port=10257 \
--bind-address=127.0.0.1 \
--service-cluster-ip-range=10.0.0.0/16 \
--allocate-node-cidrs=true \
--cluster-cidr=10.1.0.0/16 \
--leader-elect=true \
--controllers=*,bootstrapsigner,tokencleaner \
--kubeconfig=/opt/cluster/ssl/kubernetes/kube-controller-manager.kubeconfig \
--tls-cert-file=/opt/cluster/ssl/kubernetes/kube-controller-manager.pem \
--tls-private-key-file=/opt/cluster/ssl/kubernetes/kube-controller-manager-key.pem \
--cluster-signing-cert-file=/opt/cluster/ssl/rootca/rootca.pem \
--cluster-signing-key-file=/opt/cluster/ssl/rootca/rootca-key.pem \
--cluster-signing-duration=87600h0m0s \
--use-service-account-credentials=true \
--root-ca-file=/opt/cluster/ssl/rootca/rootca.pem \
--service-account-private-key-file=/opt/cluster/ssl/rootca/rootca-key.pem \
--logtostderr=false \
--v=2 \
--log-dir=/opt/cluster/log/kube-controller-manager
[Install]
WantedBy=multi-user.target
EOF
systemctl daemon-reload
systemctl enable --now kube-controller-manager.service
systemctl status kube-controller-manager.service
验证
kubectl get componentstatuses
报错查看日志
journalctl -u kube-controller-manager > error.log
vim error.log
1.给Master节点签发证书
cd /opt/cluster/ssl
cat > kubernetes/kube-scheduler-csr.json << "EOF"
{
"CN": "system:kube-scheduler",
"hosts": [
"127.0.0.1",
"192.168.0.3",
"192.168.0.7",
"192.168.0.8",
"192.168.0.100"
],
"key": {
"algo": "ecdsa",
"size": 256
},
"names": [{
"C": "CN",
"ST": "BeiJing",
"L": "BeiJing",
"O": "KUBERNETES",
"OU": "tz"
}]
}
EOF
cd /opt/cluster/ssl
cfssl gencert \
-ca=rootca/rootca.pem \
-ca-key=rootca/rootca-key.pem \
--config=cfssl-conf.json \
-profile=common kubernetes/kube-scheduler-csr.json | cfssljson -bare kubernetes/kube-scheduler
cd /opt/cluster/ssl
kubectl config set-cluster kubernetes --certificate-authority=/opt/cluster/ssl/rootca/rootca.pem \
--embed-certs=true --server=https://192.168.0.3:6443 \
--kubeconfig=kubernetes/kube-scheduler.kubeconfig
kubectl config set-credentials kube-scheduler --client-certificate=kubernetes/kube-scheduler.pem \
--client-key=kubernetes/kube-scheduler-key.pem --embed-certs=true \
--kubeconfig=kubernetes/kube-scheduler.kubeconfig
kubectl config set-context default --cluster=kubernetes --user=kube-scheduler \
--kubeconfig=kubernetes/kube-scheduler.kubeconfig
kubectl config use-context default --kubeconfig=kubernetes/kube-scheduler.kubeconfig
cat > /usr/lib/systemd/system/kube-scheduler.service << "EOF"
[Unit]
Description=Kubernetes:Kube-Scheduler
After=network.target network-online.target
Wants=network-online.target
[Service]
Restart=on-failure
RestartSec=5
ExecStart=/usr/local/bin/kube-scheduler \
--kubeconfig=/opt/cluster/ssl/kubernetes/kube-scheduler.kubeconfig \
--address=127.0.0.1 \
--leader-elect=true \
--logtostderr=false \
--v=2 \
--log-dir=/opt/cluster/log/kube-scheduler
[Install]
WantedBy=multi-user.target
EOF
systemctl daemon-reload
systemctl enable --now kube-scheduler.service
systemctl status kube-scheduler.service
#验证
kubectl get cs
image-20211101190312734
#报错查看日志
journalctl -u kube-controller-manager > error.log
vim error.log
cd /root/tools/kubernetes/server/bin
cp kubelet kube-proxy /usr/local/bin
scp -r kubelet kube-proxy [email protected]:/usr/local/bin
scp -r kubelet kube-proxy [email protected]:/usr/local/bin
cd /opt/cluster/ssl
kubectl create clusterrolebinding kubelet-bootstrap --clusterrole=system:node-bootstrapper --user=kubelet-bootstrap
kubectl config set-cluster kubernetes --certificate-authority=/opt/cluster/ssl/rootca/rootca.pem \
--embed-certs=true --server=https://192.168.0.3:6443 \
--kubeconfig=kubernetes/kubelet-bootstrap.kubeconfig
kubectl config set-credentials kubelet-bootstrap --token=$(awk -F "," '{print $1}' /opt/cluster/ssl/kubernetes/kube-apiserver.token.csv) \
--kubeconfig=kubernetes/kubelet-bootstrap.kubeconfig
kubectl config set-context default --cluster=kubernetes --user=kubelet-bootstrap \
--kubeconfig=kubernetes/kubelet-bootstrap.kubeconfig
kubectl config use-context default --kubeconfig=kubernetes/kubelet-bootstrap.kubeconfig
cd /opt/cluster/ssl
cat > kubernetes/kubelet.conf << "EOF"
kind: KubeletConfiguration
apiVersion: kubelet.config.k8s.io/v1beta1
address: 0.0.0.0
port: 10250
readOnlyPort: 0
authentication:
anonymous:
enabled: false
webhook:
cacheTTL: 2m0s
enabled: true
x509:
clientCAFile: /opt/cluster/ssl/rootca/rootca.pem
authorization:
mode: Webhook
webhook:
cacheAuthorizedTTL: 5m0s
cacheUnauthorizedTTL: 30s
cgroupDriver: systemd
clusterDNS:
- 10.0.0.254
clusterDomain: cluster.local
healthzBindAddress: 127.0.0.1
healthzPort: 10248
rotateCertificates: true
evictionHard:
imagefs.available: 15%
memory.available: 100Mi
nodefs.available: 10%
nodefs.inodesFree: 5%
maxOpenFiles: 1000000
maxPods: 110
EOF
cat > /usr/lib/systemd/system/kubelet.service << "EOF"
[Unit]
Description=Kubernetes:Kubelet
After=network.target network-online.target docker.service
Requires=docker.service
[Service]
Restart=on-failure
RestartSec=5
ExecStart=/usr/local/bin/kubelet \
--bootstrap-kubeconfig=/opt/cluster/ssl/kubernetes/kubelet-bootstrap.kubeconfig \
--config=/opt/cluster/ssl/kubernetes/kubelet.conf \
--kubeconfig=/opt/cluster/kubelet/kubelet.kubeconfig \
--cert-dir=/opt/cluster/kubelet/ssl \
--network-plugin=cni \
--pod-infra-container-image=registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.2 \
--logtostderr=false \
--v=2 \
--log-dir=/opt/cluster/log/kubelet
[Install]
WantedBy=multi-user.target
EOF
scp -r /opt/cluster/ssl [email protected]:/opt/cluster/
scp -r /opt/cluster/ssl [email protected]:/opt/cluster/
scp -r /usr/lib/systemd/system/kubelet.service [email protected]:/usr/lib/systemd/system/kubelet.service
scp -r /usr/lib/systemd/system/kubelet.service [email protected]:/usr/lib/systemd/system/kubelet.service
所有节点都执行
systemctl daemon-reload
systemctl enable --now kubelet.service
systemctl status kubelet.service
#报错查看日志
journalctl -u kubelet> error.log
vim error.log
#查看需要授权的证书
kubectl get csr
#授权证书
kubectl certificate approve
kubectl get node
cd /opt/cluster/ssl
cat > kubernetes/kube-proxy-csr.json << "EOF"
{
"CN": "system:kube-proxy",
"key": {
"algo": "ecdsa",
"size": 256
},
"names": [{
"C": "CN",
"ST": "BeiJing",
"L": "BeiJing",
"O": "KUBERNETES",
"OU": "tz"
}]
}
EOF
cd /opt/cluster/ssl
cfssl gencert \
-ca=rootca/rootca.pem \
-ca-key=rootca/rootca-key.pem \
--config=cfssl-conf.json \
-profile=common kubernetes/kube-proxy-csr.json | cfssljson -bare kubernetes/kube-proxy
cd /opt/cluster/ssl
kubectl config set-cluster kubernetes --certificate-authority=/opt/cluster/ssl/rootca/rootca.pem \
--embed-certs=true --server=https://192.168.0.3:6443 \
--kubeconfig=kubernetes/kube-proxy.kubeconfig
kubectl config set-credentials kube-proxy --client-certificate=/opt/cluster/ssl/kubernetes/kube-proxy.pem \
--client-key=/opt/cluster/ssl/kubernetes/kube-proxy-key.pem --embed-certs=true \
--kubeconfig=kubernetes/kube-proxy.kubeconfig
kubectl config set-context default --cluster=kubernetes --user=kube-proxy \
--kubeconfig=kubernetes/kube-proxy.kubeconfig
kubectl config use-context default --kubeconfig=kubernetes/kube-proxy.kubeconfig
cat > kubernetes/kube-proxy.conf << "EOF"
kind: KubeProxyConfiguration
apiVersion: kubeproxy.config.k8s.io/v1alpha1
clientConnection:
kubeconfig: /opt/cluster/ssl/kubernetes/kube-proxy.kubeconfig
bindAddress: 0.0.0.0
clusterCIDR: "10.1.0.0/16"
healthzBindAddress: "0.0.0.0:10256"
metricsBindAddress: "0.0.0.0:10249"
mode: ipvs
ipvs:
scheduler: "rr"
EOF
这里需要注意,我把Master1也部署为Work节点,一方面可以监控,还能跑Pod,如果不想在Master上跑Pod,可以添加污点。
这里是在Master1上,所以–hostname-override值为k8s-master1
cat > /usr/lib/systed/system/kube-proxy.service << "EOF"
[Unit]
Description=Kubernetes:Kube-Proxy
After=network.target network-online.target
Wants=network-online.target
[Service]
Restart=on-failure
RestartSec=5
ExecStart=/usr/local/bin/kube-proxy \
--config=/opt/cluster/ssl/kubernetes/kube-proxy.conf \
--logtostderr=false \
--v=2 \
--log-dir=/opt/cluster/log/kube-proxy \
--hostname-override=k8s-master1
[Install]
WantedBy=multi-user.target
EOF
scp -r /opt/cluster/ssl 192.168.0.4:/opt/cluster/
scp -r /opt/cluster/ssl 192.168.0.5:/opt/cluster/
scp -r /usr/lib/systemd/system/kube-proxy.service [email protected]:/usr/lib/systemd/system/kube-proxy.service
scp -r /usr/lib/systemd/system/kube-proxy.service [email protected]:/usr/lib/systemd/system/kube-proxy.service
在k8s-node1跟k8s-node2上修改
#在node1上修改
vim /usr/lib/systemd/system/kube-proxy.service
...
--hostname-override=k8s-node1
...
#在node2上修改
vim /usr/lib/systemd/system/kube-proxy.service
...
--hostname-override=k8s-node2
...
systemctl daemon-reload
systemctl enable --now kube-proxy.service
systemctl status kube-proxy.service
#报错查看日志
journalctl -u kubelet> error.log
vim error.log
下载地址:官网下载(https://docs.projectcalico.org/v3.20/manifests/calico.yaml)
curl https://docs.projectcalico.org/manifests/calico.yaml -O
cd /opt/cluster/plugins/calico
#查找下面内容,修改保存退出,修改时注意格式对齐,否则会报错
vim calico.yaml
- name: CALICO_IPV4POOL_CIDR
value: "10.97.0.0/16"
kubectl apply -f calico.yaml
calico网络插件是以容器化启动的,需要下载以下四个容器
[root@k8s-master ~]# docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
calico/kube-controllers v3.22.1 c0c6672a66a5 5 weeks ago 132MB
calico/cni v3.22.1 2a8ef6985a3e 5 weeks ago 236MB
calico/pod2daemon-flexvol v3.22.1 17300d20daf9 5 weeks ago 19.7MB
calico/node v3.22.1 7a71aca7b60f 6 weeks ago 198MB #当无法启动calico插件时,需要先使用docker pull拉取它们以排查是否是网络原因造成的无法启动
kubectl get pods -n kube-system
#-w可以实时查看
kubectl get pods -n kube-system -w
kubectl get node
正常情况下,calico插件状态为Running,各节点状态为Ready
报错查看
#查看容器事件描述,用来排查故障
kubectl describe pod -n kube-system calico-node-b7z7v
#查看calico日志
tail -f /var/log/calico/cni/cni.log
如果重新部署需要删除calico网络环境
#清理网络环境
kubectl delete -f calico.yaml
rm -rf /run/calico \
/sys/fs/bpf/calico \
/var/lib/calico \
/var/log/calico \
/opt/cluster/plugins/calico \
/opt/cni/bin/calico
#查看是否还有残留的calico的pod
kubectl get pods -n kube-system
#强制删除Pod
kubectl delete pod -n kube-system --force --grace-period=0
下载地址:官网下载(https://github.com/coredns/deployment/blob/master/kubernetes/coredns.yaml.sed)
cd /opt/cluster/plugins/coredns
vim coredns.yaml
---
...
data:
Corefile: |
.:53 {
errors
health {
lameduck 5s
}
ready
kubernetes CLUSTER_DOMAIN REVERSE_CIDRS { # 修改此处的"CLUSTER_DOMAIN"为"cluster.local",表示集群域名
fallthrough in-addr.arpa ip6.arpa # 修改此处的"REVERSE_CIDRS"为"in-addr.arpa ip6.arpa";本处的配置涉及的是DNS的反向解释功能
}
prometheus :9153
forward . UPSTREAMNAMESERVER { # 修改此处的"UPSTREAMNAMESERVER"为"/etc/resolv.conf";本处的配置涉及的是DNS的正向解释功能
max_concurrent 1000
}
cache 30
loop
reload
loadbalance
}STUBDOMAINS # 删除此处的"STUBDOMAINS";
# 新版本的YAML文件中有这个字段[若不存在则不需要任何操作]
---
...
spec:
selector:
k8s-app: kube-dns
clusterIP: CLUSTER_DNS_IP # 修改此处的"CLUSTER_DNS_IP"为"10.96.0.10";本处为定义K8S集群内的DNS服务器的地址;
# 这个值应该与"kubelet.conf"中定义的"clusterDNS"配置项的值相同;
cd /opt/cluster/plugins/coredns
kubectl apply -f coredns.yaml
#-w可以实时查看
kubectl get pods -n kube-system -w
kubectl get node
image-20211102173602476
报错查看
#查看事件日志
kubectl describe pod -n kube-system coredns-[此处写查到的id]
#如果重新部署需要删除coredns网络环境
kubectl delete -f coredns.yaml
[root@k8s-master ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-master Ready <none> 12h v1.22.2
k8s-node1 Ready <none> 12h v1.22.2
k8s-node2 Ready <none> 12h v1.22.2
#设置master标签
kubectl label node k8s-master node-role.kubernetes.io/master=
#设置node标签
kubectl label node k8s-node1 node-role.kubernetes.io/work=
#删除标签
kubectl label node k8s-node1 node-role.kubernetes.io/work-
#如果安装了旧版本,建议先卸载之前的版本
kubectl delete ns kubernetes-dashboard
#安装新的版本
kubectl create -f https://raw.githubusercontent.com/kubernetes/dashboard/master/src/deploy/recommended/kubernetes-dashboard.yaml
#查看pod service是否创建成功
kubectl get pod,svc -A
因为 Service 是 ClusterIP 类型,为了暴露对外端口,我们可通过 kubectl --namespace=kubernetes-dashboard edit service kubernetes-dashboard 修改成 NodePort 类型。
kubectl --namespace=kubernetes-dashboard edit service kubernetes-dashboard
#下面是配置文件中的内容
clusterIP: 10.0.69.102
clusterIPs:
- 10.0.69.102
externalTrafficPolicy: Cluster
internalTrafficPolicy: Cluster
ipFamilies:
- IPv4
ipFamilyPolicy: SingleStack
ports:
#修改此处,可以修改端口
- nodePort: 40000
port: 443
protocol: TCP
targetPort: 8443
selector:
k8s-app: kubernetes-dashboard
sessionAffinity: None
#把此处的ClusterIP修改成NodePort
type: NodePort
kubectl create serviceaccount dashboard-admin -n kube-system
kubectl get sa -n kubernetes-dashboard
kubectl create clusterrolebinding dashboard-admin --clusterrole=cluster-admin --serviceaccount=kube-system:dashboard-admin
kubectl describe secrets -n kube-system $(kubectl -n kube-system get secret | awk '/admin/{print $1}')
是一款基于 Kubernetes 的微服务管理界面。目的是帮助用户快速在 Kubernetes 上落地微服务。Kuboard支持中文,功能比较完善。官网:https://www.kuboard.cn/
kubectl apply -f https://kuboard.cn/install-script/kuboard.yaml
http://192.168.0.3:port
#或者docker Kuboard v3安装:
sudo docker run -d \
--restart=unless-stopped \
--name=kuboard \
-p 80:80/tcp \
-p 10081:10081/tcp \
-e KUBOARD_ENDPOINT="http://内网IP:80" \
-e KUBOARD_AGENT_SERVER_TCP_PORT="10081" \
-v /root/kuboard-data:/data \
eipwork/kuboard:v3
WARNING
KUBOARD_ENDPOINT 参数的作用是,让部署到 Kubernetes 中的 kuboard-agent 知道如何访问 Kuboard Server;
KUBOARD_ENDPOINT 中也可以使用外网 IP;
Kuboard 不需要和 K8S 在同一个网段,Kuboard Agent 甚至可以通过代理访问 Kuboard Server;
建议在 KUBOARD_ENDPOINT 中使用域名;
如果使用域名,必须能够通过 DNS 正确解析到该域名,如果直接在宿主机配置 /etc/hosts 文件,将不能正常运行;
参数解释
建议将此命令保存为一个 shell 脚本,例如 start-kuboard.sh,后续升级 Kuboard 或恢复 Kuboard 时,需要通过此命令了解到最初安装 Kuboard 时所使用的参数;
第 4 行,将 Kuboard Web 端口 80 映射到宿主机的 80 端口(您可以根据自己的情况选择宿主机的其他端口);
第 5 行,将 Kuboard Agent Server 的端口 10081/tcp 映射到宿主机的 10081 端口(您可以根据自己的情况选择宿主机的其他端口);
第 6 行,指定 KUBOARD_ENDPOINT 为 http://内网IP,如果后续修改此参数,需要将已导入的 Kubernetes 集群从 Kuboard 中删除,再重新导入;
第 7 行,指定 KUBOARD_AGENT_SERVER 的端口为 10081,此参数与第 5 行中的宿主机端口应保持一致,修改此参数不会改变容器内监听的端口 10081,例如,如果第 5 行为 -p 30081:10081/tcp 则第 7 行应该修改为 -e KUBOARD_AGENT_SERVER_TCP_PORT="30081";
第 8 行,将持久化数据 /data 目录映射到宿主机的 /root/kuboard-data 路径,请根据您自己的情况调整宿主机路径;
#访问 Kuboard v3.x
在浏览器输入 http://your-host-ip:80 即可访问 Kuboard v3.x 的界面,登录方式:
用户名: admin
密 码: Kuboard123
kubectl describe pod kuboard-etcd-0 --namespace=kuboard
#谨慎操作
#删除悬空镜像无容器使用)
docker image prune -a -f
#删除状态为exited镜像
docker rm $(docker container ls -f 'status=exited' -q)
yum -y install bash-completion
source /usr/share/bash-completion/bash_completion
source <(kubectl completion bash)
echo "source <(kubectl completion bash)" >> ~/.bashrc
此处借鉴:https://blog.csdn.net/lswzw/article/details/123396742
pod发布yaml添加下面内容
......
apiVersion: apps/v1
kind: Deployment
metadata:
name: busybox
namespace: default
spec:
replicas: 10
selector:
matchLabels:
app: busybox
template:
metadata:
labels:
app: busybox
spec:
tolerations:
- key: "node.kubernetes.io/unreachable"
operator: "Exists"
effect: "NoExecute"
tolerationSeconds: 2 #NOTready 2s
- key: "node.kubernetes.io/not-ready"
operator: "Exists"
effect: "NoExecute"
tolerationSeconds: 2
containers:
- image: busybox
command:
- sleep
- "3600"
imagePullPolicy: IfNotPresent
name: busybox
restartPolicy: Always
......
修改/etc/kubernetes/manifests/kube-apiserver.yaml
- --default-not-ready-toleration-seconds=2
- --default-unreachable-toleration-seconds=2
还可以优化其他参数
/etc/systemd/system/kube-controller-manager.service
--node-monitor-grace-period=10s
--node-monitor-period=3s
--node-startup-grace-period=20s
--pod-eviction-timeout=10s
0、Master每隔一段时间和node联系一次,判定node是否失联,这个时间周期配置项为 node-monitor-period ,默认5s
1、当node失联后一段时间后,kubernetes判定node为notready状态,这段时长的配置项为 node-monitor-grace-period ,默认40s
2、当node失联后一段时间后,kubernetes判定node为unhealthy,这段时长的配置项为 node-startup-grace-period ,默认1m0s
3、当node失联后一段时间后,kubernetes开始删除原node上的pod,这段时长配置项为 pod-eviction-timeout ,默认5m0s