二进制部署K8S集群从0到1

二进制部署K8S集群从0到1

    • 管理集群中的TLS
    • 集群部署
      • 环境规划
      • 环境准备
    • 创建TLS证书和秘钥
      • 创建CA(Certificate Authority)
    • 创建kubeconfig文件
        • 下载kubectl
        • 创建kubectl kubeconfig文件
        • 创建TLS Bootstrapping token
        • 创建kubelet bootstrapping kubeconfig文件
        • 创建kube-proxy kubeconfig文件
        • 分发kubeconfig文件
    • 创建 ETCD HA集群
        • TLS认证文件
        • 下载二进制文件
        • 创建etcd的数据目录
        • 创建etcd的systemd unit文件
        • 创建etcd的环境变量配置文件/etc/etcd/etcd.conf
        • 部署node节点的etcd
        • 启动服务
        • 验证服务
    • 部署Master节点
      • TLS证书文件
      • 下载最新版本的二进制文件
      • 配置和启动kube-apiserver
      • 配置和启动kube-controller-manager
      • 配置和启动kube-scheduler
        • 验证master节点功能
    • 安装flannel网络插件
    • 部署node节点
      • 配置Docker
      • 安装配置kubelet
      • 配置kube-proxy
        • 脚本方式配置
        • 验证测试
    • DNS服务搭建与配置
      • 安装CoreDNS插件
    • 安装dashboard插件
      • 采用kubectl proxy访问dashboard
    • 安装heapster插件
      • 准备镜像
        • 执行所有定义文件
        • 检查执行结果
        • 访问各dashboard界面
        • 访问influxdb admin UI
    • 安装EFK插件
    • 系统部署架构
      • 配置EFK服务配置文件
        • 创建目录盛放文件
        • 配置EFK-RABC服务
        • 配置ElasticSearch服务
        • 配置Fluentd服务的configmap,此处通过td-agent创建
        • 配置Kibana服务
        • 给Node设置标签
        • 执行定义的文件
        • 验证执行结果
        • 访问kibana

介绍:k8s集群系统的各组件需要使用 TLS证书对通信进行加密,本文档使用 CloudFlarePKI工具集 cfssl来生成 Certificate Authority(CA)和其他证书。

管理集群中的TLS

前言

每个Kubernetes集群都有一个集群根证书颁发机构(CA)。 集群中的组件通常使用CA来验证API server的证书,由API服务器验证kubelet客户端证书等。为了支持这一点,CA证书包被分发到集群中的每个节点,并作为一个secret附加分发到默认service account上。 或者,你的workload可以使用此CA建立信任。 你的应用程序可以使用类似于ACME草案的协议,使用certificates.k8s.io API请求证书签名。

集群中的TLS信任

让Pod中运行的应用程序信任集群根CA通常需要一些额外的应用程序配置。 您将需要将CA证书包添加到TLS客户端或服务器信任的CA证书列表中。 例如,您可以使用golang TLS配置通过解析证书链并将解析的证书添加到tls.Config结构中的Certificates字段中,CA证书捆绑包将使用默认服务账户自动加载到pod中,路径为/var/run/secrets/kubernetes.io/serviceaccount/ca.crt。 如果您没有使用默认服务账户,请请求集群管理员构建包含您有权访问使用的证书包的configmap。

集群部署

环境规划

软件 版本
Linux操作系统 CentOS Linux release 7.6.1810 (Core)
Kubernetes 1.14.2
Docker 18.06.1-ce
Etcd 3.3.1
角色 IP 组件 推荐配置
k8s-master 172.16.4.12 kube-apiserver
kube-controller-manager
kube-scheduler etcd
8core和16GB内存
k8s-node1 172.16.4.13 kubelet
kube-proxy
docker
flannel
etcd
根据需要运行的容器数量进行配置
k8s-node2 172.16.4.14 kubelet
kube-proxy
docker
flannel
etcd
根据需要运行的容器数量进行配置
组件 使用的证书
etcd ca.pem, server.pem, server-key.pem
kube-apiserver ca.pem, server.pem, server-key.pem
kubelet ca.pem, ca-key.pem
kube-proxy ca.pem, kube-proxy.pem, kube-proxy-key.pem
kubectl ca.pem, admin.pem, admin-key.pem
kube-controller-manager ca.pem, ca-key.pem
flannel ca.pem, server.pem, server-key.pem

环境准备

以下操作需要在master节点和各Node节点上执行

  • 准备必要可用的软件包(非必须操作)
# 安装net-tools,可以使用ping,ifconfig等命令
yum install -y net-tools

# 安装curl,telnet命令
yum install -y curl telnet

# 安装vim编辑器
yum install -y vim

# 安装wget下载命令
yum install -y wget

# 安装lrzsz工具,可以直接拖拽文件到Xshell中上传文件到服务器或下载文件到本地。
yum -y install lrzsz
  • 关闭防火墙
systemctl stop firewalld
systemctl disable firewalld
  • 关闭selinux
sed -i 's/enforcing/disabled' /etc/selinux/config
setenforce 0
# 或者进入到/etc/selinux/config将以下字段设置并重启生效:
SELINUX=disabled
  • 关闭swap
swapoff -a # 临时
vim /etc/fstab #永久
  • 确保net.bridge.bridge-nf-call-iptables在sysctl配置为1:
$ cat <<EOF > /etc/sysctl.d/k8s.conf
net.ipv4.ip_forward =1
net.bridge.bridge-nf-call-ip6tables =1
net.bridge.bridge-nf-call-iptables =1
EOF
$ sysctl --system
  • 添加主机名与IP对应关系(master和node节点都需要配置)
$ vim /etc/hosts
172.16.4.12  k8s-master
172.16.4.13  k8s-node1
172.16.4.14  k8s-node2
  • 同步时间
# yum install ntpdate -y
# ntpdate ntp.api.bz 

 k8s需要容器运行时(Container Runtime Interface,CRI)的支持,目前官方支持的容器运行时包括:Docker、Containerd、CRI-O和frakti。此处以Docker作为容器运行环境,推荐版本为Docker CE 18.06 或 18.09.

  • 安装Docker
# 为Docker配置阿里云源,注意是在/etc/yum.repos.d目录执行下述命令。
[root@k8s-master yum.repos.d]# wget https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
# update形成缓存,并且列出可用源,发现出现docker-ce源。
[root@k8s-master yum.repos.d]# yum update && yum repolist
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
 * base: mirror.lzu.edu.cn
 * extras: mirrors.nwsuaf.edu.cn
 * updates: mirror.lzu.edu.cn
docker-ce-stable                                                                                                  | 3.5 kB  00:00:00     
(1/2): docker-ce-stable/x86_64/updateinfo                                                                         |   55 B  00:00:00     
(2/2): docker-ce-stable/x86_64/primary_db                                                                         |  28 kB  00:00:00     
No packages marked for update
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
 * base: mirror.lzu.edu.cn
 * extras: mirrors.nwsuaf.edu.cn
 * updates: mirror.lzu.edu.cn
repo id                                                         repo name                                                          status
base/7/x86_64                                                   CentOS-7 - Base                                                    10,019
docker-ce-stable/x86_64                                         Docker CE Stable - x86_64                                              43
extras/7/x86_64                                                 CentOS-7 - Extras                                                     409
updates/7/x86_64                                                CentOS-7 - Updates                                                  2,076
repolist: 12,547

# 列出可用的docker-ce版本,推荐使用18.06或18.09的稳定版。
yum list docker-ce.x86_64 --showduplicates | sort -r
# 正式安装docker,此处以docker-ce-18.06.3.ce-3.el7为例。推荐第2种方式。
yum -y install docker-ce-18.06.3.ce-3.el7
# 在此处可能会报错:Delta RPMs disabled because /usr/bin/applydeltarpm not installed.采用如下命令解决。
yum provides '*/applydeltarpm'
yum install deltarpm -y
# 然后重新执行安装命令
yum -y install docker-ce-18.06.3.ce-3.el7
# 安装完成设置docker开机自启动。
systemctl enable docker

注意:以下操作都在 master 节点即 172.16.4.12 这台主机上执行,证书只需要创建一次即可,以后在向集群中添加新节点时只要将 /etc/kubernetes/ 目录下的证书拷贝到新节点上即可。

创建TLS证书和秘钥

  • 采用二进制源码包安装CFSSL
# 首先创建存放证书的位置
$ mkdir ssl && cd ssl
# 下载用于生成证书的
wget https://pkg.cfssl.org/R1.2/cfssl_linux-amd64
# 用于将证书的json文本导入
wget https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64
# 查看证书信息
wget https://pkg.cfssl.org/R1.2/cfssl-certinfo_linux-amd64
# 修改文件,使其具备执行权限
chmod +x cfssl_linux-amd64 cfssljson_linux-amd64 cfssl-certinfo_linux-amd64
# 将文件移到/usr/local/bin/cfssl
mv cfssl_linux-amd64 /usr/local/bin/cfssl
mv cfssljson_linux-amd64 /usr/local/bin/cfssljson
mv cfssl-certinfo_linux-amd64 /usr/local/bin/cfssl-certinfo
# 如果是普通用户,可能需要将环境变量设置下
export PATH=/usr/local/bin:$PATH

创建CA(Certificate Authority)

注意以下命令,仍旧在/root/ssl文件目录下执行。

  1. 创建CA配置文件
# 生成一个默认配置
$ cfssl print-defaults config > config.json
# 生成一个默认签发证书的配置
$ cfssl print-defaults csr > csr.json
# 根据config.json文件的格式创建如下的ca-config.json文件,其中过期时间设置成了 87600h
cat > ca-config.json <<EOF
{
  "signing": {
    "default": {
      "expiry": "87600h"
    },
    "profiles": {
      "kubernetes": {
         "expiry": "87600h",
         "usages": [
            "signing",
            "key encipherment",
            "server auth",
            "client auth"
        ]
      }
    }
  }
}
EOF

字段说明

  • ca-config.json:可以定义多个 profiles,分别指定不同的过期时间、使用场景等参数;后续在签名证书时使用某个 profile;
  • signing:表示该证书可用于签名其它证书;生成的 ca.pem 证书中 CA=TRUE
  • server auth:表示client可以用该 CA 对server提供的证书进行验证;
  • client auth:表示server可以用该CA对client提供的证书进行验证;
  1. 创建CA证书签名请求
# 创建ca-csr.json文件,内容如下
cat > ca-csr.json <<EOF
{
    "CN": "kubernetes",
    "key": {
        "algo": "rsa",
        "size": 2048
    },
    "names": [
        {
            "C": "CN",
            "L": "Beijing",
            "ST": "Beijing",
      	    "O": "k8s",
            "OU": "System"
        }
    ],
      "ca": {
    	"expiry": "87600h"
    }
}
EOF

  • “CN”:Common Name,kube-apiserver 从证书中提取该字段作为请求的用户名 (User Name);浏览器使用该字段验证网站是否合法;
  • “O”:Organization,kube-apiserver 从证书中提取该字段作为请求用户所属的组 (Group);
  1. 生成CA证书和私钥
[root@k8s-master ~]# cfssl gencert -initca ca-csr.json | cfssljson -bare ca
2019/06/12 11:08:53 [INFO] generating a new CA key and certificate from CSR
2019/06/12 11:08:53 [INFO] generate received request
2019/06/12 11:08:53 [INFO] received CSR
2019/06/12 11:08:53 [INFO] generating key: rsa-2048
2019/06/12 11:08:53 [INFO] encoded CSR
2019/06/12 11:08:53 [INFO] signed certificate with serial number 708489059891717538616716772053407287945320812263
# 此时/root下应该有以下四个文件。
[root@k8s-master ssl]# ls
ca-config.json  ca.csr  ca-csr.json  ca-key.pem  ca.pem

  1. 创建Kubernetes证书

创建Kubernetes证书签名请求文件server-csr.json(kubernetes-csr.json),并将受信任的IP修改添加到hosts,比如我的三个节点的IP为:172.16.4.12 172.16.4.13 172.16.4.14

$ cat > server-csr.json <<EOF
{
    "CN": "kubernetes",
    "hosts": [
      "127.0.0.1",
      "172.16.4.12",
      "172.16.4.13",
      "172.16.4.14",
      "10.10.10.1",
      "kubernetes",
      "kubernetes.default",
      "kubernetes.default.svc",
      "kubernetes.default.svc.cluster",
      "kubernetes.default.svc.cluster.local"
    ],
    "key": {
        "algo": "rsa",
        "size": 2048
    },
    "names": [
        {
            "C": "CN",
            "L": "BeiJing",
            "ST": "BeiJing",
            "O": "k8s",
            "OU": "System"
        }
    ]
}
EOF
# 正式生成Kubernetes证书和私钥
[root@k8s-master ssl]# cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes server-csr.json | cfssljson -bare server
2019/06/12 12:00:45 [INFO] generate received request
2019/06/12 12:00:45 [INFO] received CSR
2019/06/12 12:00:45 [INFO] generating key: rsa-2048
2019/06/12 12:00:45 [INFO] encoded CSR
2019/06/12 12:00:45 [INFO] signed certificate with serial number 276381852717263457656057670704331293435930586226
2019/06/12 12:00:45 [WARNING] This certificate lacks a "hosts" field. This makes it unsuitable for
websites. For more information see the Baseline Requirements for the Issuance and Management
of Publicly-Trusted Certificates, v.1.1.6, from the CA/Browser Forum (https://cabforum.org);
specifically, section 10.2.3 ("Information Requirements").
# 查看生成的server.pem和server-key.pem
[root@k8s-master ssl]# ls server*
server.csr  server-csr.json  server-key.pem  server.pem

  • 如果 hosts 字段不为空则需要指定授权使用该证书的 IP 或域名列表,由于该证书后续被 etcd 集群和 kubernetes master 集群使用,所以上面分别指定了 etcd集群、kubernetes master 集群的主机 IP 和 kubernetes 服务的服务 IP(一般是 kube-apiserver 指定的 service-cluster-ip-range 网段的第一个IP,如 10.10.10.1)。
  • 这是最小化安装的kubernetes集群,包括一个私有镜像仓库,三个节点的kubernetes集群,以上物理节点的IP也可以更换为主机名。
  1. 创建admin证书

创建admin证书签名请求文件,admin-csr.json:

cat > admin-csr.json <<EOF
{
  "CN": "admin",
  "hosts": [],
  "key": {
    "algo": "rsa",
    "size": 2048
  },
  "names": [
    {
      "C": "CN",
      "L": "BeiJing",
      "ST": "BeiJing",
      "O": "system:masters",
      "OU": "System"
    }
  ]
}
EOF

  • 后续 kube-apiserver 使用 RBAC 对客户端(如 kubeletkube-proxyPod)请求进行授权;
  • kube-apiserver 预定义了一些 RBAC 使用的 RoleBindings,如 cluster-admin 将 Group system:masters 与 Role cluster-admin 绑定,该 Role 授予了调用kube-apiserver所有 API的权限;
  • O 指定该证书的 Group 为 system:masterskubelet 使用该证书访问 kube-apiserver 时 ,由于证书被 CA 签名,所以认证通过,同时由于证书用户组为经过预授权的 system:masters,所以被授予访问所有 API 的权限;

注意:这个admin 证书,是将来生成管理员用的kube config 配置文件用的,现在我们一般建议使用RBAC 来对kubernetes 进行角色权限控制, kubernetes 将证书中的CN 字段 作为User, O 字段作为 Group(具体参考 Kubernetes中的用户与身份认证授权中 X509 Client Certs 一段)。

生成admin证书和私钥

[root@k8s-master ssl]# cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes admin-csr.json | cfssljson -bare admin
2019/06/12 14:52:32 [INFO] generate received request
2019/06/12 14:52:32 [INFO] received CSR
2019/06/12 14:52:32 [INFO] generating key: rsa-2048
2019/06/12 14:52:33 [INFO] encoded CSR
2019/06/12 14:52:33 [INFO] signed certificate with serial number 491769057064087302830652582150890184354925110925
2019/06/12 14:52:33 [WARNING] This certificate lacks a "hosts" field. This makes it unsuitable for
websites. For more information see the Baseline Requirements for the Issuance and Management
of Publicly-Trusted Certificates, v.1.1.6, from the CA/Browser Forum (https://cabforum.org);
specifically, section 10.2.3 ("Information Requirements").
#查看生成的证书和私钥
[root@k8s-master ssl]# ls admin*
admin.csr  admin-csr.json  admin-key.pem  admin.pem

  1. 创建kube-proxy证书

创建 kube-proxy 证书签名请求文件 kube-proxy-csr.json,让它携带证书访问集群:

cat > kube-proxy-csr.json <<EOF
{
  "CN": "system:kube-proxy",
  "hosts": [],
  "key": {
    "algo": "rsa",
    "size": 2048
  },
  "names": [
    {
      "C": "CN",
      "L": "BeiJing",
      "ST": "BeiJing",
      "O": "k8s",
      "OU": "System"
    }
  ]
}
EOF

  • CN 指定该证书的 User 为 system:kube-proxy
  • kube-apiserver 预定义的 RoleBinding system:node-proxier 将User system:kube-proxy 与 Role system:node-proxier 绑定,该 Role 授予了调用 kube-apiserver Proxy 相关 API 的权限;

生成 kube-proxy 客户端证书和私钥

[root@k8s-master ssl]# cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes kube-proxy-csr.json | cfssljson -bare kube-proxy && ls kube-proxy*
2019/06/12 14:58:09 [INFO] generate received request
2019/06/12 14:58:09 [INFO] received CSR
2019/06/12 14:58:09 [INFO] generating key: rsa-2048
2019/06/12 14:58:09 [INFO] encoded CSR
2019/06/12 14:58:09 [INFO] signed certificate with serial number 175491367066700423717230199623384101585104107636
2019/06/12 14:58:09 [WARNING] This certificate lacks a "hosts" field. This makes it unsuitable for
websites. For more information see the Baseline Requirements for the Issuance and Management
of Publicly-Trusted Certificates, v.1.1.6, from the CA/Browser Forum (https://cabforum.org);
specifically, section 10.2.3 ("Information Requirements").
kube-proxy.csr  kube-proxy-csr.json  kube-proxy-key.pem  kube-proxy.pem

  1. 校验证书

以server证书为例

使用openssl命令

[root@k8s-master ssl]# openssl x509  -noout -text -in  server.pem
......
    Signature Algorithm: sha256WithRSAEncryption
        Issuer: C=CN, ST=Beijing, L=Beijing, O=k8s, OU=System, CN=kubernetes
        Validity
            Not Before: Jun 12 03:56:00 2019 GMT
            Not After : Jun  9 03:56:00 2029 GMT
        Subject: C=CN, ST=BeiJing, L=BeiJing, O=k8s, OU=System, CN=kubernetes
        ......
        X509v3 extensions:
            X509v3 Key Usage: critical
                Digital Signature, Key Encipherment
            X509v3 Extended Key Usage: 
                TLS Web Server Authentication, TLS Web Client Authentication
            X509v3 Basic Constraints: critical
                CA:FALSE
            X509v3 Subject Key Identifier: 
                E9:99:37:41:CC:E9:BA:9A:9F:E6:DE:4A:3E:9F:8B:26:F7:4E:8F:4F
            X509v3 Authority Key Identifier: 
                keyid:CB:97:D5:C3:5F:8A:EB:B5:A8:9D:39:DE:5F:4F:E0:10:8E:4C:DE:A2

            X509v3 Subject Alternative Name: 
                DNS:kubernetes, DNS:kubernetes.default, DNS:kubernetes.default.svc, DNS:kubernetes.default.svc.cluster, DNS:kubernetes.default.svc.cluster.local, IP Address:127.0.0.1, IP Address:172.16.4.12, IP Address:172.16.4.13, IP Address:172.16.4.14, IP Address:10.10.10.1
    ......

  • 确认 Issuer 字段的内容和 ca-csr.json 一致;
  • 确认 Subject 字段的内容和 server-csr.json 一致;
  • 确认 X509v3 Subject Alternative Name 字段的内容和 server-csr.json 一致;
  • 确认 X509v3 Key Usage、Extended Key Usage 字段的内容和 ca-config.json 中 ``kubernetes profile` 一致;

使用 cfssl-certinfo 命令

[root@k8s-master ssl]# cfssl-certinfo -cert server.pem
{
  "subject": {
    "common_name": "kubernetes",
    "country": "CN",
    "organization": "k8s",
    "organizational_unit": "System",
    "locality": "BeiJing",
    "province": "BeiJing",
    "names": [
      "CN",
      "BeiJing",
      "BeiJing",
      "k8s",
      "System",
      "kubernetes"
    ]
  },
  "issuer": {
    "common_name": "kubernetes",
    "country": "CN",
    "organization": "k8s",
    "organizational_unit": "System",
    "locality": "Beijing",
    "province": "Beijing",
    "names": [
      "CN",
      "Beijing",
      "Beijing",
      "k8s",
      "System",
      "kubernetes"
    ]
  },
  "serial_number": "276381852717263457656057670704331293435930586226",
  "sans": [
    "kubernetes",
    "kubernetes.default",
    "kubernetes.default.svc",
    "kubernetes.default.svc.cluster",
    "kubernetes.default.svc.cluster.local",
    "127.0.0.1",
    "172.16.4.12",
    "172.16.4.13",
    "172.16.4.14",
    "10.10.10.1"
  ],
  "not_before": "2019-06-12T03:56:00Z",
  "not_after": "2029-06-09T03:56:00Z",
  "sigalg": "SHA256WithRSA",
  ......
}
  1. 分发证书

将生成的证书和秘钥文件(后缀名为.pem)拷贝到所有机器的 /etc/kubernetes/ssl目录下备用;

[root@k8s-master ssl]# mkdir -p /etc/kubernetes/ssl
[root@k8s-master ssl]# cp *.pem /etc/kubernetes/ssl
[root@k8s-master ssl]# ls /etc/kubernetes/ssl/
admin-key.pem  admin.pem  ca-key.pem  ca.pem  kube-proxy-key.pem  kube-proxy.pem  server-key.pem  server.pem
# 留下pem文件,删除其余无用文件(非必须操作,可以不执行)
ls | grep -v pem |xargs -i rm {}

创建kubeconfig文件

以下命令在master节点运行,没有指定运行目录,则默认是用户家目录,root用户则在/root下执行。

下载kubectl

注意请下载对应的Kubernetes版本的安装包。

# 下述网站,如果访问不了网站,请移步百度云下载:
wget https://dl.k8s.io/v1.14.3/kubernetes-client-linux-amd64.tar.gz
tar -xzvf kubernetes-client-linux-amd64.tar.gz
cp kubernetes/client/bin/kube* /usr/bin/
chmod a+x /usr/bin/kube*

创建kubectl kubeconfig文件

# 172.16.4.12是master节点的IP,注意更改。 
# 创建kubeconfig 然后需要指定k8s的api的https的访问入口 
export KUBE_APISERVER="https://172.16.4.12:6443"
# 设置集群参数
kubectl config set-cluster kubernetes \
  --certificate-authority=/etc/kubernetes/ssl/ca.pem \
  --embed-certs=true \
  --server=${KUBE_APISERVER}
# 设置客户端认证参数
kubectl config set-credentials admin \
  --client-certificate=/etc/kubernetes/ssl/admin.pem \
  --embed-certs=true \
  --client-key=/etc/kubernetes/ssl/admin-key.pem
# 设置上下文参数
kubectl config set-context kubernetes \
  --cluster=kubernetes \
  --user=admin
# 设置默认上下文
kubectl config use-context kubernetes
  • admin.pem 证书 OU 字段值为 system:masterskube-apiserver 预定义的 RoleBinding cluster-admin 将 Group system:masters 与 Role cluster-admin绑定,该 Role 授予了调用kube-apiserver 相关 API 的权限;
  • 生成的 kubeconfig 被保存到 ~/.kube/config 文件;

注意:~/.kube/config文件拥有对该集群的最高权限,请妥善保管。

kubeletkube-proxy 等 Node 机器上的进程与 Master 机器的 kube-apiserver 进程通信时需要认证和授权;

以下操作只需要在master节点上执行,生成的*.kubeconfig文件可以直接拷贝到node节点的/etc/kubernetes目录下。

创建TLS Bootstrapping token

Token auth file

Token可以是任意的包含128 bit的字符串,可以使用安全的随机数发生器生成。

export BOOTSTRAP_TOKEN=$(head -c 16 /dev/urandom | od -An -t x | tr -d ' ')
cat > token.csv <<EOF
${BOOTSTRAP_TOKEN},kubelet-bootstrap,10001,"system:kubelet-bootstrap"
EOF

后三行是一句,直接复制上面的脚本运行即可。

注意:在进行后续操作前请检查 token.csv 文件,确认其中的 ${BOOTSTRAP_TOKEN}环境变量已经被真实的值替换。

BOOTSTRAP_TOKEN 将被写入到 kube-apiserver 使用的 token.csv 文件和 kubelet 使用的 bootstrap.kubeconfig 文件,如果后续重新生成了 BOOTSTRAP_TOKEN,则需要

  1. 更新 token.csv 文件,分发到所有机器 (master 和 node)的 /etc/kubernetes/ 目录下,分发到node节点上非必需;
  2. 重新生成 bootstrap.kubeconfig 文件,分发到所有 node 机器的 /etc/kubernetes/ 目录下;
  3. 重启 kube-apiserver 和 kubelet 进程;
  4. 重新 approve kubelet 的 csr 请求;
cp token.csv /etc/kubernetes/

创建kubelet bootstrapping kubeconfig文件

执行下面的命令时需要先安装kubectl命令

# 在执行之前,可以先安装kubectl 自动补全命令工具。
yum install -y bash-completion
source /usr/share/bash-completion/bash_completion
source <(kubectl completion bash)

# 进到执行目录/etc/kubernetes下。
cd /etc/kubernetes
export KUBE_APISERVER="https://172.16.4.12:6443"

# 设置集群参数
kubectl config set-cluster kubernetes \
  --certificate-authority=/etc/kubernetes/ssl/ca.pem \
  --embed-certs=true \
  --server=${KUBE_APISERVER} \
  --kubeconfig=bootstrap.kubeconfig

# 设置客户端认证参数
kubectl config set-credentials kubelet-bootstrap \
  --token=${BOOTSTRAP_TOKEN} \
  --kubeconfig=bootstrap.kubeconfig

# 设置上下文参数
kubectl config set-context default \
  --cluster=kubernetes \
  --user=kubelet-bootstrap \
  --kubeconfig=bootstrap.kubeconfig

# 设置默认上下文
kubectl config use-context default --kubeconfig=bootstrap.kubeconfig

  • --embed-certstrue 时表示将 certificate-authority 证书写入到生成的 bootstrap.kubeconfig 文件中;
  • 设置客户端认证参数时没有指定秘钥和证书,后续由 kube-apiserver 自动生成;

创建kube-proxy kubeconfig文件

export KUBE_APISERVER="https://172.16.4.12:6443"
# 设置集群参数
kubectl config set-cluster kubernetes \
  --certificate-authority=/etc/kubernetes/ssl/ca.pem \
  --embed-certs=true \
  --server=${KUBE_APISERVER} \
  --kubeconfig=kube-proxy.kubeconfig
# 设置客户端认证参数
kubectl config set-credentials kube-proxy \
  --client-certificate=/etc/kubernetes/ssl/kube-proxy.pem \
  --client-key=/etc/kubernetes/ssl/kube-proxy-key.pem \
  --embed-certs=true \
  --kubeconfig=kube-proxy.kubeconfig
# 设置上下文参数
kubectl config set-context default \
  --cluster=kubernetes \
  --user=kube-proxy \
  --kubeconfig=kube-proxy.kubeconfig
# 设置默认上下文
kubectl config use-context default --kubeconfig=kube-proxy.kubeconfig

  • 设置集群参数和客户端认证参数时 --embed-certs 都为 true,这会将 certificate-authorityclient-certificateclient-key 指向的证书文件内容写入到生成的 kube-proxy.kubeconfig 文件中;
  • kube-proxy.pem 证书中 CN 为 system:kube-proxykube-apiserver 预定义的 RoleBinding cluster-admin 将User system:kube-proxy 与 Role system:node-proxier 绑定,该 Role 授予了调用 kube-apiserver Proxy 相关 API 的权限;

分发kubeconfig文件

将两个 kubeconfig 文件分发到所有 Node 机器的 /etc/kubernetes/ 目录下:

# 现在可以把其他节点加入互信,首先需要生成证书,三次回车即可。
ssh-keygen
# 查看生成的证书
ls /root/.ssh/
id_rsa  id_rsa.pub
# 将生成的证书拷贝到node1和node2
ssh-copy-id [email protected]
# 输入节点用户的密码即可访问。同样方式加入node2为互信。
# 把kubeconfig文件拷贝到node节点的/etc/kubernetes,该目录需要事先手动创建好。
scp bootstrap.kubeconfig kube-proxy.kubeconfig [email protected]:/etc/kubernetes
scp bootstrap.kubeconfig kube-proxy.kubeconfig [email protected]:/etc/kubernetes

创建 ETCD HA集群

etcd服务作为k8s集群的主数据库,在安装k8s各服务之前需要首先安装和启动。kuberntes 系统使用 etcd 存储所有数据,本文档介绍部署一个三节点高可用 etcd 集群的步骤,这三个节点复用 kubernetes master 机器,分别命名为k8s-masterk8s-node1k8s-node2

角色 IP
k8s-master 172.16.4.12
k8s-node1 172.16.4.13
k8s-node2 172.16.4.14

TLS认证文件

需要为 etcd 集群创建加密通信的 TLS 证书,这里复用以前创建的 kubernetes 证书:

# 将/root/ssl下的ca.pem, server-key.pem, server.pem复制到/etc/kubernetes/ssl
cp ca.pem server-key.pem server.pem /etc/kubernetes/ssl

  • kubernetes 证书的 hosts 字段列表中包含上面三台机器的 IP,否则后续证书校验会失败;

下载二进制文件

二进制包下载地址:此文最新为etcd-v3.3.13,读者可以到https://github.com/coreos/etcd/releases页面下载最新版本的二进制文件。

wget https://github.com/etcd-io/etcd/releases/download/v3.3.13/etcd-v3.3.13-linux-amd64.tar.gz
tar zxvf etcd-v3.3.13-linux-amd64.tar.gz
mv etcd-v3.3.13-linux-amd64/etcd* /usr/local/bin

或者直接使用yum命令安装:

yum install etcd

注意:若使用yum安装,默认etcd命令将在/usr/bin目录下,注意修改下面的etcd.service文件中的启动命令地址为/usr/bin/etcd

创建etcd的数据目录

mkdir -p /var/lib/etcd/default.etcd

创建etcd的systemd unit文件

在/usr/lib/systemd/system/目录下创建文件etcd.service,内容如下。注意替换IP地址为你自己的etcd集群的主机IP。

[Unit]
Description=Etcd Server
After=network.target
After=network-online.target
Wants=network-online.target
Documentation=https://github.com/coreos

[Service]
Type=notify
WorkingDirectory=/var/lib/etcd
EnvironmentFile=-/etc/etcd/etcd.conf
ExecStart=/usr/local/bin/etcd \
  --name ${ETCD_NAME} \
  --cert-file=/etc/kubernetes/ssl/server.pem \
  --key-file=/etc/kubernetes/ssl/server-key.pem \
  --peer-cert-file=/etc/kubernetes/ssl/server.pem \
  --peer-key-file=/etc/kubernetes/ssl/server-key.pem \
  --trusted-ca-file=/etc/kubernetes/ssl/ca.pem \
  --peer-trusted-ca-file=/etc/kubernetes/ssl/ca.pem \
  --initial-advertise-peer-urls ${ETCD_INITIAL_ADVERTISE_PEER_URLS} \
  --listen-peer-urls=${ETCD_LISTEN_PEER_URLS} \
  --listen-client-urls=${ETCD_LISTEN_CLIENT_URLS},http://127.0.0.1:2379 \
  --advertise-client-urls=${ETCD_ADVERTISE_CLIENT_URLS} \
  --initial-cluster-token=${ETCD_INITIAL_CLUSTER_TOKEN} \
  --initial-cluster etcd-master=https://172.16.4.12:2380,etcd-node1=https://172.16.4.13:2380,etcd-node2=https://172.16.4.14:2380 \
  --initial-cluster-state=new \
  --data-dir=${ETCD_DATA_DIR}
Restart=on-failure
RestartSec=5
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target
  • 指定 etcd 的工作目录为 /var/lib/etcd,数据目录为 /var/lib/etcd需在启动服务前创建这个目录,否则启动服务的时候会报错“Failed at step CHDIR spawning /usr/bin/etcd: No such file or directory”;
  • 为了保证通信安全,需要指定 etcd 的公私钥(cert-file和key-file)、Peers 通信的公私钥和 CA 证书(peer-cert-file、peer-key-file、peer-trusted-ca-file)、客户端的CA证书(trusted-ca-file);
  • 创建 server.pem 证书时使用的 server-csr.json 文件的 hosts 字段包含所有 etcd 节点的IP,否则证书校验会出错;
  • --initial-cluster-state 值为 new 时,--name 的参数值必须位于 --initial-cluster 列表中;

创建etcd的环境变量配置文件/etc/etcd/etcd.conf

mkdir -p  /etc/etcd
touch etcd.conf

写入内容如下:

# [member]
ETCD_NAME=etcd-master
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_LISTEN_PEER_URLS="https://172.16.4.12:2380"
ETCD_LISTEN_CLIENT_URLS="https://172.16.4.12:2379"

#[cluster]
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://172.16.4.12:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_ADVERTISE_CLIENT_URLS="https://172.16.4.12:2379"


这是172.16.4.12节点的配置,其他两个etcd节点只要将上面的IP地址改成相应节点的IP地址即可。ETCD_NAME换成对应节点的etcd-node1 etcd-node2。

部署node节点的etcd

# 1. 从master节点传送TLS认证文件到各节点。注意需要在各节点上事先创建/etc/kubernetes/ssl目录。
scp /etc/kubernetes/ssl/*.pem [email protected]:/etc/kubernetes/ssl/
scp /etc/kubernetes/ssl/*.pem [email protected]:/etc/kubernetes/ssl/

# 2. 把master节点的etcd和etcdctl命令直接传到各节点上,
scp /usr/local/bin/etcd* [email protected]:/usr/local/bin/
scp /usr/local/bin/etcd* [email protected]:/usr/local/bin/

# 3. 把etcd配置文件上传至各node节点上。注意事先在各节点上创建好/etc/etcd目录。
scp /etc/etcd/etcd.conf [email protected]:/etc/etcd/
scp /etc/etcd/etcd.conf [email protected]:/etc/etcd/
# 4. 需要修改/etc/etcd/etcd.conf的相应参数。以k8s-node1(IP:172.16.4.13)为例:
# [member]
ETCD_NAME=etcd-node1
ETCD_DATA_DIR="/var/lib/etcd"
ETCD_LISTEN_PEER_URLS="https://172.16.4.13:2380"
ETCD_LISTEN_CLIENT_URLS="https://172.16.4.13:2379"

#[cluster]
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://172.16.4.13:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_ADVERTISE_CLIENT_URLS="https://172.16.4.13:2379"
# 上述文件主要是修改ETCD_NAME和对应的IP为节点IP即可。同样修改node2的配置文件。

# 5. 把/usr/lib/systemd/system/etcd.service的etcd服务配置文件上传至各节点。
scp /usr/lib/systemd/system/etcd.service [email protected]:/usr/lib/systemd/system/
scp /usr/lib/systemd/system/etcd.service [email protected]:/usr/lib/systemd/system/

启动服务

systemctl daemon-reload
systemctl enable etcd
systemctl start etcd
systemctl status etcd

在所有的 kubernetes master 节点重复上面的步骤,直到所有机器的 etcd 服务都已启动。

注意:如果日志中出现连接异常信息,请确认所有节点防火墙是否开放2379,2380端口。 以centos7为例:

firewall-cmd --zone=public --add-port=2380/tcp --permanent
firewall-cmd --zone=public --add-port=2379/tcp --permanent
firewall-cmd --reload

验证服务

在任一 kubernetes master 机器上执行如下命令:

[root@k8s-master ~]# etcdctl \
> --ca-file=/etc/kubernetes/ssl/ca.pem \
> --cert-file=/etc/kubernetes/ssl/server.pem \
> --key-file=/etc/kubernetes/ssl/server-key.pem \
> cluster-health
member 287080ba42f94faf is healthy: got healthy result from https://172.16.4.13:2379
member 47e558f4adb3f7b4 is healthy: got healthy result from https://172.16.4.12:2379
member e531bd3c75e44025 is healthy: got healthy result from https://172.16.4.14:2379
cluster is healthy

结果最后一行为 cluster is healthy 时表示集群服务正常。

部署Master节点

kubernetes master 节点包含的组件:

  • kube-apiserver
  • kube-scheduler
  • kube-controller-manager

目前这三个组件需要部署在同一台机器上。

  • kube-schedulerkube-controller-managerkube-apiserver 三者的功能紧密相关;
  • 同时只能有一个 kube-schedulerkube-controller-manager 进程处于工作状态,如果运行多个,则需要通过选举产生一个 leader;

TLS证书文件

以下pem证书文件我们在”创建TLS证书和秘钥“这一步中已经创建过了,token.csv文件在“创建kubeconfig文件”的时候创建。我们再检查一下。

[root@k8s-master ~]# ls /etc/kubernetes/ssl/
admin-key.pem  admin.pem  ca-key.pem  ca.pem  kube-proxy-key.pem  kube-proxy.pem  server-key.pem  server.pem

下载最新版本的二进制文件

从https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG.md页面clientserver tarball 文件server的 tarball kubernetes-server-linux-amd64.tar.gz 已经包含了 client(kubectl) 二进制文件,所以不用单独下载kubernetes-client-linux-amd64.tar.gz文件;

wget https://dl.k8s.io/v1.14.3/kubernetes-server-linux-amd64.tar.gz
# 如果官网访问不到,可以移步百度云:链接:https://pan.baidu.com/s/1G6e981Q48mMVWD9Ho_j-7Q 提取码:uvc1 下载。
tar -xzvf kubernetes-server-linux-amd64.tar.gz
cd kubernetes
tar -xzvf  kubernetes-src.tar.gz

将二进制文件拷贝到指定路径

[root@k8s-master kubernetes]# cp -r server/bin/{kube-apiserver,kube-controller-manager,kube-scheduler,kubectl,kube-proxy,kubelet} /usr/local/bin/

配置和启动kube-apiserver

(1)创建kube-apiserver的service配置文件

service配置文件/usr/lib/systemd/system/kube-apiserver.service内容:

[Unit]
Description=Kubernetes API Service
Documentation=https://github.com/GoogleCloudPlatform/kubernetes
After=network.target
After=etcd.service

[Service]
EnvironmentFile=-/etc/kubernetes/config
EnvironmentFile=-/etc/kubernetes/apiserver
ExecStart=/usr/local/bin/kube-apiserver \
        $KUBE_LOGTOSTDERR \
        $KUBE_LOG_LEVEL \
        $KUBE_ETCD_SERVERS \
        $KUBE_API_ADDRESS \
        $KUBE_API_PORT \
        $KUBELET_PORT \
        $KUBE_ALLOW_PRIV \
        $KUBE_SERVICE_ADDRESSES \
        $KUBE_ADMISSION_CONTROL \
        $KUBE_API_ARGS
Restart=on-failure
Type=notify
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target

(2) 创建/etc/kubernetes/config文件内容为:

###
# kubernetes system config
#
# The following values are used to configure various aspects of all
# kubernetes services, including
#
#   kube-apiserver.service
#   kube-controller-manager.service
#   kube-scheduler.service
#   kubelet.service
#   kube-proxy.service
# logging to stderr means we get it in the systemd journal
KUBE_LOGTOSTDERR="--logtostderr=true"

# journal message level, 0 is debug
KUBE_LOG_LEVEL="--v=0"

# Should this cluster be allowed to run privileged docker containers
KUBE_ALLOW_PRIV="--allow-privileged=true"

# How the controller-manager, scheduler, and proxy find the apiserver
KUBE_MASTER="--master=http://172.16.4.12:8080"

该配置文件同时被kube-apiserver、kube-controller-manager、kube-scheduler、kubelet、kube-proxy使用。

apiserver配置文件/etc/kubernetes/apiserver内容为:

###
### kubernetes system config
###
### The following values are used to configure the kube-apiserver
###
##
### The address on the local server to listen to.
KUBE_API_ADDRESS="--advertise-address=172.16.4.12 --bind-address=172.16.4.12 --insecure-bind-address=172.16.4.12"
##
### The port on the local server to listen on.
##KUBE_API_PORT="--port=8080"
##
### Port minions listen on
##KUBELET_PORT="--kubelet-port=10250"
##
### Comma separated list of nodes in the etcd cluster
KUBE_ETCD_SERVERS="--etcd-servers=https://172.16.4.12:2379,https://172.16.4.13:2379,https://172.16.4.14:2379"
##
### Address range to use for services
KUBE_SERVICE_ADDRESSES="--service-cluster-ip-range=10.10.10.0/24"
##
### default admission control policies
KUBE_ADMISSION_CONTROL="--admission-control=ServiceAccount,NamespaceLifecycle,NamespaceExists,LimitRanger,ResourceQuota"
##
### Add your own!
KUBE_API_ARGS="--authorization-mode=RBAC \
--runtime-config=rbac.authorization.k8s.io/v1beta1 \
--kubelet-https=true \
--enable-bootstrap-token-auth \
--token-auth-file=/etc/kubernetes/token.csv \
--service-node-port-range=30000-50000 \
--tls-cert-file=/etc/kubernetes/ssl/server.pem \
--tls-private-key-file=/etc/kubernetes/ssl/server-key.pem \
--client-ca-file=/etc/kubernetes/ssl/ca.pem \
--service-account-key-file=/etc/kubernetes/ssl/ca-key.pem \
--etcd-cafile=/etc/kubernetes/ssl/ca.pem \
--etcd-certfile=/etc/kubernetes/ssl/server.pem \
--etcd-keyfile=/etc/kubernetes/ssl/server-key.pem \
--enable-swagger-ui=true \
--apiserver-count=3 \
--audit-log-maxage=30 \
--audit-log-maxbackup=3 \
--audit-log-maxsize=100 \
--audit-log-path=/var/lib/audit.log \
--event-ttl=1h"

  • 如果中途修改过--service-cluster-ip-range地址,则必须将default命名空间的kubernetes的service给删除,使用命令:kubectl delete service kubernetes,然后系统会自动用新的ip重建这个service,不然apiserver的log有报错the cluster IP x.x.x.x for service kubernetes/default is not within the service CIDR x.x.x.x/24; please recreate
  • --authorization-mode=RBAC 指定在安全端口使用 RBAC 授权模式,拒绝未通过授权的请求;
  • kube-scheduler、kube-controller-manager 一般和 kube-apiserver 部署在同一台机器上,它们使用非安全端口和 kube-apiserver通信;
  • kubelet、kube-proxy、kubectl 部署在其它 Node 节点上,如果通过安全端口访问 kube-apiserver,则必须先通过 TLS 证书认证,再通过 RBAC 授权;
  • kube-proxy、kubectl 通过在使用的证书里指定相关的 User、Group 来达到通过 RBAC 授权的目的;
  • 如果使用了 kubelet TLS Boostrap 机制,则不能再指定 --kubelet-certificate-authority--kubelet-client-certificate--kubelet-client-key 选项,否则后续 kube-apiserver 校验 kubelet 证书时出现 ”x509: certificate signed by unknown authority“ 错误;
  • --admission-control 值必须包含 ServiceAccount
  • --bind-address 不能为 127.0.0.1
  • runtime-config配置为rbac.authorization.k8s.io/v1beta1,表示运行时的apiVersion;
  • --service-cluster-ip-range 指定 Service Cluster IP 地址段,该地址段不能路由可达;
  • 缺省情况下 kubernetes 对象保存在 etcd/registry 路径下,可以通过 --etcd-prefix 参数进行调整;
  • 如果需要开通http的无认证的接口,则可以增加以下两个参数:--insecure-port=8080 --insecure-bind-address=127.0.0.1。注意,生产上不要绑定到非127.0.0.1的地址上。

注意:完整 unit 见 kube-apiserver.service可以根据自身集群需求修改参数。

(3)启动kube-apiserver

systemctl daemon-reload
systemctl enable kube-apiserver
systemctl start kube-apiserver
systemctl status kube-apiserver

配置和启动kube-controller-manager

(1)创建kube-controller-manager的service配置文件

文件路径/usr/lib/systemd/system/kube-controller-manager.service

[Unit]
Description=Kubernetes Controller Manager
Documentation=https://github.com/GoogleCloudPlatform/kubernetes

[Service]
EnvironmentFile=-/etc/kubernetes/config
EnvironmentFile=-/etc/kubernetes/controller-manager
ExecStart=/usr/local/bin/kube-controller-manager \
        $KUBE_LOGTOSTDERR \
        $KUBE_LOG_LEVEL \
        $KUBE_MASTER \
        $KUBE_CONTROLLER_MANAGER_ARGS
Restart=on-failure
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target

(2)配置文件/etc/kubernetes/controller-manager

###
# The following values are used to configure the kubernetes controller-manager

# defaults from config and apiserver should be adequate

# Add your own!
KUBE_CONTROLLER_MANAGER_ARGS="--address=127.0.0.1 \
--service-cluster-ip-range=10.10.10.0/24 \
--cluster-name=kubernetes \
--cluster-signing-cert-file=/etc/kubernetes/ssl/ca.pem \
--cluster-signing-key-file=/etc/kubernetes/ssl/ca-key.pem  \
--service-account-private-key-file=/etc/kubernetes/ssl/ca-key.pem \
--root-ca-file=/etc/kubernetes/ssl/ca.pem \
--leader-elect=true"


  • --service-cluster-ip-range 参数指定 Cluster 中 Service 的CIDR范围,该网络在各 Node 间必须路由不可达,必须和 kube-apiserver 中的参数一致;
  • --cluster-signing-* 指定的证书和私钥文件用来签名为 TLS BootStrap 创建的证书和私钥;
  • --root-ca-file 用来对 kube-apiserver 证书进行校验,指定该参数后,才会在Pod 容器的 ServiceAccount 中放置该 CA 证书文件
  • --address 值必须为 127.0.0.1,kube-apiserver 期望 scheduler 和 controller-manager 在同一台机器;

(3)启动kube-controller-manager

systemctl daemon-reload
systemctl enable kube-controller-manager
systemctl start kube-controller-manager
systemctl status kube-controller-manager

我们启动每个组件后可以通过执行命令kubectl get cs,来查看各个组件的状态;

[root@k8s-master ~]# kubectl get cs
NAME                 STATUS      MESSAGE                                                                                     ERROR
scheduler            Unhealthy   Get http://127.0.0.1:10251/healthz: dial tcp 127.0.0.1:10251: connect: connection refused   
controller-manager   Healthy     ok                                                                                          
etcd-0               Healthy     {"health":"true"}                                                                           
etcd-2               Healthy     {"health":"true"}                                                                           
etcd-1               Healthy     {"health":"true"}  

  • 如果有组件report unhealthy请参考:https://github.com/kubernetes-incubator/bootkube/issues/64

配置和启动kube-scheduler

(1)创建kube-scheduler的service的配置文件

文件路径/usr/lib/systemd/system/kube-scheduler.service

[Unit]
Description=Kubernetes Scheduler Plugin
Documentation=https://github.com/GoogleCloudPlatform/kubernetes

[Service]
EnvironmentFile=-/etc/kubernetes/config
EnvironmentFile=-/etc/kubernetes/scheduler
ExecStart=/usr/local/bin/kube-scheduler \
            $KUBE_LOGTOSTDERR \
            $KUBE_LOG_LEVEL \
            $KUBE_MASTER \
            $KUBE_SCHEDULER_ARGS
Restart=on-failure
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target

(2) 配置文件/etc/kubernetes/scheduler

###
# kubernetes scheduler config

# default config should be adequate

# Add your own!
KUBE_SCHEDULER_ARGS="--leader-elect=true --address=127.0.0.1"
  • --address 值必须为 127.0.0.1,因为当前 kube-apiserver 期望 scheduler 和 controller-manager 在同一台机器;

注意:完整 unit 见 kube-scheduler.service可以根据自身集群情况添加参数。

(3) 启动kube-scheduler

systemctl daemon-reload
systemctl enable kube-scheduler
systemctl start kube-scheduler
systemctl status kube-scheduler

验证master节点功能

[root@k8s-master ~]# kubectl get cs
NAME                 STATUS    MESSAGE             ERROR
controller-manager   Healthy   ok                  
scheduler            Healthy   ok                  
etcd-0               Healthy   {"health":"true"}   
etcd-2               Healthy   {"health":"true"}   
etcd-1               Healthy   {"health":"true"} 
# 此时发现,ERROR那一栏再没有报错了。

安装flannel网络插件

所有的node节点都需要安装网络插件才能让所有的Pod加入到同一个局域网中,本文是安装flannel网络插件的参考文档。

建议直接使用yum安装flanneld,除非对版本有特殊需求,默认安装的是0.7.1版本的flannel。

(1)安装flannel

# 查看默认安装的flannel版本,下面显示是0.7.1.个人建议安装较新版本。
[root@k8s-master ~]# yum list flannel --showduplicates | sort -r
 * updates: mirror.lzu.edu.cn
Loading mirror speeds from cached hostfile
Loaded plugins: fastestmirror
flannel.x86_64                        0.7.1-4.el7                         extras
 * extras: mirror.lzu.edu.cn
 * base: mirror.lzu.edu.cn
Available Packages

[root@k8s-master ~]# wget https://github.com/coreos/flannel/releases/download/v0.11.0/flannel-v0.11.0-linux-amd64.tar.gz

# 解压文件,可以看到产生flanneld和mk-docker-opts.sh两个可执行文件。
[root@k8s-master ~]# tar zxvf flannel-v0.11.0-linux-amd64.tar.gz
flanneld
mk-docker-opts.sh
README.md
# 把两个可执行文件传至node1和node2中
[root@k8s-master ~]# scp flanneld [email protected]:/usr/bin/ 
flanneld                                                                                                           100%   34MB  62.9MB/s   00:00    
[root@k8s-master ~]# scp flanneld [email protected]:/usr/bin/ 
flanneld                                                                                                           100%   34MB 121.0MB/s   00:00    
[root@k8s-master ~]# scp mk-docker-opts.sh [email protected]:/usr/libexec/flannel
mk-docker-opts.sh                                                                                                      100% 2139     1.2MB/s   00:00  
[root@k8s-master ~]# scp mk-docker-opts.sh [email protected]:/usr/libexec/flannel
mk-docker-opts.sh                                                                                                      100% 2139     1.1MB/s   00:00  

  • 注意在node节点上一定要实现创建好盛放flanneld和mk-docker-opts.sh的目录。

(2)/etc/sysconfig/flanneld配置文件:

# Flanneld configuration options  

# etcd url location.  Point this to the server where etcd runs
FLANNEL_ETCD_ENDPOINTS="https://172.16.4.12:2379,https://172.16.4.13:2379,https://172.16.4.14:2379"

# etcd config key.  This is the configuration key that flannel queries
# For address range assignment
FLANNEL_ETCD_PREFIX="/kube-centos/network"

# Any additional options that you want to pass
FLANNEL_OPTIONS="-etcd-cafile=/etc/kubernetes/ssl/ca.pem -etcd-certfile=/etc/kubernetes/ssl/server.pem -etcd-keyfile=/etc/kubernetes/ssl/server-key.pem"

(3)创建service配置文件/usr/lib/systemd/system/flanneld.service

[Unit]
Description=Flanneld overlay address etcd agent
After=network.target
After=network-online.target
Wants=network-online.target
After=etcd.service
Before=docker.service

[Service]
Type=notify
EnvironmentFile=/etc/sysconfig/flanneld
EnvironmentFile=-/etc/sysconfig/docker-network
ExecStart=/usr/bin/flanneld --ip-masq \
  -etcd-endpoints=${FLANNEL_ETCD_ENDPOINTS} \
  -etcd-prefix=${FLANNEL_ETCD_PREFIX} \
  $FLANNEL_OPTIONS
ExecStartPost=/usr/libexec/flannel/mk-docker-opts.sh -k DOCKER_NETWORK_OPTIONS -d /run/flannel/docker
Restart=on-failure

[Install]
WantedBy=multi-user.target
RequiredBy=docker.service

  • 注意如果是多网卡(例如vagrant环境),则需要在FLANNEL_OPTIONS中增加指定的外网出口的网卡,例如-iface=eth1

(4)在etcd中创建网络配置

执行下面的命令为docker分配IP地址段。

etcdctl --endpoints=https://172.16.4.12:2379,https://172.16.4.13:2379,https://172.16.4.14:2379 \
  --ca-file=/etc/kubernetes/ssl/ca.pem \
  --cert-file=/etc/kubernetes/ssl/server.pem \
  --key-file=/etc/kubernetes/ssl/server-key.pem
mkdir -p /kube-centos/network

[root@k8s-master network]# etcdctl --endpoints=https://172.16.4.12:2379,https://172.16.4.13:2379,https://172.16.4.14:2379   --ca-file=/etc/kubernetes/ssl/ca.pem   --cert-file=/etc/kubernetes/ssl/server.pem   --key-file=/etc/kubernetes/ssl/server-key.pem   mk /kube-centos/network/config '{"Network":"172.30.0.0/16","SubnetLen":24,"Backend":{"Type":"vxlan"}}'

[root@k8s-master network]# etcdctl --endpoints=https://172.16.4.12:2379,https://172.16.4.13:2379,https://172.16.4.14:2379   --ca-file=/etc/kubernetes/ssl/ca.pem   --cert-file=/etc/kubernetes/ssl/server.pem   --key-file=/etc/kubernetes/ssl/server-key.pem   set /kube-centos/network/config '{"Network":"172.30.0.0/16","SubnetLen":24,"Backend":{"Type":"vxlan"}}'
{"Network":"172.30.0.0/16","SubnetLen":24,"Backend":{"Type":"vxlan"}}

(5)启动flannel

systemctl daemon-reload
systemctl enable flanneld
systemctl start flanneld
systemctl status flanneld

现在查询etcd中的内容可以看到:

[root@k8s-master ~]# etcdctl --endpoints=${ETCD_ENDPOINTS} \
   --ca-file=/etc/kubernetes/ssl/ca.pem \
   --cert-file=/etc/kubernetes/ssl/server.pem \
   --key-file=/etc/kubernetes/ssl/server-key.pem \
   ls /kube-centos/network/subnets
/kube-centos/network/subnets/172.30.20.0-24
/kube-centos/network/subnets/172.30.69.0-24
/kube-centos/network/subnets/172.30.53.0-24

[root@k8s-master ~]# etcdctl --endpoints=${ETCD_ENDPOINTS} \
   --ca-file=/etc/kubernetes/ssl/ca.pem \
   --cert-file=/etc/kubernetes/ssl/server.pem \
   --key-file=/etc/kubernetes/ssl/server-key.pem \
   get /kube-centos/network/config
{"Network":"172.30.0.0/16","SubnetLen":24,"Backend":{"Type":"vxlan"}}

[root@k8s-master ~]# etcdctl --endpoints=${ETCD_ENDPOINTS} \
   --ca-file=/etc/kubernetes/ssl/ca.pem \
   --cert-file=/etc/kubernetes/ssl/server.pem \
   --key-file=/etc/kubernetes/ssl/server-key.pem \
   get /kube-centos/network/subnets/172.30.20.0-24
{"PublicIP":"172.16.4.13","BackendType":"vxlan","BackendData":{"VtepMAC":"5e:ef:ff:37:0a:d2"}}

[root@k8s-master ~]# etcdctl --endpoints=${ETCD_ENDPOINTS} \
    --ca-file=/etc/kubernetes/ssl/ca.pem \
    --cert-file=/etc/kubernetes/ssl/server.pem \
    --key-file=/etc/kubernetes/ssl/server-key.pem \
   get /kube-centos/network/subnets/172.30.53.0-24
{"PublicIP":"172.16.4.12","BackendType":"vxlan","BackendData":{"VtepMAC":"e2:e6:b9:23:79:a2"}}

[root@k8s-master ~]# etcdctl --endpoints=${ETCD_ENDPOINTS} \
>     --ca-file=/etc/kubernetes/ssl/ca.pem \
>     --cert-file=/etc/kubernetes/ssl/server.pem \
>     --key-file=/etc/kubernetes/ssl/server-key.pem \
>    get /kube-centos/network/subnets/172.30.69.0-24
{"PublicIP":"172.16.4.14","BackendType":"vxlan","BackendData":{"VtepMAC":"06:0e:58:69:a0:41"}}

同时还可以查看到其他信息:

# 1. 比如可以查看到flannel网络的信息
[root@k8s-master ~]# ifconfig
.......

flannel.1: flags=4163  mtu 1450
        inet 172.30.53.0  netmask 255.255.255.255  broadcast 0.0.0.0
        inet6 fe80::e0e6:b9ff:fe23:79a2  prefixlen 64  scopeid 0x20
        ether e2:e6:b9:23:79:a2  txqueuelen 0  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 8 overruns 0  carrier 0  collisions 0

.......
# 2. 可以查看到分配的子网的文件。
[root@k8s-master ~]# cat /run/flannel/docker
DOCKER_OPT_BIP="--bip=172.30.53.1/24"
DOCKER_OPT_IPMASQ="--ip-masq=false"
DOCKER_OPT_MTU="--mtu=1450"
DOCKER_NETWORK_OPTIONS=" --bip=172.30.53.1/24 --ip-masq=false --mtu=1450"

(6)将docker应用于flannel

# 需要修改/usr/lib/systemd/system/docker.servce的ExecStart字段,引入上述\$DOCKER_NETWORK_OPTIONS字段,docker.service详细配置见下。

[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
After=network-online.target firewalld.service
Wants=network-online.target

[Service]
Type=notify

# add by gzr
EnvironmentFile=-/run/flannel/docker
EnvironmentFile=-/run/docker_opts.env
EnvironmentFile=-/run/flannel/subnet.env
EnvironmentFile=-/etc/sysconfig/docker
EnvironmentFile=-/etc/sysconfig/docker-storage
EnvironmentFile=-/etc/sysconfig/docker-network

# the default is not to use systemd for cgroups because the delegate issues still
# exists and systemd currently does not support the cgroup feature set required
# for containers run by docker
ExecStart=/usr/bin/dockerd  $DOCKER_NETWORK_OPTIONS
ExecReload=/bin/kill -s HUP $MAINPID
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
# Uncomment TasksMax if your systemd version supports it.
# Only systemd 226 and above support this version.
#TasksMax=infinity
TimeoutStartSec=0
# set delegate yes so that systemd does not reset the cgroups of docker containers
Delegate=yes
# kill only the docker process, not all processes in the cgroup
KillMode=process
# restart the docker process if it exits prematurely
Restart=on-failure
StartLimitBurst=3
StartLimitInterval=60s

[Install]
WantedBy=multi-user.target

# 重启docker使得配置生效。
[root@k8s-master ~]# systemctl daemon-reload && systemctl restart docker.service
# 再次查看docker和flannel的网络,会发现两者在同一网段
[root@k8s-master ~]# ifconfig 
docker0: flags=4099  mtu 1500
        inet 172.30.53.1  netmask 255.255.255.0  broadcast 172.30.53.255
        ether 02:42:1e:aa:8b:0f  txqueuelen 0  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

......

flannel.1: flags=4163  mtu 1450
        inet 172.30.53.0  netmask 255.255.255.255  broadcast 0.0.0.0
        inet6 fe80::e0e6:b9ff:fe23:79a2  prefixlen 64  scopeid 0x20
        ether e2:e6:b9:23:79:a2  txqueuelen 0  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 8 overruns 0  carrier 0  collisions 0

.....
# 同理,可以应用到其他各节点上,以node1为例。
[root@k8s-node1 ~]# vim /usr/lib/systemd/system/docker.service 
[root@k8s-node1 ~]# systemctl daemon-reload && systemctl restart docker
[root@k8s-node1 ~]# ifconfig
docker0: flags=4163  mtu 1450
        inet 172.30.20.1  netmask 255.255.255.0  broadcast 172.30.20.255
        inet6 fe80::42:23ff:fe7f:6a70  prefixlen 64  scopeid 0x20
        ether 02:42:23:7f:6a:70  txqueuelen 0  (Ethernet)
        RX packets 18  bytes 2244 (2.1 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 48  bytes 3469 (3.3 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

......

flannel.1: flags=4163  mtu 1450
        inet 172.30.20.0  netmask 255.255.255.255  broadcast 0.0.0.0
        inet6 fe80::5cef:ffff:fe37:ad2  prefixlen 64  scopeid 0x20
        ether 5e:ef:ff:37:0a:d2  txqueuelen 0  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 8 overruns 0  carrier 0  collisions 0

......

veth82301fa: flags=4163  mtu 1450
        inet6 fe80::6855:cfff:fe99:5143  prefixlen 64  scopeid 0x20
        ether 6a:55:cf:99:51:43  txqueuelen 0  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 7  bytes 586 (586.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0


部署node节点

# 把master节点上的flanneld.service文件分发到各node节点上。
scp /usr/lib/systemd/system/flanneld.service [email protected]:/usr/lib/systemd/system
scp /usr/lib/systemd/system/flanneld.service [email protected]:/usr/lib/systemd/system
# 重新启动flanneld
systemctl daemon-reload
systemctl enable flanneld
systemctl start flanneld
systemctl status flanneld

配置Docker

不论您使用何种方式安装的flannel,将以下配置加入到/var/lib/systemd/systemc/docker.service中可确保万无一失。

# 待加入内容
EnvironmentFile=-/run/flannel/docker
EnvironmentFile=-/run/docker_opts.env
EnvironmentFile=-/run/flannel/subnet.env
EnvironmentFile=-/etc/sysconfig/docker
EnvironmentFile=-/etc/sysconfig/docker-storage
EnvironmentFile=-/etc/sysconfig/docker-network

# 最终完整的docker.service文件内容如下
[root@k8s-master ~]# cat /usr/lib/systemd/system/docker.service 
[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
After=network-online.target firewalld.service
Wants=network-online.target

[Service]
Type=notify

# add by gzr
EnvironmentFile=-/run/flannel/docker
EnvironmentFile=-/run/docker_opts.env
EnvironmentFile=-/run/flannel/subnet.env
EnvironmentFile=-/etc/sysconfig/docker
EnvironmentFile=-/etc/sysconfig/docker-storage
EnvironmentFile=-/etc/sysconfig/docker-network

# the default is not to use systemd for cgroups because the delegate issues still
# exists and systemd currently does not support the cgroup feature set required
# for containers run by docker
ExecStart=/usr/bin/dockerd
ExecReload=/bin/kill -s HUP $MAINPID
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
# Uncomment TasksMax if your systemd version supports it.
# Only systemd 226 and above support this version.
#TasksMax=infinity
TimeoutStartSec=0
# set delegate yes so that systemd does not reset the cgroups of docker containers
Delegate=yes
# kill only the docker process, not all processes in the cgroup
KillMode=process
# restart the docker process if it exits prematurely
Restart=on-failure
StartLimitBurst=3
StartLimitInterval=60s

[Install]
WantedBy=multi-user.target

(2)启动docker

安装配置kubelet

(1)检查是否禁用sawp

[root@k8s-master ~]# free
              total        used        free      shared  buff/cache   available
Mem:       32753848      730892    27176072      377880     4846884    31116660
Swap:             0           0           0

  • 或者进入/etc/fstab目录,将swap系统注释掉。

kubelet 启动时向 kube-apiserver 发送 TLS bootstrapping 请求,需要先将 bootstrap token 文件中的 kubelet-bootstrap 用户赋予 system:node-bootstrapper cluster 角色(role), 然后 kubelet 才能有权限创建认证请求(certificate signing requests):

(2)从master节点的/usr/local/bin将kubelet和kube-proxy文件传至各节点

[root@k8s-master ~]# scp /usr/local/bin/kubelet [email protected]:/usr/local/bin/ 
[root@k8s-master ~]# scp /usr/local/bin/kubelet [email protected]:/usr/local/bin/ 


(3)在master节点上创建角色。

# 需要在master端创建权限分配角色, 然后在node节点上再重新启动kubelet服务
[root@k8s-master kubernetes]# kubectl create clusterrolebinding kubelet-bootstrap --clusterrole=system:node-bootstrapper --user=kubelet-bootstrap
clusterrolebinding.rbac.authorization.k8s.io/kubelet-bootstrap created

(4)创建kubelet服务

第一种方式:在node节点上创建执行脚本

# 创建kubelet的配置文件和kubelet.service文件,此处采用创建脚本kubelet.sh的一键执行。
#!/bin/bash

NODE_ADDRESS=${1:-"172.16.4.13"}
DNS_SERVER_IP=${2:-"10.10.10.2"}

cat <<EOF >/etc/kubernetes/kubelet

KUBELET_ARGS="--logtostderr=true \\
--v=4 \\
--address=${NODE_ADDRESS} \\
--hostname-override=${NODE_ADDRESS} \\
--kubeconfig=/etc/kubernetes/kubelet.kubeconfig \\
--bootstrap-kubeconfig=/etc/kubernetes/bootstrap.kubeconfig \\
--api-servers=172.16.4.12 \\
--cert-dir=/etc/kubernetes/ssl \\
--allow-privileged=true \\
--cluster-dns=${DNS_SERVER_IP} \\
--cluster-domain=cluster.local \\
--fail-swap-on=false \\
--pod-infra-container-image=registry.cn-hangzhou.aliyuncs.com/google-containers/pause-amd64:3.0"

EOF

cat <<EOF >/usr/lib/systemd/system/kubelet.service
[Unit]
Description=Kubernetes Kubelet
After=docker.service
Requires=docker.service

[Service]
EnvironmentFile=-/etc/kubernetes/kubelet
ExecStart=/usr/local/bin/kubelet \$KUBELET_ARGS
Restart=on-failure
KillMode=process

[Install]
WantedBy=multi-user.target
EOF

systemctl daemon-reload
systemctl enable kubelet
systemctl restart kubelet && systemctl status kubelet

2)执行脚本

chmod +x kubelet.sh
./kubelet.sh 172.16.4.14 10.10.10.2
# 同时还可以查看生成的kubelet.service文件
[root@k8s-node2 ~]# cat /usr/lib/systemd/system/kubelet.service
[Unit]
Description=Kubernetes Kubelet
After=docker.service
Requires=docker.service

[Service]
EnvironmentFile=-/etc/kubernetes/kubelet
ExecStart=/usr/local/bin/kubelet $KUBELET_ARGS
Restart=on-failure
KillMode=process

[Install]
WantedBy=multi-user.target

  • 注意:在node1上执行kubelet.sh脚本,传入172.16.4.13(node1 IP)和 10.10.10.2(DNS服务器IP)。在其他节点执行脚本时,记得替换相应的参数。

或者采用第二种方式

1)创建kubelet的配置文件/etc/kubernetes/kubelet,内容如下:

###
## kubernetes kubelet (minion) config
#
## The address for the info server to serve on (set to 0.0.0.0 or "" for all interfaces)
KUBELET_ADDRESS="--address=172.16.4.12"
#
## The port for the info server to serve on
#KUBELET_PORT="--port=10250"
#
## You may leave this blank to use the actual hostname
KUBELET_HOSTNAME="--hostname-override=172.16.4.12"
#
## location of the api-server
## COMMENT THIS ON KUBERNETES 1.8+
KUBELET_API_SERVER="--api-servers=http://172.16.4.12:8080"
#
## pod infrastructure container
KUBELET_POD_INFRA_CONTAINER="--pod-infra-container-image=jimmysong/pause-amd64:3.0"
#
## Add your own!
KUBELET_ARGS="--cgroup-driver=systemd \
--cluster-dns=10.10.10.2 \
--bootstrap-kubeconfig=/etc/kubernetes/bootstrap.kubeconfig \
--kubeconfig=/etc/kubernetes/kubelet.kubeconfig \
--require-kubeconfig \
--cert-dir=/etc/kubernetes/ssl \
--cluster-domain=cluster.local \
--hairpin-mode promiscuous-bridge \
--serialize-image-pulls=false"

  • 如果使用systemd方式启动,则需要额外增加两个参数--runtime-cgroups=/systemd/system.slice --kubelet-cgroups=/systemd/system.slice
  • --address 不能设置为 127.0.0.1,否则后续 Pods 访问 kubelet 的 API 接口时会失败,因为 Pods 访问的 127.0.0.1 指向自己而不是 kubelet;
  • "--cgroup-driver 配置成 systemd,不要使用cgroup,否则在 CentOS 系统中 kubelet 将启动失败(保持docker和kubelet中的cgroup driver配置一致即可,不一定非使用systemd)。
  • --bootstrap-kubeconfig 指向 bootstrap kubeconfig 文件,kubelet 使用该文件中的用户名和 token 向 kube-apiserver 发送 TLS Bootstrapping 请求;
  • 管理员通过了 CSR 请求后,kubelet 自动在 --cert-dir 目录创建证书和私钥文件(kubelet-client.crtkubelet-client.key),然后写入 --kubeconfig 文件;
  • 建议在 --kubeconfig 配置文件中指定 kube-apiserver 地址,如果未指定 --api-servers 选项,则必须指定 --require-kubeconfig 选项后才从配置文件中读取 kube-apiserver 的地址,否则 kubelet 启动后将找不到 kube-apiserver (日志中提示未找到 API Server),kubectl get nodes 不会返回对应的 Node 信息; --require-kubeconfig 在1.10版本被移除,参看PR;
  • --cluster-dns 指定 kubedns 的 Service IP(可以先分配,后续创建 kubedns 服务时指定该 IP),--cluster-domain 指定域名后缀,这两个参数同时指定后才会生效;
  • --cluster-domain 指定 pod 启动时 /etc/resolve.conf 文件中的 search domain ,起初我们将其配置成了 cluster.local.,这样在解析 service 的 DNS 名称时是正常的,可是在解析 headless service 中的 FQDN pod name 的时候却错误,因此我们将其修改为 cluster.local,去掉最后面的 ”点号“ 就可以解决该问题,关于 kubernetes 中的域名/服务名称解析请参见我的另一篇文章。
  • --kubeconfig=/etc/kubernetes/kubelet.kubeconfig中指定的kubelet.kubeconfig文件在第一次启动kubelet之前并不存在,请看下文,当通过CSR请求后会自动生成kubelet.kubeconfig文件,如果你的节点上已经生成了~/.kube/config文件,你可以将该文件拷贝到该路径下,并重命名为kubelet.kubeconfig,所有node节点可以共用同一个kubelet.kubeconfig文件,这样新添加的节点就不需要再创建CSR请求就能自动添加到kubernetes集群中。同样,在任意能够访问到kubernetes集群的主机上使用kubectl --kubeconfig命令操作集群时,只要使用~/.kube/config文件就可以通过权限认证,因为这里面已经有认证信息并认为你是admin用户,对集群拥有所有权限。
  • KUBELET_POD_INFRA_CONTAINER 是基础镜像容器,这里我用的是私有镜像仓库地址,大家部署的时候需要修改为自己的镜像。可以使用Google的pause镜像gcr.io/google_containers/pause-amd64:3.0,这个镜像只有300多K。

2) 创建kubelet的service配置文件

文件位置:/usr/lib/systemd/system/kubelet.service

[Unit]
Description=Kubernetes Kubelet Server
Documentation=https://github.com/GoogleCloudPlatform/kubernetes
After=docker.service
Requires=docker.service

[Service]
WorkingDirectory=/var/lib/kubelet
EnvironmentFile=-/etc/kubernetes/config
EnvironmentFile=-/etc/kubernetes/kubelet
ExecStart=/usr/local/bin/kubelet \
            $KUBE_LOGTOSTDERR \
            $KUBE_LOG_LEVEL \
            $KUBELET_API_SERVER \
            $KUBELET_ADDRESS \
            $KUBELET_PORT \
            $KUBELET_HOSTNAME \
            $KUBE_ALLOW_PRIV \
            $KUBELET_POD_INFRA_CONTAINER \
            $KUBELET_ARGS
Restart=on-failure

[Install]
WantedBy=multi-user.target

注意:上述两种方式都可以创建kubelet服务,个人建议采用脚本一键式执行所有任务,采用第二种方式配置时,需要手动创建工作目录:/var/lib/kubelet。此处不再演示。

(5)通过kubelet的TLS证书请求

kubelet 首次启动时向 kube-apiserver 发送证书签名请求,必须通过后 kubernetes 系统才会将该 Node 加入到集群。

1)在master节点上查看未授权的CSR请求

[root@k8s-master ~]# kubectl get csr
NAME                                                   AGE    REQUESTOR           CONDITION
node-csr-4799pnHJjREEcWDGgSFvNaoyfcn4HiOML9cpEI1IbMs   3h6m   kubelet-bootstrap   Pending
node-csr-e3mql7Dm878tLhPUxu2pzg8e8eM17Togc6lHQX-mXZs   3h     kubelet-bootstrap   Pending

2)通过CSR请求

[root@k8s-master ~]# kubectl certificate approve node-csr-4799pnHJjREEcWDGgSFvNaoyfcn4HiOML9cpEI1IbMs
certificatesigningrequest.certificates.k8s.io/node-csr-4799pnHJjREEcWDGgSFvNaoyfcn4HiOML9cpEI1IbMs approved
[root@k8s-master ~]# kubectl certificate approve node-csr-e3mql7Dm878tLhPUxu2pzg8e8eM17Togc6lHQX-mXZs
certificatesigningrequest.certificates.k8s.io/node-csr-e3mql7Dm878tLhPUxu2pzg8e8eM17Togc6lHQX-mXZs approved
# 授权后发现两个node节点的csr已经approved.

3)自动生成了 kubelet kubeconfig 文件和公私钥

[root@k8s-node1 ~]# ls -l /etc/kubernetes/kubelet.kubeconfig
-rw------- 1 root root 2294 Jun 14 15:19 /etc/kubernetes/kubelet.kubeconfig
[root@k8s-node1 ~]# ls -l /etc/kubernetes/ssl/kubelet*
-rw------- 1 root root 1273 Jun 14 15:19 /etc/kubernetes/ssl/kubelet-client-2019-06-14-15-19-10.pem
lrwxrwxrwx 1 root root   58 Jun 14 15:19 /etc/kubernetes/ssl/kubelet-client-current.pem -> /etc/kubernetes/ssl/kubelet-client-2019-06-14-15-19-10.pem
-rw-r--r-- 1 root root 2177 Jun 14 11:50 /etc/kubernetes/ssl/kubelet.crt
-rw------- 1 root root 1679 Jun 14 11:50 /etc/kubernetes/ssl/kubelet.key


假如你更新kubernetes的证书,只要没有更新token.csv,当重启kubelet后,该node就会自动加入到kuberentes集群中,而不会重新发送certificaterequest,也不需要在master节点上执行kubectl certificate approve操作。前提是不要删除node节点上的/etc/kubernetes/ssl/kubelet*/etc/kubernetes/kubelet.kubeconfig文件。否则kubelet启动时会提示找不到证书而失败。

[root@k8s-master ~]# scp /etc/kubernetes/token.csv [email protected]:/etc/kubernetes/   
[root@k8s-master ~]# scp /etc/kubernetes/token.csv [email protected]:/etc/kubernetes/

**注意:**如果启动kubelet的时候见到证书相关的报错,有个trick可以解决这个问题,可以将master节点上的~/.kube/config文件(该文件在

[安装kubectl命令行工具]:

这一步中将会自动生成)拷贝到node节点的/etc/kubernetes/kubelet.kubeconfig位置,这样就不需要通过CSR,当kubelet启动后就会自动加入的集群中。注意同时记得也把.kube/config中的内容复制粘贴到/etc/kubernetes/kubelet.kubeconfig中,替换原先内容。

[root@k8s-master ~]# cat .kube/config 
apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUR2akNDQXFhZ0F3SUJBZ0lVZkJtL2lzNG1EcHdqa0M0aVFFTWF5SVJaVHVjd0RRWUpLb1pJaHZjTkFRRUwKQlFBd1pURUxNQWtHQTFVRUJoTUNRMDR4RURBT0JnTlZCQWdUQjBKbGFXcHBibWN4RURBT0JnTlZCQWNUQjBKbAphV3BwYm1jeEREQUtCZ05WQkFvVEEyczRjekVQTUEwR0ExVUVDeE1HVTNsemRHVnRNUk13RVFZRFZRUURFd3ByCmRXSmxjbTVsZEdWek1CNFhEVEU1TURZeE1qQXpNRFF3TUZvWERUSTVNRFl3T1RBek1EUXdNRm93WlRFTE1Ba0cKQTFVRUJoTUNRMDR4RURBT0JnTlZCQWdUQjBKbGFXcHBibWN4RURBT0JnTlZCQWNUQjBKbGFXcHBibWN4RERBSwpCZ05WQkFvVEEyczRjekVQTUEwR0ExVUVDeE1HVTNsemRHVnRNUk13RVFZRFZRUURFd3ByZFdKbGNtNWxkR1Z6Ck1JSUJJakFOQmdrcWhraUc5dzBCQVFFRkFBT0NBUThBTUlJQkNnS0NBUUVBOEZQK2p0ZUZseUNPVDc0ZzRmd1UKeDl0bDY3dGVabDVwTDg4ZStESzJMclBJZDRXMDRvVDdiWTdKQVlLT3dPTkM4RjA5MzNqSjVBdmxaZmppTkJCaQp2OTlhYU5tSkdxeWozMkZaaDdhTkYrb3Fab3BYdUdvdmNpcHhYTWlXbzNlVHpWVUh3d2FBeUdmTS9BQnE0WUY0ClprSVV5UkJaK29OVXduY0tNaStOR2p6WVJyc2owZEJRR0ROZUJ6OEgzbCtjd1U1WmpZdEdFUFArMmFhZ1k5bG0KbjhyOUFna2owcW9uOEdQTFlRb2RDYzliSWZqQmVNaGIzaHJGMjJqMDhzWTczNzh3MzN5VWRHdjg1YWpuUlp6UgpIYkN6UytYRGJMTTh2aGh6dVZoQmt5NXNrWXB6M0hCNGkrTnJPR1Fmdm4yWkY0ZFh4UVUyek1Dc2NMSVppdGg0Ckt3SURBUUFCbzJZd1pEQU9CZ05WSFE4QkFmOEVCQU1DQVFZd0VnWURWUjBUQVFIL0JBZ3dCZ0VCL3dJQkFqQWQKQmdOVkhRNEVGZ1FVeTVmVncxK0s2N1dvblRuZVgwL2dFSTVNM3FJd0h3WURWUjBqQkJnd0ZvQVV5NWZWdzErSwo2N1dvblRuZVgwL2dFSTVNM3FJd0RRWUpLb1pJaHZjTkFRRUxCUUFEZ2dFQkFOb3ZXa1ovd3pEWTZSNDlNNnpDCkhoZlZtVGk2dUZwS24wSmtvMVUzcHA5WTlTTDFMaXVvK3VwUjdJOCsvUXd2Wm95VkFWMTl4Y2hRQ25RSWhRMEgKVWtybXljS0crdWtsSUFUS3ZHenpzNW1aY0NQOGswNnBSSHdvWFhRd0ZhSFBpNnFZWDBtaW10YUc4REdzTk01RwpQeHdZZUZncXBLQU9Tb0psNmw5bXErQnhtWEoyZS8raXJMc3N1amlPKzJsdnpGOU5vU29Yd1RqUGZndXhRU3VFCnZlSS9pTXBGV1o0WnlCYWJKYkw5dXBldm53RTA2RXQrM2g2N3JKOU5mZ2N5MVhNSU0xeGo1QXpzRXgwVE5ETGkKWGlOQ0Zram9zWlA3U3dZdE5ncHNuZmhEandHRUJLbXV1S3BXR280ZWNac2lMQXgwOTNaeTdKM2dqVDF6dGlFUwpzQlE9Ci0tLS0tRU5EIENFUlRJRklDQVRFLS0tLS0K
    server: https://172.16.4.12:6443
  name: kubernetes
contexts:
- context:
    cluster: kubernetes
    user: admin
  name: kubernetes
current-context: kubernetes
kind: Config
preferences: {}
users:
- name: admin
  user:
    client-certificate-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUQzVENDQXNXZ0F3SUJBZ0lVVmlPdjZ6aFlHMzIzdWRZS2RFWEcvRVJENW8wd0RRWUpLb1pJaHZjTkFRRUwKQlFBd1pURUxNQWtHQTFVRUJoTUNRMDR4RURBT0JnTlZCQWdUQjBKbGFXcHBibWN4RURBT0JnTlZCQWNUQjBKbAphV3BwYm1jeEREQUtCZ05WQkFvVEEyczRjekVQTUEwR0ExVUVDeE1HVTNsemRHVnRNUk13RVFZRFZRUURFd3ByCmRXSmxjbTVsZEdWek1CNFhEVEU1TURZeE1qQTJORGd3TUZvWERUSTVNRFl3T1RBMk5EZ3dNRm93YXpFTE1Ba0cKQTFVRUJoTUNRMDR4RURBT0JnTlZCQWdUQjBKbGFVcHBibWN4RURBT0JnTlZCQWNUQjBKbGFVcHBibWN4RnpBVgpCZ05WQkFvVERuTjVjM1JsYlRwdFlYTjBaWEp6TVE4d0RRWURWUVFMRXdaVGVYTjBaVzB4RGpBTUJnTlZCQU1UCkJXRmtiV2x1TUlJQklqQU5CZ2txaGtpRzl3MEJBUUVGQUFPQ0FROEFNSUlCQ2dLQ0FRRUFuL29MQVpCcENUdWUKci95eU15a1NYelBpWk9mVFdZQmEwNjR6c2Y1Y1Z0UEt2cnlCSjVHVlVSUlFUc2F3eWdFdnFBSXI3TUJrb21GOQpBeFVNaFNxdlFjNkFYemQzcjRMNW1CWGQxZ3FoWVNNR2lJL3hEMG5RaEF1azBFbVVONWY5ZENZRmNMMTVBVnZSCituN2wwaVcvVzlBRjRqbXRtYUtLVUdsUU9vNzQ3anNCYWRndU9SVHBMSkwxUGw3SlVLZnFBWktEbFVXZnpwZXcKOE1ETVMzN1FodmVQc24va2RwUVZ0bzlJZWcwSFhBcXlmZHNaZjZKeGdaS1FmUUNyYlJEMkd2L29OVVRlYnpWMwpWVm9ueEpUYmFrZFNuOHR0cCtLWFlzTUYvQy8wR29sL1JkS1Mrc0t4Z2hUUWdJMG5CZXJBM0x0dGp6WVpySWJBClo0RXBRNmc0ZFFJREFRQUJvMzh3ZlRBT0JnTlZIUThCQWY4RUJBTUNCYUF3SFFZRFZSMGxCQll3RkFZSUt3WUIKQlFVSEF3RUdDQ3NHQVFVRkJ3TUNNQXdHQTFVZEV3RUIvd1FDTUFBd0hRWURWUjBPQkJZRUZCQThrdnFaVDhRRApaSnIvTUk2L2ZWalpLdVFkTUI4R0ExVWRJd1FZTUJhQUZNdVgxY05maXV1MXFKMDUzbDlQNEJDT1RONmlNQTBHCkNTcUdTSWIzRFFFQkN3VUFBNElCQVFDMnZzVDUwZVFjRGo3RVUwMmZQZU9DYmJ6cFZWazEzM3NteGI1OW83YUgKRDhONFgvc3dHVlYzU0V1bVNMelJYWDJSYUsyUU04OUg5ZDlpRkV2ZzIvbjY3VThZeVlYczN0TG9Ua29NbzlUZgpaM0FNN0NyM0V5cWx6OGZsM3p4cmtINnd1UFp6VWNXV29vMUJvR1VCbEM1Mi9EbFpQMkZCbHRTcWtVL21EQ3IxCnJJWkFYYjZDbXNNZG1SQzMrYWwxamVUak9MZEcwMUd6dlBZdEdsQ0p2dHRJNzBuVkR3Nkh3QUpkRVN0UUh0cWsKakpCK3NZU2NSWDg1YTlsUXVIU21DY0kyQWxZQXFkK0t2NnNKNUVFZnpwWHNUVXdya0tKbjJ0UTN2UVNLaEgyawpabUx2N0MvcWV6YnJvc3pGeHNZWEtRelZiODVIVkxBbXo2UVhYV1I2Q0ZzMAotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg==
    client-key-data: LS0tLS1CRUdJTiBSU0EgUFJJVkFURSBLRVktLS0tLQpNSUlFb2dJQkFBS0NBUUVBbi9vTEFaQnBDVHVlci95eU15a1NYelBpWk9mVFdZQmEwNjR6c2Y1Y1Z0UEt2cnlCCko1R1ZVUlJRVHNhd3lnRXZxQUlyN01Ca29tRjlBeFVNaFNxdlFjNkFYemQzcjRMNW1CWGQxZ3FoWVNNR2lJL3gKRDBuUWhBdWswRW1VTjVmOWRDWUZjTDE1QVZ2UituN2wwaVcvVzlBRjRqbXRtYUtLVUdsUU9vNzQ3anNCYWRndQpPUlRwTEpMMVBsN0pVS2ZxQVpLRGxVV2Z6cGV3OE1ETVMzN1FodmVQc24va2RwUVZ0bzlJZWcwSFhBcXlmZHNaCmY2SnhnWktRZlFDcmJSRDJHdi9vTlVUZWJ6VjNWVm9ueEpUYmFrZFNuOHR0cCtLWFlzTUYvQy8wR29sL1JkS1MKK3NLeGdoVFFnSTBuQmVyQTNMdHRqellackliQVo0RXBRNmc0ZFFJREFRQUJBb0lCQUE1cXFDZEI3bFZJckNwTAo2WHMyemxNS0IvTHorVlh0ZlVIcVJ2cFpZOVRuVFRRWEpNUitHQ2l3WGZSYmIzOGswRGloeVhlU2R2OHpMZUxqCk9MZWZleC9CRGt5R1lTRE4rdFE3MUR2L3hUOU51cjcveWNlSTdXT1k4UWRjT2lFd2IwVFNVRmN5bS84RldVenIKdHFaVGhJVXZuL2dkSG9uajNmY1ZKb2ZBYnFwNVBrLzVQd2hFSU5Pdm1FTFZFQWl6VnBWVmwxNzRCSGJBRHU1Sgp2Nm9xc0h3SUhwNC9ZbGo2NHhFVUZ1ZFA2Tkp0M1B5Uk14dW5RcWd3SWZ1bktuTklRQmZEVUswSklLK1luZmlJClgrM1lQam5sWFU3UnhYRHRFa3pVWTFSTTdVOHJndHhiNWRQWnhocGgyOFlFVnJBVW5RS2RSTWdCVVNad3hWRUYKeFZqWmVwa0NnWUVBeEtHdXExeElHNTZxL2RHeGxDODZTMlp3SkxGajdydTkrMkxEVlZsL2h1NzBIekJ6dFFyNwpMUGhUZnl2SkVqNTcwQTlDbk4ybndjVEQ2U1dqbkNDbW9ESk10Ti9iZlJaMThkZTU4b0JCRDZ5S0JGbmV1eWkwCk1oVWFmSzN5M091bGkxMjBKS3lQb2hvN1lyWUxNazc1UzVEeVRGMlEyV3JYY0VQaTlVRzNkNzhDZ1lFQTBFY3YKTUhDbE9XZ1hJUVNXNCtreFVEVXRiOFZPVnpwYjd3UWZCQ3RmSTlvTDBnVWdBd1M2U0lub2tET3ozdEl4aXdkQQpWZTVzMklHbVAzNS9qdm5FbThnaE1XbEZ3eHB5ZUxKK0hraTl1dFNPblJGWHYvMk9JdjBYbE01RlY5blBmZ01NCkMxQ09zZklKaVREaXJFOGQrR2cxV010dWxkVGo4Z0JKazRQRXZNc0NnWUJoNHA4aWZVa0VQdU9lZ1hJbWM3QlEKY3NsbTZzdjF2NDVmQTVaNytaYkxwRTd3Njl6ZUJuNXRyNTFaVklHL1RFMjBrTFEzaFB5TE1KbmFpYnM5OE44aQpKb2diRHNta0pyZEdVbjhsNG9VQStZS25rZG1ZVURZTUxJZElCQXcvd0N0a0NweXdHUnRUdGoxVDhZMzNXR3N3CkhCTVN3dzFsdnBOTE52Qlg2WVFjM3dLQmdHOHAvenJJZExjK0lsSWlJL01EREtuMXFBbW04cGhGOHJtUXBvbFEKS05oMjBhWkh5LzB3Y2NpenFxZ0VvSFZHRk9GU2Zua2U1NE5yTjNOZUxmRCt5SHdwQmVaY2ZMcVVqQkoxbWpESgp2RkpTanNld2NQaHMrWWNkTkkvY3hGQU9WZHU0L3Aydlltb0JlQ3Q4SncrMnJwVmQ4Vk15U1JTNWF1eElVUHpsCjhJU2ZBb0dBVituYjJ3UGtwOVJ0NFVpdmR0MEdtRjErQ052YzNzY3JYb3RaZkt0TkhoT0o2UTZtUkluc2tpRWgKVnFQRjZ6U1BnVmdrT1hmU0xVQ3Y2cGdWR2J5d0plRWo1SElQRHFuU25vNFErZFl2TXozcWN5d1hLbFEyUjZpcAo3VE0wWHNJaGFMRDFmWUNjaDhGVHNiZHNrQUNZUHpzeEdBa1l2TnRDcDI5WExCRmZWbkE9Ci0tLS0tRU5EIFJTQSBQUklWQVRFIEtFWS0tLS0tCg==
# 分发.kube/config到各节点。
[root@k8s-master ~]# scp .kube/config [email protected]:/etc/kubernetes/ 
[root@k8s-master ~]# scp .kube/config [email protected]:/etc/kubernetes/
# 比如在node2的/etc/kubernetes/目录下则出现了config文件。
[root@k8s-node2 ~]# ls /etc/kubernetes/
bin  bootstrap.kubeconfig  config  kubelet  kubelet.kubeconfig  kube-proxy.kubeconfig  ssl  token.csv

配置kube-proxy

脚本方式配置

(1)编写kube-proxy.sh脚本内容如下(在各node上编写该脚本):

#!/bin/bash

NODE_ADDRESS=${1:-"172.16.4.13"}

cat <<EOF >/etc/kubernetes/kube-proxy

KUBE_PROXY_ARGS="--logtostderr=true \
--v=4 \
--hostname-override=${NODE_ADDRESS} \
--kubeconfig=/etc/kubernetes/kube-proxy.kubeconfig"

EOF

cat <<EOF >/usr/lib/systemd/system/kube-proxy.service
[Unit]
Description=Kubernetes Proxy
After=network.target

[Service]
EnvironmentFile=-/etc/kubernetes/kube-proxy
ExecStart=/usr/local/bin/kube-proxy \$KUBE_PROXY_ARGS
Restart=on-failure

[Install]
WantedBy=multi-user.target
EOF

systemctl daemon-reload && systemctl enable kube-proxy
systemctl restart kube-proxy && systemctl status kube-proxy
  • --hostname-override 参数值必须与 kubelet 的值一致,否则 kube-proxy 启动后会找不到该 Node,从而不会创建任何 iptables 规则;
  • kube-proxy 根据 --cluster-cidr 判断集群内部和外部流量,指定 --cluster-cidr--masquerade-all 选项后 kube-proxy 才会对访问 Service IP 的请求做 SNAT;
  • --kubeconfig 指定的配置文件嵌入了 kube-apiserver 的地址、用户名、证书、秘钥等请求和认证信息;
  • 预定义的 RoleBinding cluster-admin 将User system:kube-proxy 与 Role system:node-proxier 绑定,该 Role 授予了调用 kube-apiserver Proxy 相关 API 的权限;

完整 unit 见 kube-proxy.service

(2)执行脚本

# 首先将前端master的kube-proxy命令拷贝至各个节点。
[root@k8s-master ~]# scp /usr/local/bin/kube-proxy [email protected]:/usr/local/bin/   
[root@k8s-master ~]# scp /usr/local/bin/kube-proxy [email protected]:/usr/local/bin/
# 并在各个节点上更改执行权限。
chmod +x kube-proxy.sh
[root@k8s-node2 ~]# ./kube-proxy.sh 172.16.4.14
Created symlink from /etc/systemd/system/multi-user.target.wants/kube-proxy.service to /usr/lib/systemd/system/kube-proxy.service.
● kube-proxy.service - Kubernetes Proxy
   Loaded: loaded (/usr/lib/systemd/system/kube-proxy.service; enabled; vendor preset: disabled)
   Active: active (running) since Fri 2019-06-14 16:01:47 CST; 39ms ago
 Main PID: 117068 (kube-proxy)
    Tasks: 10
   Memory: 8.8M
   CGroup: /system.slice/kube-proxy.service
           └─117068 /usr/local/bin/kube-proxy --logtostderr=true --v=4 --hostname-override=172.16.4.14 --kubeconfig=/etc/kubernetes/kube-proxy.kubeconfig

Jun 14 16:01:47 k8s-node2 systemd[1]: Started Kubernetes Proxy.

(3) --kubeconfig=/etc/kubernetes/kubelet.kubeconfig中指定的kubelet.kubeconfig文件在第一次启动kubelet之前并不存在,请看下文,当通过CSR请求后会自动生成kubelet.kubeconfig文件,如果你的节点上已经生成了~/.kube/config文件,你可以将该文件拷贝到该路径下,并重命名为kubelet.kubeconfig,所有node节点可以共用同一个kubelet.kubeconfig文件,这样新添加的节点就不需要再创建CSR请求就能自动添加到kubernetes集群中。同样,在任意能够访问到kubernetes集群的主机上使用kubectl --kubeconfig命令操作集群时,只要使用~/.kube/config`文件就可以通过权限认证,因为这里面已经有认证信息并认为你是admin用户,对集群拥有所有权限。

[root@k8s-master ~]# scp .kube/config [email protected]:/etc/kubernetes/
[root@k8s-node1 ~]# mv config kubelet.kubeconfig
[root@k8s-master ~]# scp .kube/config [email protected]:/etc/kubernetes/
[root@k8s-node2 ~]# mv config kubelet.kubeconfig

验证测试

# 以下操作在master节点上运行。
[root@k8s-master ~]# kubectl get nodes
NAME          STATUS   ROLES    AGE     VERSION
172.16.4.13   Ready       66s     v1.14.3
172.16.4.14   Ready       7m14s   v1.14.3

[root@k8s-master ~]# kubectl get cs
NAME                 STATUS    MESSAGE             ERROR
scheduler            Healthy   ok                  
controller-manager   Healthy   ok                  
etcd-0               Healthy   {"health":"true"}   
etcd-1               Healthy   {"health":"true"}   
etcd-2               Healthy   {"health":"true"} 

# 以nginx服务测试集群可用性
[root@k8s-master ~]# kubectl run nginx --replicas=3 --labels="run=load-balancer-example" --image=nginx  --port=80
kubectl run --generator=deployment/apps.v1 is DEPRECATED and will be removed in a future version. Use kubectl run --generator=run-pod/v1 or kubectl create instead.
deployment.apps/nginx created
[root@k8s-master ~]# kubectl expose deployment nginx --type=NodePort --name=example-service
service/example-service exposed

[root@k8s-master ~]# kubectl describe svc example-service
Name:                     example-service
Namespace:                default
Labels:                   run=load-balancer-example
Annotations:              
Selector:                 run=load-balancer-example
Type:                     NodePort
IP:                       10.10.10.222
Port:                       80/TCP
TargetPort:               80/TCP
NodePort:                   40905/TCP
Endpoints:                172.17.0.2:80,172.17.0.2:80,172.17.0.3:80
Session Affinity:         None
External Traffic Policy:  Cluster
Events:                   

# 在node节点上访问
[root@k8s-node1 ~]# curl "10.10.10.222:80"



Welcome to nginx!



Welcome to nginx!

If you see this page, the nginx web server is successfully installed and working. Further configuration is required.

For online documentation and support please refer to nginx.org.
Commercial support is available at nginx.com.

Thank you for using nginx.

# 外网测试访问 [root@k8s-master ~]# kubectl get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE example-service NodePort 10.10.10.222 80:40905/TCP 6m26s kubernetes ClusterIP 10.10.10.1 443/TCP 21h # 由上可知,服务暴露外网端口为40905.输入172.16.4.12:40905即可访问。

DNS服务搭建与配置

从k8s v1.11版本开始,Kubernetes集群的DNS服务由CoreDNS提供。它是CNCF基金会的一个项目,使用Go语言实现的高性能、插件式、易扩展的DNS服务端。它解决了KubeDNS的一些问题,如dnsmasq的安全漏洞,externalName不能使用stubDomains设置等。

安装CoreDNS插件

官方的yaml文件目录:https://github.com/kubernetes/kubernetes/tree/master/cluster/addons/dns/coredns

在部署CoreDNS应用前,至少需要创建一个ConfigMap,一个Deployment和一个Service共3个资源对象。在启用了RBAC的集群中,还可以设置ServiceAccount、ClusterRole、ClusterRoleBinding对CoreDNS容器进行权限限制。

(1)为了起到镜像加速的作用,首先将docker的配置源更改为国内阿里云

cat << EOF > /etc/docker/daemon.json
{
      "registry-mirrors":["https://registry.docker-cn.com","https://h23rao59.mirror.aliyuncs.com"]
}
EOF

# 重新载入配置并重启docker
[root@k8s-master ~]# systemctl daemon-reload && systemctl restart docker

(2)此处将svc,configmap,ServiceAccount等写在一个yaml文件里,coredns.yaml内容见下。

[root@k8s-master ~]# cat coredns.yaml 
apiVersion: v1
kind: ServiceAccount
metadata:
  name: coredns
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    kubernetes.io/bootstrapping: rbac-defaults
  name: system:coredns
rules:
- apiGroups:
  - ""
  resources:
  - endpoints
  - services
  - pods
  - namespaces
  verbs:
  - list
  - watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  annotations:
    rbac.authorization.kubernetes.io/autoupdate: "true"
  labels:
    kubernetes.io/bootstrapping: rbac-defaults
  name: system:coredns
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:coredns
subjects:
- kind: ServiceAccount
  name: coredns
  namespace: kube-system
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: coredns
  namespace: kube-system
data:
  Corefile: |
    .:53 {
        errors
        health
        kubernetes cluster.local in-addr.arpa ip6.arpa {
           pods insecure
           upstream
           fallthrough in-addr.arpa ip6.arpa
        }
        prometheus :9153
        proxy . /etc/resolv.conf
        cache 30
        loop
        reload
        loadbalance
    }

---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  labels:
    k8s-app: kube-dns
  name: coredns
  namespace: kube-system
spec:
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      k8s-app: kube-dns
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 1
    type: RollingUpdate
  template:
    metadata:
      labels:
        k8s-app: kube-dns
    spec:
      containers:
      - args:
        - -conf
        - /etc/coredns/Corefile
        image:  docker.io/fengyunpan/coredns:1.2.6
        imagePullPolicy: IfNotPresent
        livenessProbe:
          failureThreshold: 5
          httpGet:
            path: /health
            port: 8080
            scheme: HTTP
          initialDelaySeconds: 60
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 5
        name: coredns
        ports:
        - containerPort: 53
          name: dns
          protocol: UDP
        - containerPort: 53
          name: dns-tcp
          protocol: TCP
        - containerPort: 9153
          name: metrics
          protocol: TCP
        resources:
          limits:
            memory: 170Mi
          requests:
            cpu: 100m
            memory: 70Mi
        securityContext:
          allowPrivilegeEscalation: false
          capabilities:
            add:
            - NET_BIND_SERVICE
            drop:
            - all
          procMount: Default
          readOnlyRootFilesystem: true
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /etc/coredns
          name: config-volume
          readOnly: true
      dnsPolicy: Default
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      serviceAccount: coredns
      serviceAccountName: coredns
      terminationGracePeriodSeconds: 30
      tolerations:
      - key: CriticalAddonsOnly
        operator: Exists
      - effect: NoSchedule
        key: node-role.kubernetes.io/master
      volumes:
      - configMap:
          defaultMode: 420
          items:
          - key: Corefile
            path: Corefile
          name: coredns
        name: config-volume

---
apiVersion: v1
kind: Service
metadata:
  labels:
    k8s-app: kube-dns
    kubernetes.io/cluster-service: "true"
    kubernetes.io/name: KubeDNS
  name: kube-dns
  namespace: kube-system
spec:
  selector:
    k8s-app: coredns
  clusterIP: 10.10.10.2
  ports:
  - name: dns
    port: 53
    protocol: UDP
    targetPort: 53
  - name: dns-tcp
    port: 53
    protocol: TCP
    targetPort: 53
  selector:
    k8s-app: kube-dns

  • clusterIP: 10.10.10.2是我集群各节点的DNS服务器IP,注意修改。并且在各node节点的kubelet的启动参数中加入以下两个参数:
    • –cluster-dns=10.10.10.2:为DNS服务的ClusterIP地址。
    • –cluster-domain=cluster.local:为在DNS服务中设置的域名

然后重启kubelet服务。

(3)通过kubectl create创建CoreDNS服务。

[root@k8s-master ~]# kubectl create -f coredns.yaml 
serviceaccount/coredns created
clusterrole.rbac.authorization.k8s.io/system:coredns created
clusterrolebinding.rbac.authorization.k8s.io/system:coredns created
configmap/coredns created
deployment.extensions/coredns created
service/kube-dns created
[root@k8s-master ~]# kubectl get all -n kube-system
NAME                           READY   STATUS    RESTARTS   AGE
pod/coredns-5fc7b65789-rqk6f   1/1     Running   0          20s

NAME               TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)         AGE
service/kube-dns   ClusterIP   10.10.10.2           53/UDP,53/TCP   20s

NAME                      READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/coredns   1/1     1            1           20s

NAME                                 DESIRED   CURRENT   READY   AGE
replicaset.apps/coredns-5fc7b65789   1         1         1       20s

(4)验证DNS服务

接下来使用一个带有nslookup工具的Pod来验证DNS服务是否能正常工作:

  • 创建busybox.yaml内容如下:

    [root@k8s-master ~]# cat busybox.yaml 
    apiVersion: v1
    kind: Pod
    metadata:
      name: busybox
      namespace: default
    spec:
      containers:
      - name: busybox
        image: registry.cn-hangzhou.aliyuncs.com/google_containers/busybox
        command:
          - sleep
          - "3600"
        imagePullPolicy: IfNotPresent
      restartPolicy: Always
    
    
  • 采用kubectl apply命令创建pod

    [root@k8s-master ~]# kubectl apply -f busybox.yaml
    pod/busybox created
    # 采用kubectl describe命令发现busybox创建成功
    [root@k8s-master ~]# kubectl describe po/busybox
    .......
    Events:
      Type    Reason     Age   From                  Message
      ----    ------     ----  ----                  -------
      Normal  Scheduled  4s    default-scheduler     Successfully assigned default/busybox to 172.16.4.13
      Normal  Pulling    4s    kubelet, 172.16.4.13  Pulling image "registry.cn-hangzhou.aliyuncs.com/google_containers/busybox"
      Normal  Pulled     1s    kubelet, 172.16.4.13  Successfully pulled image "registry.cn-hangzhou.aliyuncs.com/google_containers/busybox"
      Normal  Created    1s    kubelet, 172.16.4.13  Created container busybox
      Normal  Started    1s    kubelet, 172.16.4.13  Started container busybox
    
    
  • 在容器成功启动后,通过kubectl exec nslookup进行测试。

[root@k8s-master ~]# kubectl exec busybox -- nslookup kubernetes
Server:    10.10.10.2
Address 1: 10.10.10.2 kube-dns.kube-system.svc.cluster.local

Name:      kubernetes
Address 1: 10.10.10.1 kubernetes.default.svc.cluster.local

注意:如果某个Service属于不同的命名空间,那么在进行Service查找时,需要补充Namespace的名称,组合完整的域名,下面以查找kube-dns服务为例,将其所在的Namespace“kube-system”补充在服务名之后,用“.”连接为”kube-dns.kube-system“,即可查询成功:

# 错误案例,没有指定namespace
[root@k8s-master ~]# kubectl exec busybox -- nslookup kube-dns
nslookup: can't resolve 'kube-dns'
Server:    10.10.10.2
Address 1: 10.10.10.2 kube-dns.kube-system.svc.cluster.local

command terminated with exit code 1

# 成功案例。
[root@k8s-master ~]# kubectl exec busybox -- nslookup kube-dns.kube-system
Server:    10.10.10.2
Address 1: 10.10.10.2 kube-dns.kube-system.svc.cluster.local

Name:      kube-dns.kube-system
Address 1: 10.10.10.2 kube-dns.kube-system.svc.cluster.local

安装dashboard插件

Kubernetes的Web UI网页管理工具kubernetes-dashboard可提供部署应用、资源对象管理、容器日志查询、系统监控等常用的集群管理功能。为了在页面上显示系统资源的使用情况,要求部署Metrics Server。参考:

dashboard官方文件目录:

https://github.com/kubernetes/kubernetes/tree/master/cluster/addons/dashboard

由于 kube-apiserver 启用了 RBAC 授权,而官方源码目录的 dashboard-controller.yaml 没有定义授权的 ServiceAccount,所以后续访问 API server 的 API 时会被拒绝,不过从k8s v.18.3中官方文档提供了dashboard.rbac.yaml文件。

(1)创建部署文件kubernetes-dashboard.yaml,其内容如下:

# Copyright 2017 The Kubernetes Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# ------------------- Dashboard Secret ------------------- #

apiVersion: v1
kind: Secret
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard-certs
  namespace: kube-system
type: Opaque

---
# ------------------- Dashboard Service Account ------------------- #

apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard
  namespace: kube-system

---
# ------------------- Dashboard Role & Role Binding ------------------- #

kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: kubernetes-dashboard-minimal
  namespace: kube-system
rules:
  # Allow Dashboard to create 'kubernetes-dashboard-key-holder' secret.
- apiGroups: [""]
  resources: ["secrets"]
  verbs: ["create"]
  # Allow Dashboard to create 'kubernetes-dashboard-settings' config map.
- apiGroups: [""]
  resources: ["configmaps"]
  verbs: ["create"]
  # Allow Dashboard to get, update and delete Dashboard exclusive secrets.
- apiGroups: [""]
  resources: ["secrets"]
  resourceNames: ["kubernetes-dashboard-key-holder", "kubernetes-dashboard-certs"]
  verbs: ["get", "update", "delete"]
  # Allow Dashboard to get and update 'kubernetes-dashboard-settings' config map.
- apiGroups: [""]
  resources: ["configmaps"]
  resourceNames: ["kubernetes-dashboard-settings"]
  verbs: ["get", "update"]
  # Allow Dashboard to get metrics from heapster.
- apiGroups: [""]
  resources: ["services"]
  resourceNames: ["heapster"]
  verbs: ["proxy"]
- apiGroups: [""]
  resources: ["services/proxy"]
  resourceNames: ["heapster", "http:heapster:", "https:heapster:"]
  verbs: ["get"]

---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: kubernetes-dashboard-minimal
  namespace: kube-system
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: kubernetes-dashboard-minimal
subjects:
- kind: ServiceAccount
  name: kubernetes-dashboard
  namespace: kube-system

---
# ------------------- Dashboard Deployment ------------------- #

kind: Deployment
apiVersion: apps/v1
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard
  namespace: kube-system
spec:
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      k8s-app: kubernetes-dashboard
  template:
    metadata:
      labels:
        k8s-app: kubernetes-dashboard
    spec:
      containers:
      - name: kubernetes-dashboard
        image: lizhenliang/kubernetes-dashboard-amd64:v1.10.1
        ports:
        - containerPort: 8443
          protocol: TCP
        args:
          - --auto-generate-certificates
          # Uncomment the following line to manually specify Kubernetes API server Host
          # If not specified, Dashboard will attempt to auto discover the API server and connect
          # to it. Uncomment only if the default does not work.
          # - --apiserver-host=http://my-address:port
        volumeMounts:
        - name: kubernetes-dashboard-certs
          mountPath: /certs
          # Create on-disk volume to store exec logs
        - mountPath: /tmp
          name: tmp-volume
        livenessProbe:
          httpGet:
            scheme: HTTPS
            path: /
            port: 8443
          initialDelaySeconds: 30
          timeoutSeconds: 30
      volumes:
      - name: kubernetes-dashboard-certs
        secret:
          secretName: kubernetes-dashboard-certs
      - name: tmp-volume
        emptyDir: {}
      serviceAccountName: kubernetes-dashboard
      # Comment the following tolerations if Dashboard must not be deployed on master
      tolerations:
      - key: node-role.kubernetes.io/master
        effect: NoSchedule

---
# ------------------- Dashboard Service ------------------- #

kind: Service
apiVersion: v1
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard
  namespace: kube-system
spec:
  type: NodePort
  ports:
    - port: 443
      targetPort: 8443
  selector:
    k8s-app: kubernetes-dashboard

(2)查看创建状态

[root@k8s-master ~]# kubectl get all -n kube-system | grep dashboard

pod/kubernetes-dashboard-7df98d85bd-jbwh2   1/1     Running   0          18m

service/kubernetes-dashboard   NodePort    10.10.10.91           443:41498/TCP   18m

deployment.apps/kubernetes-dashboard   1/1     1            1           18m
replicaset.apps/kubernetes-dashboard-7df98d85bd   1         1         1       18m

(3)此时可以通过node节点的31116端口进行访问。输入:https://172.16.4.13:41498https://172.16.4.14:41498.
二进制部署K8S集群从0到1_第1张图片
并且通过之前的CoreDNS能够解析到其服务的IP地址:

[root@k8s-master ~]# kubectl get svc -n kube-system
NAME                   TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)         AGE
kube-dns               ClusterIP   10.10.10.2            53/UDP,53/TCP   5h38m
kubernetes-dashboard   NodePort    10.10.10.91           443:41498/TCP   26m
[root@k8s-master ~]# kubectl exec busybox -- nslookup kubernetes-dashboard.kube-system
Server:    10.10.10.2
Address 1: 10.10.10.2 kube-dns.kube-system.svc.cluster.local

Name:      kubernetes-dashboard.kube-system
Address 1: 10.10.10.91 kubernetes-dashboard.kube-system.svc.cluster.local

(4)创建SA并绑定cluster-admin管理员集群角色

[root@k8s-master ~]# kubectl create serviceaccount dashboard-admin -n kube-system
serviceaccount/dashboard-admin created
[root@k8s-master ~]# kubectl create clusterrolebinding dashboard-admin --clusterrole=cluster-admin --serviceaccount=kube-system:dashboard-admin
clusterrolebinding.rbac.authorization.k8s.io/dashboard-admin created
# 查看已创建的serviceaccount
[root@k8s-master ~]# kubectl get secret -n kube-system | grep admin
dashboard-admin-token-69zsx        kubernetes.io/service-account-token   3      65s
# 查看生成的token的具体信息并将token值复制到浏览器中,采用令牌登录。
[root@k8s-master ~]# kubectl describe secret dashboard-admin-token-69zsx -n kube-system
Name:         dashboard-admin-token-69zsx
Namespace:    kube-system
Labels:       
Annotations:  kubernetes.io/service-account.name: dashboard-admin
              kubernetes.io/service-account.uid: dfe59297-8f46-11e9-b92b-e67418705759

Type:  kubernetes.io/service-account-token

Data
====
ca.crt:     1359 bytes
namespace:  11 bytes
token:      eyJhbGciOiJSUzI1NiIsImtpZCI6IiJ9.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJkYXNoYm9hcmQtYWRtaW4tdG9rZW4tNjl6c3giLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC5uYW1lIjoiZGFzaGJvYXJkLWFkbWluIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQudWlkIjoiZGZlNTkyOTctOGY0Ni0xMWU5LWI5MmItZTY3NDE4NzA1NzU5Iiwic3ViIjoic3lzdGVtOnNlcnZpY2VhY2NvdW50Omt1YmUtc3lzdGVtOmRhc2hib2FyZC1hZG1pbiJ9.Wl6WiT6MZ-37ArWhPuhudac5S1Y8v2GxiUdNcy4hIwHQ1EdtzaAlvpx1mLZsQoDYJCeM6swVtNgJwhO5ESZAYQVi9xCrXsQcEDIeBkjyzpu6U4XHmab7SuS0_KEsGXhe57XKq86ogK9bAyNvNWE497V2giJJy5eR6CHKH3GR6mIwTQDSKEf-GfDfs9SHvQxRjchsrYLJLS3B_XfZyNHFXcieMZHy7V7Ehx2jMzwh6WNk6Mqk5N-IlZQRxmTBHTe3i9efN8r7CjvRhZdKc5iF6V4eG0QWkxR95WOzgV2QCCyLh4xEJw895FlHFJ1oTR2sUIRugnzyfqZaPQxdXcrc7Q

(5)在浏览器中选择token方式登录,即可查看到集群的状态:
二进制部署K8S集群从0到1_第2张图片

  • 注意:访问dashboard实际上有三种方式,上述过程只演示了第一种方式:
    • kubernetes-dashboard 服务暴露了 NodePort,可以使用 http://NodeIP:nodePort 地址访问 dashboard。
    • 通过 API server 访问 dashboard(https 6443端口和http 8080端口方式)。
    • 通过 kubectl proxy 访问 dashboard。

采用kubectl proxy访问dashboard

(1)启动代理

[root@k8s-master ~]# kubectl proxy --address='172.16.4.12' --port=8086 --accept-hosts='^*$'
Starting to serve on 172.16.4.12:8086

(2)访问dashboard

访问URL:http://172.16.4.12:8086/ui 自动跳转到:http://172.16.4.12:8086/api/v1/proxy/namespaces/kube-system/services/kubernetes-dashboard/#/workload?namespace=default
二进制部署K8S集群从0到1_第3张图片

安装heapster插件

准备镜像

到 heapster release 页面 下载最新版本的 heapster。

wget https://github.com/kubernetes-retired/heapster/archive/v1.5.4.tar.gz
tar zxvf heapster-1.5.4.tar.gz 
[root@k8s-master ~]# cd heapster-1.5.4/deploy/kube-config/influxdb/ && ls
grafana.yaml  heapster.yaml  influxdb.yaml

(1)我们修改heapster.yaml后内容如下:

# ------------------- Heapster Service Account ------------------- #

apiVersion: v1
kind: ServiceAccount
metadata:
  name: heapster
  namespace: kube-system

---
# ------------------- Heapster Role & Role Binding ------------------- #

kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: heapster
subjects:
  - kind: ServiceAccount
    name: heapster
    namespace: kube-system
roleRef:
  kind: ClusterRole
  name: cluster-admin
  apiGroup: rbac.authorization.k8s.io
---
# ------------------- Heapster Deployment ------------------- #
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: heapster
  namespace: kube-system
spec:
  replicas: 1
  template:
    metadata:
      labels:
        task: monitoring
        k8s-app: heapster
    spec:
      serviceAccountName: heapster
      containers:
      - name: heapster
        image: registry.cn-hangzhou.aliyuncs.com/google_containers/heapster-amd64:v1.5.3
        imagePullPolicy: IfNotPresent
        command:
        - /heapster
        - --source=kubernetes:https://kubernetes.default
        - --sink=influxdb:http://monitoring-influxdb.kube-system.svc:8086
---
# ------------------- Heapster Service ------------------- #

apiVersion: v1
kind: Service
metadata:
  labels:
    task: monitoring
    # For use as a Cluster add-on (https://github.com/kubernetes/kubernetes/tree/master/cluster/addons)
    # If you are NOT using this as an addon, you should comment out this line.
    kubernetes.io/cluster-service: 'true'
    kubernetes.io/name: Heapster
  name: heapster
  namespace: kube-system
spec:
  ports:
  - port: 80
    targetPort: 8082
  selector:
    k8s-app: heapster

(2)我们修改influxdb.yaml后内容如下:

[root@k8s-master influxdb]# cat influxdb.yaml 
# ------------------- Influxdb Deployment ------------------- #
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: monitoring-influxdb
  namespace: kube-system
spec:
  replicas: 1
  template:
    metadata:
      labels:
        task: monitoring
        k8s-app: influxdb
    spec:
      containers:
      - name: influxdb
        image: registry.cn-hangzhou.aliyuncs.com/google_containers/heapster-influxdb-amd64:v1.3.3
        volumeMounts:
        - mountPath: /data
          name: influxdb-storage
      volumes:
      - name: influxdb-storage
        emptyDir: {}
---
# ------------------- Influxdb Service ------------------- #

apiVersion: v1
kind: Service
metadata:
  labels:
    task: monitoring
    # For use as a Cluster add-on (https://github.com/kubernetes/kubernetes/tree/master/cluster/addons)
    # If you are NOT using this as an addon, you should comment out this line.
    kubernetes.io/cluster-service: 'true'
    kubernetes.io/name: monitoring-influxdb
  name: monitoring-influxdb
  namespace: kube-system
spec:
  type: NodePort
  ports:
  - port: 8086
    targetPort: 8086
    name: http
  - port: 8083
    targetPort: 8083
    name: admin
  selector:
    k8s-app: influxdb
---
#-------------------Influxdb Cm-----------------#
apiVersion: v1
kind: ConfigMap
metadata:
  name: influxdb-config
  namespace: kube-system
data:
  config.toml: |
    reporting-disabled = true
    bind-address = ":8088"
    [meta]
      dir = "/data/meta"
      retention-autocreate = true
      logging-enabled = true
    [data]
      dir = "/data/data"
      wal-dir = "/data/wal"
      query-log-enabled = true
      cache-max-memory-size = 1073741824
      cache-snapshot-memory-size = 26214400
      cache-snapshot-write-cold-duration = "10m0s"
      compact-full-write-cold-duration = "4h0m0s"
      max-series-per-database = 1000000
      max-values-per-tag = 100000
      trace-logging-enabled = false
    [coordinator]
      write-timeout = "10s"
      max-concurrent-queries = 0
      query-timeout = "0s"
      log-queries-after = "0s"
      max-select-point = 0
      max-select-series = 0
      max-select-buckets = 0
    [retention]
      enabled = true
      check-interval = "30m0s"
    [admin]
      enabled = true
      bind-address = ":8083"
      https-enabled = false
      https-certificate = "/etc/ssl/influxdb.pem"
    [shard-precreation]
      enabled = true
      check-interval = "10m0s"
      advance-period = "30m0s"
    [monitor]
      store-enabled = true
      store-database = "_internal"
      store-interval = "10s"
    [subscriber]
      enabled = true
      http-timeout = "30s"
      insecure-skip-verify = false
      ca-certs = ""
      write-concurrency = 40
      write-buffer-size = 1000
    [http]
      enabled = true
      bind-address = ":8086"
      auth-enabled = false
      log-enabled = true
      write-tracing = false
      pprof-enabled = false
      https-enabled = false
      https-certificate = "/etc/ssl/influxdb.pem"
      https-private-key = ""
      max-row-limit = 10000
      max-connection-limit = 0
      shared-secret = ""
      realm = "InfluxDB"
      unix-socket-enabled = false
      bind-socket = "/var/run/influxdb.sock"
    [[graphite]]
      enabled = false
      bind-address = ":2003"
      database = "graphite"
      retention-policy = ""
      protocol = "tcp"
      batch-size = 5000
      batch-pending = 10
      batch-timeout = "1s"
      consistency-level = "one"
      separator = "."
      udp-read-buffer = 0
    [[collectd]]
      enabled = false
      bind-address = ":25826"
      database = "collectd"
      retention-policy = ""
      batch-size = 5000
      batch-pending = 10
      batch-timeout = "10s"
      read-buffer = 0
      typesdb = "/usr/share/collectd/types.db"
    [[opentsdb]]
      enabled = false
      bind-address = ":4242"
      database = "opentsdb"
      retention-policy = ""
      consistency-level = "one"
      tls-enabled = false
      certificate = "/etc/ssl/influxdb.pem"
      batch-size = 1000
      batch-pending = 5
      batch-timeout = "1s"
      log-point-errors = true
    [[udp]]
      enabled = false
      bind-address = ":8089"
      database = "udp"
      retention-policy = ""
      batch-size = 5000
      batch-pending = 10
      read-buffer = 0
      batch-timeout = "1s"
      precision = ""
    [continuous_queries]
      log-enabled = true
      enabled = true
      run-interval = "1s"

(3)我们修改grafana.yaml后文件内容如下:

[root@k8s-master influxdb]# cat grafana.yaml 
#------------Grafana Deployment----------------#

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: monitoring-grafana
  namespace: kube-system
spec:
  replicas: 1
  template:
    metadata:
      labels:
        task: monitoring
        k8s-app: grafana
    spec:
      containers:
      - name: grafana
        image: registry.cn-hangzhou.aliyuncs.com/google_containers/heapster-grafana-amd64:v4.4.3
        ports:
        - containerPort: 3000
          protocol: TCP
        volumeMounts:
        #- mountPath: /etc/ssl/certs
        #  name: ca-certificates
        #  readOnly: true
        - mountPath: /var
          name: grafana-storage
        env:
        - name: INFLUXDB_HOST
          value: monitoring-influxdb
        #- name: GF_SERVER_HTTP_PORT
        - name: GRAFANA_PORT
          value: "3000"
          # The following env variables are required to make Grafana accessible via
          # the kubernetes api-server proxy. On production clusters, we recommend
          # removing these env variables, setup auth for grafana, and expose the grafana
          # service using a LoadBalancer or a public IP.
        - name: GF_AUTH_BASIC_ENABLED
          value: "false"
        - name: GF_AUTH_ANONYMOUS_ENABLED
          value: "true"
        - name: GF_AUTH_ANONYMOUS_ORG_ROLE
          value: Admin
        - name: GF_SERVER_ROOT_URL
          # If you're only using the API Server proxy, set this value instead:
          value: /api/v1/namespaces/kube-system/services/monitoring-grafana/proxy
          # value: /
      volumes:
      # - name: ca-certificates
      #  hostPath:
      #    path: /etc/ssl/certs
      - name: grafana-storage
        emptyDir: {}
---
#------------Grafana Service----------------#

apiVersion: v1
kind: Service
metadata:
  labels:
    # For use as a Cluster add-on (https://github.com/kubernetes/kubernetes/tree/master/cluster/addons)
    # If you are NOT using this as an addon, you should comment out this line.
    kubernetes.io/cluster-service: 'true'
    kubernetes.io/name: monitoring-grafana
  name: monitoring-grafana
  namespace: kube-system
spec:
  # In a production setup, we recommend accessing Grafana through an external Loadbalancer
  # or through a public IP.
  # type: LoadBalancer
  # You could also use NodePort to expose the service at a randomly-generated port
  # type: NodePort
  ports:
  - port: 80
    targetPort: 3000
  selector:
    k8s-app: grafana

执行所有定义文件

[root@k8s-master influxdb]# pwd
/root/heapster-1.5.4/deploy/kube-config/influxdb

[root@k8s-master influxdb]# ls
grafana.yaml  heapster.yaml  influxdb.yaml

[root@k8s-master influxdb]# kubectl create -f .
deployment.extensions/monitoring-grafana created
service/monitoring-grafana created
serviceaccount/heapster created
clusterrolebinding.rbac.authorization.k8s.io/heapster created
service/heapster created
deployment.extensions/heapster created
deployment.extensions/monitoring-influxdb created
service/monitoring-influxdb created
configmap/influxdb-config created
Error from server (AlreadyExists): error when creating "heapster.yaml": serviceaccounts "heapster" already exists
Error from server (AlreadyExists): error when creating "heapster.yaml": clusterrolebindings.rbac.authorization.k8s.io "heapster" already exists
Error from server (AlreadyExists): error when creating "heapster.yaml": services "heapster" already exists

检查执行结果

# 检查Deployment
[root@k8s-master influxdb]# kubectl get deployments -n kube-system | grep -E 'heapster|monitoring'
heapster               1/1     1            1           10m
monitoring-grafana     1/1     1            1           10m
monitoring-influxdb    1/1     1            1           10m

# 检查Pods
[root@k8s-master influxdb]# kubectl get pods -n kube-system | grep -E 'heapster|monitoring'
heapster-75d646bf58-9x9tz               1/1     Running   0          10m
monitoring-grafana-77997bd67d-5khvp     1/1     Running   0          10m
monitoring-influxdb-7d6c5fb944-jmrv6    1/1     Running   0          10m

访问各dashboard界面

错误一:system:anonymous问题

访问dashboard网页时,可能出现以下问题:

{
  "kind": "Status",
  "apiVersion": "v1",
  "metadata": {
    
  },
  "status": "Failure",
  "message": "services \"heapster\" is forbidden: User \"system:anonymous\" cannot get resource \"services/proxy\" in API group \"\" in the namespace \"kube-system\"",
  "reason": "Forbidden",
  "details": {
    "name": "heapster",
    "kind": "services"
  },
  "code": 403
}

分析问题Kubernetes API Server新增了–anonymous-auth选项,允许匿名请求访问secure port。没有被其他authentication方法拒绝的请求即Anonymous requests, 这样的匿名请求的usernamesystem:anonymous, 归属的组为system:unauthenticated。并且该选线是默认的。这样一来,当采用chrome浏览器访问dashboard UI时很可能无法弹出用户名、密码输入对话框,导致后续authorization失败。为了保证用户名、密码输入对话框的弹出,需要将–anonymous-auth设置为false

  • 再次访问dashboard发现多了CPU使用率和内存使用率的表格:
    二进制部署K8S集群从0到1_第4张图片
    (2)访问grafana页面

  • 通过kube-apiserver访问:

获取 monitoring-grafana 服务 URL

[root@k8s-master ~]# kubectl cluster-info
Kubernetes master is running at https://172.16.4.12:6443
Heapster is running at https://172.16.4.12:6443/api/v1/namespaces/kube-system/services/heapster/proxy
KubeDNS is running at https://172.16.4.12:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
monitoring-grafana is running at https://172.16.4.12:6443/api/v1/namespaces/kube-system/services/monitoring-grafana/proxy
monitoring-influxdb is running at https://172.16.4.12:6443/api/v1/namespaces/kube-system/services/monitoring-influxdb:http/proxy

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.

访问浏览器URL:https://172.16.4.12:6443/api/v1/namespaces/kube-system/services/monitoring-grafana/proxy
二进制部署K8S集群从0到1_第5张图片

  • 通过kubectl proxy访问:

创建代理

[root@k8s-master ~]# kubectl proxy --address='172.16.4.12' --port=8084 --accept-hosts='^*$'
Starting to serve on 172.16.4.12:8084

访问influxdb admin UI

获取 influxdb http 8086 映射的 NodePort

[root@k8s-master influxdb]# kubectl get svc -n kube-system|grep influxdb
monitoring-influxdb    NodePort    10.10.10.154           8086:43444/TCP,8083:49123/TCP   53m

通过 kube-apiserver 的非安全端口访问 influxdb 的 admin UI 界面: http://172.16.4.12:8080/api/v1/proxy/namespaces/kube-system/services/monitoring-influxdb:8083/

在页面的 “Connection Settings” 的 Host 中输入 node IP, Port 中输入 8086 映射的 nodePort 如上面的 43444,点击 “Save” 即可(我的集群中的地址是172.16.4.12:32299)

  • 错误一:通过kube-apiserver访问不到influxdb dashboard,出现yaml文件内容。
{
  "kind": "Service",
  "apiVersion": "v1",
  "metadata": {
    "name": "monitoring-influxdb",
    "namespace": "kube-system",
    "selfLink": "/api/v1/namespaces/kube-system/services/monitoring-influxdb",
    "uid": "22c9ab6c-8f72-11e9-b92b-e67418705759",
    "resourceVersion": "215237",
    "creationTimestamp": "2019-06-15T13:33:18Z",
    "labels": {
      "kubernetes.io/cluster-service": "true",
      "kubernetes.io/name": "monitoring-influxdb",
      "task": "monitoring"
    }
  },
  "spec": {
    "ports": [
      {
        "name": "http",
        "protocol": "TCP",
        "port": 8086,
        "targetPort": 8086,
        "nodePort": 43444
      },
      {
        "name": "admin",
        "protocol": "TCP",
        "port": 8083,
        "targetPort": 8083,
        "nodePort": 49123
      }
    ],
    "selector": {
      "k8s-app": "influxdb"
    },
    "clusterIP": "10.10.10.154",
    "type": "NodePort",
    "sessionAffinity": "None",
    "externalTrafficPolicy": "Cluster"
  },
  "status": {
    "loadBalancer": {
      
    }
  }
}

安装EFK插件

在Kubernetes集群中,一个完整的应用或服务都会涉及为数众多的组件运行,各组件所在的Node及实例数量都是可变的。日志子系统如果不做集中化管理,则会给系统的运维支撑造成很大的困难,因此有必要在集群层面对日志进行统一收集和检索等工作。

在容器中输出到控制台的日志,都会以“*-json.log”的命名方式保存到/var/lib/docker/containers/目录下,这就为日志采集和后续处理奠定了基础。

Kubernetes推荐用Fluentd+Elasticsearch+Kibana完成对系统和容器日志的采集、查询和展现工作。

部署统一日志管理系统,需要以下两个前提条件:

  • API Server正确配置了CA证书。
  • DNS服务启动、运行。

系统部署架构

二进制部署K8S集群从0到1_第6张图片
我们通过在每台node上部署一个以DaemonSet方式运行的fluentd来收集每台node上的日志。Fluentd将docker日志目录/var/lib/docker/containers/var/log目录挂载到Pod中,然后Pod会在node节点的/var/log/pods目录中创建新的目录,可以区别不同的容器日志输出,该目录下有一个日志文件链接到/var/lib/docker/contianers目录下的容器日志输出。注意:两个目录下的日志都会汇集到ElasticSearch集群,最终通过Kibana完成和用户的交互工作。

这里有一个特殊需求:Fluentd必须在每个Node上运行,为了满足这一需求,我们通过以下几种方式部署Fluentd。

  • 直接在Node主机上部署Fluentd.
  • 利用kubelet的–config参数,为每个node都加载Fluentd Pod。
  • 利用DaemonSet让Fluentd Pod在每个Node上运行。

官方文件目录:https://github.com/kubernetes/kubernetes/tree/master/cluster/addons/fluentd-elasticsearch

配置EFK服务配置文件

创建目录盛放文件

[root@k8s-master ~]# mkdir EFK && cd EFK

配置EFK-RABC服务

[root@k8s-master EFK]# cat efk-rbac.yaml 
apiVersion: v1
kind: ServiceAccount
metadata:
  name: efk
  namespace: kube-system

---

kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: efk
subjects:
  - kind: ServiceAccount
    name: efk
    namespace: kube-system
roleRef:
  kind: ClusterRole
  name: cluster-admin
  apiGroup: rbac.authorization.k8s.io
# 注意配置的ServiceAccount为efk。

配置ElasticSearch服务

# 此处将官方的三个文档合并成了一个elasticsearch.yaml,内容如下:

[root@k8s-master EFK]# cat elasticsearch.yaml 
#------------ElasticSearch RBAC---------#

apiVersion: v1
kind: ServiceAccount
metadata:
  name: elasticsearch-logging
  namespace: kube-system
  labels:
    k8s-app: elasticsearch-logging
    addonmanager.kubernetes.io/mode: Reconcile
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: elasticsearch-logging
  labels:
    k8s-app: elasticsearch-logging
    addonmanager.kubernetes.io/mode: Reconcile
rules:
- apiGroups:
  - ""
  resources:
  - "services"
  - "namespaces"
  - "endpoints"
  verbs:
  - "get"
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  namespace: kube-system
  name: elasticsearch-logging
  labels:
    k8s-app: elasticsearch-logging
    addonmanager.kubernetes.io/mode: Reconcile
subjects:
- kind: ServiceAccount
  name: elasticsearch-logging
  namespace: kube-system
  apiGroup: ""
roleRef:
  kind: ClusterRole
  name: elasticsearch-logging
  apiGroup: ""
---

# -----------ElasticSearch Service--------------#
apiVersion: v1
kind: Service
metadata:
  name: elasticsearch-logging
  namespace: kube-system
  labels:
    k8s-app: elasticsearch-logging
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile
    kubernetes.io/name: "Elasticsearch"
spec:
  ports:
  - port: 9200
    protocol: TCP
    targetPort: db
  selector:
    k8s-app: elasticsearch-logging
---

#-------------------ElasticSearch StatefulSet-------#
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: elasticsearch-logging
  namespace: kube-system
  labels:
    k8s-app: elasticsearch-logging
    version: v6.6.1
    addonmanager.kubernetes.io/mode: Reconcile
spec:
  serviceName: elasticsearch-logging
  replicas: 2
  selector:
    matchLabels:
      k8s-app: elasticsearch-logging
      version: v6.7.2
  template:
    metadata:
      labels:
        k8s-app: elasticsearch-logging
        version: v6.7.2
    spec:
      serviceAccountName: elasticsearch-logging
      containers:
      - image: docker.elastic.co/elasticsearch/elasticsearch:6.6.1
        name: elasticsearch-logging
        resources:
          # need more cpu upon initialization, therefore burstable class
          limits:
            cpu: 1000m
          requests:
            cpu: 100m
        ports:
        - containerPort: 9200
          name: db
          protocol: TCP
        - containerPort: 9300
          name: transport
          protocol: TCP
        volumeMounts:
        - name: elasticsearch-logging
          mountPath: /data
        env:
        - name: "NAMESPACE"
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        - name: ES_JAVA_OPTS
          value: -Xms1024m -Xmx1024m
      volumes:
      - name: elasticsearch-logging
        emptyDir: {}
       # Elasticsearch requires vm.max_map_count to be at least 262144.
       # If your OS already sets up this number to a higher value, feel free
       # to remove this init container.
      initContainers:
      - image: alpine:3.6
        command: ["/sbin/sysctl", "-w", "vm.max_map_count=262144"]
        name: elasticsearch-logging-init
        securityContext:
          privileged: true

配置Fluentd服务的configmap,此处通过td-agent创建

# td-agent提供了一个官方文档:个人感觉繁琐,可以直接采用其脚本安装。
curl -L https://toolbelt.treasuredata.com/sh/install-redhat-td-agent3.sh | sh

# 正式配置configmap,其配置文件如下,可以自己手动创建。
[root@k8s-master fluentd-es-image]# cat td-agent.conf 
kind: ConfigMap
apiVersion: v1
metadata:
  name: td-agent-config
  namespace: kube-system
  labels:
    addonmanager.kubernetes.io/mode: Reconcile
data:
  td-agent.conf: |
    
      @type kubernetes_metadata
      tls-cert-file /etc/kubernetes/ssl/server.pem
      tls-private-key-file /etc/kubernetes/ssl/server-key.pem
      client-ca-file /etc/kubernetes/ssl/ca.pem
      service-account-key-file /etc/kubernetes/ssl/ca-key.pem
    

    
      @id elasticsearch
      @type elasticsearch
      @log_level info
      type_name _doc
      include_tag_key true
      host 172.16.4.12
      port 9200
      logstash_format true
      
        @type file
        path /var/log/fluentd-buffers/kubernetes.system.buffer
        flush_mode interval
        retry_type exponential_backoff
        flush_thread_count 2
        flush_interval 5s
        retry_forever
        retry_max_interval 30
        chunk_limit_size 2M
        queue_limit_length 8
        overflow_action block
      
    

    
    type null  
    type tail
    path /var/log/containers/*.log
    pos_file /var/log/es-containers.log.pos
    time_format %Y-%m-%dT%H:%M:%S.%NZ
    tag kubernetes.*
    format json
    read_from_head true
    
    
# 注意将configmap创建在kube-system的名称空间下。
kubectl create configmap td-agent-config --from-file=./td-agent.conf -n kube-system

# 创建fluentd的DaemonSet
[root@k8s-master EFK]# cat fluentd.yaml
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  name: fluentd-es-v1.22
  namespace: kube-system
  labels:
    k8s-app: fluentd-es
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile
    version: v1.22
spec:
  template:
    metadata:
      labels:
        k8s-app: fluentd-es
        kubernetes.io/cluster-service: "true"
        version: v1.22
      annotations:
        scheduler.alpha.kubernetes.io/critical-pod: ''
    spec:
      serviceAccountName: efk
      containers:
      - name: fluentd-es
        image: travix/fluentd-elasticsearch:1.22
        command:
          - '/bin/sh'
          - '-c'
          - '/usr/sbin/td-agent 2>&1 >> /var/log/fluentd.log'
        resources:
          limits:
            memory: 200Mi
          requests:
            cpu: 100m
            memory: 200Mi
        volumeMounts:
        - name: varlog
          mountPath: /var/log
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
      nodeSelector:
        beta.kubernetes.io/fluentd-ds-ready: "true"
      tolerations:
      - key : "node.alpha.kubernetes.io/ismaster"
        effect: "NoSchedule"
      terminationGracePeriodSeconds: 30
      volumes:
      - name: varlog
        hostPath:
          path: /var/log
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers
# 此处采用了一个dockerhub上公共镜像,官方镜像需要。

配置Kibana服务

[root@k8s-master EFK]# cat kibana.yaml 
#---------------Kibana Deployment-------------------#

apiVersion: apps/v1
kind: Deployment
metadata:
  name: kibana-logging
  namespace: kube-system
  labels:
    k8s-app: kibana-logging
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile
spec:
  replicas: 1
  selector:
    matchLabels:
      k8s-app: kibana-logging
  template:
    metadata:
      labels:
        k8s-app: kibana-logging
      annotations:
        seccomp.security.alpha.kubernetes.io/pod: 'docker/default'
    spec:
      serviceAccountName: efk
      containers:
      - name: kibana-logging
        image: docker.elastic.co/kibana/kibana-oss:6.6.1
        resources:
          # keep request = limit to keep this container in guaranteed class
          limits:
            cpu: 1000m
          requests:
            cpu: 100m
        env:
          - name: "ELASTICSEARCH_URL"
            value: "http://172.16.4.12:9200"
          # modified by gzr
          #  value: "http://elasticsearch-logging:9200"
          - name: "SERVER_BASEPATH"
            value: "/api/v1/proxy/namespaces/kube-system/services/kibana-logging/proxy"
        ports:
        - containerPort: 5601
          name: ui
          protocol: TCP
---

#------------------Kibana Service---------------------#

apiVersion: v1
kind: Service
metadata:
  name: kibana-logging
  namespace: kube-system
  labels:
    k8s-app: kibana-logging
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile
    kubernetes.io/name: "Kibana"
spec:
  ports:
  - port: 5601
    protocol: TCP
    targetPort: ui
  selector:
    k8s-app: kibana-logging


  • 注意镜像位置为:docker.elastic.co/kibana/kibana-oss:6.6.1,如果读者需要更高版本的kibana,请自行更换。
  • 由于没有配置DNS,建议直接将http://elasticsearch-logging:9200,直接替换成elasticsearch-logging的集群IP。

给Node设置标签

定义 DaemonSet fluentd-es-v1.22 时设置了 nodeSelector beta.kubernetes.io/fluentd-ds-ready=true ,所以需要在期望运行 fluentd 的 Node 上设置该标签;

[root@k8s-master EFK]# kubectl get nodes
NAME          STATUS   ROLES    AGE     VERSION
172.16.4.12   Ready       18h     v1.14.3
172.16.4.13   Ready       2d15h   v1.14.3
172.16.4.14   Ready       2d15h   v1.14.3

[root@k8s-master EFK]#kubectl label nodes 172.16.4.14 beta.kubernetes.io/fluentd-ds-ready=true
node "172.16.4.14" labeled

[root@k8s-master EFK]#kubectl label nodes 172.16.4.13 beta.kubernetes.io/fluentd-ds-ready=true
node "172.16.4.13" labeled

[root@k8s-master EFK]#kubectl label nodes 172.16.4.12 beta.kubernetes.io/fluentd-ds-ready=true
node "172.16.4.12" labeled

执行定义的文件

[root@k8s-master EFK]# kubectl create -f .
serviceaccount/efk created
clusterrolebinding.rbac.authorization.k8s.io/efk created
service/elasticsearch-logging created
serviceaccount/elasticsearch-logging created
clusterrole.rbac.authorization.k8s.io/elasticsearch-logging created
clusterrolebinding.rbac.authorization.k8s.io/elasticsearch-logging created
statefulset.apps/elasticsearch-logging created
daemonset.extensions/fluentd-es-v1.22 created
deployment.apps/kibana-logging created
service/kibana-logging created

验证执行结果

[root@k8s-master EFK]# kubectl get po -n kube-system -o wide| grep -E 'elastic|fluentd|kibana'
elasticsearch-logging-0                 1/1     Running            0          115m    172.30.69.5   172.16.4.14              
elasticsearch-logging-1                 1/1     Running            0          115m    172.30.20.8   172.16.4.13              
fluentd-es-v1.22-4bmtm                  0/1     CrashLoopBackOff   16         58m     172.30.53.2   172.16.4.12              
fluentd-es-v1.22-f9hml                  1/1     Running            0          58m     172.30.69.6   172.16.4.14              
fluentd-es-v1.22-x9rf4                  1/1     Running            0          58m     172.30.20.9   172.16.4.13              
kibana-logging-7db9f954ff-mkbhr         1/1     Running            0          25s     172.30.69.7   172.16.4.14              

kibana Pod 第一次启动时会用较长时间(10-20分钟)来优化和 Cache 状态页面,可以 tailf 该 Pod 的日志观察进度。

[root@k8s-master EFK]# kubectl logs kibana-logging-7db9f954ff-mkbhr -n kube-system
{"type":"log","@timestamp":"2019-06-18T09:23:33Z","tags":["plugin","warning"],"pid":1,"path":"/usr/share/kibana/src/legacy/core_plugins/ems_util","message":"Skipping non-plugin directory at /usr/share/kibana/src/legacy/core_plugins/ems_util"}
{"type":"log","@timestamp":"2019-06-18T09:23:34Z","tags":["warning","elasticsearch","config","deprecation"],"pid":1,"message":"Config key \"url\" is deprecated. It has been replaced with \"hosts\""}
{"type":"log","@timestamp":"2019-06-18T09:23:34Z","tags":["status","plugin:[email protected]","info"],"pid":1,"state":"green","message":"Status changed from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}
{"type":"log","@timestamp":"2019-06-18T09:23:34Z","tags":["status","plugin:[email protected]","info"],"pid":1,"state":"yellow","message":"Status changed from uninitialized to yellow - Waiting for Elasticsearch","prevState":"uninitialized","prevMsg":"uninitialized"}
{"type":"log","@timestamp":"2019-06-18T09:23:34Z","tags":["status","plugin:[email protected]","info"],"pid":1,"state":"green","message":"Status changed from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}
{"type":"log","@timestamp":"2019-06-18T09:23:34Z","tags":["status","plugin:[email protected]","info"],"pid":1,"state":"green","message":"Status changed from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}
{"type":"log","@timestamp":"2019-06-18T09:23:34Z","tags":["status","plugin:[email protected]","info"],"pid":1,"state":"green","message":"Status changed from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}
{"type":"log","@timestamp":"2019-06-18T09:23:35Z","tags":["status","plugin:[email protected]","info"],"pid":1,"state":"green","message":"Status changed from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}
{"type":"log","@timestamp":"2019-06-18T09:23:35Z","tags":["status","plugin:[email protected]","info"],"pid":1,"state":"green","message":"Status changed from yellow to green - Ready","prevState":"yellow","prevMsg":"Waiting for Elasticsearch"}
{"type":"log","@timestamp":"2019-06-18T09:23:35Z","tags":["listening","info"],"pid":1,"message":"Server running at http://0:5601"}
[root@k8s-master EFK]# kubectl logs kibana-logging-7db9f954ff-mkbhr -n kube-system
{"type":"log","@timestamp":"2019-06-18T09:23:33Z","tags":["plugin","warning"],"pid":1,"path":"/usr/share/kibana/src/legacy/core_plugins/ems_util","message":"Skipping non-plugin directory at /usr/share/kibana/src/legacy/core_plugins/ems_util"}
{"type":"log","@timestamp":"2019-06-18T09:23:34Z","tags":["warning","elasticsearch","config","deprecation"],"pid":1,"message":"Config key \"url\" is deprecated. It has been replaced with \"hosts\""}
{"type":"log","@timestamp":"2019-06-18T09:23:34Z","tags":["status","plugin:[email protected]","info"],"pid":1,"state":"green","message":"Status changed from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}
{"type":"log","@timestamp":"2019-06-18T09:23:34Z","tags":["status","plugin:[email protected]","info"],"pid":1,"state":"yellow","message":"Status changed from uninitialized to yellow - Waiting for Elasticsearch","prevState":"uninitialized","prevMsg":"uninitialized"}
{"type":"log","@timestamp":"2019-06-18T09:23:34Z","tags":["status","plugin:[email protected]","info"],"pid":1,"state":"green","message":"Status changed from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}
{"type":"log","@timestamp":"2019-06-18T09:23:34Z","tags":["status","plugin:[email protected]","info"],"pid":1,"state":"green","message":"Status changed from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}
{"type":"log","@timestamp":"2019-06-18T09:23:34Z","tags":["status","plugin:[email protected]","info"],"pid":1,"state":"green","message":"Status changed from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}
{"type":"log","@timestamp":"2019-06-18T09:23:35Z","tags":["status","plugin:[email protected]","info"],"pid":1,"state":"green","message":"Status changed from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}
{"type":"log","@timestamp":"2019-06-18T09:23:35Z","tags":["status","plugin:[email protected]","info"],"pid":1,"state":"green","message":"Status changed from yellow to green - Ready","prevState":"yellow","prevMsg":"Waiting for Elasticsearch"}
{"type":"log","@timestamp":"2019-06-18T09:23:35Z","tags":["listening","info"],"pid":1,"message":"Server running at http://0:5601"}
......

访问kibana

  1. 通过kube-apiserver访问:

获取kibana服务URL

[root@k8s-master ~]# kubectl cluster-info
Kubernetes master is running at https://172.16.4.12:6443
Elasticsearch is running at https://172.16.4.12:6443/api/v1/namespaces/kube-system/services/elasticsearch-logging/proxy
Heapster is running at https://172.16.4.12:6443/api/v1/namespaces/kube-system/services/heapster/proxy
Kibana is running at https://172.16.4.12:6443/api/v1/namespaces/kube-system/services/kibana-logging/proxy
KubeDNS is running at https://172.16.4.12:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
monitoring-grafana is running at https://172.16.4.12:6443/api/v1/namespaces/kube-system/services/monitoring-grafana/proxy
monitoring-influxdb is running at https://172.16.4.12:6443/api/v1/namespaces/kube-system/services/monitoring-influxdb:http/proxy

浏览器访问URL:https://172.16.4.12:6443/api/v1/namespaces/kube-system/services/kibana-logging/proxy

  • 错误1:Kibana did not load properly. Check the server output for more information.

二进制部署K8S集群从0到1_第7张图片
解决办法:

  • 错误2:访问kibana,出现503错误,具体内容如下:
{
  "kind": "Status",
  "apiVersion": "v1",
  "metadata": {
    
  },
  "status": "Failure",
  "message": "no endpoints available for service \"kibana-logging\"",
  "reason": "ServiceUnavailable",
  "code": 503
}

二进制部署K8S集群从0到1_第8张图片

你可能感兴趣的:(Kubernetes容器编排,Docker容器技术)