Karmada中使用coredns+multicluster实现跨集群访问探索

本文基于karmada v1.7.0版本,探索使用使用一致域名跨集群访问服务的方法。

一、实践官方例子

按照官网例子配置多集群服务发现,详细操作如下:

1、部署业务

以部署deployment与service为例。在控制面板创建deployment和server并通过PropagationPolicy发到集群member1中。该步骤合并的yaml如下:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: serve
spec:
  replicas: 2  # 为适配一般性,从例子的1改为2
  selector:
    matchLabels:
      app: serve
  template:
    metadata:
      labels:
        app: serve
    spec:
      containers:
      - name: serve
        image: jeremyot/serve:0a40de8
        args:
        - "--message='hello from cluster member1 (Node: {{env \"NODE_NAME\"}} Pod: {{env \"POD_NAME\"}} Address: {{addr}})'"
        env:
          - name: NODE_NAME
            valueFrom:
              fieldRef:
                fieldPath: spec.nodeName
          - name: POD_NAME
            valueFrom:
              fieldRef:
                fieldPath: metadata.name
      dnsPolicy: ClusterFirst   # 优先使用集群的coredns来解析
---
apiVersion: v1
kind: Service
metadata:
  name: serve
spec:
  ports:
  - port: 80
    targetPort: 8080
  selector:
    app: serve
---
apiVersion: policy.karmada.io/v1alpha1
kind: PropagationPolicy
metadata:
  name: mcs-workload
spec:
  resourceSelectors:
    - apiVersion: apps/v1
      kind: Deployment
      name: serve
    - apiVersion: v1
      kind: Service
      name: serve
  placement:
    clusterAffinity:
      clusterNames:
        - member1

2、创建ServiceExport与ServiceImport的CRD

需要在控制面创建ServiceExport与ServiceImport的crd ,然后通过ClusterPropagationPolicy将这两个crd安装到member1和member2中去。该步骤的yaml如下:

# propagate ServiceExport CRD
apiVersion: policy.karmada.io/v1alpha1
kind: ClusterPropagationPolicy
metadata:
  name: serviceexport-policy
spec:
  resourceSelectors:
    - apiVersion: apiextensions.k8s.io/v1
      kind: CustomResourceDefinition
      name: serviceexports.multicluster.x-k8s.io
  placement:
    clusterAffinity:
      clusterNames:
        - member1
        - member2
---
# propagate ServiceImport CRD
apiVersion: policy.karmada.io/v1alpha1
kind: ClusterPropagationPolicy
metadata:
  name: serviceimport-policy
spec:
  resourceSelectors:
    - apiVersion: apiextensions.k8s.io/v1
      kind: CustomResourceDefinition
      name: serviceimports.multicluster.x-k8s.io
  placement:
    clusterAffinity:
      clusterNames:
        - member1
        - member2

3、从成员集群导出Service

从member1中导出service。即在karmada的控制面创建该server的serviceExport,并创建的serve-export的PropagationPolicy。使得karmada可以管理member1的serviceExport。

root@zishen:/home/btg/yaml/mcs# cat ServiceExport.yaml 
apiVersion: multicluster.x-k8s.io/v1alpha1
kind: ServiceExport
metadata:
  name: serve
---
apiVersion: policy.karmada.io/v1alpha1
kind: PropagationPolicy
metadata:
  name: serve-export-policy
spec:
  resourceSelectors:
    - apiVersion: multicluster.x-k8s.io/v1alpha1
      kind: ServiceExport
      name: serve
  placement:
    clusterAffinity:
      clusterNames:
        - member1

4、导入Service到成员集群

将service导入到member2中。同样,在karmada的控制面创建ServiceImport以及PropagationPolicy。使得karmada可以管理member2中的ServiceImport。

apiVersion: multicluster.x-k8s.io/v1alpha1
kind: ServiceImport
metadata:
  name: serve
spec:
  type: ClusterSetIP
  ports:
  - port: 80
    protocol: TCP
---
apiVersion: policy.karmada.io/v1alpha1
kind: PropagationPolicy
metadata:
  name: serve-import-policy
spec:
  resourceSelectors:
    - apiVersion: multicluster.x-k8s.io/v1alpha1
      kind: ServiceImport
      name: serve
  placement:
    clusterAffinity:
      clusterNames:
        - member2

5、测试结果

在member2中,通过创建一个测试pod来请求member1中的pod。过程中会通过derived-serve这个service来中转。

切换到member2:

root@zishen:/home/btg/yaml/mcs# export KUBECONFIG="$HOME/.kube/members.config"
root@zishen:/home/btg/yaml/mcs# kubectl config use-context member2
Switched to context "member2".
  • 使用member2中服务的ip测试
root@zishen:/home/btg/yaml/mcs# kubectl get svc -A
NAMESPACE     NAME            TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                  AGE
default       derived-serve   ClusterIP   10.13.166.120   <none>        80/TCP                   4m37s
default       kubernetes      ClusterIP   10.13.0.1       <none>        443/TCP                  6d18h
kube-system   kube-dns        ClusterIP   10.13.0.10      <none>        53/UDP,53/TCP,9153/TCP   6d18h
#使用ip测试服务
root@zishen:/home/btg/yaml/mcs# kubectl --kubeconfig ~/.kube/members.config --context member2 run -i --rm --restart=Never --image=jeremyot/request:0a40de8 request -- --duration=3s --address=10.13.166.120
If you don't see a command prompt, try pressing enter.
2023/06/12 02:58:03 'hello from cluster member1 (Node: member1-control-plane Pod: serve-5899cfd5cd-6l27j Address: 10.10.0.17)'
2023/06/12 02:58:04 'hello from cluster member1 (Node: member1-control-plane Pod: serve-5899cfd5cd-6l27j Address: 10.10.0.17)'
pod "request" deleted
root@zishen:/home/btg/yaml/mcs# 

测试成功

  • 在member1中使用k8s默认域名serve.default.svc.cluster.local
root@zishen:/home/btg/yaml/mcs# kubectl --kubeconfig ~/.kube/members.config --context member1 run -i --rm --restart=Never --image=jeremyot/request:0a40de8 request -- --duration=3s --address=serve.default.svc.cluster.local
If you don't see a command prompt, try pressing enter.
2023/06/12 03:28:58 'hello from cluster member1 (Node: member1-control-plane Pod: serve-5899cfd5cd-6l27j Address: 10.10.0.17)'
2023/06/12 03:28:59 'hello from cluster member1 (Node: member1-control-plane Pod: serve-5899cfd5cd-6l27j Address: 10.10.0.17)'
pod "request" deleted

测试成功

  • 在member2中使用影子域名derived-serve.default.svc.cluster.local
root@zishen:/home/btg/yaml/mcs# kubectl --kubeconfig ~/.kube/members.config --context member2 run -i --rm --restart=Never --image=jeremyot/request:0a40de8 request -- --duration=3s --address=derived-serve.default.svc.cluster.local
If you don't see a command prompt, try pressing enter.
2023/06/12 03:30:41 'hello from cluster member1 (Node: member1-control-plane Pod: serve-5899cfd5cd-6l27j Address: 10.10.0.17)'
2023/06/12 03:30:42 'hello from cluster member1 (Node: member1-control-plane Pod: serve-5899cfd5cd-6l27j Address: 10.10.0.17)'
pod "request" deleted

测试成功

至此官网例子测试通过。

二、探索各集群使用一致域名

1、各方案说明

目前来说针对mcs的提案KEP-1645: Multi-Cluster Services API。

主要有三类实现:

1)使用影子服务。

​ 即在客户端集群创建一样名字或带前缀(一般为derived-)的服务,但endpointSlice的ENDPOINTS却指向目标pods。这类方式需要解决本地同名服务的冲突,这是karmada在v1.7.0及之前的实现方式。

2)使用IPTables和IP隧道技术。

​ 这类技术的代表是:submariner。它需要在不同的集群简历不同的网关节点,并通过ip tunnel的形式来实现多集群的通信。这样域名直接被网关解析,达到了目的。

3)使用ServiceImport。

​ 其原理为 1645-multi-cluster-services-api或 Multi-Cluster DNS 。本文所介绍的coredns的插件multicluster即是这类的实现。

2、coredns的multicluster插件探索

multicluster插件的原理比较简单,需要客户端集群的ServiceImport拥有原始服务的名称和对应需要解析的clusterIP(这个ip可以是原始集群的–需要打通,也可以是本集群的)。multicluster将这些信息生成coredns的rr记录。在遇到multicluster需要解析的域名时候,即可完成解析。

1). 编译带multicluster插件的coredns

按照官网的文档multicluster插件。下载coredns,k8s 1.26对应版本为v1.9.3(实测最新的v1.10.1也可):

git clone https://github.com/coredns/coredns
cd coredns
git checkout v1.9.3
  • 添加multicluster插件

打开plugin.cfg文件,添加multicluster

...
kubernetes:kubernetes
multicluster:github.com/coredns/multicluster
...
  • 执行编译
cd ../ 
make
  • 制作镜像:

直接在目录下执行(v1.10以下版本,否则需要升级docker版本):

root@zishen:/usr/workspace/go/src/github.com/coredns# docker build -f Dockerfile -t registry.k8s.io/coredns/coredns:v1.9.3 ./

查看

root@zishen:/usr/workspace/go/src/github.com/coredns# docker images|grep core
registry.k8s.io/coredns/coredns                         v1.9.3             9a15fc60cfea   27 seconds ago   49.8MB
  • load到kind中:
root@zishen:/usr/workspace/go/src/github.com/coredns# kind load docker-image registry.k8s.io/coredns/coredns:v1.9.3 --name member2
Image: "registry.k8s.io/coredns/coredns:v1.9.3" with ID "sha256:9a15fc60cfea3f7e1b9847994d385a15af6d731f86b7f056ee868ac919255dca" not yet present on node "member2-control-plane", loading...
root@zishen:/usr/workspace/go/src/github.com/coredns# 
  • 重启coredns:
root@zishen:/home/btg/yaml/mcs# kubectl delete pod -n kube-system          coredns-787d4945fb-mvsv4 
pod "coredns-787d4945fb-mvsv4" deleted
root@zishen:/home/btg/yaml/mcs# kubectl delete pod -n kube-system          coredns-787d4945fb-62nxv
pod "coredns-787d4945fb-62nxv" deleted

2). 配置member2中的coredns权限。

kubectl edit clusterrole system:coredns

...
- apiGroups:
  - multicluster.x-k8s.io
  resources:
  - serviceimports
  verbs:
  - list
  - watch

3). 配置multicluster的zone规则

在coredns中的corefile中增加multicluster的处理规则

 kubectl edit configmap coredns -n kube-system
 ....
   Corefile: |
    .:53 {
        errors
        health {
           lameduck 5s
        }
        ready
        multicluster clusterset.local
        kubernetes cluster.local in-addr.arpa ip6.arpa {
           pods insecure
           fallthrough in-addr.arpa ip6.arpa
           ttl 30
        }
        prometheus :9153
        forward . /etc/resolv.conf {
           max_concurrent 1000
        }
        cache 30
        loop
        reload
        loadbalance
    }
...

【注意】

clusterset.local不能为系统默认的cluster.local,否则会被multicluster之前的kubernetes插件拦截。若一定需要,需要在编译前,在plugin.cfg文件中把multicluster放到kubernetes插件前。但造成的影响还未完全测试,需要仔细分析。

4). 在member2的serviceImport中添加clusterIP

由于当前karmada还未填充serviceImport的ips字段,故需要我们自己填写。

  • 删除member2的ServiceImport

在karmada控制面删除ServiceImport。具体yaml参考本文之前的导入Service到成员集群

  • 创建新的ServiceImport

由于member2中已经没有ServiceImport,故没了影子服务(没有clusterip)。为调试,ips暂时用源端的pod ip。

新的ServiceImport的yaml如下:

apiVersion: multicluster.x-k8s.io/v1alpha1
kind: ServiceImport
metadata:
  name: serve
  namespace: default
spec:
  type: ClusterSetIP
  ports:
  - name: "http"
    port: 80
    protocol: TCP
  ips:
  - "10.10.0.5"

切换到member2里面创建。

创建好后效果如下:

root@zishen:/home/btg/yaml/mcs/test# kubectl get serviceImport -A
NAMESPACE   NAME    TYPE           IP              AGE
default     serve   ClusterSetIP   ["10.10.0.5"]   4h42m

5). 验证

可以使用之前介绍的方法,但为了调试方便,本文使用直接创建客户端pod的方式。

  • 在member2上创建pod
kubectl --kubeconfig ~/.kube/members.config --context member2 run -i --image=ubuntu:18.04 btg

创建好后,ctrl+c退出。进入这个pod

kubectl exec -it -n default btg bash
  • 安装软件
apt-get update
apt-get install dnsutils
apt install iputils-ping
apt-get install net-tools 
apt install curl
apt-get install vim
  • 测试可配置域名

在/etc/resolv.conf 中添加clusterset.local后缀。(解释详细点:)

root@btg:/# cat /etc/resolv.conf 
search default.svc.cluster.local svc.cluster.local cluster.local clusterset.local
nameserver 10.13.0.10
options ndots:5

访问域名:serve.default.svc.clusterset.local

root@btg:/# curl serve.default.svc.clusterset.local:8080
'hello from cluster member1 (Node: member1-control-plane Pod: serve-5899cfd5cd-dvxz8 Address: 10.10.0.7)'root@btg:/#

注意:由于我们使用的域名不是clusterIP,故需要加上端口8080.

测试成功。

故后续在karmada中实现对集群成员ServiceImport的ips添加即可。

三、问题记录

1、Failed to watch *v1alpha1.ServiceImport

现象:

W0612 12:18:13.939070       1 reflector.go:324] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: failed to list *v1alpha1.ServiceImport: serviceimports.multicluster.x-k8s.io is forbidden: User "system:serviceaccount:kube-system:coredns" cannot list resource "serviceimports" in API group "multicluster.x-k8s.io" at the cluster scope
E0612 12:18:13.939096       1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1alpha1.ServiceImport: failed to list *v1alpha1.ServiceImport: serviceimports.multicluster.x-k8s.io is forbidden: User "system:serviceaccount:kube-system:coredns" cannot list resource "serviceimports" in API group "multicluster.x-k8s.io" at the cluster scope

解决,添加RBAC权限:

root@zishen:/home/btg/yaml/mcs# kubectl edit clusterrole system:coredns

# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  creationTimestamp: "2023-06-12T07:50:29Z"
  name: system:coredns
  resourceVersion: "225"
  uid: 51e7d961-29a6-43dc-ac0f-dbca68271e46
rules:
- apiGroups:
  - ""
  resources:
  - endpoints
  - services
  - pods
  - namespaces
  - ServiceImport
  verbs:
  - list
  - watch
...
- apiGroups:
  - multicluster.x-k8s.io
  resources:
  - serviceimports
  verbs:
  - list
  - watch

2、编译coredns的master分支的镜像报:invalid argument

二进制编译好后,编译镜像的具体报错如下:

failed to parse platform : "" is an invalid component of "": platform specifier component must match "^[A-Za-z0-9_-]+$": invalid argument

升级docker解决。我是升级到24.0.2解决。

注意:不能直接apt-get install docker-ce ,否则装出来的是20版本,还是有这个问题。

1)apt换到阿里源.

编写文件/etc/apt/sources.list,内容改为如下:

deb http://mirrors.aliyun.com/ubuntu/ bionic main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ bionic main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ bionic-security main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ bionic-security main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ bionic-updates main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ bionic-updates main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ bionic-backports main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ bionic-backports main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ bionic-proposed main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ bionic-proposed main restricted universe multiverse

2)编写 /etc/apt/sources.list.d/docker.list 文件(没有就新增),添加如下内容:

deb [arch=amd64] https://download.docker.com/linux/ubuntu bionic stable 

3)执行更新

apt-get update
sudo apt-get upgrade

4)安装docker

apt-get install docker-ce docker-ce-cli containerd.io  

3、dig中报WARNING: recursion requested but not available

现象如下:

root@btg:/# dig @10.13.0.10 serve.default.svc.cluster.local  A

; <<>> DiG 9.11.3-1ubuntu1.18-Ubuntu <<>> @10.13.0.10 serve.default.svc.cluster.local A
; (1 server found)
;; global options: +cmd
;; Got answer:
;; WARNING: .local is reserved for Multicast DNS
;; You are currently testing what happens when an mDNS query is leaked to DNS
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 57327
;; flags: qr aa rd; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: 24c073ed74240563 (echoed)
;; QUESTION SECTION:
;serve.default.svc.cluster.local. IN	A
...

参考header插件,在corefile中增加即可

...
header {
  response set ra # set RecursionAvailable flag
}
...

四、要求

1、coredns版本必须为v1.9.3及其以上,否则不支持multicluster特性。配套的k8s版本至少v1.21.0

2、目前对于multicluster,不支持 headless service类型。

3、服务中最好指定dnsPolicy: ClusterFirst

五、参考

CoreDNS篇2-编译安装External Plugins

MCP多云跨集群pod互访

采坑指南——k8s域名解析coredns问题排查过程

Submariner原理说明Kubernetes CNI 插件选型和应用场景探讨

coredns原理:k8s 服务注册与发现(三)CoreDNS

service原理: 浅谈 Kubernetes Service

endpointSlice原理:【重识云原生】第六章容器基础6.4.9.5节——Service特性端点切片(Endpoint Slices)

你可能感兴趣的:(karmada,k8s,kubernetes,云原生)