在 Istio 1.8中多集群支持的演变 一文中,我们介绍了4种Istio多集群部署模型,并且简单介绍了单网络 Primary-Remote部署模型的部署步骤。今天我们通过对源码分析,来介绍Istio如何支持多集群模式。
主要通过istioctl 命令 和 Pilot-discovery源码两部分来讲述,并且基于Istio1.8 版本。
Istioctl 命令
Istioctl 提供了诸多对于多集群支持的命令。该代码位于 istioctl/pkg/multicluster
路径下,包含了如下子命令:
- apply :基于网格拓扑更新多集群网格中的集群
- describe : 描述多集群网格的控制平面的状态
- generate:根据网格描述和运行时状态生成特定于集群的控制平面配置
以上三个命令,大家可以-h,获取帮助:
$ istioctl x multicluster -h
Commands to assist in managing a multi-cluster mesh [Deprecated, it will be removed in Istio 1.9]
Usage:
istioctl experimental multicluster [command]
Aliases:
multicluster, mc
Available Commands:
apply Update clusters in a multi-cluster mesh based on mesh topology
describe Describe status of the multi-cluster mesh's control plane'
generate generate a cluster-specific control plane configuration based on the mesh description and runtime state
Flags:
-h, --help help for multicluster
Global Flags:
--context string The name of the kubeconfig context to use
-i, --istioNamespace string Istio system namespace (default "istio-system")
-c, --kubeconfig string Kubernetes configuration file
-n, --namespace string Config namespace
Use "istioctl experimental multicluster [command] --help" for more information about a command.
- create-remote-secret:创建具有凭据的secret,以允许Istio访问远程Kubernetes apiserver。
比如我们在部署多集群模型中,一定会执行如下的命令(此次演示远程集群名为 sgt-base-sg1-prod):
istioctl x create-remote-secret
--context="${CTX_REMOTE}"
--name=sgt-base-sg1-prod |
kubectl apply -f - --context="${CTX_CONTROL}"
该命令分为两部分:
- 针对远程集群操作:将会在远程集群创建istio-system命名空间,并在该命名空间下创建istio-reader-service-account 和 istiod-service-account 两个服务账户,以及对这两个账户的RBAC相关授权,执行成功后,返回控制集群所需的Secret。
- 将 上一步骤返回的Secret应用到控制集群。
我们看下实际操作返回Secret内容:
# This file is autogenerated, do not edit.
apiVersion: v1
kind: Secret
metadata:
annotations:
networking.istio.io/cluster: sgt-base-sg1-prod
creationTimestamp: null
labels:
istio/multiCluster: "true"
name: istio-remote-secret-sgt-base-sg1-prod
namespace: istio-system
stringData:
sgt-base-sg1-prod: |
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: LS0tLS1CRUdJTiBDRVJ0tCk1JSUN5RENDQWJDZ0F3SUJBZ0lCQURBTkJna3Foa2lHOXcwQkFRc0ZBREFWTVJNd0VRWURWUVFERXdwcmRXSmwKY201bGRHVnpNQjRYRFRJd01Ea3lNekF5TlRJME0xb1hEVE13TURreU1UQXlOVEkwTTFvd0ZURVRNQkVHQTFVRQpBeE1LYTNWaVpYSnVaWFJsY3pDQ0FTSXdEUVlKS29aSWh2Y05BUUVCQlFBRGdnRVBBRENDQVFvQ2dnRUJBTWFJCk5DcW1McGNjTENGNDNqTDZET1phNnhUMU5kbm9yNkpWR0w5a0FNNGMzVDZDZ1ZYOUpDbGxxdmVDQkRMclgremEKcGQwZ1orNFZqZUtHWk9jdklnc3p2dDV4TTJoWDBBZ1BQMFFDNnl2bnc5VXBrOHBNcDFLVkV1L3pUSXFPTlplcAp0NmlGcjIya1dUaWgwYmhIeDQwc3JoQXZjWXM2NStlb240QmhBYTBGR1dreWM4dUZqRmRnT2hYS3hzd01EdkRiCmUzenlMc3ZOb2NvT3V1U2JrR3hUNmtKeGhmdHI4dEZnWGllM2dYSFJnSitQUUN6UElCM1JZdEsxMGdROHB6T1UKOTAwb3p0TlllZGg4MUhZcjZSV0ZDb1FBMXlpN2xEL3BUWlo4UnRkZTZQWmt0bStFNnJkaEI2a0ZkZmFtY3U4MgptamlQZGxmYWVrSXFCTGxoa1NFQ0F3RUFBYU1qTUNFd0RnWURWUjBQQVFIL0JBUURBZ0trTUE4R0ExVWRFd0VCCi93UUZNQU1CQWY4d0RRWUpLb1pJaHZjTkFRRUxCUUFEZ2dFQkFHUHEzVllkWmFJZFdOMDk5OW5TV1RIV0E0VkYKMzROZ1pEVEdHY3QvWUpNWmZGRnVnSjlqRVBMdTZiSklrZFVHcHNCbkhvNUFsTHJZTjU2dnFkL0MrVTlOc2R2NwpnQ0FBTlNDMVArYktUZmVmWGpQd1dhY0R2RCtTZWIrTHhGUmF3NWZyNDZJNEtTRE12RUZ0T3JaRmhWL3AvQkF5ClZJT01GMDF3aCtOa045OVlWMUZ0S1pLRnd6WGVaM3N3TXBCek50a2daYzlDMjhvdlR5TGNFT05ucGk0dDRmc28KSGpYdkJubUVvak5UcmZtL3F3M1l6Y3dBNXUzekRoRlFkTU5PWlFWVk1EVmhzOFZBOXhyRk1iUFhCSWRiZmZRSApva3QvWkJ0WHRwQm9qaGZmYlJSR0pRQTBFbTk0WTRGNEhhSlFMM2QwMGRoSy9mL1Fiak5BUVhFVFhqRT0KLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQo=
server: https://88876557684F299B0ED2.xxx.ap-southeast-1.eks.amazonaws.com
name: sgt-base-sg1-prod
contexts:
- context:
cluster: sgt-base-sg1-prod
user: sgt-base-sg1-prod
name: sgt-base-sg1-prod
current-context: sgt-base-sg1-prod
kind: Config
preferences: {}
users:
- name: sgt-base-sg1-prod
user:
token: eyJhbGciOiJSUzI1NirbWFQRjFVeVI3WlZ2Qk9YQ0Qzb2FINl9xMkE5X0MzbXEwb2hVWFVnZjgifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJpc3Rpby1zeXN0ZW0iLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlY3JldC5uYW1lIjoiaXN0aW8tcmVhZGVyLXNlcnZpY2UtYWNjb3VudC10b2tlbi1wdHFmOSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50Lm5hbWUiOiJpc3Rpby1yZWFkZXItc2VydmljZS1hY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQudWlkIjoiMzJlZDQwYzktNGNmNC00Y2EwLWI1YzYtZThhZTczNjFlMDI2Iiwic3ViIjoic3lzdGVtOnNlcnZpY2VhY2NvdW50OmlzdGlvLXN5c3RlbTppc3Rpby1yZWFkZXItc2VydmljZS1hY2NvdW50In0.WbtOZc0390Yq147gvOFdWsaxhEwAC7vaNzhKtlKIf9JXRIGZhkt91zPU_fJLGAlMlj9RSc5QMzQokLSvA_69fGlXnZpdiPvVBrmWJtOQ_tUNJCAL-MfBerZ1y7Kp6Itaw3j1t2M2Ksj5h1SuqfWdiBbNAwb5ehyVJoGpAxppSGdrLGbMWHH1iZCCz6T3WnPPmMfFktcgFDJYlHuuwRaIsuNgD-nUOrUM7-PQiv2sOGVy8EYbObl9AvcvlklZz5KSHfk6GkJ_RYYObFpy-M8ZOYEA2lTpeg5Wer65nlOXo_FYUQ1It4jsZdsuj9cctIQautT6ExhrG30oAhpamzKs8A
我们可以看到该secret,被 label istio/multiCluster: "true"
标记,在后续的pilot-discovery代码中,会对带有该标记的secret的进行处理。
为什么创建该secret很重要?
- 使控制平面能够验证来自远程集群中运行的工作负载的连接请求。没有API Server访问权限,控制平面将拒绝请求。
- 启用发现在远程集群中运行的服务端点的功能。
Pilot-discovery
Pilot-discovery 在其server结构体中,包含multicluster
对象,该对象定义如下:
type Multicluster struct {
WatchedNamespaces string
DomainSuffix string
ResyncPeriod time.Duration
serviceController *aggregate.Controller
XDSUpdater model.XDSUpdater
metrics model.Metrics
endpointMode EndpointMode
m sync.Mutex // protects remoteKubeControllers
remoteKubeControllers map[string]*kubeController
networksWatcher mesh.NetworksWatcher
// fetchCaRoot maps the certificate name to the certificate
fetchCaRoot func() map[string]string
caBundlePath string
systemNamespace string
secretNamespace string
secretController *secretcontroller.Controller
syncInterval time.Duration
}
其包含远程kube控制器和多集群特定的属性。
在pilot-discovery 组件bootstrap过程中,对该对象进行实例化。
if err := s.initClusterRegistries(args); err != nil {
return nil, fmt.Errorf("error initializing cluster registries: %v", err)
}
根据传入的RegistryOptions参数,启动secret控制器以监视远程集群并初始化多集群结构。
func (s *Server) initClusterRegistries(args *PilotArgs) (err error) {
if hasKubeRegistry(args.RegistryOptions.Registries) {
log.Info("initializing Kubernetes cluster registry")
mc, err := controller.NewMulticluster(s.kubeClient,
args.RegistryOptions.ClusterRegistriesNamespace,
args.RegistryOptions.KubeOptions,
s.ServiceController(),
s.XDSServer,
s.environment)
if err != nil {
log.Info("Unable to create new Multicluster object")
return err
}
s.multicluster = mc
}
return nil
}
该方法里的核心是NewMulticluster方法:
func NewMulticluster(kc kubernetes.Interface, secretNamespace string, opts Options,
serviceController *aggregate.Controller, xds model.XDSUpdater, networksWatcher mesh.NetworksWatcher) (*Multicluster, error) {
remoteKubeController := make(map[string]*kubeController)
if opts.ResyncPeriod == 0 {
// make sure a resync time of 0 wasn't passed in.
opts.ResyncPeriod = 30 * time.Second
log.Info("Resync time was configured to 0, resetting to 30")
}
mc := &Multicluster{
WatchedNamespaces: opts.WatchedNamespaces,
DomainSuffix: opts.DomainSuffix,
ResyncPeriod: opts.ResyncPeriod,
serviceController: serviceController,
XDSUpdater: xds,
remoteKubeControllers: remoteKubeController,
networksWatcher: networksWatcher,
metrics: opts.Metrics,
fetchCaRoot: opts.FetchCaRoot,
caBundlePath: opts.CABundlePath,
systemNamespace: opts.SystemNamespace,
secretNamespace: secretNamespace,
endpointMode: opts.EndpointMode,
syncInterval: opts.GetSyncInterval(),
}
mc.initSecretController(kc)
return mc, nil
}
对于Multicluster 结构,其实现了如下3个主要方法:
- AddMemberCluster : 作为添加远程集群时要调用的回调。此功能需要设置所有处理程序,以监视在远程集群上添加,删除或更改的资源。
- DeleteMemberCluster:当删除远程集群时,也就是某远程集群不再纳入到mesh中,要调用的回调。同时清除缓存,以删除远程集群资源。
- UpdateMemberCluster : 该方法先执行 DeleteMemberCluster 操作,再执行AddMemberCluster 操作。
以上三个方法会传递到MultiCluster 对象中的secret控制器。
func (m *Multicluster) initSecretController(kc kubernetes.Interface) {
m.secretController = secretcontroller.StartSecretController(kc,
m.AddMemberCluster,
m.UpdateMemberCluster,
m.DeleteMemberCluster,
m.secretNamespace,
m.syncInterval)
}
该secret 控制器监测secret变化,当然并不是对所有的secret变化都执行对应操作。当secret 包含 istio/multiCluster: "true"
lable的时候,表明该secret代表一个远程集群,才会做对应的操作,具体操作就是执行上面讲到的三个方法。
secretsInformer := cache.NewSharedIndexInformer(
&cache.ListWatch{
ListFunc: func(opts meta_v1.ListOptions) (runtime.Object, error) {
opts.LabelSelector = MultiClusterSecretLabel + "=true"
return kubeclientset.CoreV1().Secrets(namespace).List(context.TODO(), opts)
},
WatchFunc: func(opts meta_v1.ListOptions) (watch.Interface, error) {
opts.LabelSelector = MultiClusterSecretLabel + "=true"
return kubeclientset.CoreV1().Secrets(namespace).Watch(context.TODO(), opts)
},
},
&corev1.Secret{}, 0, cache.Indexers{},
)
这样istio就实现了多集群的自动发现目的。
那么发现远程集群之后,istio会做哪些操作那?
MultiCluster 对象中,包含remoteKubeControllers 和 serviceController 两个核心对象。
remoteKubeControllers 是一个map[string]*kubeController map对象。Key为远程集群ID,值为kubeController指针。
type kubeController struct {
*Controller
stopCh chan struct{}
}
kubeController 可以获取远程集群的Service,Pod信息,node信息等,然后将其转换为istio内部模型对象。
serviceController 是一个aggregate.Controller 对象,该控制器汇总不同注册表中的数据并监视更改。这里相当于我们平时用到的注册中心。
type Controller struct {
registries []serviceregistry.Instance
storeLock sync.RWMutex
meshHolder mesh.Holder
}
当我们新增一个集群的时候,AddMemberCluster 方法中,会将新集群的kubeController实例化,并添加到remoteKubeControllers 对象中。起一个协程运行该Controller,然后将该远程集群注册到serviceController中,也就是控制集群开始对该远程集群进行资源对象发现。
当我们删除一个远程集群的时候,DeleteMemberCluster方法中,会将目的集群的kubeController从remoteKubeControllers 中删除。并且通知运行该Controller的协程退出,然后将该远程集群从注册中心serviceController中反注册。
总结
本文从源代码角度简单介绍了一下istio对于多集群的支持。