基于 kubebuilder 的 operators 的 webhook 设计&二次开发

前情提要

之前博客已经完成了一个Operator的设计、开发、部署、验证过程,实战中刻意跳过了一个重要的知识点:webhook,如今是时候学习它了,这是个很重要的功能;
什么是AdmissionWebhook,就要先了解K8S中的admission controller, 按照官方的解释是: admission controller是拦截(经过身份验证)API Server请求的网关,并且可以修改请求对象或拒绝请求。
启动引导令牌是一种简单的 bearer token ,这种令牌是在新建集群或者在现有集群中添加新加新节点时使用的。 它被设计成能支持 kubeadm,但是也可以被用在其他上下文中以便用户在 不使用 kubeadm 的情况下启动cluster。它也被设计成可以通过 RBAC 策略,结合Kubelet TLS Bootstrapping 系统进行工作。

简而言之,它可以认为是拦截器,类似web框架中的middleware。

webhook 简介

熟悉java开发的读者大多知道过滤器(Servlet Filter),如下图,外部请求会先到达过滤器,做一些统一的操作,例如转码、校验,然后才由真正的业务逻辑处理请求:


image

Operator中的webhook,其作用与上述过滤器类似,外部对CRD资源的变更,在Controller处理之前都会交给webhook提前处理,流程如下图,该图来自《Getting Started with Kubernetes | Operator and Operator Framework》:

image

再来看看webhook具体做了哪些事情,如下图,kubernetes官方博客明确指出webhook可以做两件事:修改(mutating)和验证(validating)


image

kubebuilder为我们提供了生成webhook的基础文件和代码的工具,与制作API的工具类似,极大地简化了工作量,咱们只需聚焦业务实现即可;
基于kubebuilder制作的webhook和controller,如果是同一个资源,那么它们在同一个进程中;

场景设计

为之前的elasticweb项目上增加需求,让webhook发挥实际作用;

1 如果用户忘记输入总QPS,系统webhook负责设置默认值1300,操作如下图:


image

2 为了保护系统,给单个pod的QPS设置上限1000,如果外部输入的singlePodQPS值超过1000,就创建资源对象失败,如下图所示:


image

准备工作

和controller类似,webhook既能在kubernetes环境中运行,也能在kubernetes环境之外运行;
如果webhook在kubernetes环境之外运行,是有些麻烦的,需要将证书放在所在环境,默认地址是:

# 上一篇博文,已经说明,需要提前的将配置文件和证书文件拷贝到指定目录下,方便进行开发测试。
/tmp/k8s-webhook-server/serving-certs/tls.{crt,key}

为了让webhook在kubernetes环境中运行,咱们要做一点准备工作安装cert manager,执行以下操作:

# 文件如果失效的话,记得寻找类似的文件进行替代。
kubectl apply -f https://github.com/jetstack/cert-manager/releases/download/v1.2.0/cert-manager.yaml

上述操作完成后会新建很多资源,如namespace、rbac、pod等,以pod为例如下:


image

生成webhook

进入elasticweb工程下,执行以下命令创建webhook:

kubebuilder create webhook \
--group elasticweb \
--version v1 \
--kind ElasticWeb \
--defaulting \
--programmatic-validation

上述命令执行完毕后,先去看看main.go文件,如下图红框1所示,自动增加了一段代码,作用是让webhook生效:


image

上图红框2中的elasticweb_webhook.go就是新增文件,内容如下:

/*


Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/

package v1

import (
    apierrors "k8s.io/apimachinery/pkg/api/errors"
    "k8s.io/apimachinery/pkg/runtime"
    "k8s.io/apimachinery/pkg/runtime/schema"
    "k8s.io/apimachinery/pkg/util/validation/field"
    ctrl "sigs.k8s.io/controller-runtime"
    logf "sigs.k8s.io/controller-runtime/pkg/log"
    "sigs.k8s.io/controller-runtime/pkg/webhook"
)

// log is for logging in this package.
var elasticweblog = logf.Log.WithName("elasticweb-resource")

func (r *ElasticWeb) SetupWebhookWithManager(mgr ctrl.Manager) error {
    return ctrl.NewWebhookManagedBy(mgr).
        For(r).
        Complete()
}

// EDIT THIS FILE!  THIS IS SCAFFOLDING FOR YOU TO OWN!

// +kubebuilder:webhook:path=/mutate-elasticweb-com-bolingcavalry-v1-elasticweb,mutating=true,failurePolicy=fail,groups=elasticweb.com.bolingcavalry,resources=elasticwebs,verbs=create;update,versions=v1,name=melasticweb.kb.io

var _ webhook.Defaulter = &ElasticWeb{}

// Default implements webhook.Defaulter so a webhook will be registered for the type
func (r *ElasticWeb) Default() {
    elasticweblog.Info("default", "name", r.Name)

    // TODO(user): fill in your defaulting logic.
    // 如果创建的时候没有输入总QPS,就设置个默认值
    if r.Spec.TotalQPS == nil {
        r.Spec.TotalQPS = new(int32)
        *r.Spec.TotalQPS = 1300
        elasticweblog.Info("a. TotalQPS is nil, set default value now", "TotalQPS", *r.Spec.TotalQPS)
    } else {
        elasticweblog.Info("b. TotalQPS exists", "TotalQPS", *r.Spec.TotalQPS)
    }
}

// TODO(user): change verbs to "verbs=create;update;delete" if you want to enable deletion validation.
// +kubebuilder:webhook:verbs=create;update,path=/validate-elasticweb-com-bolingcavalry-v1-elasticweb,mutating=false,failurePolicy=fail,groups=elasticweb.com.bolingcavalry,resources=elasticwebs,versions=v1,name=velasticweb.kb.io

var _ webhook.Validator = &ElasticWeb{}

// ValidateCreate implements webhook.Validator so a webhook will be registered for the type
func (r *ElasticWeb) ValidateCreate() error {
    elasticweblog.Info("validate create", "name", r.Name)

    // TODO(user): fill in your validation logic upon object creation.

    return r.validateElasticWeb()
}

// ValidateUpdate implements webhook.Validator so a webhook will be registered for the type
func (r *ElasticWeb) ValidateUpdate(old runtime.Object) error {
    elasticweblog.Info("validate update", "name", r.Name)

    // TODO(user): fill in your validation logic upon object update.
    return r.validateElasticWeb()
}

// ValidateDelete implements webhook.Validator so a webhook will be registered for the type
func (r *ElasticWeb) ValidateDelete() error {
    elasticweblog.Info("validate delete", "name", r.Name)

    // TODO(user): fill in your validation logic upon object deletion.
    return nil
}

func (r *ElasticWeb) validateElasticWeb() error {
    var allErrs field.ErrorList

    if *r.Spec.SinglePodQPS > 1000 {
        elasticweblog.Info("c. Invalid SinglePodQPS")

        err := field.Invalid(field.NewPath("spec").Child("singlePodQPS"),
            *r.Spec.SinglePodQPS,
            "d. must be less than 1000")

        allErrs = append(allErrs, err)

        return apierrors.NewInvalid(
            schema.GroupKind{Group: "elasticweb.com.bolingcavalry", Kind: "ElasticWeb"},
            r.Name,
            allErrs)
    } else {
        elasticweblog.Info("e. SinglePodQPS is valid")
        return nil
    }
}

上述代码有两处需要注意,第一处和填写默认值有关,如下图:


image

第二处和校验有关,如下图:


image

要实现的业务需求就是通过修改上述elasticweb_webhook.go的内容来实现,不过代码稍后再写,先把配置都改好;

开发(配置)

打开文件config/default/kustomization.yaml,下图四个红框中的内容原本都被注释了,现在请将注释符号都删掉,使其生效:

image

还是文件config/default/kustomization.yaml,节点vars下面的内容,原本全部被注释了,现在请全部放开,放开后的效果如下图:

image

配置已经完成,可以编码了;

开发(编码)

打开文件elasticweb_webhook.go
新增依赖:

apierrors "k8s.io/apimachinery/pkg/api/errors"

找到Default方法,改成如下内容,可见代码很简单,判断TotalQPS是否存在,若不存在就写入默认值,另外还加了两行日志:

func (r *ElasticWeb) Default() {
    elasticweblog.Info("default", "name", r.Name)

    // TODO(user): fill in your defaulting logic.
    // 如果创建的时候没有输入总QPS,就设置个默认值
    if r.Spec.TotalQPS == nil {
        r.Spec.TotalQPS = new(int32)
        *r.Spec.TotalQPS = 1300
        elasticweblog.Info("a. TotalQPS is nil, set default value now", "TotalQPS", *r.Spec.TotalQPS)
    } else {
        elasticweblog.Info("b. TotalQPS exists", "TotalQPS", *r.Spec.TotalQPS)
    }
}

接下来开发校验功能,咱们把校验功能封装成一个validateElasticWeb方法,然后在新增和修改的时候各调用一次,如下,可见最终是调用apierrors.NewInvalid生成错误实例的,而此方法接受的是多个错误,因此要为其准备切片做入参,当然了,如果是多个参数校验失败,可以都放入切片中:

func (r *ElasticWeb) validateElasticWeb() error {
    var allErrs field.ErrorList

    if *r.Spec.SinglePodQPS > 1000 {
        elasticweblog.Info("c. Invalid SinglePodQPS")

        err := field.Invalid(field.NewPath("spec").Child("singlePodQPS"),
            *r.Spec.SinglePodQPS,
            "d. must be less than 1000")

        allErrs = append(allErrs, err)

        return apierrors.NewInvalid(
            schema.GroupKind{Group: "elasticweb.com.bolingcavalry", Kind: "ElasticWeb"},
            r.Name,
            allErrs)
    } else {
        elasticweblog.Info("e. SinglePodQPS is valid")
        return nil
    }
}

再找到新增和修改资源对象时被调用的方法,在里面调用validateElasticWeb:

// ValidateCreate implements webhook.Validator so a webhook will be registered for the type
func (r *ElasticWeb) ValidateCreate() error {
    elasticweblog.Info("validate create", "name", r.Name)

    // TODO(user): fill in your validation logic upon object creation.

    return r.validateElasticWeb()
}

// ValidateUpdate implements webhook.Validator so a webhook will be registered for the type
func (r *ElasticWeb) ValidateUpdate(old runtime.Object) error {
    elasticweblog.Info("validate update", "name", r.Name)

    // TODO(user): fill in your validation logic upon object update.
    return r.validateElasticWeb()
}

编码完成,可见非常简单,接下来,咱们把以前实战遗留的东西清理一下,再开始新的部署和验证;

清理工作

# 1 删除elasticweb资源对象:
kubectl delete -f config/samples/elasticweb_v1_elasticweb.yaml

# 2 删除controller
kustomize build config/default | kubectl delete -f -

# 3 删除CRD
make uninstall

接下来就可以部署webhook了;

部署

# 1 部署CRD
make install
# 2 构建镜像并推送到仓库
make docker-build docker-push IMG=12589/elasticweb:001
# 3 部署集成了webhook功能的controller:
make deploy IMG=12589/elasticweb:001
# 4 查看pod,确认启动成功:
kubectl get pods --all-namespaces

验证Defaulter(添加默认值)

修改文件config/samples/elasticweb_v1_elasticweb.yaml,修改后的内容如下,可见totalQPS字段已经被注释掉了:

apiVersion: v1
kind: Namespace
metadata:
  name: dev
  labels:
    name: dev
---
apiVersion: elasticweb.com.bolingcavalry/v1
kind: ElasticWeb
metadata:
  namespace: dev
  name: elasticweb-sample
spec:
  # Add fields here
  image: tomcat:8.0.18-jre8
  port: 30003
  singlePodQPS: 500
  # totalQPS: 600

创建一个elasticweb资源对象:

kubectl apply -f config/samples/elasticweb_v1_elasticweb.yaml

此时单个pod的QPS是500,如果webhook的代码生效的话,总QPS就是1300,而对应的pod数应该是3个,接下来咱们看看是否符合预期;
先看elasticweb、deployment、pod等资源对象是否正常,如下所示,全部符合预期:

zhaoqin@zhaoqindeMBP-2 ~ % kubectl get elasticweb -n dev                                                                 
NAME                AGE
elasticweb-sample   89s
zhaoqin@zhaoqindeMBP-2 ~ % kubectl get deployments -n dev
NAME                READY   UP-TO-DATE   AVAILABLE   AGE
elasticweb-sample   3/3     3            3           98s
zhaoqin@zhaoqindeMBP-2 ~ % kubectl get service -n dev    
NAME                TYPE       CLUSTER-IP       EXTERNAL-IP   PORT(S)          AGE
elasticweb-sample   NodePort   10.105.125.125           8080:30003/TCP   106s
zhaoqin@zhaoqindeMBP-2 ~ % kubectl get pod -n dev    
NAME                                 READY   STATUS    RESTARTS   AGE
elasticweb-sample-56fc5848b7-5tkxw   1/1     Running   0          113s
elasticweb-sample-56fc5848b7-blkzg   1/1     Running   0          113s
elasticweb-sample-56fc5848b7-pd7jg   1/1     Running   0          113s

用kubectl describe命令查看elasticweb资源对象的详情,如下所示,TotalQPS字段被webhook设置为1300,RealQPS也计算正确:

zhaoqin@zhaoqindeMBP-2 ~ % kubectl describe elasticweb elasticweb-sample -n dev
Name:         elasticweb-sample
Namespace:    dev
Labels:       
Annotations:  
API Version:  elasticweb.com.bolingcavalry/v1
Kind:         ElasticWeb
Metadata:
  Creation Timestamp:  2021-02-27T16:07:34Z
  Generation:          2
  Managed Fields:
    API Version:  elasticweb.com.bolingcavalry/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .:
          f:kubectl.kubernetes.io/last-applied-configuration:
      f:spec:
        .:
        f:image:
        f:port:
        f:singlePodQPS:
    Manager:      kubectl-client-side-apply
    Operation:    Update
    Time:         2021-02-27T16:07:34Z
    API Version:  elasticweb.com.bolingcavalry/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:status:
        f:realQPS:
    Manager:         manager
    Operation:       Update
    Time:            2021-02-27T16:07:34Z
  Resource Version:  687628
  UID:               703de111-d859-4cd2-b3c4-1d201fb7bd7d
Spec:
  Image:           tomcat:8.0.18-jre8
  Port:            30003
  Single Pod QPS:  500
  Total QPS:       1300
Status:
  Real QPS:  1500
Events:      

再来看看controller的日志,其中的webhook部分是否符合预期,如下图红框所示,发现TotalQPS字段为空,就将设置为默认值,并且在检测的时候SinglePodQPS的值也没有超过1000:


image

最后别忘了用浏览器验证web服务是否正常,我这里的完整地址是:http://192.168.xxx.xx:30003/
至此,完成了webhook的Defaulter验证,接下来验证Validator

验证Validator

接下来该验证webhook的参数校验功能了,先验证修改时的逻辑;
编辑文件config/samples/update_single_pod_qps.yaml,值如下:

spec:
  singlePodQPS: 1100

用patch命令使之生效:

kubectl patch elasticweb elasticweb-sample \
-n dev \
--type merge \
--patch "$(cat config/samples/update_single_pod_qps.yaml)"

此时,控制台会输出错误信息:

Error from server (ElasticWeb.elasticweb.com.bolingcavalry "elasticweb-sample" is invalid: spec.singlePodQPS: Invalid value: 1100: d. must be less than 1000): admission webhook "velasticweb.kb.io" denied the request: ElasticWeb.elasticweb.com.bolingcavalry "elasticweb-sample" is invalid: spec.singlePodQPS: Invalid value: 1100: d. must be less than 1000

再用kubectl describe命令查看elasticweb资源对象的详情,如下图红框,依然是500,可见webhook已经生效,阻止了错误的发生:


image

再去看controller日志,如下图红框所示,和代码对应上了


image

接下来再试试webhook在新增时候的校验功能;

清理前面创建的elastic资源对象,执行命令:

kubectl delete -f config/samples/elasticweb_v1_elasticweb.yaml

修改文件,如下图红框所示,咱们将singlePodQPS的值改为超过1000,看看webhook是否能检查到这个错误,并阻止资源对象的创建:


image

执行以下命令开始创建elasticweb资源对象:

kubectl apply -f config/samples/elasticweb_v1_elasticweb.yaml

控制台提示以下信息,包含了咱们代码中写入的错误描述,证明elasticweb资源对象创建失败,证明webhook的Validator功能已经生效:

namespace/dev created
Error from server (ElasticWeb.elasticweb.com.bolingcavalry "elasticweb-sample" is invalid: spec.singlePodQPS: Invalid value: 1500: d. must be less than 1000): error when creating "config/samples/elasticweb_v1_elasticweb.yaml": admission webhook "velasticweb.kb.io" denied the request: ElasticWeb.elasticweb.com.bolingcavalry "elasticweb-sample" is invalid: spec.singlePodQPS: Invalid value: 1500: d. must be less than 1000

不放心的话执行kubectl get命令检查一下,发现空空如也:

zhaoqin@zhaoqindeMBP-2 elasticweb % kubectl get elasticweb -n dev       
No resources found in dev namespace.
zhaoqin@zhaoqindeMBP-2 elasticweb % kubectl get deployments -n dev
No resources found in dev namespace.
zhaoqin@zhaoqindeMBP-2 elasticweb % kubectl get service -n dev
No resources found in dev namespace.
zhaoqin@zhaoqindeMBP-2 elasticweb % kubectl get pod -n dev
No resources found in dev namespace.

还要看下controller日志,如下图红框所示,符合预期:


image

operator的webhook的开发、部署、验证咱们就完成了,整个elasticweb也算是基本功能齐全,希望能为您的operator开发提供参考;

阅读拓展

参考文章: https://xinchen.blog.csdn.net/article/details/113922328

Watches 如何监控:
elasticweb_controller.go 文件原内容:

func (r *ElasticWebReconciler) SetupWithManager(mgr ctrl.Manager) error {
    return ctrl.NewControllerManagedBy(mgr).
        For(&elasticwebv1.ElasticWeb{}).
        Complete(r)
}

elasticweb_controller.go 文件更新后内容:

func (r *ElasticWebReconciler) SetupWithManager(mgr ctrl.Manager) error {
    //return ctrl.NewControllerManagedBy(mgr).
    //  For(&elasticwebv1.ElasticWeb{}).
    //  Complete(r)

    // 追加 pod 个数的 Watches 监控
    return ctrl.NewControllerManagedBy(mgr).
        For(&appsv1.Deployment{}).Watches(&source.Kind{Type: &corev1.Pod{}}, &handler.EnqueueRequestForObject{}).
        Complete(r)
}

作用说明,这样我们通过 deployment 部署的资源,我们就可以很好的监控起来了 pod 个数的变化,并且把他矫正为正常的,资源值。

你可能感兴趣的:(基于 kubebuilder 的 operators 的 webhook 设计&二次开发)