kubernetes -- 删除namespace的过程以及遇到的bug解决


  • 解决一个bug。
  • 理解k8s的controller中,删除namespace的源码,理解其中的删除过程。


执行kubectl delete ns {ns-name}命令来删除ns-name的时候,发现状态一直停留在Terminating

[root@k8smaster k8slearn]# kubectl get ns
NAME              STATUS        AGE
default           Active        99m
hello             Terminating   36m
kube-node-lease   Active        99m
kube-public       Active        99m
kube-system       Active        99m


[root@k8smaster k8slearn]# kubectl get all -n hello
No resources found in hello namespace.



[root@k8smaster k8slearn]# kubectl get ns hello -o yaml
apiVersion: v1
kind: Namespace
  creationTimestamp: "2023-02-01T06:42:00Z"
  - apiVersion: v1
    fieldsType: FieldsV1
        f:phase: {}
    manager: kubectl-create
    operation: Update
    time: "2023-02-01T06:42:00Z"
  name: hello
  resourceVersion: "5676"
  uid: bc48ddf5-7456-44f0-8f7f-597c6a141a0f
  - kubernetes
  phase: Active


  - kubernetes

那系统在什么情况下才能最终删除掉上面的spec.finalizers.kubernetes,从而删除namespace呢,有必要分析一下namespace controller的源码实现。

从kubernetes架构可以推测出,删除namespace时系统删除namespace关联资源的处理应该是在contorller里面实现的。因此顺其自然去分析namespace controller的源码。

kubernetes -- 删除namespace的过程以及遇到的bug解决_第1张图片


// Delete deletes all resources in the given namespace.
// Before deleting resources:
//   - It ensures that deletion timestamp is set on the
//     namespace (does nothing if deletion timestamp is missing).
//   - Verifies that the namespace is in the "terminating" phase
//     (updates the namespace phase if it is not yet marked terminating)
// After deleting the resources:
// * It removes finalizer token from the given namespace.
// Returns an error if any of those steps fail.
// Returns ResourcesRemainingError if it deleted some resources but needs
// to wait for them to go away.
// Caller is expected to keep calling this until it succeeds.
func (d *namespacedResourcesDeleter) Delete(nsName string) error {
	// Multiple controllers may edit a namespace during termination
	// first get the latest state of the namespace before proceeding
	// if the namespace was deleted already, don't do anything
	namespace, err := d.nsClient.Get(context.TODO(), nsName, metav1.GetOptions{})
	if err != nil {
		if errors.IsNotFound(err) {
			return nil
		return err
	if namespace.DeletionTimestamp == nil {
		return nil

	klog.V(5).Infof("namespace controller - syncNamespace - namespace: %s, finalizerToken: %s", namespace.Name, d.finalizerToken)

	// ensure that the status is up to date on the namespace
	// if we get a not found error, we assume the namespace is truly gone
	namespace, err = d.retryOnConflictError(namespace, d.updateNamespaceStatusFunc)
	if err != nil {
		if errors.IsNotFound(err) {
			return nil
		return err

	// the latest view of the namespace asserts that namespace is no longer deleting..
	if namespace.DeletionTimestamp.IsZero() {
		return nil

	// return if it is already finalized.
	if finalized(namespace) {
		return nil

	// there may still be content for us to remove
	estimate, err := d.deleteAllContent(namespace)
	if err != nil {
		return err
	if estimate > 0 {
		return &ResourcesRemainingError{estimate}

	// we have removed content, so mark it finalized by us
	_, err = d.retryOnConflictError(namespace, d.finalizeNamespace)
	if err != nil {
		// in normal practice, this should not be possible, but if a deployment is running
		// two controllers to do namespace deletion that share a common finalizer token it's
		// possible that a not found could occur since the other controller would have finished the delete.
		if errors.IsNotFound(err) {
			return nil
		return err
	return nil


	// return if it is already finalized.
	if finalized(namespace) {
		return nil


// finalized returns true if the namespace.Spec.Finalizers is an empty list
func finalized(namespace *v1.Namespace) bool {
	return len(namespace.Spec.Finalizers) == 0





Finalizer 是带有命名空间的键,告诉 Kubernetes 等到特定的条件被满足后, 再完全删除被标记为删除的资源。 Finalizer 提醒控制器清理被删除的对象拥有的资源。

当你使用清单文件创建资源时,你可以在 metadata.finalizers 字段指定 Finalizers。 当你试图删除该资源时,处理删除请求的 API 服务器会注意到 finalizers 字段中的值, 并进行以下操作:

  • 修改对象,将你开始执行删除的时间添加到metadata.deletionTimestamp字段。
  • 禁止对象被删除,直到其 metadata.finalizers 字段为空。
  • 返回 202 状态码(HTTP “Accepted”)。

管理 finalizer 的控制器注意到对象上发生的更新操作,对象的 metadata.deletionTimestamp 被设置,意味着已经请求删除该对象。然后,控制器会试图满足资源的 Finalizers 的条件。 每当一个 Finalizer 的条件被满足时,控制器就会从资源的 finalizers 字段中删除该键。 当 finalizers 字段为空时,deletionTimestamp 字段被设置的对象会被自动删除。 你也可以使用 Finalizers 来阻止删除未被管理的资源。

一个常见的 Finalizer 的例子是 kubernetes.io/pv-protection, 它用来防止意外删除 PersistentVolume 对象。 当一个 PersistentVolume 对象被 Pod 使用时, Kubernetes 会添加 pv-protection Finalizer。 如果你试图删除 PersistentVolume,它将进入 Terminating 状态, 但是控制器因为该 Finalizer 存在而无法删除该资源。 当 Pod 停止使用 PersistentVolume 时, Kubernetes 清除 pv-protection Finalizer,控制器就会删除该卷。


	estimate, err := d.deleteAllContent(namespace)


	// we have removed content, so mark it finalized by us
	_, err = d.retryOnConflictError(namespace, d.finalizeNamespace)





// deleteAllContent will use the dynamic client to delete each resource identified in groupVersionResources.
// It returns an estimate of the time remaining before the remaining resources are deleted.
// If estimate > 0, not all resources are guaranteed to be gone.
func (d *namespacedResourcesDeleter) deleteAllContent(ns *v1.Namespace) (int64, error) {
	namespace := ns.Name
	namespaceDeletedAt := *ns.DeletionTimestamp
	var errs []error
	conditionUpdater := namespaceConditionUpdater{}
	estimate := int64(0)
	klog.V(4).Infof("namespace controller - deleteAllContent - namespace: %s", namespace)

	resources, err := d.discoverResourcesFn()
	if err != nil {
		// discovery errors are not fatal.  We often have some set of resources we can operate against even if we don't have a complete list
		errs = append(errs, err)
	// TODO(sttts): get rid of opCache and pass the verbs (especially "deletecollection") down into the deleter
	deletableResources := discovery.FilteredBy(discovery.SupportsAllVerbs{Verbs: []string{"delete"}}, resources)
	groupVersionResources, err := discovery.GroupVersionResources(deletableResources)
	if err != nil {
		// discovery errors are not fatal.  We often have some set of resources we can operate against even if we don't have a complete list
		errs = append(errs, err)

	numRemainingTotals := allGVRDeletionMetadata{
		gvrToNumRemaining:        map[schema.GroupVersionResource]int{},
		finalizersToNumRemaining: map[string]int{},
	for gvr := range groupVersionResources {
		gvrDeletionMetadata, err := d.deleteAllContentForGroupVersionResource(gvr, namespace, namespaceDeletedAt)
		if err != nil {
			// If there is an error, hold on to it but proceed with all the remaining
			// groupVersionResources.
			errs = append(errs, err)
		if gvrDeletionMetadata.finalizerEstimateSeconds > estimate {
			estimate = gvrDeletionMetadata.finalizerEstimateSeconds
		if gvrDeletionMetadata.numRemaining > 0 {
			numRemainingTotals.gvrToNumRemaining[gvr] = gvrDeletionMetadata.numRemaining
			for finalizer, numRemaining := range gvrDeletionMetadata.finalizersToNumRemaining {
				if numRemaining == 0 {
				numRemainingTotals.finalizersToNumRemaining[finalizer] = numRemainingTotals.finalizersToNumRemaining[finalizer] + numRemaining

	// we always want to update the conditions because if we have set a condition to "it worked" after it was previously, "it didn't work",
	// we need to reflect that information.  Recall that additional finalizers can be set on namespaces, so this finalizer may clear itself and
	// NOT remove the resource instance.
	if hasChanged := conditionUpdater.Update(ns); hasChanged {
		if _, err = d.nsClient.UpdateStatus(context.TODO(), ns, metav1.UpdateOptions{}); err != nil {
			utilruntime.HandleError(fmt.Errorf("couldn't update status condition for namespace %q: %v", namespace, err))

	// if len(errs)==0, NewAggregate returns nil.
	klog.V(4).Infof("namespace controller - deleteAllContent - namespace: %s, estimate: %v, errors: %v", namespace, estimate, utilerrors.NewAggregate(errs))
	return estimate, utilerrors.NewAggregate(errs)

我们看有哪些if err != nil。就可以直到哪里会发生错误。

  • 错误1: 获取所有注册namesapce scope资源失败
  • 错误2: 获取资源的gvr信息解析失败
  • 错误3: namespace下某些gvr资源删除失败




  • 查看hello的namespace描述
kubectl get ns hello -o json > hello.json
  • 编辑json文件,删除spec字段的内存,因为k8s集群需要认证
 vim hello.json
"spec": {       
	"finalizers": [           
"spec": {
  • 新开一个窗口运行kubectl proxy跑一个API代理在本地的8081端口
kubectl proxy --port=8081
  • 运行curl命令,直接调用kube api进行删除
curl -k -H "Content-Type:application/json" -X PUT --data-binary @hello.json


我们来梳理一下删除namespace的过程。kubernetes -- 删除namespace的过程以及遇到的bug解决_第2张图片
