k8s异常Too many requests: Too many requests, please try again later.

问题现象

服务中使用了k8s client-go,日志里频繁出现如下异常信息,且部分节点出现NotReady状态。

failed to list *vI. Endpoints: Too many requests: Too m
any requests, please try again later.

关于k8s Apiserver的限流

通过总量限流,这种方案是比较简单粗暴的

读请求并发量限制

--max-requests-inflight int     Default: 400

This and --max-mutating-requests-inflight are summed to determine the server's total concurrency limit (which must be positive) if --enable-priority-and-fairness is true. Otherwise, this flag limits the maximum number of non-mutating requests in flight, or a zero value disables the limit completely.

写请求并发量限制

--max-mutating-requests-inflight int     Default: 200

This and --max-requests-inflight are summed to determine the server's total concurrency limit (which must be positive) if --enable-priority-and-fairness is true. Otherwise, this flag limits the maximum number of mutating requests in flight, or a zero value disables the limit completely.

关于--enable-priority-and-fairness

--enable-priority-and-fairness     Default: true

If true and the APIPriorityAndFairness feature gate is enabled, replace the max-in-flight handler with an enhanced one that queues and dispatches with priority and fairness。

因此默认配置下,ApiServer的总并发量限制=400+200

关于k8s集群节点心跳

节点心跳的上报也是经过ApiServer,因此触发限流后,会导致心跳上报失败。

Kubernetes 节点发送的心跳帮助你的集群确定每个节点的可用性,并在检测到故障时采取行动。

节点的心跳由kubelet发送。

心跳类型 心跳方式
更新节点的.status 默认间隔5min
更新kube-node-lease空间里的Lease对象 默认间隔10s

kubelet 负责创建和更新节点的 .status,以及更新它们对应的 Lease。

当节点状态发生变化时,或者在配置的时间间隔内没有更新事件时,kubelet 会更新 .status。 .status 更新的默认间隔为 5 分钟(比节点不可达事件的 40 秒默认超时时间长很多)。

kubelet 会创建并每 10 秒(默认更新间隔时间)更新 Lease 对象。 Lease 的更新独立于节点的 .status 更新而发生。 如果 Lease 的更新操作失败,kubelet 会采用指数回退机制,从 200 毫秒开始重试, 最长重试间隔为 7 秒钟。

关于client-go的并发限制

client-go访问ApiServer时,也进行了client端的限流,采用令牌桶限流算法

By default in client-go the Burst is 10 and QPS is 5。

保持QPS=5,应对突发流量QPS=burst。初始令牌桶里有burst个token。

// QPS indicates the maximum QPS to the master from this client.
	// If it's zero, the created RESTClient will use DefaultQPS: 5
	QPS float32

	// Maximum burst for throttle.
	// If it's zero, the created RESTClient will use DefaultBurst: 10.
	Burst int
func flowcontrol.NewTokenBucketRateLimiter(qps float32, burst int) flowcontrol.RateLimiter
NewTokenBucketRateLimiter creates a rate limiter which implements a token bucket approach. 
The rate limiter allows bursts of up to 'burst' to exceed the QPS, 
while still maintaining a smoothed qps rate of 'qps'. 
The bucket is initially filled with 'burst' tokens, 
and refills at a rate of 'qps'. 
The maximum number of tokens in the bucket is capped at 'burst'.

你可能感兴趣的:(AI视觉解决方案,k8s,go实战,kubernetes,容器,云原生,golang)