在之前进行ratelimit测试的时候,总觉得有时修改完对应的配置后(例如redisquota中的overrides.dimensions、maxAmount等),配置并没有生效,因此感觉生效逻辑有些混乱;但是又被Istio ratelimit功能所吸引,希望能将Isito内置的ratelimit功能开箱即用,毕竟不用修改原服务代码(之前的ratelimit逻辑需要在原服务中添加拦截器并结合redis incr+expire来实现的);
如何确定ratelimit配置修改后已生效?
(1)开启istio-policy容器日志级别为debug;
编辑istio-system.deployment.istio-policy,修改args中--log_output_level=default:debug
(2)在修改完ratelimit相关配置后(通常仅修改redisquota OR memquota),查看istio-policy中配置生效日志;
生效日志1:
2019-08-08T08:53:56.673367Z debug New routes in effect:
[Routing ExpectedTable]
ID: 4
[#0] TEMPLATE_VARIETY_REPORT {V}
[#0] istio-system {NS}
[#0] prometheus.istio-system {H}
[#0]
Condition: (context.protocol == "http" || context.protocol == "grpc") && (match((request.useragent | "-"), "kube-probe*") == false)
[#0] requestcount.metric.istio-system {I}
[#1] requestduration.metric.istio-system {I}
[#2] requestsize.metric.istio-system {I}
[#3] responsesize.metric.istio-system {I}
[#1]
Condition: context.protocol == "tcp"
[#0] tcpbytereceived.metric.istio-system {I}
[#1] tcpbytesent.metric.istio-system {I}
[#2]
Condition: context.protocol == "tcp" && ((connection.event | "na") == "close")
[#0] tcpconnectionsclosed.metric.istio-system {I}
[#3]
Condition: context.protocol == "tcp" && ((connection.event | "na") == "open")
[#0] tcpconnectionsopened.metric.istio-system {I}
[#1] TEMPLATE_VARIETY_QUOTA {V}
[#0] tsp {NS}
[#0] request-count-redisquota.tsp {H}
[#0]
Condition:
[#0] request-count-quota.instance.tsp {I}
[#2] TEMPLATE_VARIETY_ATTRIBUTE_GENERATOR {V}
[#0] istio-system {NS}
[#0] kubernetesenv.istio-system {H}
[#0]
Condition:
[#0] attributes.kubernetes.istio-system {I}
[#1]
Condition: context.protocol == "tcp"
[#0] attributes.kubernetes.istio-system {I}2019-08-08T08:53:56.673382Z info Cleaning up handler table, with config ID:3
2019-08-08T08:53:56.673423Z debug Closing adapter request-count-redisquota.tsp/&{0xc4218f60f0 map[request-count-quota.instance.tsp:0xc4222ac9b0] map[0xc4221dcc70:1665947451 0xc4221dcce0:3047078253] map[0:0xc4214b40c0 1:0xc4214b40e0] 0x47d3a0 {{adapter 15 0 request-count-redisquota.tsp}}}
2019-08-08T08:53:56.673610Z debug Closing adapter kubernetesenv.istio-system/&{{{0 0} 0 0 0 0} map[:0xc4211680a0] {{{adapter 15 0 kubernetesenv.istio-system}} 0xc4205a4880 0xc421b8c120 0xc42132a0c0 0xc42132a0c8} 0x3075200 0xc421b8c1e0}
2019-08-08T08:53:56.673639Z info adapters deleted remote controller {"adapter": "kubernetesenv.istio-system"}
2019-08-08T08:53:56.673674Z debug Closing adapter prometheus.istio-system/&{0xc4205aa500 map[requestduration.metric.istio-system:0xc420ca9700 requestsize.metric.istio-system:0xc420ca97c0 responsesize.metric.istio-system:0xc420ca9800 tcpbytesent.metric.istio-system:0xc420ca9880 tcpbytereceived.metric.istio-system:0xc420ca98c0 tcpconnectionsopened.metric.istio-system:0xc420ca9900 tcpconnectionsclosed.metric.istio-system:0xc420ca9940 requestcount.metric.istio-system:0xc420ca96c0] 0xc421861e70}
2019-08-08T08:53:56.673693Z info adapters adapter closed all scheduled daemons and workers {"adapter": "request-count-redisquota.tsp"}
2019-08-08T08:53:56.673711Z info adapters adapter closed all scheduled daemons and workers {"adapter": "prometheus.istio-system"}
2019-08-08T08:53:56.673770Z info adapters shutting down daemon... {"adapter": "kubernetesenv.istio-system"}
2019-08-08T08:53:56.673842Z info adapters shutting down daemon... {"adapter": "kubernetesenv.istio-system"}
2019-08-08T08:53:56.673954Z info adapters shutting down daemon... {"adapter": "kubernetesenv.istio-system"}
2019-08-08T08:53:56.674075Z info adapters shutting down daemon... {"adapter": "kubernetesenv.istio-system"}
2019-08-08T08:53:56.764901Z debug caches populated
2019-08-08T08:53:57.673948Z info adapters adapter closed all scheduled daemons and workers {"adapter": "kubernetesenv.istio-system"}
生效日志2:
注:当前log在最后报了个error,但是并不影响配置生效,这个错误是我用旧版本的memquota所引起的,切换到新版ratelimit配置后,该条error日志便不见了(新版配置对应log参见生效日志1);
2019-08-08T02:02:33.597655Z debug New routes in effect:
[Routing ExpectedTable]
ID: 5
[#0] TEMPLATE_VARIETY_REPORT {V}
[#0] istio-system {NS}
[#0] prometheus.istio-system {H}
[#0]
Condition: (context.protocol == "http" || context.protocol == "grpc") && (match((request.useragent | "-"), "kube-probe*") == false)
[#0] requestcount.metric.istio-system {I}
[#1] requestduration.metric.istio-system {I}
[#2] requestsize.metric.istio-system {I}
[#3] responsesize.metric.istio-system {I}
[#1]
Condition: context.protocol == "tcp"
[#0] tcpbytereceived.metric.istio-system {I}
[#1] tcpbytesent.metric.istio-system {I}
[#2]
Condition: context.protocol == "tcp" && ((connection.event | "na") == "close")
[#0] tcpconnectionsclosed.metric.istio-system {I}
[#3]
Condition: context.protocol == "tcp" && ((connection.event | "na") == "open")
[#0] tcpconnectionsopened.metric.istio-system {I}
[#1] TEMPLATE_VARIETY_QUOTA {V}
[#0] tsp {NS}
[#0] handler.memquota.tsp {H}
[#0]
Condition:
[#0] requestcount.quota.tsp {I}
[#2] TEMPLATE_VARIETY_ATTRIBUTE_GENERATOR {V}
[#0] istio-system {NS}
[#0] kubernetesenv.istio-system {H}
[#0]
Condition:
[#0] attributes.kubernetes.istio-system {I}
[#1]
Condition: context.protocol == "tcp"
[#0] attributes.kubernetes.istio-system {I}2019-08-08T02:02:33.597664Z info Cleaning up handler table, with config ID:4
2019-08-08T02:02:33.597687Z debug Closing adapter kubernetesenv.istio-system/&{{{0 0} 0 0 0 0} map[:0xc420522e60] {{{adapter 15 0 kubernetesenv.istio-system}} 0xc4205611a0 0xc420c81aa0 0xc420bc1e40 0xc420bc1e48} 0x3075200 0xc420c81bf0}
2019-08-08T02:02:33.597706Z info adapters deleted remote controller {"adapter": "kubernetesenv.istio-system"}
2019-08-08T02:02:33.597730Z debug Closing adapter handler.memquota.tsp/&{{{0 0} map[] map[] 0xc4214afe50 0x47d3a0 {{adapter 15 0 handler.memquota.tsp}}} map[] map[requestcount.quota.tsp;destination=mx-vehicle-parts-management;destinationVersion=v256;sourceVin=luohq:0xc42161e6c0 requestcount.quota.tsp;destination=mx-vehicle-parts-management;destinationVersion=v256;sourceVin=luohqx:0xc421a33950] map[requestcount.quota.tsp:0xc420b2f580] {{adapter 15 0 handler.memquota.tsp }}}
2019-08-08T02:02:33.597749Z debug Closing adapter prometheus.istio-system/&{0xc4200e2500 map[requestduration.metric.istio-system:0xc420b49840 requestsize.metric.istio-system:0xc420b49880 responsesize.metric.istio-system:0xc420b498c0 tcpbytesent.metric.istio-system:0xc420b49900 tcpbytereceived.metric.istio-system:0xc420b49940 tcpconnectionsopened.metric.istio-system:0xc420b49980 tcpconnectionsclosed.metric.istio-system:0xc420b499c0 requestcount.metric.istio-system:0xc420b49800] 0xc42050ff00}
2019-08-08T02:02:33.597770Z info adapters adapter closed all scheduled daemons and workers {"adapter": "prometheus.istio-system"}
2019-08-08T02:02:33.597790Z info adapters shutting down daemon... {"adapter": "kubernetesenv.istio-system"}
2019-08-08T02:02:33.597823Z info adapters shutting down daemon... {"adapter": "kubernetesenv.istio-system"}
2019-08-08T02:02:33.597921Z info adapters shutting down daemon... {"adapter": "kubernetesenv.istio-system"}
2019-08-08T02:02:33.598017Z info adapters shutting down daemon... {"adapter": "kubernetesenv.istio-system"}
2019-08-08T02:02:33.691536Z debug caches populated
2019-08-08T02:02:34.597868Z info adapters adapter closed all scheduled daemons and workers {"adapter": "kubernetesenv.istio-system"}
2019-08-08T02:02:43.599882Z error adapters adapter did not close all the scheduled daemons {"adapter": "handler.memquota.tsp"}
(3)若配置不生效,则手动重启istio-policy后即可;
若在istio-policy中没有看到类似(2)中所提到的生效日志,即需要手动重启istio-policy(删除istio-proxy对应的pod,之后k8s会重新创建istio-policy对应pod)即可;实际测试过程中istio-policy启动速度很快(5秒内钟就可重启完成,几乎对服务请求无影响);
TODO:之后在Istio升级版本又或者搭建新Istio环境时,会再观察配置是否实时生效(当前1.1.11版本起初修改ratelimit配置生效,多次修改后便会出现配置不生效情况);
示例1:memquota限流(新版本)
当前配置针对4个维度(app+version+请求path+请求头req-vin)进行限流控制,即一个request会以4个维度的叠加值进行请求计数,例如:
如下5个请求:
request1(app=svc1,version=v1,path=/hello,req-vin=luo)
request2(app=svc1,version=v1,path=/hello,req-vin=luo)
request3(app=svc1,version=v1,path=/bye,req-vin=luo)
request4(app=svc1,version=v1,path=/bye,req-vin=luo)
request5(app=svc1,version=v1,path=/hello,req-vin=luohq)
则对应的请求计数:
reqAmount(app=svc1,version=v1,path=/hello,req-vin=luo)=2
reqAmount(app=svc1,version=v1,path=/bye,req-vin=luo)=2
reqAmount(app=svc1,version=v1,path=/hello,req-vin=luohq)=1
envoy在发送请求时,会将以上4个维度的属性值上报给mixer进行check,mixer会检查以上维度的计数器是否在validDuration时间间隔内已超过maxAmount,若超过maxAmount则envoy直接返回429(Too Many Requests),否则envoy继续调用upstream服务;
#===========================================================================
#================ RateLimits memquota速率限制(新版本) - 实际使用 ==================
#亲测好用,多次更新规则后,在policy中看不到更新log(起初更新可以看到log)
#重启policy容器后配置即生效
#===========================================================================
---
apiVersion: config.istio.io/v1alpha2
kind: instance
metadata:
name: request-count-quota
namespace: tsp
spec:
compiledTemplate: quota
params:
dimensions:
destination: destination.labels["app"] | destination.service.name | "unknown"
destinationVersion: destination.labels["version"] | "unknown"
requestPath: request.url_path | "unknown"
sourceVin: request.headers["req-vin"] | "unknown"
---
kind: handler
apiVersion: config.istio.io/v1alpha2
metadata:
name: request-count-memquota
namespace: tsp
spec:
compiledAdapter: memquota
params:
quotas:
- name: request-count-quota.instance.tsp
maxAmount: 500
validDuration: 1s
overrides:
- dimensions:
sourceVin: luohq
maxAmount: 1
validDuration: 1s
- dimensions:
#requestPath(requestPath: request.url_path | "unknown")
#request.url_path不带queryString参数
#例如s267.tsp/256/battery/list -> request.url_path=/mx_vehicle_parts_management/battery/list
#request.url_path为virtualService最终rewrite后的url_path
#request.url带queryString参数
requestPath: /mx_vehicle_parts_management/batteryVendor/list
maxAmount: 1
validDuration: 1s
---
apiVersion: config.istio.io/v1alpha2
kind: rule
metadata:
name: request-count-rule
namespace: tsp
spec:
# quota only applies if you are not logged in.
# match: match(request.headers["cookie"], "user=*") == false
actions:
- handler: request-count-memquota
instances:
- request-count-quota
---
apiVersion: config.istio.io/v1alpha2
kind: QuotaSpec
metadata:
name: request-count-qs
namespace: tsp
spec:
rules:
- quotas:
- charge: 1
quota: request-count-quota
---
apiVersion: config.istio.io/v1alpha2
kind: QuotaSpecBinding
metadata:
name: request-count-qsb
namespace: tsp
spec:
quotaSpecs:
- name: request-count-qs
namespace: tsp
services:
- name: mx-vehicle-parts-management
namespace: tsp
- name: s267
namespace: tsp
# - service: '*' # Uncomment this to bind *all* services to request-count
---
示例2:redisquota限流(新版本)- 生产环境使用
redisquota配置与memquota基本相同,区别如下:
(1)redisquota(handler.request-count-redisquota)需要指定redis连接信息(redisServerUrl, connetionPoolSize);
(2)redisquota(handler.request-count-redisquota)-> overrides.dimensions中不能指定validDuration;
(3)redisquota(handler.request-count-redisquota)可以指定ratelimit具体算法(rateLimitAlgorithm, bucketDuration);
#==========================================================================
#================ 【生产环境】RateLimits redisquota速率限制(新版本) ===============
#亲测好用,多次更新规则后,在policy中看不到更新log(起初更新可以看到log),
#重启policy容器后配置即生效
#==========================================================================
---
apiVersion: config.istio.io/v1alpha2
kind: instance
metadata:
name: request-count-quota
namespace: tsp
spec:
compiledTemplate: quota
params:
dimensions:
sourceVin: request.headers["req-vin"] | "unknown"
destination: destination.labels["app"] | destination.service.name | "unknown"
#目标版本:若需要对app所有实例进行控制,则无需指定destinationVersion; #若指定destinationVersion,则对app+version进行控制(限速仅适用于单独版本);
destinationVersion: destination.labels["version"] | "unknown"
#请求path
requestPath: request.url_path | "unknown"
---
apiVersion: config.istio.io/v1alpha2
kind: handler
metadata:
name: request-count-redisquota
namespace: tsp
spec:
compiledAdapter: redisquota
params:
redisServerUrl: 192.168.xxx.xxx:6379
connectionPoolSize: 10
quotas:
- name: request-count-quota.instance.tsp
maxAmount: 500
validDuration: 1s
bucketDuration: 500ms
rateLimitAlgorithm: ROLLING_WINDOW
overrides:
- dimensions:
sourceVin: luohq
#在redisquota->overrides->dimensions中,只可以定义maxAmount,不可以定义validDuration(在memquota中可以定义validDuration)
maxAmount: 1
- dimensions:
#requestPath(requestPath: request.url_path | "unknown")
#request.url_path不带queryString参数
#例如s267.tsp/256/battery/list -> request.url_path=/mx_vehicle_parts_management/battery/list
#request.url_path为virtualService最终rewrite后的url_path
#request.url带queryString参数
requestPath: /mx_vehicle_parts_management/batteryVendor/list
maxAmount: 1
---
apiVersion: config.istio.io/v1alpha2
kind: rule
metadata:
name: request-count-rule
namespace: tsp
spec:
# quota only applies if you are not logged in.
# match: match(request.headers["cookie"], "session=*") == false
actions:
- handler: request-count-redisquota
instances:
- request-count-quota
---
apiVersion: config.istio.io/v1alpha2
kind: QuotaSpec
metadata:
name: request-count-qs
namespace: tsp
spec:
rules:
- quotas:
- charge: 1
quota: request-count-quota
---
apiVersion: config.istio.io/v1alpha2
kind: QuotaSpecBinding
metadata:
name: request-count-qsb
namespace: tsp
spec:
quotaSpecs:
- name: request-count-qs
namespace: tsp
services:
- name: mx-vehicle-parts-management
namespace: tsp
- name: s267
namespace: tsp
# - service: '*' # Uncomment this to bind *all* services to request-count-qs
---
结果1:
发送请求(3s时间 + 3个连接 + 3个处理线程)
request(
destionation: mx-vehicle-parts-management,
destinationVersion: v256,
path: /mx_vehicle_parts_management/battery/list,
sourceVin: luohqx
)
则对应ratelimit限制:maxAmount(500)/validDuration(1s)
请求成功率:100%
请求失败率:0%
结果2:
发送请求(3s时间 + 3个连接 + 3个处理线程)
request(
destionation: mx-vehicle-parts-management,
destinationVersion: v256,
path: /mx_vehicle_parts_management/battery/list,
sourceVin: luohq
)
则对应ratelimit限制:maxAmount(1)/validDuration(1s)
请求成功率:0.24%
请求失败率:99.76%
结果3:
发送请求(3s时间 + 3个连接 + 3个处理线程)
request(
destionation: mx-vehicle-parts-management,
destinationVersion: v256,
path: /mx_vehicle_parts_management/batteryVendor/list,
sourceVin: luohqx
)
则对应ratelimit限制:maxAmount(1)/validDuration(1s)
请求成功率:0.34%
请求失败率:99.66%
对于ratelimit配置修改后若没有生效,手动重启istio-policy即可
之后会继续调查ratelimit配置没有生效的原因,亦会关注Istio之后的版本(目前使用1.1.11)......