kubernetes auto scaling

    • Auto scaling
      • Create deployment
      • Create hpa
      • Increate load
      • Stop load
      • reference
      • 2017/10/19 更新
        • 参考链接

Auto scaling

Create deployment

创建 deployment (或者直接创建 pod) 时,需要指定 resource 配额。否则会导致 auto scaling 功能无法正常运行。关键配置如下:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: weather
  namespace: default
spec:
  replicas: 1
  selector:
    matchLabels:
      app: weather
  template:
    metadata:
      labels:
        app: weather
      name: weather
    spec:
      containers:
      - image: 172.16.18.5:30088/admin/centos7.1-v1-weather:789
        resources:
          requests:
            cpu: "4"
            memory: 8Gi
          limits:
            cpu: "4"
            memory: 8Gi

Create hpa

apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  name: weather-autoscaling
  namespace: default
spec:
  scaleTargetRef:
    apiVersion: extensions/v1beta1
    kind: Deployment
    name: weather
  minReplicas: 1
  maxReplicas: 5
  targetCPUUtilizationPercentage: 10

设置cpu到10%时,开始扩展。最多5个pod

Increate load

通过终端观察 pod 数量:

[root@k8s-master ~]# kubectl get pod | grep weather
weather-277902962-m0wwk     1/1     Running     0       2d

通过webbenchpod打压力

[root@k8s-master ~]# webbench -c 50 -t 600 -2 -d "Service-Id: epg_server" http://192.168.58.30:6600/weather/v1/weather?city=2151849

压测期间,可通过命令或者监控界面的方式来查看 pod 当前负载情况

[root@k8s-master ~]# kubectl get hpa
NAME                REFERENCE           TARGET    CURRENT     MINPODS     MAXPODS     AGE
weather-autoscaling Deployment/weather  10%       48%         1           5           1d    
[root@k8s-master ~]# kubectl get pod | grep weather | wc -l
4

Stop load

webbench 停止之后,pod 负载开始下降
可通过命令观察到(停止压测后的1分钟或者几分钟):

[root@k8s-master ~]# kubectl get hpa
NAME                REFERENCE           TARGET    CURRENT     MINPODS     MAXPODS     AGE
weather-autoscaling Deployment/weather  10%       0%         1           5           1d 

cpu 降低到0, 所以pod数量相应的减少为1

Note:
1. autoscaling 需要花费几分钟的时间
2. 只有设定了resource 才能进行 autoscaling
3. 只有DeploymentRc支持 autoscaling
4. 如果一个 pod 里包含了多个容器,责需要为每个容器都设定配额

reference

有关 pod autoscaling 计算公式和准则如下:

The period of the autoscaler is controlled by the
–horizontal-pod-autoscaler-sync-period flag of controller manager. The default value is 30 seconds.
CPU utilization is the recent CPU usage of a pod (average across the last 1 minute) divided by the CPU requested by the pod.
The target number of pods is calculated from the following formula:

TargetNumOfPods = ceil(sum(CurrentPodsCPUUtilization) / Target)

Starting and stopping pods may introduce noise to the metric (for instance, starting may temporarily increase CPU). So, after each action, the autoscaler should wait some time for reliable data. Scale-up can only happen if there was no rescaling within the last 3 minutes. Scale-down will wait for 5 minutes from the last rescaling. Moreover any scaling will only be made if:

avg(CurrentPodsConsumption) / Target

drops below 0.9 or increases above 1.1 (10% tolerance). Such approach has two benefits:
- Autoscaler works in a conservative way. If new user load appears, it is important for us to rapidly increase the number of pods, so that user requests will not be rejected. Lowering the number of pods is not that urgent.
- Autoscaler avoids thrashing, i.e.: prevents rapid execution of conflicting decision if the load is not stable.

2017/10/19 更新

kubernetes 1.8 开始支持memory和自定义条件。

The Horizontal Pod Autoscaler is an API resource in the Kubernetes autoscaling API group. The current stable version, which only includes support for CPU autoscaling, can be found in the autoscaling/v1 API version.

The beta version, which includes support for scaling on memory and custom metrics, can be found in autoscaling/v2beta1. The new fields introduced in autoscaling/v2beta1 are preserved as annotations when working with autoscaling/v1

以 memory 为例:

  1. 定义配置:
[root@walker-1 test]# cat test-auto.yaml 
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: test
  namespace: default
spec:
  scaleTargetRef:
    apiVersion: apps/V1beta1
    kind: Deployment
    name: test
  minReplicas: 1
  maxReplicas: 3
  metrics:
  - type: Resource
    resource:
      name: memory
      targetAverageUtilization: 50
  1. 写个内存占用脚本,放到容器内部执行
  2. 观察
[root@walker-1 test]# kubectl describe hpa test
Name:                                                     test
Namespace:                                                default
...
Reference:                                                Deployment/test
Metrics:                                                  ( current / target )
  resource memory on pods  (as a percentage of request):  57% (52979712) / 50%
Min replicas:                                             1
Max replicas:                                             3
...
[root@walker-1 ~]# kubectl get pod -o wide | grep test
test-5567b449f-mgghr                1/1       Running   19         6h        192.168.135.99    walker-2.novalocal
test-5567b449f-tlqtd                1/1       Running   0          6m        192.168.218.220   walker-4.novalocal

参考链接

https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale-walkthrough/

https://github.com/kubernetes/community/blob/master/contributors/design-proposals/horizontal-pod-autoscaler.md

你可能感兴趣的:(kubernetes)