虛拟化技术是云计算平台的基础,其目标是对计算资源进行整合或划分,这是云计算管理平台中的关键技术。虚拟化技术为云计算管理乎台的资源管理提供了资源调配上的灵活性,从而使得云计算管理平台可以通过虚拟化层整合或划分计算资源。
相比于虚拟机,新出现的容器技术使用了一系列的系统级别的机制,诸如利用Linux Namespace进行空间隔离,通过文件系统的挂载点决定容器可以访问哪些文件,通过Cgroup确定每个容器可以利用多少资源。此外,容器之间共享同一个系统内核,这样当同一个内核被多个容器使用时,内存的使用效率会得到提升。
容器和虛拟机两大虛拟化技术,虽然实现方式完全不同,但是它们的资源需求和模型其实是类似的。容器像虛拟机一样需要内存、CPU、硬盘空间和网络带宽,宿主机系统可以将虛拟机和容器都视作一个整体,为这个整体分配其所需的资源,并进行管理。当然, 虚拟机提供了专用操作系统的安全性和更牢固的逻辑边界,而容器在资源边界上比较松散,这带来了灵活性以及不确定性。
Kubernetes是一个容器集群管理平台,Kubernetes需要统计整体平台的资源使用情况,合理地将资源分配给容器使用,并且要保证容器生命周期内有足够的资源来保证其运行。 更进一步,如果资源发放是独占的,即资源已发放给了个容器,同样的资源不会发放给另外一个容器,对于空闲的容器来说占用着没有使用的资源比如CPU是非常浪费的,Kubernetes需要考虑如何在优先度和公平性的前提下提高资源的利用率。
创建Pod的时候,可以指定计算资源(目前支持的资源类型有CPU和内存),即指定每个容器的资源请求(Request)和资源限制(Limit),资源请求是容器所需的最小资源需求,资源限制则是容器不能超过的资源上限。它们的大小关系是:
0<=request<=limit<=infinity
Pod的资源请求就是Pod中容器资源请求之和。Kubernetes在调度Pod时,会根据Node中的资源总量(通过cAdvisor接口获得),以及该Node上已使用的计算资源,来判断该Node是否满足需求。
资源请求能够保证Pod有足够的资源来运行,而资源限制则是防止某个Pod无限制地使用资源,导致其他Pod崩溃。特别是在公有云场景,往往会有恶意软件通过抢占内存来攻击平台。
原理:Docker 通过使用Linux Cgroup来实现对容器资源的控制,具体到启动参数上是--memory和--cpu-shares。Kubernetes中是通过控制这两个参数来实现对容器资源的控制。
以下给出某个Pod申请内存及CPU的示例:
[root@k8s-master demon2]# cat test-limit.yaml apiVersion: v1 kind: Pod metadata: labels: name: test-limit role: master name: test-limit spec: containers: - name: test-limit image: registry:5000/back_demon:1.0 resources: requests: memory: "256Mi" cpu: "500m" limits: memory: "512Mi" cpu: "1000m" command: - /run.sh
待Pod调度到具体某个机器上之后,在该机器上查询对应容器的详情,如下:
[root@k8s-node-4 home]# docker inspect 1fdbd6f1b39b [ { "Id": "1fdbd6f1b39b561d09084adafb382b721959e5edd0ee9538472313ed0a39162a", "Created": "2017-03-20T05:40:30.756006226Z", "Path": "/run.sh", "Args": [], "State": { "Status": "running", "Running": true, "Paused": false, "Restarting": false, "OOMKilled": false, "Dead": false, "Pid": 18603, "ExitCode": 0, "Error": "", "StartedAt": "2017-03-20T05:40:31.113657323Z", "FinishedAt": "0001-01-01T00:00:00Z" }, "Image": "sha256:9369911131d30b12759074e5b72356345446996bf6044950c2def787471e9b4c", "ResolvConfPath": "/var/lib/docker/containers/8fdb38a4d0074b07ff2f07e21fd8602fdbf4267eafc76179e931d4f5d9265940/resolv.conf", "HostnamePath": "/var/lib/docker/containers/8fdb38a4d0074b07ff2f07e21fd8602fdbf4267eafc76179e931d4f5d9265940/hostname", "HostsPath": "/var/lib/kubelet/pods/ba75e7a9-0d2f-11e7-b3d5-fa163ebba51b/etc-hosts", "LogPath": "/var/lib/docker/containers/1fdbd6f1b39b561d09084adafb382b721959e5edd0ee9538472313ed0a39162a/1fdbd6f1b39b561d09084adafb382b721959e5edd0ee9538472313ed0a39162a-json.log", "Name": "/k8s_test-limit.79cbd53f_test-limit_default_ba75e7a9-0d2f-11e7-b3d5-fa163ebba51b_efc94078", "RestartCount": 0, "Driver": "devicemapper", "MountLabel": "", "ProcessLabel": "", "AppArmorProfile": "", "ExecIDs": null, "HostConfig": { "Binds": [ "/var/lib/kubelet/pods/ba75e7a9-0d2f-11e7-b3d5-fa163ebba51b/etc-hosts:/etc/hosts:Z", "/var/lib/kubelet/pods/ba75e7a9-0d2f-11e7-b3d5-fa163ebba51b/containers/test-limit/efc94078:/dev/termination-log:Z" ], "ContainerIDFile": "", "LogConfig": { "Type": "json-file", "Config": {} }, "NetworkMode": "container:8fdb38a4d0074b07ff2f07e21fd8602fdbf4267eafc76179e931d4f5d9265940", "PortBindings": null, "RestartPolicy": { "Name": "", "MaximumRetryCount": 0 }, "AutoRemove": false, "VolumeDriver": "", "VolumesFrom": null, "CapAdd": null, "CapDrop": null, "Dns": null, "DnsOptions": null, "DnsSearch": null, "ExtraHosts": null, "GroupAdd": null, "IpcMode": "container:8fdb38a4d0074b07ff2f07e21fd8602fdbf4267eafc76179e931d4f5d9265940", "Cgroup": "", "Links": null, "OomScoreAdj": 984, "PidMode": "", "Privileged": false, "PublishAllPorts": false, "ReadonlyRootfs": false, "SecurityOpt": [ "seccomp=unconfined" ], "UTSMode": "", "UsernsMode": "", "ShmSize": 67108864, "Runtime": "docker-runc", "ConsoleSize": [ 0, 0 ], "Isolation": "", "CpuShares": 1024, "Memory": 536870912, "CgroupParent": "", "BlkioWeight": 0, "BlkioWeightDevice": null, "BlkioDeviceReadBps": null, "BlkioDeviceWriteBps": null, "BlkioDeviceReadIOps": null, "BlkioDeviceWriteIOps": null, "CpuPeriod": 100000, "CpuQuota": 200000, "CpusetCpus": "", "CpusetMems": "", "Devices": [], "DiskQuota": 0, "KernelMemory": 0, "MemoryReservation": 0, "MemorySwap": -1, "MemorySwappiness": -1, "OomKillDisable": false, "PidsLimit": 0, "Ulimits": null, "CpuCount": 0, "CpuPercent": 0, "IOMaximumIOps": 0, "IOMaximumBandwidth": 0 }, "GraphDriver": { "Name": "devicemapper", "Data": { "DeviceId": "210", "DeviceName": "docker-253:0-100693626-cb08877222f483f043fc45c5c4b024de8da9c393c3c06c6252d3c59d330dd4d4", "DeviceSize": "10737418240" } }, "Mounts": [ { "Source": "/var/lib/kubelet/pods/ba75e7a9-0d2f-11e7-b3d5-fa163ebba51b/etc-hosts", "Destination": "/etc/hosts", "Mode": "Z", "RW": true, "Propagation": "rprivate" }, { "Source": "/var/lib/kubelet/pods/ba75e7a9-0d2f-11e7-b3d5-fa163ebba51b/containers/test-limit/efc94078", "Destination": "/dev/termination-log", "Mode": "Z", "RW": true, "Propagation": "rprivate" } ], "Config": { "Hostname": "test-limit", "Domainname": "", "User": "", "AttachStdin": false, "AttachStdout": false, "AttachStderr": false, "ExposedPorts": { "222/tcp": {} }, "Tty": false, "OpenStdin": false, "StdinOnce": false, "Env": [ "KUBERNETES_PORT_443_TCP_PROTO=tcp", "KUBERNETES_PORT_443_TCP_PORT=443", "FRONTEND_SERVICE_PORT=tcp://10.254.232.119:8080", "REDIS_SERVICE_SERVICE_PORT=6379", "REDIS_SERVICE_PORT_6379_TCP_ADDR=10.254.71.136", "KUBERNETES_SERVICE_PORT_HTTPS=443", "KUBERNETES_SERVICE_HOST=10.254.0.1", "KUBERNETES_PORT_443_TCP=tcp://10.254.0.1:443", "BACK_SERVICE_PORT_8080_TCP_PORT=8080", "FRONTEND_SERVICE_PORT_8080_TCP=tcp://10.254.232.119:8080", "FRONTEND_SERVICE_PORT_8080_TCP_ADDR=10.254.232.119", "REDIS_SERVICE_PORT_6379_TCP=tcp://10.254.71.136:6379", "REDIS_MASTER_PORT_6379_TCP_PORT=6379", "FRONTEND_PORT_80_TCP_PORT=80", "REDIS_MASTER_PORT_6379_TCP_ADDR=10.254.132.210", "REDIS_SLAVE_PORT_6379_TCP=tcp://10.254.104.23:6379", "REDIS_SLAVE_PORT_6379_TCP_PORT=6379", "BACK_SERVICE_SERVICE_HOST=10.254.246.51", "BACK_SERVICE_PORT=tcp://10.254.246.51:8080", "BACK_SERVICE_PORT_8080_TCP_PROTO=tcp", "FRONTEND_PORT=tcp://10.254.93.91:80", "REDIS_MASTER_SERVICE_HOST=10.254.132.210", "REDIS_MASTER_PORT_6379_TCP_PROTO=tcp", "KUBERNETES_SERVICE_PORT=443", "FRONTEND_SERVICE_PORT_8080_TCP_PROTO=tcp", "REDIS_MASTER_SERVICE_PORT=6379", "REDIS_SLAVE_SERVICE_HOST=10.254.104.23", "REDIS_SLAVE_PORT=tcp://10.254.104.23:6379", "REDIS_SERVICE_PORT_6379_TCP_PROTO=tcp", "REDIS_MASTER_PORT=tcp://10.254.132.210:6379", "KUBERNETES_PORT_443_TCP_ADDR=10.254.0.1", "BACK_SERVICE_SERVICE_PORT=8080", "FRONTEND_SERVICE_HOST=10.254.93.91", "FRONTEND_SERVICE_SERVICE_PORT=8080", "REDIS_SERVICE_SERVICE_HOST=10.254.71.136", "REDIS_SERVICE_PORT=tcp://10.254.71.136:6379", "BACK_SERVICE_PORT_8080_TCP=tcp://10.254.246.51:8080", "REDIS_SLAVE_SERVICE_PORT=6379", "REDIS_SLAVE_PORT_6379_TCP_PROTO=tcp", "REDIS_SLAVE_PORT_6379_TCP_ADDR=10.254.104.23", "KUBERNETES_PORT=tcp://10.254.0.1:443", "BACK_SERVICE_PORT_8080_TCP_ADDR=10.254.246.51", "FRONTEND_PORT_80_TCP=tcp://10.254.93.91:80", "FRONTEND_PORT_80_TCP_ADDR=10.254.93.91", "FRONTEND_SERVICE_PORT_8080_TCP_PORT=8080", "REDIS_MASTER_PORT_6379_TCP=tcp://10.254.132.210:6379", "FRONTEND_PORT_80_TCP_PROTO=tcp", "FRONTEND_SERVICE_SERVICE_HOST=10.254.232.119", "REDIS_SERVICE_PORT_6379_TCP_PORT=6379" ], "Cmd": null, "Image": "registry:5000/back_demon:1.0", "Volumes": null, "WorkingDir": "", "Entrypoint": [ "/run.sh" ], "OnBuild": null, "Labels": { "io.kubernetes.container.hash": "79cbd53f", "io.kubernetes.container.name": "test-limit", "io.kubernetes.container.restartCount": "0", "io.kubernetes.container.terminationMessagePath": "/dev/termination-log", "io.kubernetes.pod.name": "test-limit", "io.kubernetes.pod.namespace": "default", "io.kubernetes.pod.terminationGracePeriod": "30", "io.kubernetes.pod.uid": "ba75e7a9-0d2f-11e7-b3d5-fa163ebba51b" } }, "NetworkSettings": { "Bridge": "", "SandboxID": "", "HairpinMode": false, "LinkLocalIPv6Address": "", "LinkLocalIPv6PrefixLen": 0, "Ports": null, "SandboxKey": "", "SecondaryIPAddresses": null, "SecondaryIPv6Addresses": null, "EndpointID": "", "Gateway": "", "GlobalIPv6Address": "", "GlobalIPv6PrefixLen": 0, "IPAddress": "", "IPPrefixLen": 0, "IPv6Gateway": "", "MacAddress": "", "Networks": null } } ]
LimitRange设计的初衷是为了满足以下场景:
能够约束租户的资源需求。
能够约束容器的资源请求范围。
能够约束Pod的资源请求范围。
能够指定容器的默认资源限制。
能够指定Pod的默认资源限制。
能够约束资源请求和限制之间的比例。
Kubernetes是一个多租户架构,当多用户或者团队共享一个Kubernetes系统的时候,系统管理员需要防止租户的资源抢占,定义好资源分配策略。比如Kubernetes系统共有20 核CPU和32GB内存,分配给A租户5核CPU和16GB,分配给B租户5核CPU 和8GB,预留10核CPU和8GB内存。这样,租户中所使用的CPU和内存的总和不能超过指定的资源配额,促使其更合理地使用资源。
Kubernetes提供API对象Resource Quota(资源配额)来实现资源配额,Resource Quota不仅可以作用于CPU和內存,另外还可以限制比如创建Pod的总数目、Service总数目、RC总数目等。
默认情况下,Namespace是没有Resource Quota的,需要另外创建Resource Quota。一旦Namespace中有了Resource Quota,那么创建Pod的时候就必须制定资源请求,否则Pod就会创建失败。