StarlingX 补丁升级功能

Patch功能说明

StarlingX系统具有升级的能力,这个特性叫做“patching”,提供从2个版本之间升级的能力,主要用户bug修复、安全补丁和特性增强等等。

Patching支持两种补丁, In-Service补丁和Reboot-required补丁。In-Service补丁不需要主机节点重启,只需要服务进程重启即可。Reboot-required补丁需要重启主机以实现补丁生效。在升级Reboot-required补丁时,需要先对主机进行lock操作,等待补丁applied,再unlock使其生效。

这篇介绍文档,主要面向开发人员使用补丁功能,而不是产品用户指南。它更着重介绍补丁修复功能,而不是包含补丁的各个方面。

简要的说,补丁修复包含2个阶段,创建补丁和应用补丁。下面进行详细介绍这两部分。

 

创建补丁

 

一个StarlingX补丁包括一个或多个系统升级所需要的rpm包。在开始创建补丁之前需要验证rpm包已经在已部署的StarlingX上安装了。以下步骤可以帮助我们确认。

  1. 确定已部署系统的软件版本,有两种方式
    • 在horizon界面上 Admin -> Platform -> System Configuration -> Systems
    • 使用命令行 system show
    controller-0:~$ . /etc/platform/openrc
    [sysadmin@controller-0 ~(keystone_admin)]$ system show
    +----------------------+--------------------------------------+
    | Property             | Value                                |
    +----------------------+--------------------------------------+
    | contact              | None                                 |
    | created_at           | 2019-10-14T03:10:50.862114+00:00     |
    | description          | None                                 |
    | https_enabled        | False                                |
    | location             | None                                 |
    | name                 | 608dfe48-9a05-4b21-afc1-ea122574caa7 |
    | region_name          | RegionOne                            |
    | sdn_enabled          | False                                |
    | security_feature     | spectre_meltdown_v1                  |
    | service_project_name | services                             |
    | software_version     | 19.09                                |
    | system_mode          | duplex                               |
    | system_type          | All-in-one                           |
    | timezone             | UTC                                  |
    | updated_at           | 2019-10-14T03:12:41.983029+00:00     |
    | uuid                 | 2639ad15-08a7-4f1b-a372-f927a5e4ab31 |
    | vswitch_type         | none                                 |
    +----------------------+--------------------------------------+
  1. 检查最新构建,找到针对这个版本需要升级的rpm包,选择需要的rpm生成补丁

一旦确定需要升级/安装的rpm包,下一步就是准备补丁构建环境。作为StarlingX开发人员,最简单的办法是使用StarlingX Building 容器,我们只需要对容器进行小小的修改就可以了。StarlingX Building 容器可以使用构建教程生成。

现在假设StarlingX的源码已经下载好了,需要升级安装的rpm包也准备好了,现在我们开始构造补丁构建环境。再次声明,这个教程主要针对开发人员,而不是产品。

  1. 安装2个cgcs-patch依赖包,cryptopycrypto

sudo pip install crypto pycrypto

  1. 使用脚本$MY_REPO/stx/stx-update/extras/scripts/patch_build.sh创建补丁。

在这个脚本中,它从release-info.inc 文件中获取PLATFORM_RELEASE参数,并把PYTHONPATH指向repo中的cgcs-patch包,避免了安装cgcs-patch和手动指定PLATFORM_RELEASE参数。可以使用下面命令查看构建脚本的使用说明。

    $ $MY_REPO/stx/stx-update/cgcs-patch/bin/patch_build --help
    Usage: patch_build [  ] ... 
    Options:
        --id                    Patch ID
        --release          Platform release version
        --status            Patch Status Code (ie. O, R, V)
        --unremovable               Marks patch as unremovable
        --reboot-required      Marks patch as reboot-required (default=Y)
        --summary          Patch Summary
        --desc         Patch Description
        --warn            Patch Warnings
        --inst        Patch Install Instructions
        --req             Required Patch
        --controller           New package for controller
        --worker               New package for worker node
        --worker-lowlatency    New package for worker-lowlatency node
        --storage              New package for storage node
        --controller-worker    New package for combined node
        --controller-worker-lowlatency    New package for lowlatency
                                    combined node
        --all-nodes            New package for all node types

使用这个脚本可以指定patch id、reboot required、depended patches、rpm list等等,如果系统上没有的,需要新安装的包需要指定节点,比如 --controller 指定是在控制节点上新装包。脚本执行完后,可以得到名字为“.patch”的文件。

下面深入研究下这个补丁文件。

  1. 首先,这个补丁文件是个gzip压缩包。我们可以通过file命令去检查下。
    $ file 001.patch
    001.patch: gzip compressed data, was "001.patch", last modified:
    Fri Aug 16 05:56:59 2019, max compression
  1. 解压出来后,可以看到以下文件
    $ tar -xf 001.patch
    $ tree
    ├── 001.patch
    ├── metadata.tar
    ├── signature
    ├── signature.v2
    └── software.tar
  1. 解压 software.tar,可以发现它包含了所有需要安装的rpm包。注意:所有的rpm包在补丁构建时用下面的key进行签名。
$MY_REPO/build-tools/signing/ima_signing_key.priv
  1. 在metadata.tar中只有一个文件metadata.xml,包含补丁构建的所有信息。StarlingX集群系统会读取这个文件信息。
  2. signature文件包含software.tar和metadata.tar的MD5的组合。
  3. signature.v2是为software.tar和metadata.tar的签名文件,在当前环境中,它由$MY_REPO/build-tools/signing/dev-private-key.pemkey文件生成。

 

安装补丁

 

补丁生成后,可以手动安装补丁到指定的StarlingX系统,同时支持界面和命令行安装操作。补丁的生命周期包括四个状态: Available,Partial-Apply, Applied Partial-Remove.

  • Available:表示补丁已经上传到补丁存储库里了,但是还没有到软件升级的仓库,同样也没有在任何主机上安装。
  • Partial-Apply:表示补丁升级程序已经被触发(sw-patch apply),已经在部分主机上安装,但是还没有在所有需要安装的主机上安装。
  • Applied:表示已经在所有需要安装的主机上安装完成。
  • Partial-Remove:表示补丁正在被移除,通过命令触发(sw-patch remove),正在移除,但是还没完全移除。

如果需要用命令行安装补丁,需要把补丁拷贝到active的控制节点上。StarlingX集群提供客户端命令sw-patch。补丁操作都是通过这个命令完成,这个命令提供了很多功能,包括upload, apply, query,host-install, delete, remove等等。

    controller-0:~$ sw-patch --help
    usage: sw-patch [--debug]
                       ...

    Subcomands:
        upload:         Upload one or more patches to the patching system.

        upload-dir:     Upload patches from one or more directories to the
                        patching system.

        apply:          Apply one or more patches. This adds the specified
                        patches to the repository, making the update(s)
                        available to the hosts in the system. Use --all to
                        apply all available patches.
                        Patches are specified as a space-separated list of
                        patch IDs.

        remove:         Remove one or more patches. This removes the specified
                        patches from the repository.
                        Patches are specified as a space-separated list of
                        patch IDs.

        delete:         Delete one or more patches from the patching system.
                        Patches are specified as a space-separated list of
                        patch IDs.

        query:          Query system patches. Optionally, specify 'query
                        applied' to query only those patches that are applied,
                        or 'query available' to query those that are not.

        show:           Show details for specified patches.

        what-requires:  List patches that require the specified patches.

        query-hosts:    Query patch states for hosts in the system.

        host-install:   Trigger patch install/remove on specified host. To
                        force install on unlocked node, use the --force option.

        host-install-async: Trigger patch install/remove on specified host. To
                        force install on unlocked node, use the --force option.
                        Note: This command returns immediately upon dispatching
                        installation request.

        install-local:  Trigger patch install/remove on the local host. This
                        command can only be used for patch installation prior
                        to initial configuration.

        drop-host:      Drop specified host from table.

        query-dependencies: List dependencies for specified patch. Use
                        --recursive for recursive query.

        is-applied:     Query Applied state for list of patches. Returns True
                        if all are Applied, False otherwise.

        report-app-dependencies: Report application patch dependencies,
                        specifying application name with --app option, plus a
                        list of patches. Reported dependencies can be dropped
                        by specifying app with no patch list.

        query-app-dependencies: Display set of reported application patch
                        dependencies.

        commit:         Commit patches to free disk space. WARNING: This
                        action is irreversible!

        --os-region-name: Send the request to a specified region

下面演示如何使用这个命令去安装补丁。演示用的补丁是需要安装在所有主机上的In-Service补丁,需要升级的StarlingX环境是 2+2+2的标准环境。

  1. 上传补丁文件
    controller-0:~$ sudo sw-patch upload 001.patch
    001 is now available
  # 检查补丁状态
  controller-0:~$ sudo sw-patch query
    Patch ID  RR  Release  Patch State
    ========  ==  =======  ===========
    001       N    19.09    Available
  # 检查所有主机的的升级状态
  controller-0:/$ sudo sw-patch query-hosts
    Hostname      IP Address      Patch Current  Reboot Required  Release State
    ============  ==============  =============  ===============  ======  =====
    compute-0     192.178.204.7        Yes             No          19.09   idle
    compute-1     192.178.204.9        Yes             No          19.09   idle
    controller-0  192.178.204.3        Yes             No          19.09   idle
    controller-1  192.178.204.4        Yes             No          19.09   idle
    storage-0     192.178.204.12       Yes             No          19.09   idle
    storage-1     192.178.204.11       Yes             No          19.09   idle
  # Patch Current 表示当前主机是否有补丁安装,Yes表示没有安装补丁,No表示至少有一个补丁在安装
  1. 当补丁状态available后,可以触发补丁安装
    controller-0:/$ sudo sw-patch apply 001
    001 is now in the repo
  # 检查补丁状态
  controller-0:~$ sudo sw-patch query
    Patch ID  RR  Release   Patch State
    ========  ==  =======  =============
    001       N    19.09   Partial-Apply
  # 检查节点状态
  controller-0:~$ sudo sw-patch query-hosts
    Hostname      IP Address      Patch Current  Reboot Required  Release State
    ============  ==============  =============  ===============  ======  =====
    compute-0     192.178.204.7        No              No          19.09   idle
    compute-1     192.178.204.9        No              No          19.09   idle
    controller-0  192.178.204.3        No              No          19.09   idle
    controller-1  192.178.204.4        No              No          19.09   idle
    storage-0     192.178.204.12       No              No          19.09   idle
    storage-1     192.178.204.11       No              No          19.09   idle
  1. 在每个节点上安装补丁,由于是in-service 补丁,所以不需要执行lock操作。
    controller-0:~$ sudo sw-patch host-install controller-0
    ...
    Installation was successful.
  # 检查主机升级状态
  controller-0:~$ sudo sw-patch query-hosts
    Hostname      IP Address    Patch Current    Reboot Required  Release State
    ============  ==============  =============  ===============  ======  =====
    compute-0     192.178.204.7        No              No          19.09   idle
    compute-1     192.178.204.9        No              No          19.09   idle
    controller-0  192.178.204.3        Yes             No          19.09   idle
    controller-1  192.178.204.4        No              No          19.09   idle
    storage-0     192.178.204.12       No              No          19.09   idle
    storage-1     192.178.204.11       No              No          19.09   idle
  
  # 在所有节点上安装补丁,需要为每个节点执行命令
  controller-0:~$ sudo sw-patch host-install controller-1
    ....
    Installation was successful.
    controller-0:~$ sudo sw-patch host-install compute-0
    ....
    Installation was successful.
    controller-0:~$ sudo sw-patch host-install compute-1
    ....
    Installation was successful.
    controller-0:~$ sudo sw-patch host-install storage-0
    ...
    Installation was successful.
    controller-0:~$ sudo sw-patch host-install storage-1
    ...
    Installation was successful.
  1. 所有节点按照完毕后,可以看到下面状态
    controller-0:~$ sudo sw-patch query
    Patch ID  RR  Release  Patch State
    ========  ==  =======  ===========
    001       N    19.09     Applied

    controller-0:~$ sudo sw-patch query-hosts
    Hostname      IP Address      Patch Current Reboot Required  Release  State
    ============  ==============  ============  ===============  =======  =====
    compute-0     192.178.204.7        Yes             No          19.09   idle
    compute-1     192.178.204.9        Yes             No          19.09   idle
    controller-0  192.178.204.3        Yes             No          19.09   idle
    controller-1  192.178.204.4        Yes             No          19.09   idle
    storage-0     192.178.204.12       Yes             No          19.09   idle
    storage-1     192.178.204.11       Yes             No          19.09   idle
  # 此时补丁升级程序完成

除了补丁升级,StarlingX还支持补丁回退和删除,通过下面两个命令实现sw-patch removesw-patch host-install,和补丁安装有点类似。

 

补丁编排

 

在上面的例子中,演示了在集群中补丁升级的功能。但是在大规模集群中,整个升级过程会持续很长的时间。特别是reboot-required补丁,这个方案会很糟糕,效率很低而且给管理员带来很多工作。因此StarlingX提供了另一个高级特性“补丁编排”。它支持集群通过一些简单的操作达到升级的目的,极大减少管理员的工作负担和较少出错。这个功能有三种方式使用,客户端CLI、界面Horizon和VIM Restful API。

  1. 客户端CLI。StarlingX提供客户端工具sw-manager, 可以用于补丁编排。如下所示,可以通过创建和应用补丁策略来升级整个集群
    controller-0:~$ sw-manager patch-strategy -h
    usage: sw-manager patch-strategy [-h]  ...

    optional arguments:
      -h, --help  show this help message and exit

    Software Patch Commands:

        create    Create a strategy
        delete    Delete a strategy
        apply     Apply a strategy
        abort     Abort a strategy
        show      Show a strategy

    controller-0:~$ sw-manager patch-strategy create -h
    usage: sw-manager patch-strategy create [-h]
                [--controller-apply-type {serial,ignore}]
                [--storage-apply-type {serial,parallel,ignore}]
                [--worker-apply-type {serial,parallel,ignore}]
                [--max-parallel-worker-hosts {2,3,4,5,6,7,8,9,10,
                11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,
                28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,
                45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,
                62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,
                79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,
                96,97,98,99,100}]
                [--instance-action {migrate,stop-start}]
                [--alarm-restrictions {strict,relaxed}]

    optional arguments:
      -h, --help            show this help message and exit
      --controller-apply-type {serial,ignore}
                            defaults to serial
      --storage-apply-type {serial,parallel,ignore}
                            defaults to serial
      --worker-apply-type {serial,parallel,ignore}
                            defaults to serial
      --max-parallel-worker-hosts {2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,
            17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,
            37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,
            57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,
            77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,
            97,98,99,100}
                            maximum worker hosts to patch in parallel
      --instance-action {migrate,stop-start}
                            defaults to stop-start
      --alarm-restrictions {strict,relaxed}
                            defaults to strict
  1. 界面Horizon。打开Admin -> Platform -> Software Management -> Patch Orchestration 标签
  2. VIM API。:4545>
+--------+---------------------------------------+----------------------------+
| Method | URI                                   | Description                |
+========+=======================================+============================+
| Post   | /api/orchestration/sw-update/strategy | Create a patch strategy    |
+--------+---------------------------------------+----------------------------+
| Delete | /api/orchestration/sw-update/strategy | Delete current patch       |
|        |                                       | strategy                   |
+--------+---------------------------------------+----------------------------+
| Get    | /api/orchestration/sw-update/strategy | Get detailed information of|
|        |                                       | current patch strategy     |
+--------+---------------------------------------+----------------------------+
| Post   | /api/orchestration/sw-update/strategy/| Apply or abort a patch     |
|        | actions                               | strategy                   |
+--------+---------------------------------------+----------------------------+

在补丁安装时,补丁编排要求集群处于一个良好的状态。

  • 所有主机必须处于unlocked-enabled-available状态
  • 系统没有告警
  • 足够的空间用于VM迁移

 

当前开发状态

  • 所有的源码都在StarlingX仓库里开源,包括“update”和“nfv”
  • in-service补丁和reboot-required补丁的生成和安装已经经过验证
  • 补丁编排还没经过验证

你可能感兴趣的:(StarlingX 补丁升级功能)