Linux 开机自启动服务的顺序指定

        今天,在产品线上遇到一个问题:产品有两个服务,A.service和B.service,两个服务都是使用systemd来启动,两个服务之间有依赖关系,A.service必须在B.service起来之后,才能使用B.service提供的功能。所以,在设置系统的两个服务的启动时,必须要设置好两者的启动关系,必须保证在B.service彻底启动起来之后,才能启动A.service,否则,A.service启动会因为不能使用B.service提供的功能而启动失败。

        看到Unit里面可以指定服务的启动的依赖关系,觉得这应该很简单,只要加上依赖关系不就可以了吗?

[Unit]
Description=A service
After=network.target
After=syslog.target
After=B.service

[Install]
WantedBy=multi-user.target

[Service]
User=usera
Group=usera

Type=forking

PIDFile=/var/a/a.pid

TimeoutSec=0

# Execute pre and post scripts as root
PermissionsStartOnly=true

# Start main service
ExecStart=/var/a/a.sh
# Sets open_files_limit
LimitNOFILE = 5000

Restart=on-failure

RestartPreventExitStatus=1

PrivateTmp=false
~

经过测试只添加After=B.service,这样是不可以的。 查看手册说明,发现指定两个服务的前后顺序,最好还要加上Requires,因为根据System.unit的说明:必须结合Requires和After才能够控制服务的启动顺序。于是修改以上脚本如下:

[Unit]
Description=A service
After=network.target
After=syslog.target
After=B.service
Requires=B.service

[Install]
WantedBy=multi-user.target

[Service]
User=usera
Group=usera

Type=forking

PIDFile=/var/a/a.pid

TimeoutSec=0

# Execute pre and post scripts as root
PermissionsStartOnly=true

# Start main service
ExecStart=/var/a/a.sh
# Sets open_files_limit
LimitNOFILE = 5000

Restart=on-failure

RestartPreventExitStatus=1

PrivateTmp=false
~

再次启动机器测试,另人失望的是还是不行。 根据对Wants和Requires的解释:

  • If unit1 has Wants=unit2 as a dependency, when unit1 is run, unit2 will be run as well. But whether unit2 starts successfully does not affect unit1 running successfully.
  • When unit1 has Requires=unit2, however, again both units will run, but if unit2 does not succeed, unit1 is also deactivated. This happens regardless of whether the processes of unit1 would otherwise have worked fine.

        这里只是提到依赖关系,却没有提到启动顺序的问题。两个服务还是会同时启动,也就是说dependencies和ordering对于Systemd来说是两回事。指定了依赖关系,并不一定保证服务的启动顺序。

于是,就猜测可能是什么原因,猜测的原因可能是:A.service启动的速度比较快,而B.service服务启动的比较慢,由于没法保证在B.service完全彻底地启动成功之后,再启动A.service,就可能导致A.service启动失败。根据猜测,就想着把A.servie服务是否可以sleep一会,于是,就添加了ExecStartPre=-/bin/sleep 5s来试一试。脚本修改如下:

[Unit]
Description=A service
After=network.target
After=syslog.target
After=B.service
Requires=B.service

[Install]
WantedBy=multi-user.target

[Service]
User=usera
Group=usera

Type=forking

PIDFile=/var/a/a.pid

TimeoutSec=0

# Execute pre and post scripts as root
PermissionsStartOnly=true

# Start main service
ExecStartPre=-/bin/sleep 5s
ExecStart=/var/a/a.sh
# Sets open_files_limit
LimitNOFILE = 5000

Restart=on-failure

RestartPreventExitStatus=1

PrivateTmp=false
~

再次重启机器测试,居然可以了。

通过此次试验可以知道,如果需要制定两个服务之间的严格的启动关系,光指定服务的启动顺序还不行,还需要一种手段控制一个服务在另外一个服务完全启动之后,才能启动。本次试验的脚本是通过sleep来完成的,当然不是很可靠的。

抱着不相信没有办法的态度,又一次阅读了systemd.unit

发现一段说明的文字:

Requires=

Similar to Wants=, but declares a stronger requirement dependency. Dependencies of this type may also be configured by adding a symlink to a .requires/ directory accompanying the unit file.

If this unit gets activated, the units listed will be activated as well. If one of the other units fails to activate, and an ordering dependency After= on the failing unit is set, this unit will not be started. Besides, with or without specifying After=, this unit will be stopped if one of the other units is explicitly stopped.

Often, it is a better choice to use Wants= instead of Requires= in order to achieve a system that is more robust when dealing with failing services.

Note that this dependency type does not imply that the other unit always has to be in active state when this unit is running. Specifically: failing condition checks (such as ConditionPathExists=ConditionPathIsSymbolicLink=, … — see below) do not cause the start job of a unit with a Requires= dependency on it to fail. Also, some unit types may deactivate on their own (for example, a service process may decide to exit cleanly, or a device may be unplugged by the user), which is not propagated to units having a Requires= dependency. Use the BindsTo= dependency type together with After= to ensure that a unit may never be in active state without a specific other unit also in active state (see below).

于是又尝试使用BindsTo结合After,结果虽然有所改善,但是,会出现偶尔还是不可以的情况,还是不靠谱。

最后还是不得已使用sleep的方法,只不过稍微sleep的时间长一些来保证两个服务之间的启动关系。

参考:

https://fedoramagazine.org/systemd-unit-dependencies-and-order/

systemd.service

你可能感兴趣的:(Linux,linux,服务器,运维)