stackstorm 30. 源码分析之----stackstorm重要场景run action


目标:
弄清楚run action原理
目录:
0 st2api调试代码前准备
1 st2api服务分析
2 关键方法publish_request分析
3 st2actionrunner调试代码前准备
4 st2actionrunner服务分析
5 总结

0 st2api调试代码前准备
将st2api容器启动命令修改

      containers:
      - command:
        - bash
        - -c
        - exec /opt/stackstorm/st2/bin/gunicorn st2api.wsgi:application -k eventlet
          -b 0.0.0.0:9101 --workers 1 --threads 1 --graceful-timeout 10 --timeout
          30
修改为:
      containers:
      - command:
        - sleep
        - 3d
等待st2api的pod启动后
修改/etc/st2/st2.conf
[auth]
host = 127.0.0.1
[api]
host = 127.0.0.1
的host的值修改为0.0.0.0

原因:
服务监听127.0.0.1只有本机才可以监听,而
0.0.0.0则表示其他节点可以发送请求,这样监听服务也可以监听到其他节点的请求。


修改为:
sudo /opt/stackstorm/st2/bin/st2api --config-file /etc/st2/st2.conf

进入st2任意的pod,执行st2 login命令后,然后执行如下命令:
st2 run core.local cmd=ls

1 st2api服务分析
代码入口
/opt/stackstorm/st2/lib/python2.7/site-packages/st2api/controllers/v1/actionexecutions.py的post()
代码如下:
class ActionExecutionsController(BaseResourceIsolationControllerMixin,
                                 ActionExecutionsControllerMixin, ResourceController):

    def post(self, liveaction_api, requester_user, context_string=None, show_secrets=False):
        return self._handle_schedule_execution(liveaction_api=liveaction_api,
                                               requester_user=requester_user,
                                               context_string=context_string,
                                               show_secrets=show_secrets)

分析:
1.1)执行run action命令进入到
上述st2api的代码
对应输入参数样例如下:
(Pdb) p liveaction_api

(Pdb) p liveaction_api.__dict__
{u'action': u'core.local', u'user': None, u'parameters': {u'cmd': u'ls'}}
(Pdb) p requester_user

(Pdb) p requester_user.__dict__
{'_cls': 'UserDB'}
(Pdb) p context_string
None
(Pdb) p show_secrets
None

1.2)继续进入
class ActionExecutionsControllerMixin(BaseRestControllerMixin):

    def _handle_schedule_execution(self, liveaction_api, requester_user, context_string=None,
                                   show_secrets=False):
        """
        :param liveaction: LiveActionAPI object.
        :type liveaction: :class:`LiveActionAPI`
        """

        if not requester_user:
            requester_user = UserDB(cfg.CONF.system_user.user)

        # Assert action ref is valid
        action_ref = liveaction_api.action
        action_db = action_utils.get_action_by_ref(action_ref)

        if not action_db:
            message = 'Action "%s" cannot be found.' % action_ref
            LOG.warning(message)
            abort(http_client.BAD_REQUEST, message)

        # Assert the permissions
        assert_user_has_resource_db_permission(user_db=requester_user, resource_db=action_db,
                                               permission_type=PermissionType.ACTION_EXECUTE)

        # Validate that the authenticated user is admin if user query param is provided
        user = liveaction_api.user or requester_user.name
        assert_user_is_admin_if_user_query_param_is_provided(user_db=requester_user,
                                                             user=user)

        try:
            return self._schedule_execution(liveaction=liveaction_api,
                                            requester_user=requester_user,
                                            user=user,
                                            context_string=context_string,
                                            show_secrets=show_secrets,
                                            pack=action_db.pack)
        except ValueError as e:
            LOG.exception('Unable to execute action.')
            ......
分析:
上述主要处理逻辑是:
根据action_ref查询得到action_db,调用_schedule_execution获取执行结果


1.3) 分析_schedule_execution方法
class ActionExecutionsControllerMixin(BaseRestControllerMixin):

    def _schedule_execution(self,
                            liveaction,
                            requester_user,
                            user=None,
                            context_string=None,
                            show_secrets=False,
                            pack=None):
        # Initialize execution context if it does not exist.
        if not hasattr(liveaction, 'context'):
            liveaction.context = dict()

        liveaction.context['user'] = user
        liveaction.context['pack'] = pack
        LOG.debug('User is: %s' % liveaction.context['user'])

        # Retrieve other st2 context from request header.
        if context_string:
            context = try_loads(context_string)
            if not isinstance(context, dict):
                raise ValueError('Unable to convert st2-context from the headers into JSON.')
            liveaction.context.update(context)

        # Include RBAC context (if RBAC is available and enabled)
        if cfg.CONF.rbac.enable:
            user_db = UserDB(name=user)
            role_dbs = rbac_service.get_roles_for_user(user_db=user_db, include_remote=True)
            roles = [role_db.name for role_db in role_dbs]
            liveaction.context['rbac'] = {
                'user': user,
                'roles': roles
            }

        # Schedule the action execution.
        liveaction_db = LiveActionAPI.to_model(liveaction)
        action_db = action_utils.get_action_by_ref(liveaction_db.action)
        runnertype_db = action_utils.get_runnertype_by_name(action_db.runner_type['name'])

        try:
            liveaction_db.parameters = param_utils.render_live_params(
                runnertype_db.runner_parameters, action_db.parameters, liveaction_db.parameters,
                liveaction_db.context)
        except param_exc.ParamException:

            # We still need to create a request, so liveaction_db is assigned an ID
            liveaction_db, actionexecution_db = action_service.create_request(liveaction_db)

            # By this point the execution is already in the DB therefore need to mark it failed.
            _, e, tb = sys.exc_info()
            action_service.update_status(
                liveaction=liveaction_db,
                new_status=action_constants.LIVEACTION_STATUS_FAILED,
                result={'error': str(e), 'traceback': ''.join(traceback.format_tb(tb, 20))})
            # Might be a good idea to return the actual ActionExecution rather than bubble up
            # the exception.
            raise validation_exc.ValueValidationException(str(e))

        # The request should be created after the above call to render_live_params
        # so any templates in live parameters have a chance to render.
        liveaction_db, actionexecution_db = action_service.create_request(liveaction_db)
        liveaction_db = LiveAction.add_or_update(liveaction_db, publish=False)

        _, actionexecution_db = action_service.publish_request(liveaction_db, actionexecution_db)
        mask_secrets = self._get_mask_secrets(requester_user, show_secrets=show_secrets)
        execution_api = ActionExecutionAPI.from_model(actionexecution_db, mask_secrets=mask_secrets)

        return Response(json=execution_api, status=http_client.CREATED)


分析:
1)_schedule_execution方法
    1 根据输入参数,形如
        (Pdb) p liveaction_api
       
        (Pdb) p liveaction_api.__dict__
        {u'action': u'core.local', u'user': None, u'parameters': {u'cmd': u'ls'}}
        (Pdb) p requester_user
       
        (Pdb) p requester_user.__dict__
        {'_cls': 'UserDB'}
        (Pdb) p context_string
        None
        (Pdb) p show_secrets
        None
        pack : core
    2 根据action_ref查询得到action_db,根据runner_type名称(例如: 'local-shell-cmd')查询得到runnertype_db
    3 获取action实例的参数(例如: liveaction_db.parameters{u'cmd': u'ls'})
    4 调用create_request(liveaction): 创建一个action的执行,返回(liveaction, execution),具体是:
        向live_action_d_b表添加或更新liveaction
        创建execution
    5 调用publish_request(liveaction, execution)方法,具体是:
        发送liveaction消息到'st2.liveaction'这个exchange, routing_key为'create'
        发送liveaction消息到'st2.liveaction.status'这个exchange, routing_key为'requested'
        发送actionexecution消息到'st2.execution'这个exchange, routing_key为'create'
        返回: liveaction, execution
    6 将execution返回

其中最为关键的就是5 调用publish_request(liveaction, execution)方法
发送liveaction消息到'st2.liveaction.status'这个exchange, routing_key为'requested'
具体参见2的分析

2 关键方法publish_request分析
_, actionexecution_db = action_service.publish_request(liveaction_db, actionexecution_db)
进行liveaction状态的修改,导致actionrunner接收到消息并进行处理
下面是actionrunner的日志
2020-06-05T08:36:02.974507088Z 2020-06-05 16:36:02,925 AUDIT [-] The status of action execution is changed from requested to scheduled. (liveaction_db={'status': 'scheduled', 'runner_info': {},

根据我之前的文章分析:
https://blog.csdn.net/qingyuanluofeng/java/article/details/105398730

ActionExecutionScheduler类中监听的队列,绑定关系如下
'st2.liveaction.status'--->'requested'--->'st2.actionrunner.req'

现在这里发送的消息
exchange是'st2.liveaction.status',routing_key是'scheduled'

ActionExecutionDispatcher类中监听的队列,绑定关系如下
'st2.liveaction.status'--->'scheduled'--->'st2.actionrunner.work',
'st2.liveaction.status'--->'canceling'--->'st2.actionrunner.cancel', 
'st2.liveaction.status'--->'pausing'--->'st2.actionrunner.pause',
'st2.liveaction.status'--->'resuming'--->'st2.actionrunner.resume'

所以ActionExecutionScheduler实际起到的作用就是将liveaction的状态从requested转换为scheduled,
并以exchange是'st2.liveaction.status',routing_key是'scheduled'将liveaction作为payload发送消息出去,
该发送的消息会被ActionExecutionDispatcher类中监听的队列:
'st2.liveaction.status'--->'scheduled'--->'st2.actionrunner.work',
收到,收到消息后,最终会触发action的执行。


3 st2actionrunner调试代码前准备
现在可以跳过ActionExecutionScheduler类的分析,直接进入到
ActionExecutionDispatcher类进行分析,因为最终是在这个类中进行action的真正执行。
接下来需要调试st2actionrunner服务
先修改st2actionrunner的容器启动命令
      containers:
      - command:
        - bash
        - -c
        - exec /opt/stackstorm/st2/bin/st2actionrunner --config-file /etc/st2/st2.conf


      containers:
      - command:
        - sleep
        - 3d
然后等待pod启动后,修改/etc/st2/st2.conf
[auth]
host = 127.0.0.1
[api]
host = 127.0.0.1
的host的值修改为0.0.0.0

进入st2actionrunner的pod中,在如下代码处加上断点
cd /opt/stackstorm/st2/lib/python2.7/site-packages/st2actions
vi worker.py

调试:
然后执行如下命令来手动开启st2actionrunner服务
sudo /opt/stackstorm/st2/bin/st2actionrunner --config-file /etc/st2/st2.conf
进入其他任意一个st2的pod,执行:
st2 run core.local cmd=ls

4 st2actionrunner服务分析
代码入口
class ActionExecutionDispatcher(MessageHandler):

    def process(self, liveaction):
        """Dispatches the LiveAction to appropriate action runner.

        LiveAction in statuses other than "scheduled" and "canceling" are ignored. If
        LiveAction is already canceled and result is empty, the LiveAction
        is updated with a generic exception message.

        :param liveaction: Action execution request.
        :type liveaction: ``st2common.models.db.liveaction.LiveActionDB``

        :rtype: ``dict``
        """

        if liveaction.status == action_constants.LIVEACTION_STATUS_CANCELED:
            LOG.info('%s is not executing %s (id=%s) with "%s" status.',
                     self.__class__.__name__, type(liveaction), liveaction.id, liveaction.status)
            if not liveaction.result:
                updated_liveaction = action_utils.update_liveaction_status(
                    status=liveaction.status,
                    result={'message': 'Action execution canceled by user.'},
                    liveaction_id=liveaction.id)
                executions.update_execution(updated_liveaction)
            return

        if liveaction.status not in ACTIONRUNNER_DISPATCHABLE_STATES:
            LOG.info('%s is not dispatching %s (id=%s) with "%s" status.',
                     self.__class__.__name__, type(liveaction), liveaction.id, liveaction.status)
            return

        try:
            liveaction_db = action_utils.get_liveaction_by_id(liveaction.id)
        except StackStormDBObjectNotFoundError:
            LOG.exception('Failed to find liveaction %s in the database.', liveaction.id)
            raise

        if liveaction.status != liveaction_db.status:
            LOG.warning(
                'The status of liveaction %s has changed from %s to %s '
                'while in the queue waiting for processing.',
                liveaction.id,
                liveaction.status,
                liveaction_db.status
            )

        dispatchers = {
            action_constants.LIVEACTION_STATUS_SCHEDULED: self._run_action,
            action_constants.LIVEACTION_STATUS_CANCELING: self._cancel_action,
            action_constants.LIVEACTION_STATUS_PAUSING: self._pause_action,
            action_constants.LIVEACTION_STATUS_RESUMING: self._resume_action
        }

        return dispatchers[liveaction.status](liveaction)

分析:
4.1)进入
class ActionExecutionDispatcher(MessageHandler):

    def _run_action(self, liveaction_db):
        # stamp liveaction with process_info
        runner_info = system_info.get_process_info()

        # Update liveaction status to "running"
        liveaction_db = action_utils.update_liveaction_status(
            status=action_constants.LIVEACTION_STATUS_RUNNING,
            runner_info=runner_info,
            liveaction_id=liveaction_db.id)

        self._running_liveactions.add(liveaction_db.id)

        action_execution_db = executions.update_execution(liveaction_db)

        # Launch action
        extra = {'action_execution_db': action_execution_db, 'liveaction_db': liveaction_db}
        LOG.audit('Launching action execution.', extra=extra)

        # the extra field will not be shown in non-audit logs so temporarily log at info.
        LOG.info('Dispatched {~}action_execution: %s / {~}live_action: %s with "%s" status.',
                 action_execution_db.id, liveaction_db.id, liveaction_db.status)

        extra = {'liveaction_db': liveaction_db}
        try:
            result = self.container.dispatch(liveaction_db)
            LOG.debug('Runner dispatch produced result: %s', result)
            if not result:
                raise ActionRunnerException('Failed to execute action.')
        except:
            _, ex, tb = sys.exc_info()
            extra['error'] = str(ex)
            LOG.info('Action "%s" failed: %s' % (liveaction_db.action, str(ex)), extra=extra)

            liveaction_db = action_utils.update_liveaction_status(
                status=action_constants.LIVEACTION_STATUS_FAILED,
                liveaction_id=liveaction_db.id,
                result={'error': str(ex), 'traceback': ''.join(traceback.format_tb(tb, 20))})
            executions.update_execution(liveaction_db)
            raise
        finally:
            # In the case of worker shutdown, the items are removed from _running_liveactions.
            # As the subprocesses for action executions are terminated, this finally block
            # will be executed. Set remove will result in KeyError if item no longer exists.
            # Use set discard to not raise the KeyError.
            self._running_liveactions.discard(liveaction_db.id)

        return result

分析:
4.1.1) 变量分析
(Pdb) p liveaction_db

(Pdb) p liveaction_db.__dict__
{'_fields_ordered': ('id', 'status', 'start_timestamp', 'end_timestamp', 'action', 'action_is_workflow', 'parameters', 'result', 'context', 'callback', 'runner_info', 'notify')}

(Pdb) p runner_info
{'hostname': 'dozer-st2actionrunner-0', 'pid': 39}
(Pdb) p liveaction_db.id
ObjectId('5edde21d3e2ff7000d50e09d')
db.live_action_d_b.find({'_id': ObjectId('5edde21d3e2ff7000d50e09d')}).pretty();
{
    "_id" : ObjectId("5edde21d3e2ff7000d50e09d"),
    "status" : "scheduled",
    "start_timestamp" : NumberLong("1591599645250893"),
    "action" : "core.local",
    "action_is_workflow" : false,
    "parameters" : {
        "cmd" : "ls"
    },
    "result" : {
        
    },
    "context" : {
        "rbac" : {
            "user" : "admin",
            "roles" : [
                "admin"
            ]
        },
        "user" : "admin",
        "pack" : "core"
    },
    "callback" : {
        
    },
    "runner_info" : {
        
    }
}

4.1.2) 处理流程分析
更新liveaction的状态
调用RunnerContainer.dispatch(self, liveaction_db)方法执行action
进入:
st2/st2actions/st2actions/container/base.py

class RunnerContainer(object):

    def dispatch(self, liveaction_db):
        action_db = get_action_by_ref(liveaction_db.action)
        if not action_db:
            raise Exception('Action %s not found in DB.' % (liveaction_db.action))

        liveaction_db.context['pack'] = action_db.pack

        runnertype_db = get_runnertype_by_name(action_db.runner_type['name'])

        extra = {'liveaction_db': liveaction_db, 'runnertype_db': runnertype_db}
        LOG.info('Dispatching Action to a runner', extra=extra)

        # Get runner instance.
        runner = self._get_runner(runnertype_db, action_db, liveaction_db)

        LOG.debug('Runner instance for RunnerType "%s" is: %s', runnertype_db.name, runner)

        # Process the request.
        funcs = {
            action_constants.LIVEACTION_STATUS_REQUESTED: self._do_run,
            action_constants.LIVEACTION_STATUS_SCHEDULED: self._do_run,
            action_constants.LIVEACTION_STATUS_RUNNING: self._do_run,
            action_constants.LIVEACTION_STATUS_CANCELING: self._do_cancel,
            action_constants.LIVEACTION_STATUS_PAUSING: self._do_pause,
            action_constants.LIVEACTION_STATUS_RESUMING: self._do_resume
        }

        if liveaction_db.status not in funcs:
            raise actionrunner.ActionRunnerDispatchError(
                'Action runner is unable to dispatch the liveaction because it is '
                'in an unsupported status of "%s".' % liveaction_db.status
            )

        liveaction_db = funcs[liveaction_db.status](
            runner=runner,
            runnertype_db=runnertype_db,
            action_db=action_db,
            liveaction_db=liveaction_db
        )

        return liveaction_db.result

分析:
1)变量分析
(Pdb) p liveaction_db

(Pdb) p action_db

(Pdb) p action_db.pack
u'core'
(Pdb) p runnertype_db

(Pdb) p runner

2) 逻辑处理分析
会调用_do_run方法

4.1.3) 
    def _do_run(self, runner, runnertype_db, action_db, liveaction_db):
        # Create a temporary auth token which will be available
        # for the duration of the action execution.
        runner.auth_token = self._create_auth_token(context=runner.context, action_db=action_db,
                                                    liveaction_db=liveaction_db)

        try:
            # Finalized parameters are resolved and then rendered. This process could
            # fail. Handle the exception and report the error correctly.
            try:
                runner_params, action_params = param_utils.render_final_params(
                    runnertype_db.runner_parameters, action_db.parameters, liveaction_db.parameters,
                    liveaction_db.context)
                runner.runner_parameters = runner_params
            except ParamException as e:
                raise actionrunner.ActionRunnerException(str(e))

            LOG.debug('Performing pre-run for runner: %s', runner.runner_id)
            runner.pre_run()

            # Mask secret parameters in the log context
            resolved_action_params = ResolvedActionParameters(action_db=action_db,
                                                              runner_type_db=runnertype_db,
                                                              runner_parameters=runner_params,
                                                              action_parameters=action_params)
            extra = {'runner': runner, 'parameters': resolved_action_params}
            LOG.debug('Performing run for runner: %s' % (runner.runner_id), extra=extra)
            (status, result, context) = runner.run(action_params)

            try:
                result = json.loads(result)
            except:
                pass

            action_completed = status in action_constants.LIVEACTION_COMPLETED_STATES
            if isinstance(runner, AsyncActionRunner) and not action_completed:
                self._setup_async_query(liveaction_db.id, runnertype_db, context)
        except:
            LOG.exception('Failed to run action.')
            _, ex, tb = sys.exc_info()
            # mark execution as failed.
            status = action_constants.LIVEACTION_STATUS_FAILED
            # include the error message and traceback to try and provide some hints.
            result = {'error': str(ex), 'traceback': ''.join(traceback.format_tb(tb, 20))}
            context = None
        finally:
            # Log action completion
            extra = {'result': result, 'status': status}
            LOG.debug('Action "%s" completed.' % (action_db.name), extra=extra)

            # Update the final status of liveaction and corresponding action execution.
            liveaction_db = self._update_status(liveaction_db.id, status, result, context)

            # Always clean-up the auth_token
            # This method should be called in the finally block to ensure post_run is not impacted.
            self._clean_up_auth_token(runner=runner, status=status)

        LOG.debug('Performing post_run for runner: %s', runner.runner_id)
        runner.post_run(status=status, result=result)

        LOG.debug('Runner do_run result', extra={'result': liveaction_db.result})
        LOG.audit('Liveaction completed', extra={'liveaction_db': liveaction_db})

        return liveaction_db

分析:
1) 变量分析
(Pdb) p liveaction_db

2)逻辑处理分析
关键就是调用
(status, result, context) = runner.run(action_params)
进行action的真正执行
进入:
/opt/stackstorm/runners/local_runner/local_runner/local_runner.py的run()方法


4.1.4) 代码如下
class LocalShellRunner(ActionRunner, ShellRunnerMixin):

    def run(self, action_parameters):
        env_vars = self._env

        if not self.entry_point:
            script_action = False
            command = self.runner_parameters.get(RUNNER_COMMAND, None)
            action = ShellCommandAction(name=self.action_name,
                                        action_exec_id=str(self.liveaction_id),
                                        command=command,
                                        user=self._user,
                                        env_vars=env_vars,
                                        sudo=self._sudo,
                                        timeout=self._timeout,
                                        sudo_password=self._sudo_password)
        else:
            script_action = True
            script_local_path_abs = self.entry_point
            positional_args, named_args = self._get_script_args(action_parameters)
            named_args = self._transform_named_args(named_args)

            action = ShellScriptAction(name=self.action_name,
                                       action_exec_id=str(self.liveaction_id),
                                       script_local_path_abs=script_local_path_abs,
                                       named_args=named_args,
                                       positional_args=positional_args,
                                       user=self._user,
                                       env_vars=env_vars,
                                       sudo=self._sudo,
                                       timeout=self._timeout,
                                       cwd=self._cwd,
                                       sudo_password=self._sudo_password)

        args = action.get_full_command_string()
        sanitized_args = action.get_sanitized_full_command_string()

        # For consistency with the old Fabric based runner, make sure the file is executable
        if script_action:
            args = 'chmod +x %s ; %s' % (script_local_path_abs, args)
            sanitized_args = 'chmod +x %s ; %s' % (script_local_path_abs, sanitized_args)

        env = os.environ.copy()

        # Include user provided env vars (if any)
        env.update(env_vars)

        # Include common st2 env vars
        st2_env_vars = self._get_common_action_env_variables()
        env.update(st2_env_vars)

        LOG.info('Executing action via LocalRunner: %s', self.runner_id)
        LOG.info('[Action info] name: %s, Id: %s, command: %s, user: %s, sudo: %s' %
                 (action.name, action.action_exec_id, sanitized_args, action.user, action.sudo))

        stdout = StringIO()
        stderr = StringIO()

        store_execution_stdout_line = functools.partial(store_execution_output_data,
                                                        output_type='stdout')
        store_execution_stderr_line = functools.partial(store_execution_output_data,
                                                        output_type='stderr')

        read_and_store_stdout = make_read_and_store_stream_func(execution_db=self.execution,
            action_db=self.action, store_data_func=store_execution_stdout_line)
        read_and_store_stderr = make_read_and_store_stream_func(execution_db=self.execution,
            action_db=self.action, store_data_func=store_execution_stderr_line)

        # If sudo password is provided, pass it to the subprocess via stdin>
        # Note: We don't need to explicitly escape the argument because we pass command as a list
        # to subprocess.Popen and all the arguments are escaped by the function.
        if self._sudo_password:
            LOG.debug('Supplying sudo password via stdin')
            echo_process = subprocess.Popen(['echo', self._sudo_password + '\n'],
                                            stdout=subprocess.PIPE)
            stdin = echo_process.stdout
        else:
            stdin = None

        # Make sure os.setsid is called on each spawned process so that all processes
        # are in the same group.

        # Process is started as sudo -u {{system_user}} -- bash -c {{command}}. Introduction of the
        # bash means that multiple independent processes are spawned without them being
        # children of the process we have access to and this requires use of pkill.
        # Ideally os.killpg should have done the trick but for some reason that failed.
        # Note: pkill will set the returncode to 143 so we don't need to explicitly set
        # it to some non-zero value.
        exit_code, stdout, stderr, timed_out = shell.run_command(cmd=args,
                                                                 stdin=stdin,
                                                                 stdout=subprocess.PIPE,
                                                                 stderr=subprocess.PIPE,
                                                                 shell=True,
                                                                 cwd=self._cwd,
                                                                 env=env,
                                                                 timeout=self._timeout,
                                                                 preexec_func=os.setsid,
                                                                 kill_func=kill_process,
                                                           read_stdout_func=read_and_store_stdout,
                                                           read_stderr_func=read_and_store_stderr,
                                                           read_stdout_buffer=stdout,
                                                           read_stderr_buffer=stderr)

        error = None

        if timed_out:
            error = 'Action failed to complete in %s seconds' % (self._timeout)
            exit_code = -1 * exit_code_constants.SIGKILL_EXIT_CODE

        # Detect if user provided an invalid sudo password or sudo is not configured for that user
        if self._sudo_password:
            if re.search('sudo: \d+ incorrect password attempts', stderr):
                match = re.search('\[sudo\] password for (.+?)\:', stderr)

                if match:
                    username = match.groups()[0]
                else:
                    username = 'unknown'

                error = ('Invalid sudo password provided or sudo is not configured for this user '
                        '(%s)' % (username))
                exit_code = -1

        succeeded = (exit_code == exit_code_constants.SUCCESS_EXIT_CODE)

        result = {
            'failed': not succeeded,
            'succeeded': succeeded,
            'return_code': exit_code,
            'stdout': strip_shell_chars(stdout),
            'stderr': strip_shell_chars(stderr)
        }

        if error:
            result['error'] = error

        status = PROC_EXIT_CODE_TO_LIVEACTION_STATUS_MAP.get(
            str(exit_code),
            action_constants.LIVEACTION_STATUS_FAILED
        )

        return (status, jsonify.json_loads(result, LocalShellRunner.KEYS_TO_TRANSFORM), None)

分析:
1) 变量分析
(Pdb) p action_parameters
{}
(Pdb) p self.entry_point
None
(Pdb) p command
u'ls'
(Pdb) p self.action_name
u'local'
(Pdb) p self.liveaction_id
'5edde21d3e2ff7000d50e09d'
(Pdb) p self._timeout
60
(Pdb) p self._sudo_password
None
2) 逻辑处理分析
这里根据是否有entry_point来判断是直接执行命令还是执行shell脚本。
根据上述要求初始化ShellCommandAction对象或ShellScriptAction对象。
然后获得完成的执行命令,例如:
u'sudo -E -H -u [email protected] -- bash -c ls'
更新环境变量,构造如下方法:
read_and_store_stdout = make_read_and_store_stream_func(execution_db=self.execution,
            action_db=self.action, store_data_func=store_execution_stdout_line)
最终调用shell.run_command方法传入read_and_store_stdout方法进行处理

重点就是分析shell.run_command方法

4.1.5)代码
st2/st2common/util/green/shell.py的run_command()方法如下

def run_command(cmd, stdin=None, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=False,
                cwd=None, env=None, timeout=60, preexec_func=None, kill_func=None,
                read_stdout_func=None, read_stderr_func=None,
                read_stdout_buffer=None, read_stderr_buffer=None):
    """
    Run the provided command in a subprocess and wait until it completes.

    :param cmd: Command to run.
    :type cmd: ``str`` or ``list``

    :param stdin: Process stdin.
    :type stdin: ``object``

    :param stdout: Process stdout.
    :type stdout: ``object``

    :param stderr: Process stderr.
    :type stderr: ``object``

    :param shell: True to use a shell.
    :type shell ``boolean``

    :param cwd: Optional working directory.
    :type cwd: ``str``

    :param env: Optional environment to use with the command. If not provided,
                environment from the current process is inherited.
    :type env: ``dict``

    :param timeout: How long to wait before timing out.
    :type timeout: ``float``

    :param preexec_func: Optional pre-exec function.
    :type preexec_func: ``callable``

    :param kill_func: Optional function which will be called on timeout to kill the process.
                      If not provided, it defaults to `process.kill`
    :type kill_func: ``callable``

    :param read_stdout_func: Function which is responsible for reading process stdout when
                                 using live read mode.
    :type read_stdout_func: ``func``

    :param read_stdout_func: Function which is responsible for reading process stderr when
                                 using live read mode.
    :type read_stdout_func: ``func``


    :rtype: ``tuple`` (exit_code, stdout, stderr, timed_out)
    """
    LOG.debug('Entering st2common.util.green.run_command.')

    assert isinstance(cmd, (list, tuple) + six.string_types)

    if (read_stdout_func and not read_stderr_func) or (read_stderr_func and not read_stdout_func):
        raise ValueError('Both read_stdout_func and read_stderr_func arguments need '
                         'to be provided.')

    if read_stdout_func and not (read_stdout_buffer or read_stderr_buffer):
        raise ValueError('read_stdout_buffer and read_stderr_buffer arguments need to be provided '
                         'when read_stdout_func is provided')

    if not env:
        LOG.debug('env argument not provided. using process env (os.environ).')
        env = os.environ.copy()

    # Note: We are using eventlet friendly implementation of subprocess
    # which uses GreenPipe so it doesn't block
    LOG.debug('Creating subprocess.')
    process = subprocess.Popen(args=cmd, stdin=stdin, stdout=stdout, stderr=stderr,
                               env=env, cwd=cwd, shell=shell, preexec_fn=preexec_func)

    if read_stdout_func:
        LOG.debug('Spawning read_stdout_func function')
        read_stdout_thread = eventlet.spawn(read_stdout_func, process.stdout, read_stdout_buffer)

    if read_stderr_func:
        LOG.debug('Spawning read_stderr_func function')
        read_stderr_thread = eventlet.spawn(read_stderr_func, process.stderr, read_stderr_buffer)

    def on_timeout_expired(timeout):
        global timed_out

        try:
            LOG.debug('Starting process wait inside timeout handler.')
            process.wait(timeout=timeout)
        except subprocess.TimeoutExpired:
            # Command has timed out, kill the process and propagate the error.
            # Note: We explicitly set the returncode to indicate the timeout.
            LOG.debug('Command execution timeout reached.')
            process.returncode = TIMEOUT_EXIT_CODE

            if kill_func:
                LOG.debug('Calling kill_func.')
                kill_func(process=process)
            else:
                LOG.debug('Killing process.')
                process.kill()

            if read_stdout_func and read_stderr_func:
                LOG.debug('Killing read_stdout_thread and read_stderr_thread')
                read_stdout_thread.kill()
                read_stderr_thread.kill()

    LOG.debug('Spawning timeout handler thread.')
    timeout_thread = eventlet.spawn(on_timeout_expired, timeout)
    LOG.debug('Attaching to process.')

    if read_stdout_func and read_stderr_func:
        LOG.debug('Using real-time stdout and stderr read mode, calling process.wait()')
        process.wait()
    else:
        LOG.debug('Using delayed stdout and stderr read mode, calling process.communicate()')
        stdout, stderr = process.communicate()

    timeout_thread.cancel()
    exit_code = process.returncode

    if read_stdout_func and read_stderr_func:
        # Wait on those green threads to finish reading from stdout and stderr before continuing
        read_stdout_thread.wait()
        read_stderr_thread.wait()

        stdout = read_stdout_buffer.getvalue()
        stderr = read_stderr_buffer.getvalue()

    if exit_code == TIMEOUT_EXIT_CODE:
        LOG.debug('Timeout.')
        timed_out = True
    else:
        LOG.debug('No timeout.')
        timed_out = False

    LOG.debug('Returning.')
    return (exit_code, stdout, stderr, timed_out)

分析:
1) 变量分析
(Pdb) p cmd
u'sudo -E -H -u [email protected] -- bash -c ls'
(Pdb) p stdin
None
(Pdb) p stdout
-1
(Pdb) p stderr
-1
(Pdb) p shell
True
(Pdb) p cwd
None
(Pdb) p env
{'USERNAME': 'root', 'SUDO_COMMAND': '/opt/stackstorm/st2/bin/st2actionrunner --config-file /etc/st2/st2.conf', 'TERM': 'xterm', 'SHELL': '/bin/bash', 'ST2_ACTION_PACK_NAME': u'core', 'ST2_ACTION_EXECUTION_ID': '5edde21d3e2ff7000d50e09e', 'ST2_ACTION_AUTH_TOKEN': u'd8beda985f27486ea54d9d3471415731', 'HOSTNAME': 'dozer-st2actionrunner-0', 'SUDO_UID': '0', 'SUDO_GID': '0', 'LS_COLORS': 'rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=01;05;37;41:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arc=01;31:*.arj=01;31:*.taz=01;31:*.lha=01;31:*.lz4=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.tzo=01;31:*.t7z=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.dz=01;31:*.gz=01;31:*.lrz=01;31:*.lz=01;31:*.lzo=01;31:*.xz=01;31:*.bz2=01;31:*.bz=01;31:*.tbz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.war=01;31:*.ear=01;31:*.sar=01;31:*.rar=01;31:*.alz=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.cab=01;31:*.jpg=01;35:*.jpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.webm=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.axv=01;35:*.anx=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=01;36:*.au=01;36:*.flac=01;36:*.mid=01;36:*.midi=01;36:*.mka=01;36:*.mp3=01;36:*.mpc=01;36:*.ogg=01;36:*.ra=01;36:*.wav=01;36:*.axa=01;36:*.oga=01;36:*.spx=01;36:*.xspf=01;36:', 'LOGNAME': 'root', 'USER': 'root', 'PATH': '/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin', 'ST2_ACTION_API_URL': 'http://st2api:9101/v1', 'MAIL': '/var/mail/root', 'SUDO_USER': 'root', 'PS1': '\\[\x1b[1m\\]()\\[\x1b(B\x1b[m\\][\\u@\\h \\W]\\$ ', 'HOME': '/root', 'LC_ALL': 'en_US.utf8'}
(Pdb) p timeout
60
(Pdb) p preexec_func

(Pdb) p kill_func

(Pdb) p read_stdout_func

(Pdb) p read_stderr_func

(Pdb) p read_stdout_buffer

(Pdb) p read_stderr_buffer

2)逻辑处理分析
具体执行shell类型的action时,先通过subprocess.Popen命令来执行
process = subprocess.Popen(args=cmd, stdin=stdin, stdout=stdout, stderr=stderr,
                               env=env, cwd=cwd, shell=shell, preexec_fn=preexec_func)
可以看到只要read_stdout_func非空,就会开启协程来处理subprocess.Popen命令得到的stdout结果
read_stdout_thread = eventlet.spawn(read_stdout_func, process.stdout, read_stdout_buffer)
另外,为了处理超时的问题,还开启了如下协程进行超时就kill的处理
timeout_thread = eventlet.spawn(on_timeout_expired, timeout)
最终执行:
process.wait()
来等待action执行完成。返回(exit_code, stdout, stderr, timed_out)的4元组。

3)样例如下:
(Pdb) p exit_code
0
(Pdb) p stdout
'bootstrap\ncmd\nconfig.py\nconfig.pyc\ncontainer\n__init__.py\n__init__.pyc\nnotifier\npolicies\nresultstracker\nrunners\nscheduler.py\nscheduler.pyc\nworker.py\nworker.py_bak\nworker.pyc\n'
(Pdb) p stderr
''
(Pdb) p timeout
60

5 总结
1) st2 run action是st2的常用场景,其本质是通过:
命令行向st2 api服务发送请求
--> st2 api服务经过层层处理后,最终发送liveaction消息到'st2.liveaction.status'这个exchange, routing_key为'requested' 
--> 该消息被st2 actionrunner服务中的ActionExecutionScheduler类所处理,因为该类监听的队列绑定关系如下
'st2.liveaction.status'--->'requested'--->'st2.actionrunner.req' ,随后消息被处理后被ActionExecutionScheduler类再次发送消息
到'st2.liveaction.status'的exchange, routing_key为'scheduled'
--> 该消息被st2 actionrunner服务中的ActionExecutionDispatcher类处理,因为该类监听的队列绑定关系如下
'st2.liveaction.status'--->'scheduled'--->'st2.actionrunner.work'
--> 在ActionExecutionDispatcher类中调用RunnerContainer.dispatch(self, liveaction_db)方法执行action,
根据不通的runner类型,调用runner的run方法进行action的处理。
2) 以LocalShellRunner为例,其run方法具体处理流程是:
具体执行shell类型的action时,先通过subprocess.Popen命令来执行
process = subprocess.Popen(args=cmd, stdin=stdin, stdout=stdout, stderr=stderr,
                               env=env, cwd=cwd, shell=shell, preexec_fn=preexec_func)
可以看到只要read_stdout_func非空,就会开启协程来处理subprocess.Popen命令得到的stdout结果
read_stdout_thread = eventlet.spawn(read_stdout_func, process.stdout, read_stdout_buffer)
另外,为了处理超时的问题,还开启了如下协程进行超时就kill的处理
timeout_thread = eventlet.spawn(on_timeout_expired, timeout)
最终执行:
process.wait()
来等待action执行完成。返回(exit_code, stdout, stderr, timed_out)的4元组。

参考:
stackstorm 2.6代码
 

你可能感兴趣的:(stackstorm)