在进行Ansible模块开发的过程中我们可能会遇到这样的问题:比如我们要开发一个(批)控制Oracle的模块,除去sql相关的功能有cx_Oracle为我们提供驱动之外,像rman,listener,crsctl,srvctl等等功能的实现,还是需要通过shell。但从功能上来划分,它们又都是属于oracle数据库工具集的。所以我们提出这样的需求(以listener的控制为例):要写新的,单独的模块(如ora_lsnr)来实现这部分功能。
实现思路有:
本文主要讨论第二种。
shell plugin的源码如下:
# Copyright: (c) 2017, Ansible Project
# GNU General Public License v3.0+ (see COPYING or https://www.gnu.org/licenses/gpl-3.0.txt)
from __future__ import (absolute_import, division, print_function)
__metaclass__ = type
from ansible.plugins.action import ActionBase
from ansible.utils.vars import merge_hash
class ActionModule(ActionBase):
def run(self, tmp=None, task_vars=None):
del tmp # tmp no longer has any effect
# Shell module is implemented via command
self._task.action = 'command'
self._task.args['_uses_shell'] = True
command_action = self._shared_loader_obj.action_loader.get('command',
task=self._task,
connection=self._connection,
play_context=self._play_context,
loader=self._loader,
templar=self._templar,
shared_loader_obj=self._shared_loader_obj)
result = command_action.run(task_vars=task_vars)
return result
配上一个空荡荡的shell module,就可以自由自在的执行命令了。
我们也先写一个更加空荡荡的lib/ansible/modules/database/oracle/ora_lsnr.py:
from __future__ import absolute_import, division, print_function
__metaclass__ = type
ANSIBLE_METADATA = {'metadata_version': '1.1',
'status': ['stableinterface'],
'supported_by': '三苦庵'}
DOCUMENTATION = r'''
'''
EXAMPLES = r'''
'''
RETURN = r'''
'''
再搭配一个跟shell一模一样的lib/ansible/plugins/action/ora_lsnr.py:
# Copyright: (c) 2017, Ansible Project
# GNU General Public License v3.0+ (see COPYING or https://www.gnu.org/licenses/gpl-3.0.txt)
from __future__ import (absolute_import, division, print_function)
__metaclass__ = type
from ansible.plugins.action import ActionBase
from ansible.utils.vars import merge_hash
class ActionModule(ActionBase):
def run(self, tmp=None, task_vars=None):
del tmp # tmp no longer has any effect
# Shell module is implemented via command
self._task.action = 'command'
self._task.args['_uses_shell'] = True
command_action = self._shared_loader_obj.action_loader.get('command',
task=self._task,
connection=self._connection,
play_context=self._play_context,
loader=self._loader,
templar=self._templar,
shared_loader_obj=self._shared_loader_obj)
result = command_action.run(task_vars=task_vars)
return result
然后把
dict(action=dict(module='ora_lsnr', args='lsnrctl start'))
放到task里面去执行,结果触发了failed:
{'msg': 'no command given', 'rc': 256, 'invocation': {'module_args': {'creates': None, 'executable': None, '_uses_shell': True, '_raw_params': None, 'removes': None, 'argv': None, 'warn': True, 'chdir': None, 'stdin': None}}, '_ansible_parsed': True, '_ansible_no_log': False, 'changed': False}
为什么说’no command given’呢?args的值明明是‘lsnrctl start’啊!
回头检查,plugin ora_lsnr中,加入
print(self._task.args)
结果:
{‘_uses_shell’: True}
args竟然是个空的
在lib/ansible/playbook/task.py中找到解析参数的代码(183-185行):
args_parser = ModuleArgsParser(task_ds=ds)
try:
(action, args, delegate_to) = args_parser.parse()
继续看ModuleArgsParser.parse()的代码(lib/ansible/parsing/mod_args.py):
def parse(self):
'''
Given a task in one of the supported forms, parses and returns
returns the action, arguments, and delegate_to values for the
task, dealing with all sorts of levels of fuzziness.
'''
thing = None
action = None
delegate_to = self._task_ds.get('delegate_to', None)
args = dict()
# This is the standard YAML form for command-type modules. We grab
# the args and pass them in as additional arguments, which can/will
# be overwritten via dict updates from the other arg sources below
additional_args = self._task_ds.get('args', dict())
# We can have one of action, local_action, or module specified
# action
if 'action' in self._task_ds:
# an old school 'action' statement
thing = self._task_ds['action']
action, args = self._normalize_parameters(thing, action=action, additional_args=additional_args)
# local_action
if 'local_action' in self._task_ds:
# local_action is similar but also implies a delegate_to
if action is not None:
raise AnsibleParserError("action and local_action are mutually exclusive", obj=self._task_ds)
thing = self._task_ds.get('local_action', '')
delegate_to = 'localhost'
action, args = self._normalize_parameters(thing, action=action, additional_args=additional_args)
# module: is the more new-style invocation
# walk the input dictionary to see we recognize a module name
for (item, value) in iteritems(self._task_ds):
if item in BUILTIN_TASKS or item in action_loader or item in module_loader:
# finding more than one module name is a problem
if action is not None:
raise AnsibleParserError("conflicting action statements: %s, %s" % (action, item), obj=self._task_ds)
action = item
thing = value
action, args = self._normalize_parameters(thing, action=action, additional_args=additional_args)
# if we didn't see any module in the task at all, it's not a task really
if action is None:
if 'ping' not in module_loader:
raise AnsibleParserError("The requested action was not found in configured module paths. "
"Additionally, core modules are missing. If this is a checkout, "
"run 'git pull --rebase' to correct this problem.",
obj=self._task_ds)
else:
raise AnsibleParserError("no action detected in task. This often indicates a misspelled module name, or incorrect module path.",
obj=self._task_ds)
elif args.get('_raw_params', '') != '' and action not in RAW_PARAM_MODULES:
templar = Templar(loader=None)
raw_params = args.pop('_raw_params')
if templar._contains_vars(raw_params):
args['_variable_params'] = raw_params
else:
raise AnsibleParserError("this task '%s' has extra params, which is only allowed in the following modules: %s" % (action, ", ".join(RAW_PARAM_MODULES)),
obj=self._task_ds)
return (action, args, delegate_to)
前面都没什么问题,关键在最后一个elif分支涉及到了一个叫做RAW_PARAM_MODULES的东西(也在mod_args.py里)
FREEFORM_ACTIONS = frozenset((
'command',
'win_command',
'shell',
'win_shell',
'script',
'raw'
))
RAW_PARAM_MODULES = FREEFORM_ACTIONS.union((
'include',
'include_vars',
'include_tasks',
'include_role',
'import_tasks',
'import_role',
'add_host',
'group_by',
'set_fact',
'meta',
))
如果action(module的名字)不属于这个RAW_PARAM_MODULES集合,就把_raw_params拿出来塞到了_variable_params里。
parse这个步骤发生在play执行之前,Play().load()的过程中。
所以对策是在load之前修改RAW_PARAM_MODULES集合。
照猫画个虎吧:
import ansible.parsing.mod_args as A
A.RAW_PARAM_MODULES = A.RAW_PARAM_MODULES.union(('ora_lsnr',))
再次运行,就可以取到命令的值了。
但至此为止,我们只是抄了一遍shell模块而已。如何让它仅执行listener控制相关的功能呢?继续修改plugin,在run中重新组织task的args:
def run(self, tmp=None, task_vars=None):
del tmp # tmp no longer has any effect
# Shell module is implemented via command
self._task.action = 'command'
self._task.args['_uses_shell'] = True
#我们知道原本的_raw_params是start/stop
self._task.args['_raw_params'] = 'lsnrctl ' + self._task.args['_raw_params']
command_action = self._shared_loader_obj.action_loader.get('command',
task=self._task,
connection=self._connection,
play_context=self._play_context,
loader=self._loader,
templar=self._templar,
shared_loader_obj=self._shared_loader_obj)
result = command_action.run(task_vars=task_vars)
return result
task改这样的:
dict(action=dict(module='ora_lsnr', args='start'))
如果需要指定目标listener的名字,也可以写成这样:
dict(action=dict(module='ora_lsnr', args='start {{lsnr_name}}'))
将lsnr_name作为variable传进来。
到此为止,目的似乎是可以达成了。然而所谓人间处处有惊喜,当运行结果出来之后,我们惊奇地发现——找不到lsnrctl,因为没有环境变量。
关于ansible远程执行的环境变量问题(login shell & nonlogin shelll) (作者:huangwjwork)这篇文章给出了比较详细的解释,结论也比较清晰。给出的方案中,面对成千上万台各种类型各种操作系统的服务器,显然无法通过把环境变量从.bash_profile迁移到.bashrc来解决,通过environment字段传入环境变量似乎也太过麻烦了,所以剩下的路只有手动加载目标服务器的环境变量了。
所幸$HOME还是可以取到的。
self._task.args['_raw_params'] = '. $HOME/.bash_profile && lsnrctl ' + self._task.args['_raw_params']
可以运行成功。
现在我们的ora_lsnr模块实现的功能是,在shell模块的args前面加了个lsnrctl——实际上并不能严格地控制通过ora_lsnr模块只能执行listener控制操作。raw params提供的高灵活度必然与可控的目标有冲突。另外,在不修改Ansible源代码的情况下,这个模块只能通过API调用(在load Playsource之前修改RAW_PARAM_MODULES集合)。如果想直接通过shell启动ansible来调用,由于上文说到过的原因,会报出错误:
ERROR! this task ‘ora_lsnr’ has extra params, which is only allowed in the following modules: raw, include_role, include_tasks, include_vars, shell, import_role, group_by, import_tasks, win_shell, script, include, meta, add_host, win_command, command, set_fact
所以还需要继续改进。
既然我们是在plugin中实现调用,模块的参数完全可以定义为字典,则只要保证在plugin中拼接好一个字符串作为_raw_params传给command模块即可,这样可以完美避过ansible对模块args的限制,且如果使用者在参数中没有传入必须项,则pop会报错,传入的不符合要求项则会被丢弃:
dict(action=dict(module='ora_lsnr', args=dict(cmd='start', lsnr_name='my_listener')))
plugin:
def run(self, tmp=None, task_vars=None):
del tmp # tmp no longer has any effect
# Shell module is implemented via command
self._task.action = 'command'
self._task.args['_uses_shell'] = True
self._task.args['_raw_params'] = ' '.join(['lsnrctl',self._task.args.pop('cmd'),self._task.args.pop('lsnr_name')])
command_action = self._shared_loader_obj.action_loader.get('command',
task=self._task,
connection=self._connection,
play_context=self._play_context,
loader=self._loader,
templar=self._templar,
shared_loader_obj=self._shared_loader_obj)
result = command_action.run(task_vars=task_vars)
return result
ansible也可以直接使用自定义的模块:
ansible 192.168.1.1 -m ora_lsnr -a 'cmd=start lsnr_name='
OK,虽然道路是曲折的,但结果还是过得去的吧。