星期天难得比较空闲,刚好前段时间在老师布置的网页作业里用到了debug动态自动编译,一直没明白是怎么工作的,于是就上网google了tornado的英文官网,查看了有关动态自动编译的源代码。还好源码注释多,大概明白了个原理,现在就来分享一下我的见解吧,有错漏之处还请指点。
先来看一下application对象的构造函数
def __init__(self, handlers=None, default_host="", transforms=None,**settings):
__app__ = tornado.web.Application( handlers=[(r'/', MainHandler)] debug=True )传进去后,在application构造函数里会执行以下代码:
if self.settings.get('debug'): self.settings.setdefault('autoreload', True) self.settings.setdefault('compiled_template_cache', False) self.settings.setdefault('static_hash_cache', False) self.settings.setdefault('serve_traceback', True) # Automatically reload modified modules if self.settings.get('autoreload'): from tornado import autoreload autoreload.start()可以看到,当debug被设置为True以后autoreload、complied_template_cache、static_hash_cache、serve_traceback被分别赋值。其中complied_template_cache默认值为True,当其修改为False以后,模板在接受到每一条请求时都会被重新编译。static_hash_cache默认值为True,当其被修改为False以后,url在接受到每一条请求时会被重新加载。serve_traceback应该是与异常跟踪有关。
在autoreload被设置成True以后,执行下一个if语句。在这里,一个autoreload函数start()被调用。注意并没有autoreload对象生成,因为autoreload不是一个class,依我看更像是一个功能函数集的头域。
那么我们跟踪这一条信息,看看到底start()里面做了神马事情居然能动态编译?
于是我找来了tornado.autoreload的源代码,先看看start()函数:
_watched_files = set() _reload_hooks = [] _reload_attempted = False _io_loops = weakref.WeakKeyDictionary() def start(io_loop=None, check_time=500): """Begins watching source files for changes using the given `.IOLoop`. """ io_loop = io_loop or ioloop.IOLoop.current() if io_loop in _io_loops: return _io_loops[io_loop] = True if len(_io_loops) > 1: gen_log.warning("tornado.autoreload started more than once in the same process") add_reload_hook(functools.partial(io_loop.close, all_fds=True)) modify_times = {} callback = functools.partial(_reload_on_update, modify_times) scheduler = ioloop.PeriodicCallback(callback, check_time, io_loop=io_loop) scheduler.start()在这里start有两个默认参数,所以上面可以直接autoreload.start()这样调用。io_loop赋值当前的io_loop对象(io_loop对象是tornado为了实现高性能和高并发,处理socket读写事件的)。前几句应该是用来判断是否有多个相同的io_loop被启用(这个我不是很明白,不过对理解没太大影响)。接着调用了add_reload_hook函数,并传递了一个偏函数作为其参数。
呵呵,你肯定要问什么是偏函数了。简单地来说,就是提前把参数传递给一个函数,然后返回一个可调用的函数。例如:
from operator import add import functools print add(1,2) #3 add1 = functools.partial(add,1) print add1(10) #11接着画面跳转到add_reload_hook函数:
def add_reload_hook(fn): """Add a function to be called before reloading the process. """ _reload_hooks.append(fn)我*,居然只有一条添加语句我也是醉了。不过注释是关键!说明函数储存在_reload_hooks列表里是为了以后要把它取出来调用的。
好吧,我们再回到start()函数。接下来是初始化了一个modify_times字典,看名字就猜到是用来记录文档修改时间的,后来证明确实如此。接着又一个偏函数callback被定义,调用callback()相当于调用_reload_on_update(modify_times)。于是画面又跳转到_reload_on_update函数:
def _reload_on_update(modify_times): if _reload_attempted: # We already tried to reload and it didn't work, so don't try again. return if process.task_id() is not None: # We're in a child process created by fork_processes. If child # processes restarted themselves, they'd all restart and then # all call fork_processes again. return for module in sys.modules.values(): # Some modules play games with sys.modules (e.g. email/__init__.py # in the standard library), and occasionally this can cause strange # failures in getattr. Just ignore anything that's not an ordinary # module. if not isinstance(module, types.ModuleType): continue path = getattr(module, "__file__", None) if not path: continue if path.endswith(".pyc") or path.endswith(".pyo"): path = path[:-1] _check_file(modify_times, path) for path in _watched_files: _check_file(modify_times, path)
def _check_file(modify_times, path): try: modified = os.stat(path).st_mtime except Exception: return if path not in modify_times: modify_times[path] = modified return if modify_times[path] != modified: gen_log.info("%s modified; restarting server", path) _reload()哈哈!我们看到modified获取了当前模块路径的修改时间。接下来的两句判断至关重要!第一个判断是判断当前模块是否在modify_times字典里,如果不在,说明模块是新的,那么将其添加进字典并赋值其修改的时间。第二个判断则是判断已存在模块是否重新被修改。例如第一次我修改了.py文件是在3:00:00触发了第一个判断,那么如果第二次修改在3:00:10,这时就会触发第二个判断。因为修改的时间发生了变化,于是一条“某某模块路径被修改,正在重启服务器”的信息就会打印在屏幕上,并执行_reload()。画面跳转:
def _reload(): global _reload_attempted _reload_attempted = True for fn in _reload_hooks: fn() if hasattr(signal, "setitimer"): # Clear the alarm signal set by # ioloop.set_blocking_log_threshold so it doesn't fire # after the exec. signal.setitimer(signal.ITIMER_REAL, 0, 0) # sys.path fixes: see comments at top of file. If sys.path[0] is an empty # string, we were (probably) invoked with -m and the effective path # is about to change on re-exec. Add the current directory to $PYTHONPATH # to ensure that the new process sees the same path we did. path_prefix = '.' + os.pathsep if (sys.path[0] == '' and not os.environ.get("PYTHONPATH", "").startswith(path_prefix)): os.environ["PYTHONPATH"] = (path_prefix + os.environ.get("PYTHONPATH", "")) if sys.platform == 'win32': # os.execv is broken on Windows and can't properly parse command line # arguments and executable name if they contain whitespaces. subprocess # fixes that behavior. subprocess.Popen([sys.executable] + sys.argv) sys.exit(0) else: try: os.execv(sys.executable, [sys.executable] + sys.argv) except OSError: # Mac OS X versions prior to 10.6 do not support execv in # a process that contains multiple threads. Instead of # re-executing in the current process, start a new one # and cause the current process to exit. This isn't # ideal since the new process is detached from the parent # terminal and thus cannot easily be killed with ctrl-C, # but it's better than not being able to autoreload at # all. # Unfortunately the errno returned in this case does not # appear to be consistent, so we can't easily check for # this error specifically. os.spawnv(os.P_NOWAIT, sys.executable, [sys.executable] + sys.argv) sys.exit(0)不要嫌其长,大部分都是注释。这里先把_reload_attempted赋值为了True,告诉大家我已经尝试过重新编译了。然后我们看到先前被存放在_reload_hooks里的偏函数
functools.partial(io_loop.close, all_fds=True)被取出来调用了,关闭了之前的io_loop。然后的一串代码我无法解释,不过我猜测应该是保证编译路径的正确性,因为在这过程中用户很可能改变了文件的位置甚至是编译环境。最后就是判断操作平台,然后执行编译命令,代码于是被重新编译。
咳咳,你以为这样就完了?其实这只是start()里一条偏函数引出来的一串函数而已,我们还得再回到start()函数。接下来有这么两条语句:
scheduler = ioloop.PeriodicCallback(callback, check_time, io_loop=io_loop) scheduler.start()这个非常好理解,schedule是时刻表的意思,PeriodicCallback顾名思义就是周期性调用的意思。也就是sheduler.start()以后,程序每隔check_time时间会调用一次callback()函数,然后执行刚才那一连串的函数,只要相关的文件被修改了,就会停止一切动作,重新编译代码。于是乎,一个动态的自动编译功能就实现了!
当然,想详细了解的童鞋可以从此传送门进入到tornado官方网站查看百分百原汁原味的源代码:
http://www.tornadoweb.org/en/stable/_modules/tornado/web.html#Application
http://www.tornadoweb.org/en/stable/_modules/tornado/autoreload.html#add_reload_hook
再一次说明,这篇文章纯属个人见解,如有不当或错误的地方还望大家踊跃指出。
好吧,已经快0点了,我还是洗洗睡了~