werkzeug源码分析:routing(路由)

werkzeug的routing模块探究

怎么根据请求信息寻找对应的处理函数,这就需要路由功能,routing模块的功能就是提供路由,主要有三个重要的类组成,下面逐个分析:

Rule

Rule类主要用于存储单个路由规则,定义了url和endpoint的映射,同时能指定允许的请求方法。__init__方法主要做这些初始化。

@implements_to_string
class Rule(RuleFactory):

    def __init__(
        self,
        string,
        defaults=None,
        subdomain=None,
        methods=None,
        build_only=False,
        endpoint=None,
        strict_slashes=None,
        redirect_to=None,
        alias=False,
        host=None,
    ):
        if not string.startswith("/"):
            raise ValueError("urls must start with a leading slash")
        self.rule = string
        self.is_leaf = not string.endswith("/")

        self.map = None
        self.strict_slashes = strict_slashes
        self.subdomain = subdomain
        self.host = host
        self.defaults = defaults
        self.build_only = build_only
        self.alias = alias
        if methods is None:
            self.methods = None
        else:
            if isinstance(methods, str):
                raise TypeError("param `methods` should be `Iterable[str]`, not `str`")
            self.methods = set([x.upper() for x in methods])
            if "HEAD" not in self.methods and "GET" in self.methods:
                self.methods.add("HEAD")
        self.endpoint = endpoint
        self.redirect_to = redirect_to

        if defaults:
            self.arguments = set(map(str, defaults))
        else:
            self.arguments = set()
        self._trace = self._converters = self._regex = self._argument_weights = None

Rule通过bind方法和Map进行绑定,绑定之后,路由映射Map就可以获取这条路由规则。通俗说法就是路由注册。

def bind(self, map, rebind=False):
    """Bind the url to a map and create a regular expression based on
    the information from the rule itself and the defaults from the map.

    :internal:
    """
    if self.map is not None and not rebind:
        raise RuntimeError("url rule %r already bound to map %r" % (self, self.map))
    self.map = map
    if self.strict_slashes is None:
        self.strict_slashes = map.strict_slashes
    if self.subdomain is None:
        self.subdomain = map.default_subdomain
    self.compile()

那么Map怎么判断请求路由是否匹配已经注册的路由呢?就是通过match方法,match判断请求路由和请求方法是否满足已经已经注册的路由规则。

def match(self, path, method=None):
    """Check if the rule matches a given path. Path is a string in the
    form ``"subdomain|/path"`` and is assembled by the map.  If
    the map is doing host matching the subdomain part will be the host
    instead.

    If the rule matches a dict with the converted values is returned,
    otherwise the return value is `None`.

    :internal:
    """
    if not self.build_only:
        m = self._regex.search(path)
        if m is not None:
            groups = m.groupdict()
            # we have a folder like part of the url without a trailing
            # slash and strict slashes enabled. raise an exception that
            # tells the map to redirect to the same url but with a
            # trailing slash
            if (
                self.strict_slashes
                and not self.is_leaf
                and not groups.pop("__suffix__")
                and (
                    method is None or self.methods is None or method in self.methods
                )
            ):
                raise RequestSlash()
            # if we are not in strict slashes mode we have to remove
            # a __suffix__
            elif not self.strict_slashes:
                del groups["__suffix__"]

            result = {
     }
            for name, value in iteritems(groups):
                try:
                    value = self._converters[name].to_python(value)
                except ValidationError:
                    return
                result[str(name)] = value
            if self.defaults:
                result.update(self.defaults)

            if self.alias and self.map.redirect_defaults:
                raise RequestAliasRedirect(result)

            return result

Map

Map类存储了所有路由规则和全局配置。__init__方法主要做这些初始化

class Map(object):
    """The map class stores all the URL rules and some configuration
    parameters.  Some of the configuration values are only stored on the
    `Map` instance since those affect all rules, others are just defaults
    and can be overridden for each rule.  Note that you have to specify all
    arguments besides the `rules` as keyword arguments!

    :param rules: sequence of url rules for this map.
    :param default_subdomain: The default subdomain for rules without a
                              subdomain defined.
    :param charset: charset of the url. defaults to ``"utf-8"``
    :param strict_slashes: Take care of trailing slashes.
    :param redirect_defaults: This will redirect to the default rule if it
                              wasn't visited that way. This helps creating
                              unique URLs.
    :param converters: A dict of converters that adds additional converters
                       to the list of converters. If you redefine one
                       converter this will override the original one.
    :param sort_parameters: If set to `True` the url parameters are sorted.
                            See `url_encode` for more details.
    :param sort_key: The sort key function for `url_encode`.
    :param encoding_errors: the error method to use for decoding
    :param host_matching: if set to `True` it enables the host matching
                          feature and disables the subdomain one.  If
                          enabled the `host` parameter to rules is used
                          instead of the `subdomain` one.

    .. versionadded:: 0.5
        `sort_parameters` and `sort_key` was added.

    .. versionadded:: 0.7
        `encoding_errors` and `host_matching` was added.
    """

    #: A dict of default converters to be used.
    default_converters = ImmutableDict(DEFAULT_CONVERTERS)

    def __init__(
        self,
        rules=None,
        default_subdomain="",
        charset="utf-8",
        strict_slashes=True,
        redirect_defaults=True,
        converters=None,
        sort_parameters=False,
        sort_key=None,
        encoding_errors="replace",
        host_matching=False,
    ):
        self._rules = []
        self._rules_by_endpoint = {
     }
        self._remap = True
        self._remap_lock = Lock()

        self.default_subdomain = default_subdomain
        self.charset = charset
        self.encoding_errors = encoding_errors
        self.strict_slashes = strict_slashes
        self.redirect_defaults = redirect_defaults
        self.host_matching = host_matching

        self.converters = self.default_converters.copy()
        if converters:
            self.converters.update(converters)

        self.sort_parameters = sort_parameters
        self.sort_key = sort_key

        for rulefactory in rules or ():
            self.add(rulefactory)

除了在初始化的时候添加路由规则,也可以动态添加路由规则,通过add方法,这个方法核心工作是rule.bind(self),这个我们在分析Rule的bind方法已经分析过了,就是进行路由注册。

def add(self, rulefactory):
    """Add a new rule or factory to the map and bind it.  Requires that the
    rule is not bound to another map.

    :param rulefactory: a :class:`Rule` or :class:`RuleFactory`
    """
    for rule in rulefactory.get_rules(self):
        rule.bind(self)
        self._rules.append(rule)
        self._rules_by_endpoint.setdefault(rule.endpoint, []).append(rule)
    self._remap = True

而Map也需要和请求环境绑定,绑定之后,Map就能从请求环境中获取请求路径,然后进行路由匹配。这个方法返回一个MapAdapter类,这个类后面会分析。

def bind(
    self,
    server_name,
    script_name=None,
    subdomain=None,
    url_scheme="http",
    default_method="GET",
    path_info=None,
    query_args=None,
):
    """Return a new :class:`MapAdapter` with the details specified to the
    call.  Note that `script_name` will default to ``'/'`` if not further
    specified or `None`.  The `server_name` at least is a requirement
    because the HTTP RFC requires absolute URLs for redirects and so all
    redirect exceptions raised by Werkzeug will contain the full canonical
    URL.

    If no path_info is passed to :meth:`match` it will use the default path
    info passed to bind.  While this doesn't really make sense for
    manual bind calls, it's useful if you bind a map to a WSGI
    environment which already contains the path info.

    `subdomain` will default to the `default_subdomain` for this map if
    no defined. If there is no `default_subdomain` you cannot use the
    subdomain feature.

    .. versionadded:: 0.7
       `query_args` added

    .. versionadded:: 0.8
       `query_args` can now also be a string.

    .. versionchanged:: 0.15
        ``path_info`` defaults to ``'/'`` if ``None``.
    """
    server_name = server_name.lower()
    if self.host_matching:
        if subdomain is not None:
            raise RuntimeError("host matching enabled and a subdomain was provided")
    elif subdomain is None:
        subdomain = self.default_subdomain
    if script_name is None:
        script_name = "/"
    if path_info is None:
        path_info = "/"
    try:
        server_name = _encode_idna(server_name)
    except UnicodeError:
        raise BadHost()
    return MapAdapter(
        self,
        server_name,
        script_name,
        subdomain,
        url_scheme,
        path_info,
        default_method,
        query_args,
    )

同时也提供了一个更易用的绑定接口,只要传入environ、server_name和subdomian就可以了,由最后的return语句可知,实际上还是调用上面分析的bind方法,只不过前面利用environ做了一些处理工作。

def bind_to_environ(self, environ, server_name=None, subdomain=None):
    """Like :meth:`bind` but you can pass it an WSGI environment and it
    will fetch the information from that dictionary.  Note that because of
    limitations in the protocol there is no way to get the current
    subdomain and real `server_name` from the environment.  If you don't
    provide it, Werkzeug will use `SERVER_NAME` and `SERVER_PORT` (or
    `HTTP_HOST` if provided) as used `server_name` with disabled subdomain
    feature.

    If `subdomain` is `None` but an environment and a server name is
    provided it will calculate the current subdomain automatically.
    Example: `server_name` is ``'example.com'`` and the `SERVER_NAME`
    in the wsgi `environ` is ``'staging.dev.example.com'`` the calculated
    subdomain will be ``'staging.dev'``.

    If the object passed as environ has an environ attribute, the value of
    this attribute is used instead.  This allows you to pass request
    objects.  Additionally `PATH_INFO` added as a default of the
    :class:`MapAdapter` so that you don't have to pass the path info to
    the match method.

    .. versionchanged:: 0.5
        previously this method accepted a bogus `calculate_subdomain`
        parameter that did not have any effect.  It was removed because
        of that.

    .. versionchanged:: 0.8
       This will no longer raise a ValueError when an unexpected server
       name was passed.

    :param environ: a WSGI environment.
    :param server_name: an optional server name hint (see above).
    :param subdomain: optionally the current subdomain (see above).
    """
    environ = _get_environ(environ)

    wsgi_server_name = get_host(environ).lower()

    if server_name is None:
        server_name = wsgi_server_name
    else:
        server_name = server_name.lower()

    if subdomain is None and not self.host_matching:
        cur_server_name = wsgi_server_name.split(".")
        real_server_name = server_name.split(".")
        offset = -len(real_server_name)
        if cur_server_name[offset:] != real_server_name:
            # This can happen even with valid configs if the server was
            # accesssed directly by IP address under some situations.
            # Instead of raising an exception like in Werkzeug 0.7 or
            # earlier we go by an invalid subdomain which will result
            # in a 404 error on matching.
            subdomain = ""
        else:
            subdomain = ".".join(filter(None, cur_server_name[:offset]))

    def _get_wsgi_string(name):
        val = environ.get(name)
        if val is not None:
            return wsgi_decoding_dance(val, self.charset)

    script_name = _get_wsgi_string("SCRIPT_NAME")
    path_info = _get_wsgi_string("PATH_INFO")
    query_args = _get_wsgi_string("QUERY_STRING")
    return Map.bind(
        self,
        server_name,
        script_name,
        subdomain,
        environ["wsgi.url_scheme"],
        environ["REQUEST_METHOD"],
        path_info,
        query_args=query_args,
    )

MapAdapter

当Map调用bind方法后,返回的就是一个MapAdapter对象。这个类基于运行时的信息做路由匹配和url重构(实际上还是调用上面分析的Rule和Map类中的方法)。__init__方法依旧是做一些初始化工作。

class MapAdapter(object):

    """Returned by :meth:`Map.bind` or :meth:`Map.bind_to_environ` and does
    the URL matching and building based on runtime information.
    """

    def __init__(
        self,
        map,
        server_name,
        script_name,
        subdomain,
        url_scheme,
        path_info,
        default_method,
        query_args=None,
    ):
        self.map = map
        self.server_name = to_unicode(server_name)
        script_name = to_unicode(script_name)
        if not script_name.endswith(u"/"):
            script_name += u"/"
        self.script_name = script_name
        self.subdomain = to_unicode(subdomain)
        self.url_scheme = to_unicode(url_scheme)
        self.path_info = to_unicode(path_info)
        self.default_method = to_unicode(default_method)
        self.query_args = query_args

之前分析的Rule类的中路由规则就是url和endpoint的映射,但实际处理请求的是视图函数,那么怎么把请求分发给对应的视图函数,就是通过dispatch方法,这个方法首先调用match方法获取endpoint和args,然后根据视图函数和endpoint的映射调用视图函数。而视图函数和endpoint的映射由我们自行实现,dispatch方法只做分发工作。这也解释了为什么flask要自行实现视图函数和endpoint的映射的原因了。

def dispatch(
    self, view_func, path_info=None, method=None, catch_http_exceptions=False
):
    """Does the complete dispatching process.  `view_func` is called with
    the endpoint and a dict with the values for the view.  It should
    look up the view function, call it, and return a response object
    or WSGI application.  http exceptions are not caught by default
    so that applications can display nicer error messages by just
    catching them by hand.  If you want to stick with the default
    error messages you can pass it ``catch_http_exceptions=True`` and
    it will catch the http exceptions.

    Here a small example for the dispatch usage::

        from werkzeug.wrappers import Request, Response
        from werkzeug.wsgi import responder
        from werkzeug.routing import Map, Rule

        def on_index(request):
            return Response('Hello from the index')

        url_map = Map([Rule('/', endpoint='index')])
        views = {'index': on_index}

        @responder
        def application(environ, start_response):
            request = Request(environ)
            urls = url_map.bind_to_environ(environ)
            return urls.dispatch(lambda e, v: views[e](request, **v),
                                 catch_http_exceptions=True)

    Keep in mind that this method might return exception objects, too, so
    use :class:`Response.force_type` to get a response object.

    :param view_func: a function that is called with the endpoint as
                      first argument and the value dict as second.  Has
                      to dispatch to the actual view function with this
                      information.  (see above)
    :param path_info: the path info to use for matching.  Overrides the
                      path info specified on binding.
    :param method: the HTTP method used for matching.  Overrides the
                   method specified on binding.
    :param catch_http_exceptions: set to `True` to catch any of the
                                  werkzeug :class:`HTTPException`\\s.
    """
    try:
        try:
            endpoint, args = self.match(path_info, method)
        except RequestRedirect as e:
            return e
        return view_func(endpoint, args)
    except HTTPException as e:
        if catch_http_exceptions:
            return e
        raise

你可能感兴趣的:(werkzeug)