django源码分析:中间件CsrfViewMiddleware

本文环境python3.5.2,django1.10.x系列
介绍django中关于跨域请求保护的内容,主要由其中一个中间件完成,下面稍微讲一下关于csrf原理和django中间件在请求中的作用,重点将放到CsrfViewMiddleware中间键的源码讲解。

csrf原理

   CSRF概念:CSRF跨站点请求伪造(Cross—Site Request Forgery),跟XSS攻击一样,存在巨大的危害性,你可以这样来理解:
   攻击者盗用了你的身份,以你的名义发送恶意请求,对服务器来说这个请求是完全合法的,但是却完成了攻击者所期望的一个操作,比如以你的名义发送邮件、发消息,盗取你的账号,添加系统管理员,甚至于购买商品、虚拟货币转账等。 如下:其中Web A为存在CSRF漏洞的网站,Web B为攻击者构建的恶意网站,User C为Web A网站的合法用户。

django中间件作用

Django文档中对中间件的解释

中间件是Django请求/响应处理的链接框架。它是一个轻巧的底层的“插件”系统,用于全局改变Django的输入或输出。
每个中间件组件负责执行某些特定功能。例如,Django包含一个中间件组件AuthenticationMiddleware,它将用户与使用会话的请求相关联。

在请求阶段,调用views之前,django按照MIDDLEWARE_CLASSES中定义的顺序从上到下调用中间件。有两个hooks:
process_request()
process_view()
在响应阶段,调用views之后,中间件被从下到上反向调用,有三个hooks:
process_exception() (只有当view中raise一个例外时)
process_template_response() (只有view中返回template时)
process_response()

django源码分析:中间件CsrfViewMiddleware_第1张图片

源码分析

下面详细解释一下中间件CsrfViewMiddleware的源代码,代码旁边都有注释。

class CsrfViewMiddleware(MiddlewareMixin):
    """
    Middleware that requires a present and correct csrfmiddlewaretoken
    for POST requests that have a CSRF cookie, and sets an outgoing
    CSRF cookie.

    This middleware should be used in conjunction with the csrf_token template
    tag.
    """
    # The _accept and _reject methods currently only exist for the sake of the
    # requires_csrf_token decorator.
    def _accept(self, request):
        # Avoid checking the request twice by adding a custom attribute to
        # request.  This will be relevant when both decorator and middleware
        # are used.
        request.csrf_processing_done = True                                      # 避免检查middleware或者装饰器检查多次request
        return None

    def _reject(self, request, reason):                                          # 返回请求拒绝理由
        logger.warning(
            'Forbidden (%s): %s', reason, request.path,
            extra={
                'status_code': 403,
                'request': request,
            }
        )
        return _get_failure_view()(request, reason=reason)

    def process_view(self, request, callback, callback_args, callback_kwargs):   
        if getattr(request, 'csrf_processing_done', False):                      # 检查csrf验证参数,如果已经验证,则返回None
            return None

        try:
            cookie_token = request.COOKIES[settings.CSRF_COOKIE_NAME]            # CSRF_SESSION_KEY= "csrftoken"
        except KeyError:
            csrf_token = None
        else:
            csrf_token = _sanitize_token(cookie_token)                           # 根据cookie_token 生成csrf_toke
            if csrf_token != cookie_token:
                # Cookie token needed to be replaced;
                # the cookie needs to be reset.
                request.csrf_cookie_needs_reset = True                           # 当csrf_token != cookie_token , 设置参数 csrf_cookie 需要重制
            # Use same token next time.
            request.META['CSRF_COOKIE'] = csrf_token                             # csrf_token 放入到META中

        # Wait until request.META["CSRF_COOKIE"] has been manipulated before
        # bailing out, so that get_token still works
        if getattr(callback, 'csrf_exempt', False):                              # 检查视图函数是否有 csrf_exempt 装饰器, 有就返回None
            return None

        # Assume that anything not defined as 'safe' by RFC7231 needs protection
        if request.method not in ('GET', 'HEAD', 'OPTIONS', 'TRACE'):            # 如果请求方法为修改数据库方法,如 patch, post
            if getattr(request, '_dont_enforce_csrf_checks', False):             # 关闭CSRF检查测试套件的机制.在创建CSRF cookie之后,
                # Mechanism to turn off CSRF checks for test suite.              # 所以其他所有内容继续完全相同(例如发送cookie等),但在调用reject()之前
                # It comes after the creation of CSRF cookies, so that
                # everything else continues to work exactly the same
                # (e.g. cookies are sent, etc.), but before any
                # branches that call reject().
                return self._accept(request)

            if request.is_secure():                                              # is_secure 判断请求是否为HTTPS请求,返回True
                # Suppose user visits http://example.com/
                # An active network attacker (man-in-the-middle, MITM) sends a
                # POST form that targets https://example.com/detonate-bomb/ and
                # submits it via JavaScript.
                #
                # The attacker will need to provide a CSRF cookie and token, but
                # that's no problem for a MITM and the session-independent
                # secret we're using. So the MITM can circumvent the CSRF
                # protection. This is true for any HTTP connection, but anyone
                # using HTTPS expects better! For this reason, for
                # https://example.com/ we need additional protection that treats
                # http://example.com/ as completely untrusted. Under HTTPS,
                # Barth et al. found that the Referer header is missing for
                # same-domain requests in only about 0.2% of cases or less, so
                # we can use strict Referer checking.
                referer = force_text(                                            # 获得请求的referer
                    request.META.get('HTTP_REFERER'),
                    strings_only=True,
                    errors='replace'
                )
                if referer is None:                                              # 如果referer为空,报403错
                    return self._reject(request, REASON_NO_REFERER)

                referer = urlparse(referer)

                # Make sure we have a valid URL for Referer.
                if '' in (referer.scheme, referer.netloc):                         # referer.scheme: 请求的协议,一般为http或者https, referer.netloc: host域名
                    return self._reject(request, REASON_MALFORMED_REFERER)         # 报错

                # Ensure that our Referer is also secure.
                if referer.scheme != 'https':                                      # 如果请求协议不是https ,报错
                    return self._reject(request, REASON_INSECURE_REFERER)

                # If there isn't a CSRF_COOKIE_DOMAIN, assume we need an exact
                # match on host:port. If not, obey the cookie rules.
                if settings.CSRF_COOKIE_DOMAIN is None:
                    # request.get_host() includes the port.
                    good_referer = request.get_host()
                else:
                    good_referer = settings.CSRF_COOKIE_DOMAIN                          
                    server_port = request.get_port()
                    if server_port not in ('443', '80'):
                        good_referer = '%s:%s' % (good_referer, server_port)

                # Here we generate a list of all acceptable HTTP referers,
                # including the current host since that has been validated
                # upstream.
                good_hosts = list(settings.CSRF_TRUSTED_ORIGINS)                  # 在这里,我们生成所有可能接受HTTP引用的列表.
                good_hosts.append(good_referer)                                   # 包括主机和settings.CSRF_COOKIE_DOMAIN中定义的domain

                if not any(is_same_domain(referer.netloc, host) for host in good_hosts):  # 判断请求domain 是否在 允许列表中
                    reason = REASON_BAD_REFERER % referer.geturl()
                    return self._reject(request, reason)                                  # 如果都不在,则禁止跨域请求
                                                                                          # 以上是通过domain要判断是否跨域请求

            if csrf_token is None:                                                        # csrf_token 为空,返回403
                # No CSRF cookie. For POST requests, we insist on a CSRF cookie,
                # and in this way we can avoid all CSRF attacks, including login
                # CSRF.
                return self._reject(request, REASON_NO_CSRF_COOKIE)

            # Check non-cookie token for match.
            request_csrf_token = ""
            if request.method == "POST":
                try:
                    request_csrf_token = request.POST.get('csrfmiddlewaretoken', '')      # 这里的csrfmiddlewaretoken是提交的表单中的值,在模板中用{% csrf_token %}生成
                except IOError:
                    # Handle a broken connection before we've completed reading
                    # the POST data. process_view shouldn't raise any
                    # exceptions, so we'll ignore and serve the user a 403
                    # (assuming they're still listening, which they probably
                    # aren't because of the error).
                    pass

            if request_csrf_token == "":
                # Fall back to X-CSRFToken, to make things easier for AJAX,
                # and possible for PUT/DELETE.
                request_csrf_token = request.META.get(settings.CSRF_HEADER_NAME, '')      # ajax中使用"X-CSRFToken"  CERF_HEADER_NAME = "HTTP_X_CSRFTOKEN"

            request_csrf_token = _sanitize_token(request_csrf_token)
            if not _compare_salted_tokens(request_csrf_token, csrf_token):                # 对比两个csrf_token, 一个是表单里隐藏的csrfmiddlewaretoken或者ajax的header: X_CSRFTOKEN,
                return self._reject(request, REASON_BAD_TOKEN)                            # 另一个是自带的cookies里面的csrf_token

        return self._accept(request)

    def process_response(self, request, response):
        if not getattr(request, 'csrf_cookie_needs_reset', False):                        # 判断是否需要重置 csrf_cookie
            if getattr(response, 'csrf_cookie_set', False):                               # 判断response中 设置csrf_cookie 参数
                return response

        if not request.META.get("CSRF_COOKIE_USED", False):
            return response

        # Set the CSRF cookie even if it's already set, so we renew
        # the expiry timer.
        response.set_cookie(settings.CSRF_COOKIE_NAME,                                    # response 重新设置cookie, 设置csrf_cookie
                            request.META["CSRF_COOKIE"],
                            max_age=settings.CSRF_COOKIE_AGE,
                            domain=settings.CSRF_COOKIE_DOMAIN,
                            path=settings.CSRF_COOKIE_PATH,
                            secure=settings.CSRF_COOKIE_SECURE,
                            httponly=settings.CSRF_COOKIE_HTTPONLY
                            )
        # Content varies with the CSRF cookie, so set the Vary header.
        patch_vary_headers(response, ('Cookie',))
        response.csrf_cookie_set = True                                                   # 设置csrf 
        return response

CsrfViewMiddleware中间件中主要有process_view() ,process_response(), 两种hooks.

process_view() 主要是验证csrf_token参数,验证请求中的csrf_token与cookie中的csrf_token,如果一致,则通过
process_response() 是将csrf_token参数设置到response cookie中

中间件中主要使用倒了以下几个主要的函数,主要为生成csrf_token的函数,简言之,csrf_token就是server生成的一堆随机字符串,再通过某种有规律的方式生成的64位字符串

CSRF_SECRET_LENGTH = 32
CSRF_TOKEN_LENGTH = 2 * CSRF_SECRET_LENGTH
CSRF_ALLOWED_CHARS = string.ascii_letters + string.digits



def _sanitize_token(token):                                                       # 获得csrf_token, 如果没有token,则生成token
    # Allow only ASCII alphanumerics
    if re.search('[^a-zA-Z0-9]', force_text(token)):
        return _get_new_csrf_token()
    elif len(token) == CSRF_TOKEN_LENGTH:                                         # 如果token长度为64位,则为加salte版本
        return token
    elif len(token) == CSRF_SECRET_LENGTH:        
        # Older Django versions set cookies to values of CSRF_SECRET_LENGTH
        # alphanumeric characters. For backwards compatibility, accept
        # such values as unsalted secrets.
        # It's easier to salt here and be consistent later, rather than add
        # different code paths in the checks, although that might be a tad more
        # efficient.
        return _salt_cipher_secret(token)                                         # 老版本的token都为32位,需要为token添加32位salts
    return _get_new_csrf_token()




def _get_new_csrf_string():                                                          # 生成随机32位字符串作为csrf_token, 或者salts
    return get_random_string(CSRF_SECRET_LENGTH, allowed_chars=CSRF_ALLOWED_CHARS)


def _salt_cipher_secret(secret):                                                    # 为csrf_token 加salts
    """
    Given a secret (assumed to be a string of CSRF_ALLOWED_CHARS), generate a
    token by adding a salt and using it to encrypt the secret.
    """
    salt = _get_new_csrf_string()
    chars = CSRF_ALLOWED_CHARS
    pairs = zip((chars.index(x) for x in secret), (chars.index(x) for x in salt))
    cipher = ''.join(chars[(x + y) % len(chars)] for x, y in pairs)
    return salt + cipher


def _compare_salted_tokens(request_csrf_token, csrf_token):                       # 比较请求的csrf_token 与 cookie中的 csrf_token
    # Assume both arguments are sanitized -- that is, strings of
    # length CSRF_TOKEN_LENGTH, all CSRF_ALLOWED_CHARS.
    return constant_time_compare(                                                 # 比较去除salt后的token是否相同
        _unsalt_cipher_token(request_csrf_token),
        _unsalt_cipher_token(csrf_token),
    )

rotate_token()方法主要用在用户登录后,需要重置csrf_token

def rotate_token(request):                                # django.contrib.auth.login()调用此方法, 登录时需要重置csrf
    """ 
    Changes the CSRF token in use for a request - should be done on login
    for security purposes.
    当完成登录后,需要重新更改scrf_token值
    """
    request.META.update({
        "CSRF_COOKIE_USED": True,
        "CSRF_COOKIE": _get_new_csrf_token(),
    })
    request.csrf_cookie_needs_reset = True

总结
首次访问页面的时候,cookie_token值为空,server生成cookie_token,并且在proview_response()方法中,将cookie_token值放入到response中。
后面访问页面时,请求会携带cookie_token.
  后续Template中的{% csrf_token %}携带一个csrf_secret的值,在process_view()方法中进行解密,与cookie_token 进行对比,如果不同则直接返回403

你可能感兴趣的:(Django,python,web)