django源码分析:中间件SessionMiddleware

本文环境python3.5.2,django1.10.x系列
本文主要介绍django是如果通过中间件SessionMiddleware来处理session,重点将放到SessionMiddleware中间键的源码讲解。

关于中间件的作用,在上一篇文章介绍CsrfViewMiddleware中间件时已经讲过,在这里就不再进行赘述。下面就直入主题,讲讲SessionMiddleware的源码。

SessionMiddleware()类中主要有三个方法init(), process_request(), process_response(),其中__init__()方法前面没有讲过,这里讲一下。

中间件中的__init__()方法,是在启动服务时,遍历所有的中间件时,中间件类实例化时加载调用,后面不再处理请求便不再调用。

源代码如下,进行了简单的分析

class SessionMiddleware(MiddlewareMixin):
    def __init__(self, get_response=None):
        self.get_response = get_response                                         # 传入的值, None
        engine = import_module(settings.SESSION_ENGINE)                          # 利用import_module 导入 django.contrib.sessions.backends.db 后面会详细解释
        self.SessionStore = engine.SessionStore                                  # 获取SessionStore() 类下文将重点讲解

    def process_request(self, request):
        session_key = request.COOKIES.get(settings.SESSION_COOKIE_NAME)          # SESSION_COOKIE_NAME = 'sessionid', 获取cookies中的session值 
        request.session = self.SessionStore(session_key)                         # 将session传入,获得SessionStore实例, 并赋给request.session

    def process_response(self, request, response):
        """
        If request.session was modified, or if the configuration is to save the
        session every time, save the changes and set a session cookie or delete
        the session cookie if the session has been emptied.
        """
        try:
            accessed = request.session.accessed                                  # 判断是否获取过session 返回True or False
            modified = request.session.modified                                  # 判断是否修改过session 返回True or False
            empty = request.session.is_empty()                                   # 判断session是否为空
        except AttributeError:
            pass
        else:
            # First check if we need to delete this cookie.
            # The session should be deleted only if the session is entirely empty
            if settings.SESSION_COOKIE_NAME in request.COOKIES and empty:        # 判断如果cookies中有session_id, 但是session值为空时
                response.delete_cookie(                                          # 则删除cookies中的session信息
                    settings.SESSION_COOKIE_NAME,
                    path=settings.SESSION_COOKIE_PATH,
                    domain=settings.SESSION_COOKIE_DOMAIN,
                )
            else:
                if accessed:
                    patch_vary_headers(response, ('Cookie',))                    # 主要修改response.header中的Vary数
                if (modified or settings.SESSION_SAVE_EVERY_REQUEST) and not empty:    # 判断session是否有过修改
                    if request.session.get_expire_at_browser_close():                  # 当关闭浏览器,session直接设置为过期
                        max_age = None
                        expires = None
                    else:
                        max_age = request.session.get_expiry_age()                     # 获得session的过期时间
                        expires_time = time.time() + max_age
                        expires = cookie_date(expires_time)
                    # Save the session data and refresh the client cookie.
                    # Skip session save for 500 responses, refs #3881.
                    if response.status_code != 500:
                        try:
                            request.session.save()                                     # 当请求成功时,session进行持久化存储,即存储到django_session 表中
                        except UpdateError:                                            # 存储过程比较重要,下面会重点讲述此方法
                            # The user is now logged out; redirecting to same
                            # page will result in a redirect to the login page
                            # if required.
                            return redirect(request.path)
                        response.set_cookie(                                            # 将session_id放置到response的cookies中去
                            settings.SESSION_COOKIE_NAME,
                            request.session.session_key, max_age=max_age,
                            expires=expires, domain=settings.SESSION_COOKIE_DOMAIN,
                            path=settings.SESSION_COOKIE_PATH,
                            secure=settings.SESSION_COOKIE_SECURE or None,
                            httponly=settings.SESSION_COOKIE_HTTPONLY or None,
                        )
        return response

下面详细分析上文中重点标记的几个点

  def __init__(self, get_response=None):
        self.get_response = get_response                                         # 传入的值, None
        engine = import_module(settings.SESSION_ENGINE)                          # 利用import_module 导入 django.contrib.sessions.backends.db 后面会详细解释
        self.SessionStore = engine.SessionStore                                  # 获取SessionStore() 类下文将重点讲解

init()方法重点讲叙一下参数engine是什么,也就是engine = import_module(settings.SESSION_ENGINE)
import_module()方法大家都知道这是python自带的importlib中的一个方法,根据字符串标示的路径导入模块或方法属性。代码如下,这里就不深入分析此方法,有兴趣的童鞋可以了解一下。

def import_module(name, package=None):
    """Import a module.

    The 'package' argument is required when performing a relative import. It
    specifies the package to use as the anchor point from which to resolve the
    relative import to an absolute import.

    """
    level = 0
    if name.startswith('.'):
        if not package:
            msg = ("the 'package' argument is required to perform a relative "
                   "import for {!r}")
            raise TypeError(msg.format(name))
        for character in name:
            if character != '.':
                break
            level += 1
    return _bootstrap._gcd_import(name[level:], package, level)


那由方法可知settings.SESSION_ENGINE肯定是一串字符串。下面来分析一下这串字符串是什么。当大家看到settings,肯定就立马去看项目下的setting文件。但是除非是特意配置,要不然是没有这个参数,那这个参数在哪里呢,且看我下面分析。
但我们定位settings时,会看到如下一行代码。

settings = LazySettings()

原来settings是LazySettings()的实例化,这个类主要作用就是加载所有setting_moudle中的配置内容,其中包括global_setting。不知道同学们对这个文件是否熟悉,我们项目中的setting文件就是在创建项目时,由global_setting文件中部分内容复制得到。

所以在global_setting 文件中可以找到

SESSION_ENGINE = 'django.contrib.sessions.backends.db'

self.SessionStore = engine.SessionStore 就是将模块中的类方法赋值

在这里特别说明一下,SessionStore()类,就是一个管理和储存session的类
django为我们预先设置了三种方法管理session

SESSION_ENGINE = 'django.contrib.sessions.backends.db'
SESSION_ENGINE = 'django.contrib.sessions.backends.cache'
SESSION_ENGINE = 'django.contrib.sessions.backends.cached_db'

可以通过名字判断,三种管理方式分别为:db数据库存储管理,缓存存储(一般都放置到redis中), 缓存,db数据库同事存储管理, 使用者可以根据项目需求自行配置。
这里主要讲db存储。
下面看一下SessionStore()这个类的初始化部分

class SessionStore(SessionBase):
    """
    Implements database session store.
    """
    def __init__(self, session_key=None):
        super(SessionStore, self).__init__(session_key)                             # 继承父类


父类的初始化参数看一下

class SessionBase(object):
    """
    Base class for all Session classes.
    """
    TEST_COOKIE_NAME = 'testcookie'
    TEST_COOKIE_VALUE = 'worked'

    __not_given = object()

    def __init__(self, session_key=None):
        self._session_key = session_key                                                                      # 传入的session_key参数
        self.accessed = False                                                                                       # accessed 参数标示 session是否获取过
        self.modified = False                                                                                        # modified参数标示 session是否修改过
        self.serializer = import_string(settings.SESSION_SERIALIZER)                    # session的序列化器

上面是SessionStore中几个重要的参数,后面的操作中会经常用到。
下面来分析一下process_request()方法中的内容

class SessionMiddleware(MiddlewareMixin):
    def process_request(self, request):
        session_key = request.COOKIES.get(settings.SESSION_COOKIE_NAME)          # SESSION_COOKIE_NAME = 'sessionid', 获取cookies中的session值 
        request.session = self.SessionStore(session_key)                         # 将session传入,获得SessionStore实例, 并赋给request.session

这个方法里的内容比较简单
第一行内容就是从request.COOKIES中取出session_key
第二行就是将session_key参数传入到类SessionStore()中,然后实例化,并且赋给request.session, 那现在request.session就是SessionStore的一个实例。

在设置session时代码如request.session[SESSION_KEY] = user._meta.pk.value_to_string(user)类似的操作 ,对象利用字典式的添加属性,会调用__setitem__方法,获取属性操作调用__getitem__方法
下面看一下这几个方法

class SessionBase(object):
    """
    Base class for all Session classes.
    """
    TEST_COOKIE_NAME = 'testcookie'
    TEST_COOKIE_VALUE = 'worked'

    __not_given = object()

    def __init__(self, session_key=None):
        self._session_key = session_key
        self.accessed = False
        self.modified = False
        self.serializer = import_string(settings.SESSION_SERIALIZER)

    def __contains__(self, key):                                                                      # 获得key值
        return key in self._session

    def __getitem__(self, key):                                                                      # 获得session
        return self._session[key]

    def __setitem__(self, key, value):                                                           # 设置session值
        self._session[key] = value
        self.modified = True

    def __delitem__(self, key):                                                                     # 删除session 值
        del self._session[key]
        self.modified = True

看一下 __getitem__方法

    def __getitem__(self, key):                                                                      # 获得session
        return self._session[key]

看一下_session是什么,你会发现这么一行代码, 原来_session就是标示_get_session这个方法

_session = property(_get_session)
 def _get_session(self, no_load=False):
        """
        Lazily loads session from storage (unless "no_load" is True, when only
        an empty dict is stored) and stores it in the current instance.
        """
        self.accessed = True                                                                       # accessed参数置为True,表示获取过session
        try:
            return self._session_cache                                                        # 获取_session_cache session 缓存
        except AttributeError:
            if self.session_key is None or no_load:                                    # 当session_key为空,并且加载是,返回空字典
                self._session_cache = {}
            else:
                self._session_cache = self.load()                                        # 否则调用加载函数
        return self._session_cache

分析一下上面的代码,当缓存中有_session_cache时,就会返回,但是当第一次请求时,_session_cache中没有值,就会执行except后的代码因为self.session_key 就是我们传入的参数,所以会执行self._session_cache = self.load()
下面看一下self.load()方法

 def load(self):
        try:
            s = self.model.objects.get(                                                             # 从数据库中筛选数据
                session_key=self.session_key,
                expire_date__gt=timezone.now()
            )
            return self.decode(s.session_data)                                              # 如果有,将获得的数据解码返回
        except (self.model.DoesNotExist, SuspiciousOperation) as e:
            if isinstance(e, SuspiciousOperation):
                logger = logging.getLogger('django.security.%s' % e.__class__.__name__)
                logger.warning(force_text(e))
            self._session_key = None                                                             # 如果没有,就返回空
            return {}

由以上分析可知,load()方法就是起到数据库中加载数据作用,其中self.model就是django.contrib.sessions.models下的Session model

 @classmethod
    def get_model_class(cls):
        # Avoids a circular import and allows importing SessionStore when
        # django.contrib.sessions is not in INSTALLED_APPS.
        from django.contrib.sessions.models import Session                       # 导入model class
        return Session

    @cached_property
    def model(self):
        return self.get_model_class()

在process_request()方法中,我们可以看到,当我们设置session时,session并没有被储存起来,而是放在了缓存中。在process_response()方法中,我们将看到save()方法,这个至关重要,是将我们设置的session存储到数据库中的一步。
下面一点一点的来看process_response()方法

 def process_response(self, request, response):
        """
        If request.session was modified, or if the configuration is to save the
        session every time, save the changes and set a session cookie or delete
        the session cookie if the session has been emptied.
        """
        try:
            accessed = request.session.accessed                                  # 判断是否获取过session 返回True or False
            modified = request.session.modified                                  # 判断是否修改过session 返回True or False
            empty = request.session.is_empty()                                   # 判断session是否为空
        except AttributeError:
            pass

先看try中的三个参数,前两个都有过介绍,至于is_empty()方法,看代码

 def is_empty(self):
        "Returns True when there is no session_key and the session is empty"           # 当没有session_key 或者session值为空时,返回True
        try:
            return not bool(self._session_key) and not self._session_cache
        except AttributeError:
            return True

下面接着看process_response()中else后的代码,try中内容无报错后执行的代码

 else:
            # First check if we need to delete this cookie.
            # The session should be deleted only if the session is entirely empty
            if settings.SESSION_COOKIE_NAME in request.COOKIES and empty:        # 判断如果cookies中有session_id, 但是session值为空时
                response.delete_cookie(                                          # 则删除cookies中的session信息
                    settings.SESSION_COOKIE_NAME,
                    path=settings.SESSION_COOKIE_PATH,
                    domain=settings.SESSION_COOKIE_DOMAIN,
                )
            else:
                if accessed:
                    patch_vary_headers(response, ('Cookie',))                          # 主要修改response.header中的Vary数
                if (modified or settings.SESSION_SAVE_EVERY_REQUEST) and not empty:    # 判断session是否有过修改
                    if request.session.get_expire_at_browser_close():                  # 当关闭浏览器,session直接设置为过期
                        max_age = None
                        expires = None
                    else:
                        max_age = request.session.get_expiry_age()                     # 获得session的过期时间
                        expires_time = time.time() + max_age
                        expires = cookie_date(expires_time)
                    # Save the session data and refresh the client cookie.
                    # Skip session save for 500 responses, refs #3881.
                    if response.status_code != 500:
                        try:
                            request.session.save()                                     # 当请求成功时,session进行持久化存储,即存储到django_session 表中
                        except UpdateError:                                            # 存储过程比较重要,下面会重点讲述此方法
                            # The user is now logged out; redirecting to same
                            # page will result in a redirect to the login page
                            # if required.
                            return redirect(request.path)
                        response.set_cookie(                                            # 将session_id放置到response的cookies中去
                            settings.SESSION_COOKIE_NAME,
                            request.session.session_key, max_age=max_age,
                            expires=expires, domain=settings.SESSION_COOKIE_DOMAIN,
                            path=settings.SESSION_COOKIE_PATH,
                            secure=settings.SESSION_COOKIE_SECURE or None,
                            httponly=settings.SESSION_COOKIE_HTTPONLY or None,
                        )
        return response

这其中有两个重要的代码

 if accessed:
    patch_vary_headers(response, ('Cookie',))                    # 主要修改response.header中的Vary数

如果accessed参数为True, 也就是获取过session值,就会执行patch_vary_headers()这个方法,看一下代码,看一下他主要是干什么的

def patch_vary_headers(response, newheaders):
    """
    Adds (or updates) the "Vary" header in the given HttpResponse object.
    newheaders is a list of header names that should be in "Vary". Existing
    headers in "Vary" aren't removed.
    添加(更新)HttpResponse对象中的头部信息"Vary" 。'Vary"中的内容是一个列表,并且,以前存在的信息不会被移除
    """
    # Note that we need to keep the original order intact, because cache
    # implementations may rely on the order of the Vary contents in, say,
    # computing an MD5 hash.
    if response.has_header('Vary'):
        vary_headers = cc_delim_re.split(response['Vary'])                                    # 如果"Vary"中有信息,就切割形成列表
    else:
        vary_headers = []
    # Use .lower() here so we treat headers as case-insensitive.
    existing_headers = set(header.lower() for header in vary_headers)              
    additional_headers = [newheader for newheader in newheaders                # 新旧header信息合并成一个列表
                          if newheader.lower() not in existing_headers]
    response['Vary'] = ', '.join(vary_headers + additional_headers)                   # 将信息赋予到response['Vary'] 中

Vary 是一个HTTP响应头部信息,它决定了对于未来的一个请求头,应该用一个缓存的回复(response)还是向源服务器请求一个新的回复。它被服务器用来表明在 content negotiation algorithm(内容协商算法)中选择一个资源代表的时候应该使用哪些头部信息(headers).
距离说明:
哪种情况下使用 Vary: 对于User-Agent 头部信息,例如你提供给移动端的内容是不同的,可用防止你客户端误使用了用于桌面端的缓存。 并可帮助Google和其他搜索引擎来发现你的移动端版本的页面,同时告知他们不需要Cloaking。

Vary: User-Agent

接下来我们分析process_response其他的代码段

  if (modified or settings.SESSION_SAVE_EVERY_REQUEST) and not empty:    # 判断session是否有过修改
     if request.session.get_expire_at_browser_close():                  # 当关闭浏览器,session直接设置为过期
        max_age = None
        expires = None
     else:
        max_age = request.session.get_expiry_age()                     # 获得session的过期时间
        expires_time = time.time() + max_age
        expires = cookie_date(expires_time)

get_expire_at_browser_close(), get_expiry_age()的代码如下

    def get_expire_at_browser_close(self):
        """
        Returns ``True`` if the session is set to expire when the browser
        closes, and ``False`` if there's an expiry date. Use
        ``get_expiry_date()`` or ``get_expiry_age()`` to find the actual expiry
        date/age, if there is one.
        当浏览器关闭时,session被设置为过期,并且此方法返回True
        """
        if self.get('_session_expiry') is None:                                                         # 判断_session_expiry
            return settings.SESSION_EXPIRE_AT_BROWSER_CLOSE               # 可以在setting文件中配置此属性,默认为True
        return self.get('_session_expiry') == 0

    def get_expiry_age(self, **kwargs):
        """Get the number of seconds until the session expires.
	获取session过期的时间
        Optionally, this function accepts `modification` and `expiry` keyword
        arguments specifying the modification and expiry of the session.
        """
        try:
            modification = kwargs['modification']
        except KeyError:
            modification = timezone.now()
        # Make the difference between "expiry=None passed in kwargs" and
        # "expiry not passed in kwargs", in order to guarantee not to trigger
        # self.load() when expiry is provided.
        try:
            expiry = kwargs['expiry']
        except KeyError:
            expiry = self.get('_session_expiry')

        if not expiry:   # Checks both None and 0 cases
            return settings.SESSION_COOKIE_AGE
        if not isinstance(expiry, datetime):
            return expiry
        delta = expiry - modification
        return delta.days * 86400 + delta.seconds

下面就主要分析一下下面这段代码

    if response.status_code != 500:
          try:
              request.session.save()
          except UpdateError:
              # The user is now logged out; redirecting to same
              # page will result in a redirect to the login page
              # if required.
              return redirect(request.path)

当放回response的状态码不是500时,就会执行request.session.save()动作,这个动作就是将session进行持久化存储,我们来简单看下SessionStore对象下的save方法是如何操作的

 def save(self, must_create=False):
        """
        Saves the current session data to the database. If 'must_create' is
        True, a database error will be raised if the saving operation doesn't
        create a *new* entry (as opposed to possibly updating an existing
        entry).
        """
        if self.session_key is None:
            return self.create()                                                                  # 当第一次访问,没有session_key值时,就要创建session_key值
        data = self._get_session(no_load=must_create)
        obj = self.create_model_instance(data)
        using = router.db_for_write(self.model, instance=obj)
        try:
            with transaction.atomic(using=using):
                obj.save(force_insert=must_create, force_update=not must_create, using=using)
        except IntegrityError:
            if must_create:
                raise CreateError
            raise
        except DatabaseError:
            if not must_create:
                raise UpdateError
            raise

下面一步步分析上面的代码,当第一次访问,并无session_key时,但是session中设置的参数已经存到缓存中,这是就要生成一个session_key.

  def create(self):
        while True:
            self._session_key = self._get_new_session_key()                      # 获得新的session_key 值
            try:
                # Save immediately to ensure we have a unique entry in the
                # database.
                self.save(must_create=True)                                                   # 生成成功后,重新调用save()方法
            except CreateError:
                # Key wasn't unique. Try again.
                continue
            self.modified = True
            return

create()方法主要是调用了self._get_new_session_key() 方法来生成session_key

    def _get_new_session_key(self):
        "Returns session key that isn't being used."
        while True:
            session_key = get_random_string(32, VALID_KEY_CHARS)          # 生成新的字符串
            if not self.exists(session_key):                                                          # 判断这个字符串是否存在,不村子啊就返回session_key
                break
        return session_key



	def get_random_string(length=12,
	                      allowed_chars='abcdefghijklmnopqrstuvwxyz'
	                                    'ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789'):
	    """
	    Returns a securely generated random string.
	
	    The default length of 12 with the a-z, A-Z, 0-9 character set returns
	    a 71-bit value. log_2((26+26+10)^12) =~ 71 bits
	    """
	    if not using_sysrandom:
	        # This is ugly, and a hack, but it makes things better than
	        # the alternative of predictability. This re-seeds the PRNG
	        # using a value that is hard for an attacker to predict, every
	        # time a random string is required. This may change the
	        # properties of the chosen random sequence slightly, but this
	        # is better than absolute predictability.
	        random.seed(                                                                         # 一段字符串进过hash等加密后生成session_key
	            hashlib.sha256(
	                ("%s%s%s" % (
	                    random.getstate(),
	                    time.time(),
	                    settings.SECRET_KEY)).encode('utf-8')
	            ).digest())
	    return ''.join(random.choice(allowed_chars) for i in range(length))

最后到了get_random_string()方法,最后就是生成一个32位的无序字符串。
生成session_key后,返回到create()方法,继续执行 self.save(must_create=True) 重新到save()方法中,参数must_create变为了True,好,下面看一下当参数表为True时,save()方法时怎么执行的

def save(self, must_create=False):
        """
        Saves the current session data to the database. If 'must_create' is
        True, a database error will be raised if the saving operation doesn't
        create a *new* entry (as opposed to possibly updating an existing
        entry).
        """
        if self.session_key is None:
            return self.create()                                                                  # 已建立session_key, 所以这一步跳过
        data = self._get_session(no_load=must_create)                       # 取出缓存中的数据
        obj = self.create_model_instance(data)                                    # 根据data数据,创建实例
        using = router.db_for_write(self.model, instance=obj)
        try:
            with transaction.atomic(using=using):
                obj.save(force_insert=must_create, force_update=not must_create, using=using)
        except IntegrityError:
            if must_create:
                raise CreateError
            raise
        except DatabaseError:
            if not must_create:
                raise UpdateError
            raise

下面看下一行代码

 data = self._get_session(no_load=must_create)   

代码就又到了_get_session()方法, 这一步主要是取出_session_cache中缓存的数据

    def _get_session(self, no_load=False):
        """
        Lazily loads session from storage (unless "no_load" is True, when only
        an empty dict is stored) and stores it in the current instance.
        """
        self.accessed = True
        try:
            return self._session_cache
        except AttributeError:
            if self.session_key is None or no_load:
                self._session_cache = {}
            else:
                self._session_cache = self.load()
        return self._session_cache

    _session = property(_get_session)

接着看下一行代码

 obj = self.create_model_instance(data)              # 根据data数据,创建实例, 看一下

create_model_instance()方法

    def create_model_instance(self, data):
        """
        Return a new instance of the session model object, which represents the
        current session state. Intended to be used for saving the session data
        to the database.
        """
        return self.model(
            session_key=self._get_or_create_session_key(),
            session_data=self.encode(data),
            expire_date=self.get_expiry_date(),
        )

这段代码就比较简单,就是创建一个model实例, 这里的model前面已经提到过,就是Session,在数据库中名字为django_session
接着看下面的代码

using = router.db_for_write(self.model, instance=obj)

看一下db_for_write()方法

db_for_write = _router_func('db_for_write')
    def _router_func(action):
        def _route_db(self, model, **hints):
            chosen_db = None
            for router in self.routers:
                try:
                    method = getattr(router, action)
                except AttributeError:
                    # If the router doesn't have a method, skip to the next one.
                    pass
                else:
                    chosen_db = method(model, **hints)
                    if chosen_db:
                        return chosen_db
            instance = hints.get('instance')
            if instance is not None and instance._state.db:
                return instance._state.db
            return DEFAULT_DB_ALIAS
        return _route_db

最后定位到此方法,说实话,这个方法的具体代码内容我不是很理解,大家有什么想法可以评论给我。
接着分析try中的内容。

  try:
    with transaction.atomic(using=using):
          obj.save(force_insert=must_create, force_update=not must_create, using=using)
  except IntegrityError:

这个就是使用数据库中的事务来创建一条数据。

总结

1、以上就是整个SessionMiddleware的工作流程。
2、init()方法的主要作用就是加载出类SessionStore
3、process_request()方法生成SessionStore实例,并赋给request.session, sever可以通过请求中传入的session_key,获得session内容,或者进行赋值,并且保存到session_cache 缓存中。
4、 process_response()主要是为第一次请求生成session_key, 并且将session_cache 缓存中的内容进行持久化存储。

你可能感兴趣的:(Django,web,python)