Django源码阅读 (一) 项目的生成与启动

诚实的说，直到目前为止，我并不欣赏django。在我的认知它并不是多么精巧的设计。只是由功能堆积起来的"成熟方案"。但每一样东西的崛起都是时代的选择。无论你多么不喜欢，但它被需要。希望有一天，python能有更多更丰富的成熟方案，且不再被诟病性能和可维护性。（屁话结束）
取其精华去其糟粕，django的优点是方便，我们这次源码阅读的目的是探究其方便的本质。计划上本次源码阅读不会精细到每一处，而是大体以功能为单位进行解读。

8w行py

django-admin startproject HelloWorld即可生成django项目，命令行是exe格式的。

生成的项目

模板在这

项目创建后，manage.py接管项目python manage.py runserver

manage.py做了什么？

execute_from_command_line(sys.argv)

manage.py把参数交给命令行解析。

def execute_from_command_line(argv=None):
    """Run a ManagementUtility."""
    utility = ManagementUtility(argv)
    utility.execute()

execute_from_command_line()通过命令行参数，创建一个管理类。然后运行他的execute()。

class ManagementUtility:
    """
    Encapsulate the logic of the django-admin and manage.py utilities.
    """
    def __init__(self, argv=None):
        self.argv = argv or sys.argv[:]
        self.prog_name = os.path.basename(self.argv[0])
        if self.prog_name == '__main__.py':
            self.prog_name = 'python -m django'
        self.settings_exception = None

    ......

    def execute(self):
        """
        Given the command-line arguments, figure out which subcommand is being
        run, create a parser appropriate to that command, and run it.
        """

        ......

        if settings.configured:
            if subcommand == 'runserver' and '--noreload' not in self.argv:
                try:
                    autoreload.check_errors(django.setup)()
                except Exception:

                    ......

            else:
                django.setup()

        self.autocomplete()

        .......

如果设置了reload，将会在启动前先check_errors。

def check_errors(fn):
    @functools.wraps(fn)
    def wrapper(*args, **kwargs):
        global _exception
        try:
            fn(*args, **kwargs)
        except Exception:
            _exception = sys.exc_info()

            et, ev, tb = _exception

            if getattr(ev, 'filename', None) is None:
                # get the filename from the last item in the stack
                filename = traceback.extract_tb(tb)[-1][0]
            else:
                filename = ev.filename

            if filename not in _error_files:
                _error_files.append(filename)

            raise

    return wrapper

check_errors()是个闭包，所以上文结尾是(django.setup)()。

setup()

def setup(set_prefix=True):
    """
    Configure the settings (this happens as a side effect of accessing the
    first setting), configure logging and populate the app registry.
    Set the thread-local urlresolvers script prefix if `set_prefix` is True.
    """
    from django.apps import apps
    from django.conf import settings
    from django.urls import set_script_prefix
    from django.utils.log import configure_logging

    configure_logging(settings.LOGGING_CONFIG, settings.LOGGING)
    if set_prefix:
        set_script_prefix(
            '/' if settings.FORCE_SCRIPT_NAME is None else settings.FORCE_SCRIPT_NAME
        )
    apps.populate(settings.INSTALLED_APPS)

直接看最后一句settings.INSTALLED_APPS。从settings中抓取app
注意，这个settings还不是我们项目中的settings.py。而是一个对象，位于django\conf\__init__.py

settings = LazySettings()

ENVIRONMENT_VARIABLE = "DJANGO_SETTINGS_MODULE"
......
class LazySettings(LazyObject):
    def _setup(self, name=None):
        settings_module = os.environ.get(ENVIRONMENT_VARIABLE)
        if not settings_module:
            ......

        self._wrapped = Settings(settings_module)

    def __getattr__(self, name):
        """Return the value of a setting and cache it in self.__dict__."""
        if self._wrapped is empty:
            self._setup(name)
        val = getattr(self._wrapped, name)
        self.__dict__[name] = val
        return val

这是个Settings类的懒加载封装类，直到__getattr__取值时才开始初始化。然后从Settings类的实例中取值。且会讲该值赋值到自己的__dict__上（下次会直接在自己身上找到，因为__getattr__优先级较低）

为了方便debug，我们直接写个run.py。不用命令行的方式。
项目下建个run.py，模拟runserver命令

import os
from django.core.management import execute_from_command_line

os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'HelloWorld.settings')
execute_from_command_line(['run.py', 'runserver'])

debug抓一下setting_module

项目中的settings.py

是我们在manager.py（包括run.py）中定义好的配置文件。
接下来直接把settings.py中的每一项，抓出来配置进class settings即可。

class Settings:
    def __init__(self, settings_module):
        for setting in dir(global_settings):
            if setting.isupper():
                setattr(self, setting, getattr(global_settings, setting))

        ......

        mod = importlib.import_module(self.settings_module)

        ......

        for setting in dir(mod):
            if setting.isupper():
                setting_value = getattr(mod, setting)
                ......
                setattr(self, setting, setting_value)

回到setup()中的最后一句apps.populate(settings.INSTALLED_APPS)
开始看apps.populate()

    def populate(self, installed_apps=None):
        """
        Load application configurations and models.

        Import each application module and then each model module.

        It is thread-safe and idempotent, but not reentrant.
        """

        ......

            # Phase 1: initialize app configs and import app modules.
            for entry in installed_apps:
                if isinstance(entry, AppConfig):
                    app_config = entry
                else:
                    app_config = AppConfig.create(entry)
                if app_config.label in self.app_configs:
                    raise ImproperlyConfigured(
                        "Application labels aren't unique, "
                        "duplicates: %s" % app_config.label)

                self.app_configs[app_config.label] = app_config
                app_config.apps = self

            # Check for duplicate app names.
            counts = Counter(
                app_config.name for app_config in self.app_configs.values())
            duplicates = [
                name for name, count in counts.most_common() if count > 1]
            if duplicates:
                raise ImproperlyConfigured(
                    "Application names aren't unique, "
                    "duplicates: %s" % ", ".join(duplicates))

            self.apps_ready = True

            # Phase 2: import models modules.
            for app_config in self.app_configs.values():
                app_config.import_models()

            self.clear_cache()

            self.models_ready = True

            # Phase 3: run ready() methods of app configs.
            for app_config in self.get_app_configs():
                app_config.ready()

            self.ready = True
            self.ready_event.set()

首先看这段

if isinstance(entry, AppConfig):
        app_config = entry
else:
        app_config = AppConfig.create(entry)

这些App最后都会封装成为AppConfig。且会装载到self.app_configs字典中

随后，分别调用每个appConfig的import_models()和ready()方法。
App的装载部分大体如此

回到execute()

def execute(self):

        ......

        self.autocomplete()

        ......

        self.fetch_command(subcommand).run_from_argv(self.argv)

为了方便debug我们改写下最后一句

        res = self.fetch_command(subcommand)
        res.run_from_argv(self.argv)

res的类型是 Command
重点是第二句，让我们跳到run_from_argv()方法，这里对参数进行了若干处理。

    def run_from_argv(self, argv):

        ......

        cmd_options = vars(options)

        ......

        try:
            self.execute(*args, **cmd_options)
        except Exception as e:

            ......

            sys.exit(1)
        finally:

            ......

    def execute(self, *args, **options):

        ......

        output = self.handle(*args, **options)

        if output:
            ......

        return output

用pycharm点这里的handle会进入基类的方法，无法得到正确的走向。实际上子类Commond重写了这个方法。

    def handle(self, *args, **options):
        ......
        self.run(**options)

接下来是重点

    def run(self, **options):
        use_reloader = options['use_reloader']

        if use_reloader:
            autoreload.run_with_reloader(self.inner_run, **options)
        else:
            self.inner_run(None, **options)

这里分为两种情况，如果是reload重载时，会直接执行inner_run()，而项目启动需要先执行其他逻辑。

def run_with_reloader(main_func, *args, **kwargs):
    signal.signal(signal.SIGTERM, lambda *args: sys.exit(0))
    try:
        if os.environ.get(DJANGO_AUTORELOAD_ENV) == 'true':
            reloader = get_reloader()
            logger.info('Watching for file changes with %s', reloader.__class__.__name__)

            start_django(reloader, main_func, *args, **kwargs)
        else:
            exit_code = restart_with_reloader()
            sys.exit(exit_code)
    except KeyboardInterrupt:
        pass

django项目启动时，实际上会启动两次，如果我们在项目入口(manage.py)中设置个print，会发现它会打印两次。

首次启动

第一次启动时，DJANGO_AUTORELOAD_ENV为None，无法进入启动逻辑。会进入restart_with_reloader()。

def restart_with_reloader():
    new_environ = {**os.environ, DJANGO_AUTORELOAD_ENV: 'true'}
    args = get_child_arguments()
    while True:
        exit_code = subprocess.call(args, env=new_environ, close_fds=False)
        if exit_code != 3:
            return exit_code

在这里会将DJANGO_AUTORELOAD_ENV置为True，随后重启。

第二次启动

第二次时，可以进入启动逻辑了。

def start_django(reloader, main_func, *args, **kwargs):
    ensure_echo_on()

    main_func = check_errors(main_func)
    django_main_thread = threading.Thread(target=main_func, args=args, kwargs=kwargs)
    django_main_thread.setDaemon(True)
    django_main_thread.start()

    while not reloader.should_stop:
        try:
            reloader.run(django_main_thread)
        except WatchmanUnavailable as ex:
            # It's possible that the watchman service shuts down or otherwise
            # becomes unavailable. In that case, use the StatReloader.
            reloader = StatReloader()
            logger.error('Error connecting to Watchman: %s', ex)
            logger.info('Watching for file changes with %s', reloader.__class__.__name__)

这里创建了一个django主线程，将inner_run()传入。
随后本线程通过reloader.run(django_main_thread)，创建一个轮询守护进程。

    def run(self, django_main_thread):
        logger.debug('Waiting for apps ready_event.')
        self.wait_for_apps_ready(apps, django_main_thread)
        from django.urls import get_resolver
        # Prevent a race condition where URL modules aren't loaded when the
        # reloader starts by accessing the urlconf_module property.
        get_resolver().urlconf_module
        logger.debug('Apps ready_event triggered. Sending autoreload_started signal.')
        autoreload_started.send(sender=self)
        self.run_loop()

    def run_loop(self):
        ticker = self.tick()
        while not self.should_stop:
            try:
                next(ticker)
            except StopIteration:
                break
        self.stop()

我们接下来看django的主线程inner_run()。

    def inner_run(self, *args, **options):
        # 如果一个异常在ManagementUtility中被沉默。
        # 为了在子进程中引发，现在就引发它。
        autoreload.raise_last_exception()

        threading = options['use_threading']
        # 'shutdown_message' 是一个隐藏选项.
        shutdown_message = options.get('shutdown_message', '')
        quit_command = 'CTRL-BREAK' if sys.platform == 'win32' else 'CONTROL-C'

        self.stdout.write("Performing system checks...\n\n")
        self.check(display_num_errors=True)
        # 这里需要检查迁移，所以不能使用requires_migrations_check属性。
        self.check_migrations()
        now = datetime.now().strftime('%B %d, %Y - %X')
        self.stdout.write(now)
        self.stdout.write((
            "Django version %(version)s, using settings %(settings)r\n"
            "Starting development server at %(protocol)s://%(addr)s:%(port)s/\n"
            "Quit the server with %(quit_command)s.\n"
        ) % {
            "version": self.get_version(),
            "settings": settings.SETTINGS_MODULE,
            "protocol": self.protocol,
            "addr": '[%s]' % self.addr if self._raw_ipv6 else self.addr,
            "port": self.port,
            "quit_command": quit_command,
        })

        --------------------上面这段就是django启动时打印的那一堆----------------------

        try:
            handler = self.get_handler(*args, **options)
            # run函数，是启动的最后一阶段。
            run(self.addr, int(self.port), handler,
                ipv6=self.use_ipv6, threading=threading, server_cls=self.server_cls)
        except socket.error as e:
            # 使用有用的错误信息，而不是难看的回溯.
            ERRORS = {
                errno.EACCES: "You don't have permission to access that port.",
                errno.EADDRINUSE: "That port is already in use.",
                errno.EADDRNOTAVAIL: "That IP address can't be assigned to.",
            }
            try:
                error_text = ERRORS[e.errno]
            except KeyError:
                error_text = e
            self.stderr.write("Error: %s" % error_text)
            # Need to use an OS exit because sys.exit doesn't work in a thread
            os._exit(1)
        except KeyboardInterrupt:
            if shutdown_message:
                self.stdout.write(shutdown_message)
            sys.exit(0)

def run(addr, port, wsgi_handler, ipv6=False, threading=False, server_cls=WSGIServer):
    server_address = (addr, port)
    if threading:
        httpd_cls = type('WSGIServer', (socketserver.ThreadingMixIn, server_cls), {})
    else:
        httpd_cls = server_cls
    httpd = httpd_cls(server_address, WSGIRequestHandler, ipv6=ipv6)
    if threading:
        # ThreadingMixIn.daemon_threads indicates how threads will behave on an
        # abrupt shutdown; like quitting the server by the user or restarting
        # by the auto-reloader. True means the server will not wait for thread
        # termination before it quits. This will make auto-reloader faster
        # and will prevent the need to kill the server manually if a thread
        # isn't terminating correctly.
        httpd.daemon_threads = True
    httpd.set_app(wsgi_handler)
    httpd.serve_forever()

当我们看到wsgi时，django负责的启动逻辑，就此结束了。接下来的工作交由wsgi服务器了
这相当于我们之前在fastapi中说到的，将fastapi的app交由asgi服务器。(asgi也是django提出来的，两者本质同源)

wsgi app

那么这个wsgi是从哪来的？让我们来稍微回溯下

handler = self.get_handler(*args, **options)
run(self.addr, int(self.port), handler, ipv6=self.use_ipv6, threading=threading, server_cls=self.server_cls)

    def get_handler(self, *args, **options):
        """
        Return the static files serving handler wrapping the default handler,
        if static files should be served. Otherwise return the default handler.
        """
        # 调用父类，然后将其封装
        handler = super().get_handler(*args, **options)
        use_static_handler = options['use_static_handler']
        insecure_serving = options['insecure_serving']
        if use_static_handler and (settings.DEBUG or insecure_serving):
            return StaticFilesHandler(handler)
        return handler

    def get_handler(self, *args, **options):
        """Return the default WSGI handler for the runner."""
        return get_internal_wsgi_application()

def get_internal_wsgi_application():
    """
    Load and return the WSGI application as configured by the user in
    ``settings.WSGI_APPLICATION``. With the default ``startproject`` layout,
    this will be the ``application`` object in ``projectname/wsgi.py``.

    This function, and the ``WSGI_APPLICATION`` setting itself, are only useful
    for Django's internal server (runserver); external WSGI servers should just
    be configured to point to the correct application object directly.

    If settings.WSGI_APPLICATION is not set (is ``None``), return
    whatever ``django.core.wsgi.get_wsgi_application`` returns.
    """
    from django.conf import settings
    app_path = getattr(settings, 'WSGI_APPLICATION')
    if app_path is None:
        return get_wsgi_application()

    try:
        return import_string(app_path)
    except ImportError as err:
        raise ImproperlyConfigured(
            "WSGI application '%s' could not be loaded; "
            "Error importing module." % app_path
        ) from err

这个settings是一个对象，在之前的操作中已经从settings.py配置文件中获得了自身的属性。所以我们只需要去settings.py配置文件中寻找。

配置文件

项目目录

这个app，就是我们在生成项目时，的wsgi.py文件。

import os

from django.core.wsgi import get_wsgi_application

os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'djangoDemo.settings')

application = get_wsgi_application()

我们来寻找这个get_wsgi_application()。

import django
from django.core.handlers.wsgi import WSGIHandler

def get_wsgi_application():
    """
    The public interface to Django's WSGI support. Return a WSGI callable.

    Avoids making django.core.handlers.WSGIHandler a public API, in case the
    internal WSGI implementation changes or moves in the future.
    """
    django.setup(set_prefix=False)
    return WSGIHandler()

它会再次调用setup()，重要的是，返回一个WSGIHandler类的实例。

class WSGIHandler(base.BaseHandler):
    request_class = WSGIRequest

    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.load_middleware()

    def __call__(self, environ, start_response):
        set_script_prefix(get_script_name(environ))
        signals.request_started.send(sender=self.__class__, environ=environ)
        request = self.request_class(environ)
        response = self.get_response(request)

        response._handler_class = self.__class__

        status = '%d %s' % (response.status_code, response.reason_phrase)
        response_headers = [
            *response.items(),
            *(('Set-Cookie', c.output(header='')) for c in response.cookies.values()),
        ]
        start_response(status, response_headers)
        if getattr(response, 'file_to_stream', None) is not None and environ.get('wsgi.file_wrapper'):
            response = environ['wsgi.file_wrapper'](response.file_to_stream)
        return response

这就是wsgiapp本身。

注意

from django.conf import settings

......
class WSGIHandler(base.BaseHandler):

  ......

  def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.load_middleware()

load_middleware()为构建中间件堆栈，这也是wsgiapp获取setting信息的唯一途径。导入settings.py，生成中间件堆栈。
如果看过我之前那篇fastapi源码的，应该对中间件堆栈不陌生。
app入口→中间件堆栈→路由→路由节点→endpoint
所以，wsgiapp就此构建完毕，服务器传入请求至app入口，即可经过中间件到达路由进行分发。