Python-Django框架的select_related 和 prefetch_related函数对 QuerySet 查询的优化

概念:
  • select_related()当执行它的查询时它沿着外键关系查询关联的对象数据。它会生成一个复杂的查询并引起性能的消耗,但是在以后使用外键关系时将不需要数据库查询。
  • prefetch_related()返回的也是QuerySet,它将在单个批处理中自动检索每个指定查找的对象。这具有与select_related类似的目的,两者都被设计为阻止由访问相关对象而导致的数据库查询的泛滥,但是策略是完全不同的。
  • select_related通过创建SQL连接并在SELECT语句中包括相关对象的字段来工作。因此,select_related在同一数据库查询中获取相关对象。然而,为了避免由于跨越“多个'关系而导致的大得多的结果集,select_related限于单值关系 -外键一对一关系
  • prefetch_related,另一方面,为每个关系单独查找,并在Python中“加入”。这允许它预取多对多多对一对象,除了外键一对一关系,它们不能使用select_related来完成
按例如下:
  • 首先我们在setting.py文件里设置SQL查询语句:
LOGGING = {
    'version': 1,
    'disable_existing_loggers': False,
    'handlers': {
        'console':{
            'level':'DEBUG',
            'class':'logging.StreamHandler',
        },
    },
    'loggers': {
        'django.db.backends': {
            'handlers': ['console'],
            'propagate': True,
            'level':'DEBUG',
        },
    }
}
创建一个model.py
from django.db import models


class Province(models.Model):
    """省"""
    name = models.CharField(max_length=10)

    def __unicode__(self):
        return self.name


class City(models.Model):
    """市"""
    name = models.CharField(max_length=5)
    province = models.ForeignKey(Province)

    def __unicode__(self):
        return self.name


class Person(models.Model):
    """人"""
    firstname = models.CharField(max_length=10)
    lastname = models.CharField(max_length=10)
    visitation = models.ManyToManyField(City, related_name="visitor")
    hometown = models.ForeignKey(City, related_name="birth")
    living = models.ForeignKey(City, related_name="citizen")

    def __unicode__(self):
        return self.firstname + self.lastname
  • 三张表里都插入了少量的数据(需要自己添加数据)
一:select_related()
  • 对于一对一字段(OneToOneField)外键字段(ForeignKey),可以使用select_related 来对QuerySet进行优化。
  • 接下来我们看案例:
>>> City.objects.all()
#  打印结果是:有四条数据 , , , ]>

# 下面我们来实现外键字段的操作:
>>> citys = City.objects.all()
>>> for city in citys:
...     print city.province
"""
BeiJing
BeiJing
BeiJing
BeiJing
"""
  • 下面我们打开SQL查询的语句:(可以看到查询了5次数据库)
(0.000) SELECT `apps_city`.`id`, `apps_city`.`name`, `apps_city`.`province_id` FROM `apps_city`; args=()
(0.000) SELECT `apps_province`.`id`, `apps_province`.`name` FROM `apps_province` WHERE `apps_province`.`id` = 2; args=(2,)
BeiJing
(0.000) SELECT `apps_province`.`id`, `apps_province`.`name` FROM `apps_province` WHERE `apps_province`.`id` = 2; args=(2,)
BeiJing
(0.000) SELECT `apps_province`.`id`, `apps_province`.`name` FROM `apps_province` WHERE `apps_province`.`id` = 2; args=(2,)
BeiJing
(0.000) SELECT `apps_province`.`id`, `apps_province`.`name` FROM `apps_province` WHERE `apps_province`.`id` = 2; args=(2,)
BeiJing
  • 下面我们用select_related()来查看一下查询的次数
>>> citys = City.objects.select_related().all()
>>> for city in citys:
...     print city.province
# 看打印结果:同样返回的4条数据
"""
BeiJing
BeiJing
BeiJing
BeiJing
"""
  • 接下来我们看SQL语句的查询:(只有一条,仔细观察INNER JOIN想起了连表查询吧)
    显然大大的减少了SQL查询的次数
(0.000) SELECT `apps_city`.`id`, `apps_city`.`name`, `apps_city`.`province_id`, `apps_province`.`id`, `apps_province`.`name` FROM `apps_city` INNER JOIN `apps_province` ON (`apps_ci
ty`.`province_id` = `apps_province`.`id`); args=()
接下来我们给select_related() 添加:*fields 参数
  • select_related() 接受可变长参数,每个参数是需要获取的外键(父表的内容)的字段名,以及外键的外键的字段名外键的外键的外键…。若要选择外键的外键需要使用两个下划线“__”来连接。
  • 我们举个例子:
>>> persons = Person.objects.select_related("living__province").get(pk=1)
>>> persons.living.province
# 输出结果: 
  • 接下来看出发的SQL语句:
(0.000) SELECT `apps_person`.`id`, `apps_person`.`firstname`, `apps_person`.`lastname`, `apps_person`.`hometown_id`, `apps_person`.`living_id`, `apps_city`.`id`, `apps_city`.`name`,
 `apps_city`.`province_id`, `apps_province`.`id`, `apps_province`.`name` FROM `apps_person` INNER JOIN `apps_city` ON (`apps_person`.`living_id` = `apps_city`.`id`) INNER JOIN `apps
_province` ON (`apps_city`.`province_id` = `apps_province`.`id`) WHERE `apps_person`.`id` = 1; args=(1,)
  • 以上可以看出来,Django使用了2次INNER JOIN来完成请求,取到了city表和province表的内容,并添加到结果表的相应列,在调用查询persons.living的时候也不必再次进行SQL查询。
  • 如果未指定外键则不会被添加到结果中。
>>> persons.hometown.province
(0.001) SELECT `apps_city`.`id`, `apps_city`.`name`, `apps_city`.`province_id` FROM `apps_city` WHERE `apps_city`.`id` = 1; args=(1,)
(0.000) SELECT `apps_province`.`id`, `apps_province`.`name` FROM `apps_province` WHERE `apps_province`.`id` = 2; args=(2,)

  • 同时,如果不指定外键,就会进行两次查询。如果深度更深,查询的次数就越多。
  • Diango1.7开始,select_related()函数的作用方式改变了。在1.7版本以前select_related()只能这么做:
>>> persons = Person.objects.select_related("hometown__province","living__province").get(pk=1)
  • 看SQL执行的查询:

(0.003) SELECT `apps_person`.`id`, `apps_person`.`firstname`, `apps_person`.`lastname`, `apps_person`.`hometown_id`, `apps_person`.`living_id`, `apps_city`.`id`, `apps_city`.`name`,
 `apps_city`.`province_id`, `apps_province`.`id`, `apps_province`.`name`, T4.`id`, T4.`name`, T4.`province_id`, T5.`id`, T5.`name` FROM `apps_person` INNER JOIN `apps_city` ON (`app
s_person`.`hometown_id` = `apps_city`.`id`) INNER JOIN `apps_province` ON (`apps_city`.`province_id` = `apps_province`.`id`) INNER JOIN `apps_city` T4 ON (`apps_person`.`living_id`
= T4.`id`) INNER JOIN `apps_province` T5 ON (T4.`province_id` = T5.`id`) WHERE `apps_person`.`id` = 1; args=(1,)
# 从下面我们可以看出来,通过外键获取到的数据,不会进行数据库的查询:
>>> persons.living.province

>>> persons.hometown.province
  • 但是在1.7以上版本,可以像QuerySet的其他函数一样进行操作:
>>> persons = Person.objects.select_related("living__province").select_related("hometown__province").get(pk=1)
  • 看SQL语句的查询:
(0.001) SELECT `apps_person`.`id`, `apps_person`.`firstname`, `apps_person`.`lastname`, `apps_person`.`hometown_id`, `apps_person`.`living_id`, `apps_city`.`id`, `apps_city`.`name`,
 `apps_city`.`province_id`, `apps_province`.`id`, `apps_province`.`name`, T4.`id`, T4.`name`, T4.`province_id`, T5.`id`, T5.`name` FROM `apps_person` INNER JOIN `apps_city` ON (`app
s_person`.`hometown_id` = `apps_city`.`id`) INNER JOIN `apps_province` ON (`apps_city`.`province_id` = `apps_province`.`id`) INNER JOIN `apps_city` T4 ON (`apps_person`.`living_id`
= T4.`id`) INNER JOIN `apps_province` T5 ON (T4.`province_id` = T5.`id`) WHERE `apps_person`.`id` = 1; args=(1,)
>>> persons.living.province

>>> persons.hometown.province

>>>
  • 建议大家使用1.7以上版本
二:prefetch_related()
  • 下面我来们操作一下:
>>> persons = Person.objects.prefetch_related("visitation__province").get(pk=1)
  • 查询SQL语句:
(0.007) SELECT (`apps_person_visitation`.`person_id`) AS `_prefetch_related_val_person_id`, `apps_city`.`id`, `apps_city`.`name`, `apps_city`.`province_id` FROM `apps_city` INNER JO
IN `apps_person_visitation` ON (`apps_city`.`id` = `apps_person_visitation`.`city_id`) WHERE `apps_person_visitation`.`person_id` IN (1); args=(1,)
>>> persons.visitation
0x0000000003FCAAC8>
  • prefetch_related()的操作和 select_related()大概是相同的, 只是prefetch_related()是操作,多对多的表格。

你可能感兴趣的:(Django)