django 查询优化之 select_related 和 prefetch_related

基础:这两个方法都是避免因访问外键对象而导致的数据库查询泛滥,但策略却大不相同。

select_related(*fields)

作用和方法:JOIN关联一次性查询,减少查询次数。

作用对象:该方法只作用于一对多(普通外键 ForeignKey)或者一对一(OneToOneField)关系。

# 查询Book表同时联合查询author表和hometown表,并缓存结果
b = Book.objects.select_related('author__hometown').get(id=4)
p = b.author         # Doesn't hit the database.  直接使用缓存结果
c = p.hometown       # Doesn't hit the database.  直接使用缓存结果

# 只查询Book表
b = Book.objects.get(id=4)  # Hits the database.
p = b.author         # Hits the database.  产生一次查询IO
c = p.hometown       # Hits the database.  产生一次查询IO

prefetch_related( *lookups)

作用和方法:分别查询每个表,然后用Python处理他们之间的关系。

作用对象:对于多对多字段(ManyToManyField)和一对多字段( ForeignKey)

# 查询张三去过的所有城市

zhangs = Person.objects.prefetch_related('visitation').get(firstname=u"张",lastname=u"三")
for city in zhangs.visitation.all() :
    print city


"""
实际产生的SQL
SELECT `QSOptimize_person`.`id`, `QSOptimize_person`.`firstname`,
`QSOptimize_person`.`lastname`, `QSOptimize_person`.`hometown_id`, `QSOptimize_person`.`living_id`
FROM `QSOptimize_person`
WHERE (`QSOptimize_person`.`lastname` = '三' AND `QSOptimize_person`.`firstname` = '张');
 
SELECT (`QSOptimize_person_visitation`.`person_id`) AS `_prefetch_related_val`, `QSOptimize_city`.`id`,
`QSOptimize_city`.`name`, `QSOptimize_city`.`province_id`
FROM `QSOptimize_city`
INNER JOIN `QSOptimize_person_visitation` ON (`QSOptimize_city`.`id` = `QSOptimize_person_visitation`.`city_id`)
WHERE `QSOptimize_person_visitation`.`person_id` IN (1);
"""


下面用一个例子说明这两个方法的区别

需求:查询家乡是广东省的人

样例models:

from django.db import models
  
class Province(models.Model):
    name = models.CharField(max_length=10)
    
    def __unicode__(self):
        return self.name
  

class City(models.Model):
    name = models.CharField(max_length=5)
    province = models.ForeignKey(Province)
    
    def __unicode__(self):
         return self.name
  

class Person(models.Model):
    firstname = models.CharField(max_length=10)
    lastname = models.CharField(max_length=10)
    visitation = models.ManyToManyField(City, related_name = "visitor")
    hometown = models.ForeignKey(City, related_name = "birth")
    living  = models.ForeignKey(City, related_name = "citizen")

    def __unicode__(self):
        return self.firstname + self.lastname

class Order(models.Model):
    customer = models.ForeignKey(Person)
    orderinfo = models.CharField(max_length=50)
    time  = models.DateTimeField(auto_now_add = True)

    def __unicode__(self):
        return self.orderinfo

普通写法:总共需要产生查询 广东省(1)+ 广东省下的城市(n),合计n+1次查询IO,显然这很麻瓜

    def test_1(self):
        from order.models import Province

        hb = Province.objects.get(name__iexact=u"广东省")
        people = []
        for city in hb.city_set.all():
            people.extend(city.birth.all())

使用 prefetch_related 写法:因为是一个深度为2的prefetch,所以会导致3次查询IO:第一次查询Province表,第二次查询city表,第三次查询person表

hb = Province.objects.prefetch_related("city_set__birth").get(name__iexact=u"广东省")
people = []
for city in hb.city_set.all():
    people.extend(city.birth.all())

        
""" 
实际SQL
        SELECT `QSOptimize_province`.`id`, `QSOptimize_province`.`name`
        FROM `QSOptimize_province`
        WHERE `QSOptimize_province`.`name` LIKE '湖北省' ;
         
        SELECT `QSOptimize_city`.`id`, `QSOptimize_city`.`name`, `QSOptimize_city`.`province_id`
        FROM `QSOptimize_city`
        WHERE `QSOptimize_city`.`province_id` IN (1);
         
        SELECT `QSOptimize_person`.`id`, `QSOptimize_person`.`firstname`, `QSOptimize_person`.`lastname`,
        `QSOptimize_person`.`hometown_id`, `QSOptimize_person`.`living_id`
        FROM `QSOptimize_person`
        WHERE `QSOptimize_person`.`hometown_id` IN (1, 3);
"""

使用 select_related 写法:只产生一次联表查询IO

people = list(Person.objects.select_related("hometown__province").filter(hometown__province__name__iexact=u"广东省"))

"""
        SELECT `QSOptimize_person`.`id`, `QSOptimize_person`.`firstname`, `QSOptimize_person`.`lastname`,
        `QSOptimize_person`.`hometown_id`, `QSOptimize_person`.`living_id`, `QSOptimize_city`.`id`,
        `QSOptimize_city`.`name`, `QSOptimize_city`.`province_id`, `QSOptimize_province`.`id`, `QSOptimize_province`.`name`
        FROM `QSOptimize_person`
        INNER JOIN `QSOptimize_city` ON (`QSOptimize_person`.`hometown_id` = `QSOptimize_city`.`id`)
        INNER JOIN `QSOptimize_province` ON (`QSOptimize_city`.`province_id` = `QSOptimize_province`.`id`)
        WHERE `QSOptimize_province`.`name` LIKE '湖北省';
"""

综上,在做索引优化前,其实可以考虑下是否可以直接优化SQL,在内存允许的情况下尽量减少趟数。能利用一对一,多对一关系,就不要利用多对多。


另外,这两个方法可以联合使用

查询该订单顾客去过的省份

只使用prefetch_related,产生了4次查询

order = Order.objects.prefetch_related("customer__visitation__province").get(id=1)
for city in order.customer.visitation.all():
    print(city.province.name)

"""
SELECT `QSOptimize_order`.`id`, `QSOptimize_order`.`customer_id`, `QSOptimize_order`.`orderinfo`, `QSOptimize_order`.`time`
FROM `QSOptimize_order`
WHERE `QSOptimize_order`.`id` = 1 ;
 
SELECT `QSOptimize_person`.`id`, `QSOptimize_person`.`firstname`, `QSOptimize_person`.`lastname`, `QSOptimize_person`.`hometown_id`, `QSOptimize_person`.`living_id`
FROM `QSOptimize_person`
WHERE `QSOptimize_person`.`id` IN (1);
 
SELECT (`QSOptimize_person_visitation`.`person_id`) AS `_prefetch_related_val`, `QSOptimize_city`.`id`,
`QSOptimize_city`.`name`, `QSOptimize_city`.`province_id`
FROM `QSOptimize_city`
INNER JOIN `QSOptimize_person_visitation` ON (`QSOptimize_city`.`id` = `QSOptimize_person_visitation`.`city_id`)
WHERE `QSOptimize_person_visitation`.`person_id` IN (1);
 
SELECT `QSOptimize_province`.`id`, `QSOptimize_province`.`name`
FROM `QSOptimize_province`
WHERE `QSOptimize_province`.`id` IN (1, 2);

"""

联合查询:产生3次查询IO,prefetch_related使用了select_related的缓存结果。

order = Order.objects.select_related("customer").prefetch_related("customer__visitation__province").get(id=1)
for city in order.customer.visitation.all():
    print(city.province.name)

"""
SELECT `QSOptimize_order`.`id`, `QSOptimize_order`.`customer_id`, `QSOptimize_order`.`orderinfo`, 
`QSOptimize_order`.`time`, `QSOptimize_person`.`id`, `QSOptimize_person`.`firstname`, 
`QSOptimize_person`.`lastname`, `QSOptimize_person`.`hometown_id`, `QSOptimize_person`.`living_id` 
FROM `QSOptimize_order` 
INNER JOIN `QSOptimize_person` ON (`QSOptimize_order`.`customer_id` = `QSOptimize_person`.`id`) 
WHERE `QSOptimize_order`.`id` = 1 ;
  
SELECT (`QSOptimize_person_visitation`.`person_id`) AS `_prefetch_related_val`, `QSOptimize_city`.`id`, 
`QSOptimize_city`.`name`, `QSOptimize_city`.`province_id` 
FROM `QSOptimize_city` 
INNER JOIN `QSOptimize_person_visitation` ON (`QSOptimize_city`.`id` = `QSOptimize_person_visitation`.`city_id`) 
WHERE `QSOptimize_person_visitation`.`person_id` IN (1);
  
SELECT `QSOptimize_province`.`id`, `QSOptimize_province`.`name` 
FROM `QSOptimize_province` 
WHERE `QSOptimize_province`.`id` IN (1, 2);
"""

总结,尽量使用联表查询 select_related,性能最优。其次再使用prefetch_related。两者都是为了减少查询次数,但是作用原理不一样。

你可能感兴趣的:(Python,Django,python,django)