基础:这两个方法都是避免因访问外键对象而导致的数据库查询泛滥,但策略却大不相同。
select_related
(*fields)¶
作用和方法:JOIN关联一次性查询,减少查询次数。
作用对象:该方法只作用于一对多(普通外键 ForeignKey)或者一对一(OneToOneField)关系。
# 查询Book表同时联合查询author表和hometown表,并缓存结果
b = Book.objects.select_related('author__hometown').get(id=4)
p = b.author # Doesn't hit the database. 直接使用缓存结果
c = p.hometown # Doesn't hit the database. 直接使用缓存结果
# 只查询Book表
b = Book.objects.get(id=4) # Hits the database.
p = b.author # Hits the database. 产生一次查询IO
c = p.hometown # Hits the database. 产生一次查询IO
prefetch_related
( *lookups)
作用和方法:分别查询每个表,然后用Python处理他们之间的关系。
作用对象:对于多对多字段(ManyToManyField)和一对多字段( ForeignKey)
# 查询张三去过的所有城市
zhangs = Person.objects.prefetch_related('visitation').get(firstname=u"张",lastname=u"三")
for city in zhangs.visitation.all() :
print city
"""
实际产生的SQL
SELECT `QSOptimize_person`.`id`, `QSOptimize_person`.`firstname`,
`QSOptimize_person`.`lastname`, `QSOptimize_person`.`hometown_id`, `QSOptimize_person`.`living_id`
FROM `QSOptimize_person`
WHERE (`QSOptimize_person`.`lastname` = '三' AND `QSOptimize_person`.`firstname` = '张');
SELECT (`QSOptimize_person_visitation`.`person_id`) AS `_prefetch_related_val`, `QSOptimize_city`.`id`,
`QSOptimize_city`.`name`, `QSOptimize_city`.`province_id`
FROM `QSOptimize_city`
INNER JOIN `QSOptimize_person_visitation` ON (`QSOptimize_city`.`id` = `QSOptimize_person_visitation`.`city_id`)
WHERE `QSOptimize_person_visitation`.`person_id` IN (1);
"""
需求:查询家乡是广东省的人
样例models:
from django.db import models
class Province(models.Model):
name = models.CharField(max_length=10)
def __unicode__(self):
return self.name
class City(models.Model):
name = models.CharField(max_length=5)
province = models.ForeignKey(Province)
def __unicode__(self):
return self.name
class Person(models.Model):
firstname = models.CharField(max_length=10)
lastname = models.CharField(max_length=10)
visitation = models.ManyToManyField(City, related_name = "visitor")
hometown = models.ForeignKey(City, related_name = "birth")
living = models.ForeignKey(City, related_name = "citizen")
def __unicode__(self):
return self.firstname + self.lastname
class Order(models.Model):
customer = models.ForeignKey(Person)
orderinfo = models.CharField(max_length=50)
time = models.DateTimeField(auto_now_add = True)
def __unicode__(self):
return self.orderinfo
普通写法:总共需要产生查询 广东省(1)+ 广东省下的城市(n),合计n+1次查询IO,显然这很麻瓜
def test_1(self):
from order.models import Province
hb = Province.objects.get(name__iexact=u"广东省")
people = []
for city in hb.city_set.all():
people.extend(city.birth.all())
使用 prefetch_related 写法:
因为是一个深度为2的prefetch,所以会导致3次查询IO:第一次查询Province表,第二次查询city表,第三次查询person表
hb = Province.objects.prefetch_related("city_set__birth").get(name__iexact=u"广东省")
people = []
for city in hb.city_set.all():
people.extend(city.birth.all())
"""
实际SQL
SELECT `QSOptimize_province`.`id`, `QSOptimize_province`.`name`
FROM `QSOptimize_province`
WHERE `QSOptimize_province`.`name` LIKE '湖北省' ;
SELECT `QSOptimize_city`.`id`, `QSOptimize_city`.`name`, `QSOptimize_city`.`province_id`
FROM `QSOptimize_city`
WHERE `QSOptimize_city`.`province_id` IN (1);
SELECT `QSOptimize_person`.`id`, `QSOptimize_person`.`firstname`, `QSOptimize_person`.`lastname`,
`QSOptimize_person`.`hometown_id`, `QSOptimize_person`.`living_id`
FROM `QSOptimize_person`
WHERE `QSOptimize_person`.`hometown_id` IN (1, 3);
"""
使用 select_related 写法:
只产生一次联表查询IO
people = list(Person.objects.select_related("hometown__province").filter(hometown__province__name__iexact=u"广东省"))
"""
SELECT `QSOptimize_person`.`id`, `QSOptimize_person`.`firstname`, `QSOptimize_person`.`lastname`,
`QSOptimize_person`.`hometown_id`, `QSOptimize_person`.`living_id`, `QSOptimize_city`.`id`,
`QSOptimize_city`.`name`, `QSOptimize_city`.`province_id`, `QSOptimize_province`.`id`, `QSOptimize_province`.`name`
FROM `QSOptimize_person`
INNER JOIN `QSOptimize_city` ON (`QSOptimize_person`.`hometown_id` = `QSOptimize_city`.`id`)
INNER JOIN `QSOptimize_province` ON (`QSOptimize_city`.`province_id` = `QSOptimize_province`.`id`)
WHERE `QSOptimize_province`.`name` LIKE '湖北省';
"""
综上,在做索引优化前,其实可以考虑下是否可以直接优化SQL,在内存允许的情况下尽量减少趟数。能利用一对一,多对一关系,就不要利用多对多。
另外,这两个方法可以联合使用
查询该订单顾客去过的省份
只使用prefetch_related,产生了4次查询
order = Order.objects.prefetch_related("customer__visitation__province").get(id=1)
for city in order.customer.visitation.all():
print(city.province.name)
"""
SELECT `QSOptimize_order`.`id`, `QSOptimize_order`.`customer_id`, `QSOptimize_order`.`orderinfo`, `QSOptimize_order`.`time`
FROM `QSOptimize_order`
WHERE `QSOptimize_order`.`id` = 1 ;
SELECT `QSOptimize_person`.`id`, `QSOptimize_person`.`firstname`, `QSOptimize_person`.`lastname`, `QSOptimize_person`.`hometown_id`, `QSOptimize_person`.`living_id`
FROM `QSOptimize_person`
WHERE `QSOptimize_person`.`id` IN (1);
SELECT (`QSOptimize_person_visitation`.`person_id`) AS `_prefetch_related_val`, `QSOptimize_city`.`id`,
`QSOptimize_city`.`name`, `QSOptimize_city`.`province_id`
FROM `QSOptimize_city`
INNER JOIN `QSOptimize_person_visitation` ON (`QSOptimize_city`.`id` = `QSOptimize_person_visitation`.`city_id`)
WHERE `QSOptimize_person_visitation`.`person_id` IN (1);
SELECT `QSOptimize_province`.`id`, `QSOptimize_province`.`name`
FROM `QSOptimize_province`
WHERE `QSOptimize_province`.`id` IN (1, 2);
"""
联合查询:产生3次查询IO,prefetch_related使用了select_related的缓存结果。
order = Order.objects.select_related("customer").prefetch_related("customer__visitation__province").get(id=1)
for city in order.customer.visitation.all():
print(city.province.name)
"""
SELECT `QSOptimize_order`.`id`, `QSOptimize_order`.`customer_id`, `QSOptimize_order`.`orderinfo`,
`QSOptimize_order`.`time`, `QSOptimize_person`.`id`, `QSOptimize_person`.`firstname`,
`QSOptimize_person`.`lastname`, `QSOptimize_person`.`hometown_id`, `QSOptimize_person`.`living_id`
FROM `QSOptimize_order`
INNER JOIN `QSOptimize_person` ON (`QSOptimize_order`.`customer_id` = `QSOptimize_person`.`id`)
WHERE `QSOptimize_order`.`id` = 1 ;
SELECT (`QSOptimize_person_visitation`.`person_id`) AS `_prefetch_related_val`, `QSOptimize_city`.`id`,
`QSOptimize_city`.`name`, `QSOptimize_city`.`province_id`
FROM `QSOptimize_city`
INNER JOIN `QSOptimize_person_visitation` ON (`QSOptimize_city`.`id` = `QSOptimize_person_visitation`.`city_id`)
WHERE `QSOptimize_person_visitation`.`person_id` IN (1);
SELECT `QSOptimize_province`.`id`, `QSOptimize_province`.`name`
FROM `QSOptimize_province`
WHERE `QSOptimize_province`.`id` IN (1, 2);
"""
总结,尽量使用联表查询 select_related,性能最优。其次再使用prefetch_related。两者都是为了减少查询次数,但是作用原理不一样。