错误的SQL查询语句返回大量重复数据将导致意外的性能急剧下降。
比如存在如下关系:
*) Deal has 1 City if (deal.is_multi_city == false);
*) Deal has many DealCities if (deal.is_multi_city == true);
如果我们想查询所有某个城市的multi-city和非multi-city的Deal, 写出如下的语句将导致很恶劣的性能问题:
$query = "select Deal.id from t_deals as Deal,t_deal_cities as DealCity, t_cities as City where ( (Deal.is_multi_city_deal = 1 and Deal.id=DealCity.deal_id and DealCity.city_id=City.id ) or (Deal.is_multi_city_deal = 0 and Deal.city_id=City.id ) ) and ( City.slug='".$cityslug."')";
return $this->query($query);
其中OR后面的查询有一个很隐蔽的问题: 忽略了非常重要的表间外键约束:Deal.id=DealCity.deal_id, 将导致大量的数据冗余。
实际上OR前后的查询其范围是不同的,OR后面的语句不应该把不相关的表格DealCity引入进来。通过联合查询我们可以达到同样的目的并且消除了业务逻辑和查询范围上的混乱:
$query = "SELECT Deal.id
FROM t_deals AS Deal, t_deal_cities AS DealCity, t_cities AS City
WHERE
Deal.is_multi_city_deal =1
AND Deal.id = DealCity.deal_id
AND DealCity.city_id = City.id
AND City.slug = '".$cityslug."'
UNION
SELECT Deal.id
FROM t_deals AS Deal, t_cities AS City
WHERE Deal.is_multi_city_deal =0
AND Deal.city_id = City.id
AND City.slug = '".$cityslug."'";
这个语句改正使得网站首页从10s以上访问时间大大降低到2s以下。性能得到5倍以上的提高。
从中可以看出糟糕的数据模型和查询语句将对网站性能造成多大的影响。
很多情况下其效益远比通过加应用服务器和提高带宽大得多。