很多开发都喜欢用Subquery而不喜欢用Join,对于他们来讲Subquery更容易实现。但是很多情况下用Join性能要比用Subquery好。
首先我们看一下Subquery: 子查询也称为内部查询或内部选择,而包含子查询的语句也称为外部查询或外部选择。
许多包含子查询的Transact-SQL 语句都可以改用联接表示。其他问题只能通过子查询提出。在 Transact-SQL 中,包含子查询的语句和语义上等效的不包含子查询的语句在性能上通常没有差别。但是,在一些必须检查存在性的情况中,使用联接会产生更好的性能。否则,为确保消除重复值,必须为外部查询的每个结果都处理嵌套查询。所以在这些情况下,联接方式会产生更好的效果。
下面是改写Subquery的一个例子:
SELECT c.AccountNumber,
(SELECT count(*)
FROM Sales.SalesOrderHeader o
WHERE c.CustomerID = o.CustomerID AND Year(OrderDate) = 2001) as Orders_2001,
(SELECT count(*)
FROM Sales.SalesOrderHeader o
WHERE c.CustomerID = o.CustomerID AND Year(OrderDate) = 2002) as Orders_2002,
(SELECT count(*)
FROM Sales.SalesOrderHeader o
WHERE c.CustomerID = o.CustomerID AND Year(OrderDate) = 2003) as Orders_2003,
(SELECT count(*)
FROM Sales.SalesOrderHeader o
WHERE c.CustomerID = o.CustomerID AND Year(OrderDate) = 2004) as Orders_2004
FROM Sales.Customer c
order by 1
查询1返回的结果:
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server parse and compiletime:
CPU time = 65 ms, elapsed time = 65 ms.
(19185 row(s) affected)
Table 'Worktable'. Scan count0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0,lob physical reads 0, lob read-ahead reads 0.
Table 'Customer'. Scan count1, logical reads 105,physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads0, lob read-ahead reads 0.
Table 'SalesOrderHeader'. Scancount 4, logical reads2824, physical reads 0, read-ahead reads 0, lob logical reads 0, lobphysical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 375 ms, elapsed time= 772 ms.
改写后:
SELECT c.AccountNumber,
SUM(CASE WHEN YEAR(o.OrderDate) = 2001 THEN 1 ELSE 0 END) as Orders_2001,
SUM(CASE WHEN YEAR(o.OrderDate) = 2002 THEN 1 ELSE 0 END) as Orders_2002,
SUM(CASE WHEN YEAR(o.OrderDate) = 2003 THEN 1 ELSE 0 END) as Orders_2003,
SUM(CASE WHEN YEAR(o.OrderDate) = 2004 THEN 1 ELSE 0 END) as Orders_2004
FROM Sales.Customer c
LEFT JOIN Sales.SalesOrderHeadero ON o.CustomerID= c.CustomerID
GROUP BY c.AccountNumber
order by 1
查询2返回的结果:
SQL Server parse and compiletime:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server parse and compiletime:
CPU time = 16 ms, elapsed time = 21 ms.
(19185 row(s) affected)
Table 'Worktable'. Scan count0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0,lob physical reads 0, lob read-ahead reads 0.
Table 'SalesOrderHeader'.Scan count 1, logicalreads 706, physical reads 0, read-ahead reads 0, lob logical reads 0,lob physical reads 0, lob read-ahead reads 0.
Table 'Customer'. Scan count1, logical reads 36,physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 249 ms, elapsed time= 547 ms.
SQL Server parse and compiletime:
CPU time = 0 ms, elapsed time = 0 ms.
可以看到改写后的查询不管是在CPU花费时间和IO上面都有很大的提高。 当然Subquery在某些情况下还是有优势的,比如不相关的Subquery使用Exist/not exist或者Subquery中做加总等,另外还要考虑外围查询结果的数据量。
总之开发人员在写程序的时候不光要考虑实现还需要兼顾性能。功能实现以后自己做测试看看是否有该井空间。