SqlServer 2005 T-SQL Query 学习笔记(6)

JOIN关键字所起的效果最先并不是使用JOIN来做到的,ANSI SQL:1989所采用以下的格式:

FROM T1, T2

WHERE where_filter

 

同时LEFT JOIN和RIGHT JOIN是使用*=和=*来起到相同作用。

 

到了ANSI SQL:1992,就出现了JOIN:

FROM T1 <join_type> JOIN T2 ON <on_filter>

WHERE where_filter
 
 

CROSS JOIN

返回的是2个表的笛卡尔积,左N-ROWS,右M-ROWS,即为M*N。

 

OUTER

OUTER一般和LEFT,RIGHT,JOIN在一起使用。

 

注意旧写法如果不注意,将会导致一种错误,下面是个例子(求没有订单的顾客)。

新写法:

SELECT C.CustomerID, CompanyName, OrderID

FROM dbo.Customers AS C

  LEFT OUTER JOIN dbo.Orders AS O

    ON C.CustomerID = O.CustomerID

WHERE O.CustomerID IS NULL;

 

返回了2个CUSTOMER,结果正确。

 

旧写法(*错误的写法):

SELECT C.CustomerID, CompanyName, OrderID

FROM dbo.Customers AS C, dbo.Orders AS O

WHERE C.CustomerID *= O.CustomerID

  AND O.CustomerID IS NULL;

 

在旧的写法中,返回了所有的91个顾客。为什么会这样呢,取决于WHERE和ON条件判断发生的时间的不同。在这里,WHERE里AND前,AND后是没有任何区别的,你不能控制在加入OUTER行之前到底执行那个过滤条件。

理论上,他是在加入OUTER行之前进行了过滤。实际上很明显,O.CustomerID IS NULL是肯定不满足WHERE条件的(在ORDER表里根本就没有Customer为空的数据),所以这个查询直接就进行左半的连接,返回了CUSTOMER的数据(91条)。

 

那如何修正旧的写法呢,可以按下面的方法:

SELECT C.CustomerID, CompanyName, OrderID

FROM dbo.Customers AS C, dbo.Orders AS O

WHERE C.CustomerID *= O.CustomerID

GROUP BY C.CustomerID, CompanyName, OrderID

HAVING OrderID IS NULL;
 
 

这样,就在加入OUTER ROW之前进行了条件的过滤。

 

CONTROLLING THE PHYSICAL JOIN EVALUATION ORDER

在JOIN过程中,SQLSERVER做了大量的内部优化,所以物理的执行顺序并不是真正的执行顺序。如果要强加于物理的顺序,可以

OPTION (FORCE ORDER);
 
 

但是,严重不推荐这样做,除非你对JOIN查询的情况完全的了解,同时去查询MSDN。

 

CONTROLLING THE LOGICAL JOIN EVALUATION ORDER

ON的使用:ON的位置的摆放很有讲究,请见下面的例子。

SELECT DISTINCT C.CompanyName AS customer, S.CompanyName AS supplier

FROM dbo.Customers AS C

  LEFT OUTER JOIN dbo.Orders AS O

    ON O.CustomerID = C.CustomerID

  JOIN dbo.[Order Details] AS OD

    ON OD.OrderID = O.OrderID

  JOIN dbo.Products AS P

    ON P.ProductID = OD.ProductID

  JOIN dbo.Suppliers AS S

    ON S.SupplierID = P.SupplierID;

 

这个例子返回了1326行,可是我们想返回的是1328行,为什么会掉了2行,原因就是LEFT OUTER JOIN没有适用到后3个JOIN连接中。

可以修改这个例子,给后3个JOIN连接加上LEFT,那么如下所示:

SELECT DISTINCT C.CompanyName AS customer, S.CompanyName AS supplier

FROM dbo.Customers AS C

  LEFT OUTER JOIN dbo.Orders AS O

    ON O.CustomerID = C.CustomerID

  LEFT OUTER JOIN dbo.[Order Details] AS OD

    ON OD.OrderID = O.OrderID

  LEFT OUTER JOIN dbo.Products AS P

    ON P.ProductID = OD.ProductID

  LEFT OUTER JOIN dbo.Suppliers AS S

    ON S.SupplierID = P.SupplierID;

 

这样,成功的返回了1328行。但是,有一种更好的写法,在最后对CUSTOMER进行RIGHT JOIN的组合,这样就能包含所有的客户:

SELECT DISTINCT C.CompanyName AS customer, S.CompanyName AS supplier

FROM dbo.Orders AS O

  JOIN dbo.[Order Details] AS OD

    ON OD.OrderID = O.OrderID

  JOIN dbo.Products AS P

    ON P.ProductID = OD.ProductID

  JOIN dbo.Suppliers AS S

    ON S.SupplierID = P.SupplierID

  RIGHT OUTER JOIN dbo.Customers AS C

    ON O.CustomerID = C.CustomerID;

 

同样,也是返回了1328行。更好的更准确的写法如下,注意ON的位置变化:

SELECT DISTINCT C.CompanyName AS customer, S.CompanyName AS supplier

FROM dbo.Customers AS C

  LEFT OUTER JOIN

    (dbo.Orders AS O

       JOIN dbo.[Order Details] AS OD

         ON OD.OrderID = O.OrderID

       JOIN dbo.Products AS P

         ON P.ProductID = OD.ProductID

       JOIN dbo.Suppliers AS S

         ON S.SupplierID = P.SupplierID)

      ON O.CustomerID = C.CustomerID;

 

这里使用()是为了更方便理解,其实不用()也可以。

 

SEMI JOINS

有些时候,使用EXISTS可以代替使用JOIN ON,而且使用EXISTS效率要高很多。

比如下面的查询:

SELECT DISTINCT C.CustomerID, C.CompanyName

FROM dbo.Customers AS C

  JOIN dbo.Orders AS O

    ON O.CustomerID = C.CustomerID

WHERE Country = N'Spain';

 

完全可以用下面的替换:

SELECT CustomerID, CompanyName

FROM dbo.Customers AS C

WHERE Country = N'Spain'

  AND EXISTS

    (SELECT * FROM dbo.Orders AS O

     WHERE O.CustomerID = C.CustomerID);

 

JOIN ALGORITHMS

Index的位置放在那里,才是最有效率的(最优化的方案)?

差的性能<— 无索引(表审视每行的所有数据)—> 非聚集索引无覆盖(寻找+偏序审查+发现)—> 聚集索引(寻找+部分审查)—> 非聚集索引集覆盖(寻找+部分审查)—> 非聚集索引包含非KEY的列(寻找+部分审查)—> 最好的性能

 

Technorati 标签: , ,

你可能感兴趣的:(sqlserver)