对检索结果使用 UNION
, INTERSECT,
EXCEPT
来将检索结果进行并,交和差运算
所有的表–以及查询结果–都可以视为集合
SELECT product_id, product_name
FROM product
UNION
SELECT product_id, product_name
FROM product2;
UNION 等集合运算符通常都会除去重复的记录。
select product_id,product_name,product_type,sale_price,purchase_price
from product
where sale_price>purchase_price*1.5
union
select product_id,product_name,product_type,sale_price,purchase_price
from product
where sale_price<800
select product_id,product_name,product_type,sale_price,purchase_price
from product
where sale_price>purchase_price*1.5 or sale_price<800
要将两个不同的表中的结果合并在一起, 就得使用 UNION
select * from product where sale_price < 1.3*purchase_price or sale_price is null or purchase_price is null;
-- 使用 OR 谓词
select * from product where sale_price/purchase_price < 1.3
or sale_price/purchase_price is null;
select * from product where sale_price/purchase_price is null
union
select * from product where sale_price/purchase_price < 1.3
select product_id,product_name
from product
union all
select product_id,product_name
from product
练习题
select * from product where sale_price/purchase_price <1.5
union all
select * from product where sale_price<1000
hive中进行join关联时,关联列要避免使用隐式数据类型转换,否则容易导致数据倾斜
select product_id,product_name,'1'
from product
union
select product_id,product_name,sale_price
from product2
练习题
需要用 inner join 来求得交集
SELECT p1.product_id, p1.product_name
FROM product p1
INNER JOIN product2 p2
ON p1.product_id=p2.product_id
练习题
select * from product
where sale_price>1.5*purchase_price
and sale_price < 1500
select * from product
where product_id not in (select product_id from product2)
练习题
select * from product
where sale_price>2000
and product_id
not in (select product_id
from product where sale_price<1.3*purchase_price)
首先使用UNION求两个表的并集, 然后使用INTERSECT求两个表的交集, 然后用并集减去交集, 就得到了对称差。
select * from product
where product_id not in (select product_id from product2)
union
select * from product
where product_id not in (select product_id from product)
使用 UNION 会增加记录行数,而使用 INTERSECT 或者 EXCEPT 会减少记录行数。
使用关联子查询也可以从其他表获取信息, 但连结更适合从多张表获取信息。
关联子查询像vlookup函数,以表A为主,根表A为主表,根据表A的关联列的每一行的取值,逐个到表 B 中的关联列中去查找取值相等的行。
from table inner join ** on *condition(s)**
在 FROM 子句中使用 INNER JOIN 将两张表连接起来, 并为 ON 子句指定连结条件为 ShopProduct.product_id=Product.product_id
找出东京商店里的衣服类商品的商品名称,商品价格,商品种类,商品数量信息。
Product 表保存了商品编号,商品名称,商品种类
ShopProduct 表里有商店编号名称,商店的商品编号及数量
公共列商品编号将Product和ShopProduct这两张表连接起来
select sp.shop_id,sp.shop_name,sp.product_id,p.product_name,p.product_type,p.sale_price,sp.quantity
from shopproduct as sp
inner join product as p
on sp.product_id=p.product_id;
from shopproduct as sp inner join product as p
必须使用on子句指定连结条件
ON 子句是专门用来指定连结条件的, 我们在上述查询的 ON 之后指定两张表连结所使用的列以及比较条件, 基本上, 它能起到与 WHERE 相同的筛选作用
select子句中的列最好按照表名.列名的格式来使用
WHERE 子句写在 ON 子句的后边。
select * from (
select sp.shop_id,
sp.shop_name,
sp.product_id,
p.product_name,
p.product_type,
p.sale_price,
sp.quantity
from shopproduct as sp
inner join product as p
on sp.product_id = p.product_id)as step1
where shop_name='东京' and product_type = '衣服';
在做完 INNER JOIN … ON 得到一个新表后, 才会执行 WHERE 子句
FROM 子句->WHERE 子句->SELECT 子句
可以将 WHERE 子句中的条件直接添加在 ON 子句中, 这时候 ON 子句后最好用括号将连结条件和筛选条件括起来
select sp.shop_id,sp.shop_name,sp.product_id,p.product_name,p.product_type,p.sale_price,sp.quantity
from shopproduct as sp
inner join product as p
on (sp.product_id=p.product_id And sp.shop_name='东京' And p.product_type='衣服')
先连结再筛选的标准写法的执行顺序是, 两张完整的表做了连结inner join之后再做筛选where
或者先用where拆解 后inner join
select sp.shop_id,sp.shop_name,sp.product_id,p.product_name,p.product_type,p.sale_price,sp.quantity
from (select * from shopproduct where shop_name='东京')as sp
inner join (select * from product where product_type='衣服')as p
on sp.product_id=p.product_id;
select sp.shop_id,sp.shop_name,sp.product_id,p.product_name,p.product_type,p.purchase_price
from shopproduct as sp
inner join product as p
on sp.product_id = p.product_id
where p.product_type='衣服';
select sp.*,p.* from shopproduct as sp inner join product as p on sp.product_id = p.product_id where shop_id='000A' and sale_price < 200
select sp.shop_id,sp.shop_name,max(p.sale_price)as max_price
from shopproduct as sp
inner join product as p
on sp.product_id = p.product_id
group by sp.shop_id,sp.shop_name;
SELECT P1.*
FROM Product AS P1
INNER JOIN Product2 AS P2
ON (P1.product_id = P2.product_id
AND P1.product_name = P2.product_name
AND P1.product_type = P2.product_type
AND P1.sale_price = P2.sale_price
AND P1.regist_date = P2.regist_date)
regist_date有空值不能比较
SELECT P1.*
FROM Product AS P1
INNER JOIN Product2 AS P2
ON P1.product_id = P2.product_id
自连结并不是区分于内连结和外连结的第三种连结, 自连结可以是外连结也可以是内连结, 它是不同于内连结外连结的另一个连结的分类方法。
SELECT * FROM shopproduct NATURAL JOIN Product
---
SELECT SP.product_id,SP.shop_id,SP.shop_name,SP.quantity
,P.product_name,P.product_type,P.sale_price
,P.purchase_price,P.regist_date
FROM shopproduct AS SP
INNER JOIN Product AS P
ON SP.product_id = P.product_id
自然连结并不是区别于内连结和外连结的第三种连结, 它其实是内连结的一种特例–当两个表进行自然连结时, 会按照两个表中都包含的列名来进行等值内连结, 此时无需使用 ON 来指定连接条件。
求表 Product 和表 Product2 中的公共部分
SELECT * FROM Product NATURAL JOIN Product2
在进行自然连结时, 来自于 Product 和 Product2 的运动 T 恤这一行数据在进行比较时, 实际上是在逐字段进行等值连结
两个缺失值进行=比较, 结果都不为真
正确的写法有省略空列
SELECT *
FROM (SELECT product_id, product_name
FROM Product ) AS A
NATURAL JOIN
(SELECT product_id, product_name
FROM Product2) AS B;
内连结会丢弃两张表中不满足 ON 条件的行,和内连结相对的就是外连结. 外连结会根据外连结的种类有选择地保留无法匹配到的行。
外连结有三种形式: 左连结, 右连结和全外连结。
左连结会保存左表中无法按照 ON 子句匹配到的行, 此时对应右表的行均为缺失值; 右连结则会保存右表中无法按照 ON 子句匹配到的行, 此时对应左表的行均为缺失值; 而全外连结则会同时保存两个表中无法按照 ON子句匹配到的行, 相应的另一张表中的行用缺失值填充。
所有的内容在调换两个表的前后位置, 并将左连结改为右连结之后, 都能得到相同的结果. 稍后再介绍全外连结的概念。
-- 左连结
FROM <tb_1> LEFT OUTER JOIN <tb_2> ON <condition(s)>
-- 右连结
FROM <tb_1> RIGHT OUTER JOIN <tb_2> ON <condition(s)>
-- 全外连结
FROM <tb_1> FULL OUTER JOIN <tb_2> ON <condition(s)>
左连结从两个表获取信息
SELECT SP.shop_id
,SP.shop_name
,SP.product_id
,P.product_name
,P.sale_price
FROM Product AS P
LEFT OUTER JOIN ShopProduct AS SP
ON SP.product_id = P.product_id;
内连结只能选取出同时存在于两张表中的数据,
如果使用内连结的话,根据 SELECT 语句执行时商店库存状况的不同,结果的行数也会发生改变,生成的单据的版式也会受到影响
外连结能够得到固定行数的结果.
外连结会根据外连结的种类有选择地保留无法匹配到的行。
使用 LEFT 时 FROM 子句中写在左侧的表是主表,使用 RIGHT 时右侧的表是主表
SELECT
P.product_id,
P.product_name,
P.sale_price,
SP.shop_id,
SP.shop_name,
SP.quantity
FROM
Product AS P
LEFT OUTER JOIN - ( SELECT * FROM ShopProduct WHERE quantity < 50 ) AS SP ON SP.product_id = P.product_id
多表连结
inner join
SELECT SP.shop_id
,SP.shop_name
,SP.product_id
,P.product_name
,P.sale_price
,IP.inventory_quantity
FROM ShopProduct AS SP4
---
INNER JOIN Product AS P
ON SP.product_id = P.product_id
INNER JOIN InventoryProduct AS IP
ON SP.product_id = IP.product_id
---
WHERE IP.inventory_id = 'P001';
outer join
SELECT P.product_id
,P.product_name
,P.sale_price
,SP.shop_id
,SP.shop_name
,IP.inventory_quantity
FROM Product AS P
LEFT OUTER JOIN ShopProduct AS SP
ON SP.product_id = P.product_id
LEFT OUTER JOIN InventoryProduct AS IP
ON SP.product_id = IP.product_id
使用相等判断的等值连结, 也可以使用比较运算符来进行连接. 实际上, 包括比较运算符(<,<=,>,>=, BETWEEN)和谓词运算(LIKE, IN, NOT 等等)在内的所有的逻辑运算都可以放在 ON 子句内作为连结条件。
SELECT product_id
,product_name
,sale_price
,COUNT(p2_id) AS my_rank
FROM (--使用自左连结对每种商品找出价格不低于它的商品
SELECT P1.product_id
,P1.product_name
,P1.sale_price
,P2.product_id AS P2_id
,P2.product_name AS P2_name
,P2.sale_price AS P2_price
FROM Product AS P1
LEFT OUTER JOIN Product AS P2
ON P1.sale_price <= P2.sale_price
) AS X
GROUP BY product_id, product_name, sale_price
ORDER BY my_rank;
SELECT P1.product_id
,P1.product_name
,P1.sale_price
,P2.product_id AS P2_id
,P2.product_name AS P2_name
,P2.sale_price AS P2_price
FROM Product AS P1
LEFT OUTER JOIN Product AS P2
ON P1.sale_price >= P2.sale_price
ORDER BY P1.sale_price,P1.product_id
对每种商品使用自左连结, 找出比该商品售价价格更低或相等的商品
SELECT product_id, product_name, sale_price
,SUM(P2_price) AS cum_price
FROM
(SELECT P1.product_id, P1.product_name, P1.sale_price
,P2.product_id AS P2_id
,P2.product_name AS P2_name
,P2.sale_price AS P2_price
FROM Product AS P1
LEFT OUTER JOIN Product AS P2
ON ((P1.sale_price > P2.sale_price)
OR (P1.sale_price = P2.sale_price
AND P1.product_id<=P2.product_id))
ORDER BY P1.sale_price,P1.product_id) AS X
GROUP BY product_id, product_name, sale_price
ORDER BY sale_price,cum_price;
-- 1.使用关键字 CROSS JOIN 显式地进行交叉连结
SELECT SP.shop_id
,SP.shop_name
,SP.product_id
,P.product_name
,P.sale_price
FROM ShopProduct AS SP
CROSS JOIN Product AS P;
--2.使用逗号分隔两个表,并省略 ON 子句
SELECT SP.shop_id
,SP.shop_name
,SP.product_id
,P.product_name
,P.sale_price
FROM ShopProduct AS SP , Product AS P;
select * from product where sale_price > 500
union
select * from product2 where sale_price > 500;
select * from product where product_id not in (select product_id from product2)
union
select * from product2 where product_id not in (select product_id from product);
select sp.shop_id, sp.shop_name, sp.quantity,
p.product_id, p.product_name, p.product_type, p.sale_price
from product as p
inner join shopproduct as sp
on sp.product_id = p.product_id;
select product_type, max(sale_price) as maxp from product
group by product_type
select sp.shop_id, sp.shop_name, sp.quantity,
p.product_id, p.product_name, p.product_type, p.sale_price,
mp.maxp as '该类商品的最大售价'
from product as p
inner join shopproduct as sp
on sp.product_id = p.product_id
inner join (
select product_type, max(sale_price) as maxp from product
group by product_type
) as mp
on mp.product_type = p.product_type and p.sale_price = mp.maxp;
select p.product_id, p.product_name, p.product_type, p.sale_price
from product as p
inner join (
select product_type, max(sale_price) as maxp from product
group by product_type
) as mp
on mp.product_type = p.product_type and p.sale_price = mp.maxp;
select p.product_id, p.product_name, p.product_type, p.sale_price
from product as p
where sale_price = (
select max(sale_price) from product as p1
where p.product_type = p1.product_type
group by product_type
);