一个完整的SQL语句中会被拆分红多个子句,子句的执行过程当中会产生虚拟表(vt),可是结果只返回最后一张虚拟表。从这个思路出发,咱们试着理解一下JOIN查询的执行过程并解答一些常见的问题。
若是以前对不一样JOIN的执行结果没有概念,能够结合这篇文章往下看mysql
如下是JOIN查询的通用结构sql
SELECTFROM JOIN ON WHERE
它的执行顺序以下(SQL语句里第一个被执行的老是FROM子句):segmentfault
下面用一个例子介绍一下上述联表的过程(这个例子不是个好的实践,只是为了说明join语法)code
建立一个用户信息表:文档
CREATE TABLE `user_info` ( `userid` int(11) NOT NULL, `name` varchar(255) NOT NULL, UNIQUE `userid` (`userid`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4
再建立一个用户余额表:get
CREATE TABLE `user_account` ( `userid` int(11) NOT NULL, `money` bigint(20) NOT NULL, UNIQUE `userid` (`userid`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4
随便导入一些数据:it
select * from user_info; +--------+------+ | userid | name | +--------+------+ | 1001 | x | | 1002 | y | | 1003 | z | | 1004 | a | | 1005 | b | | 1006 | c | | 1007 | d | | 1008 | e | +--------+------+ 8 rows in set (0.00 sec) select * from user_account; +--------+-------+ | userid | money | +--------+-------+ | 1001 | 22 | | 1002 | 30 | | 1003 | 8 | | 1009 | 11 | +--------+-------+ 4 rows in set (0.00 sec)
一共8个用户有用户名,4个用户的帐户有余额。
取出userid为1003的用户姓名和余额,SQL以下:io
SELECT i.name, a.money FROM user_info as i LEFT JOIN user_account as a ON i.userid = a.userid WHERE a.userid = 1003;
第一步:执行FROM子句对两张表进行笛卡尔积操做table
笛卡尔积操做后会返回两张表中全部行的组合,左表user_info有8行,右表user_account有4行,生成的虚拟表vt1就是8*4=32行:class
SELECT * FROM user_info as i LEFT JOIN user_account as a ON 1; +--------+------+--------+-------+ | userid | name | userid | money | +--------+------+--------+-------+ | 1001 | x | 1001 | 22 | | 1002 | y | 1001 | 22 | | 1003 | z | 1001 | 22 | | 1004 | a | 1001 | 22 | | 1005 | b | 1001 | 22 | | 1006 | c | 1001 | 22 | | 1007 | d | 1001 | 22 | | 1008 | e | 1001 | 22 | | 1001 | x | 1002 | 30 | | 1002 | y | 1002 | 30 | | 1003 | z | 1002 | 30 | | 1004 | a | 1002 | 30 | | 1005 | b | 1002 | 30 | | 1006 | c | 1002 | 30 | | 1007 | d | 1002 | 30 | | 1008 | e | 1002 | 30 | | 1001 | x | 1003 | 8 | | 1002 | y | 1003 | 8 | | 1003 | z | 1003 | 8 | | 1004 | a | 1003 | 8 | | 1005 | b | 1003 | 8 | | 1006 | c | 1003 | 8 | | 1007 | d | 1003 | 8 | | 1008 | e | 1003 | 8 | | 1001 | x | 1009 | 11 | | 1002 | y | 1009 | 11 | | 1003 | z | 1009 | 11 | | 1004 | a | 1009 | 11 | | 1005 | b | 1009 | 11 | | 1006 | c | 1009 | 11 | | 1007 | d | 1009 | 11 | | 1008 | e | 1009 | 11 | +--------+------+--------+-------+ 32 rows in set (0.00 sec)
第二步:执行ON子句过滤掉不知足条件的行
ON i.userid = a.userid 过滤以后vt2以下:
+--------+------+--------+-------+ | userid | name | userid | money | +--------+------+--------+-------+ | 1001 | x | 1001 | 22 | | 1002 | y | 1002 | 30 | | 1003 | z | 1003 | 8 | +--------+------+--------+-------+
第三步:JOIN 添加外部行
LEFT JOIN会将左表未出如今vt2的行插入进vt2,每一行的剩余字段将被填充为NULL,RIGHT JOIN同理
本例中用的是LEFT JOIN,因此会将左表user_info剩下的行都添上 生成表vt3:
+--------+------+--------+-------+ | userid | name | userid | money | +--------+------+--------+-------+ | 1001 | x | 1001 | 22 | | 1002 | y | 1002 | 30 | | 1003 | z | 1003 | 8 | | 1004 | a | NULL | NULL | | 1005 | b | NULL | NULL | | 1006 | c | NULL | NULL | | 1007 | d | NULL | NULL | | 1008 | e | NULL | NULL | +--------+------+--------+-------+
第四步:WHERE条件过滤
WHERE a.userid = 1003 生成表vt4:
+--------+------+--------+-------+ | userid | name | userid | money | +--------+------+--------+-------+ | 1003 | z | 1003 | 8 | +--------+------+--------+-------+
第五步:SELECT
SELECT i.name, a.money 生成vt5:
+------+-------+ | name | money | +------+-------+ | z | 8 | +------+-------+
虚拟表vt5做为最终结果返回给客户端
介绍完联表的过程以后,咱们看看经常使用JOIN的区别
INNER JOIN
拿上文的第三步添加外部行来举例,若LEFT JOIN替换成INNER JOIN,则会跳过这一步,生成的表vt3与vt2如出一辙:
+--------+------+--------+-------+ | userid | name | userid | money | +--------+------+--------+-------+ | 1001 | x | 1001 | 22 | | 1002 | y | 1002 | 30 | | 1003 | z | 1003 | 8 | +--------+------+--------+-------+
RIGHT JOIN
若LEFT JOIN替换成RIGHT JOIN,则生成的表vt3以下:
+--------+------+--------+-------+ | userid | name | userid | money | +--------+------+--------+-------+ | 1001 | x | 1001 | 22 | | 1002 | y | 1002 | 30 | | 1003 | z | 1003 | 8 | | NULL | NULL | 1009 | 11 | +--------+------+--------+-------+
由于user_account(右表)里存在userid=1009这一行,而user_info(左表)里却找不到这一行的记录,因此会在第三步插入如下一行:
| NULL | NULL | 1009 | 11 |
FULL JOIN
上文引用的文章中提到了标准SQL定义的FULL JOIN,这在mysql里是不支持的,不过咱们能够经过LEFT JOIN + UNION + RIGHT JOIN 来实现FULL JOIN:
SELECT * FROM user_info as i RIGHT JOIN user_account as a ON a.userid=i.userid union SELECT * FROM user_info as i LEFT JOIN user_account as a ON a.userid=i.userid;
他会返回以下结果:
+--------+------+--------+-------+ | userid | name | userid | money | +--------+------+--------+-------+ | 1001 | x | 1001 | 22 | | 1002 | y | 1002 | 30 | | 1003 | z | 1003 | 8 | | NULL | NULL | 1009 | 11 | | 1004 | a | NULL | NULL | | 1005 | b | NULL | NULL | | 1006 | c | NULL | NULL | | 1007 | d | NULL | NULL | | 1008 | e | NULL | NULL | +--------+------+--------+-------+
ps:其实咱们从语义上就能看出LEFT JOIN和RIGHT JOIN没什么差异,二者的结果差别取决于左右表的放置顺序,如下内容摘自mysql官方文档:
RIGHT JOIN works analogously to LEFT JOIN. To keep code portable across databases, it is recommended that you use LEFT JOIN instead of RIGHT JOIN.
因此当你纠结使用LEFT JOIN仍是RIGHT JOIN时,尽量只使用LEFT JOIN吧
上文把JOIN的执行顺序了解清楚以后,ON和WHERE的区别也就很好理解了。
举例说明:
SELECT * FROM user_info as i LEFT JOIN user_account as a ON i.userid = a.userid and i.userid = 1003;
SELECT * FROM user_info as i LEFT JOIN user_account as a ON i.userid = a.userid where i.userid = 1003;
第一种状况LEFT JOIN在执行完第二步ON子句后,筛选出知足i.userid = a.userid and i.userid = 1003的行,生成表vt2,而后执行第三步JOIN子句,将外部行添加进虚拟表生成vt3即最终结果:
vt2: +--------+------+--------+-------+ | userid | name | userid | money | +--------+------+--------+-------+ | 1003 | z | 1003 | 8 | +--------+------+--------+-------+ vt3: +--------+------+--------+-------+ | userid | name | userid | money | +--------+------+--------+-------+ | 1001 | x | NULL | NULL | | 1002 | y | NULL | NULL | | 1003 | z | 1003 | 8 | | 1004 | a | NULL | NULL | | 1005 | b | NULL | NULL | | 1006 | c | NULL | NULL | | 1007 | d | NULL | NULL | | 1008 | e | NULL | NULL | +--------+------+--------+-------+
而第二种状况LEFT JOIN在执行完第二步ON子句后,筛选出知足i.userid = a.userid的行,生成表vt2;再执行第三步JOIN子句添加外部行生成表vt3;而后执行第四步WHERE子句,再对vt3表进行过滤生成vt4,得的最终结果:
vt2: +--------+------+--------+-------+ | userid | name | userid | money | +--------+------+--------+-------+ | 1001 | x | 1001 | 22 | | 1002 | y | 1002 | 30 | | 1003 | z | 1003 | 8 | +--------+------+--------+-------+ vt3: +--------+------+--------+-------+ | userid | name | userid | money | +--------+------+--------+-------+ | 1001 | x | 1001 | 22 | | 1002 | y | 1002 | 30 | | 1003 | z | 1003 | 8 | | 1004 | a | NULL | NULL | | 1005 | b | NULL | NULL | | 1006 | c | NULL | NULL | | 1007 | d | NULL | NULL | | 1008 | e | NULL | NULL | +--------+------+--------+-------+ vt4: +--------+------+--------+-------+ | userid | name | userid | money | +--------+------+--------+-------+ | 1003 | z | 1003 | 8 | +--------+------+--------+-------+
若是将上例的LEFT JOIN替换成INNER JOIN,不论将条件过滤放到ON仍是WHERE里,结果都是同样的,由于INNER JOIN不会执行第三步添加外部行
SELECT * FROM user_info as i INNER JOIN user_account as a ON i.userid = a.userid and i.userid = 1003;
SELECT * FROM user_info as i INNER JOIN user_account as a ON i.userid = a.userid where i.userid = 1003;
返回结果都是:
+--------+------+--------+-------+ | userid | name | userid | money | +--------+------+--------+-------+ | 1003 | z | 1003 | 8 | +--------+------+--------+-------+