0.来源
- 提前挂一下:推荐几个SQL在线学习网站;
- 来源是: SQLBolt-Learn SQL with simple, interactive exercises.
- 网站内容为英文,通过浏览器即可学习。
0.Introduction to SQL(SQL的介绍)
- 需知:SQL(结构化查询语言)有一般意义上的基本标准,但是各大关系型数据库(SQLite,Mysql,SqlServer等)在存储类型和一些附加特征上存在差异。
这个网站讲的都是标准SQL。
- 理解关系型数据库,可以将其视为多个二维表格相互关联起来形成的集合。其中的二维表可以理解为Excel表格,有列名(属性、字段),每一行代表一条数据(记录)。
- 网站提到了学习SQL实际上经常需要回答一些问题,比如“道路上哪种类型的交通工具少于4个轮子?”、“特斯拉生产了多少种汽车模型?”。实际上,这些问题就是通过SQL语句从关系型数据库中查询数据表当中的数据所得到的。
1.SQL Lesson 1: SELECT queries(选择查询语句)
- 列角度。
- 这一部分只是简单地介绍了select语句的语法结构,select后跟着需要查询的列名,*星号表述查询全部列;
- 网站所给习题是简单地修改select后面跟着的列名:
Find the title of each film ✓
Find the director of each film ✓
Find the title and director of each film ✓
Find the title and year of each film ✓
Find all the information about each film ✓
SELECT title FROM movies;
SELECT director FROM movies;
SELECT title,director FROM movies;
SELECT title,year FROM movies;
SELECT * FROM movies;
2.SQL Lesson 2: Queries with constraints(带约束查询1)
- 行角度。
- 使用where子句实现;
- where子句常用操作符:=, !=, <, <=, >=, >, <>;
between…and…; not between…and…;
in(…); not in(…);
- 运算符示例(假设选择名为num的某列作为约束对象):
…where num!=4; …where num between 1 and 10;
…where not num between 100 and 10000;
…where num in (1,3,9);
…where num not in (1,2,3);
- 练习:
Find the movie with a row id of 6 ✓
Find the movies released in the years between 2000 and 2010 ✓
Find the movies not released in the years between 2000 and 2010 ✓
Find the first 5 Pixar movies and their release year ✓
SELECT * FROM movies where id=6;
SELECT * FROM movies where year between 2000 and 2010;
SELECT * FROM movies where year not between 2000 and 2010;
SELECT title, year FROM movies WHERE year <= 2003;
SELECT title,year FROM movies where id in (1,2,3,4,5);
3.SQL Lesson 3: Queries with constraints (带约束查询2)
- 文本数据类型也有一些运算符;
- 除了=,<>,!=,in和not in以外,还有like,not like,%,_;
- like的作用是判断文本是否符合一定的规则,%匹配任意长度字符串,_匹配单个字符,比如(选择列名为text_content作为约束对象):
…where text_content like ‘张%’,那么“张”字开头的任意字符串都符合条件(如张一,张二三五七八;单个“张”也可以);
…where text_content like ‘陈 _’,匹配“陈”字开头的任意两个字符,比如陈真,但是陈心心不行,单个陈字也不行。
- 练习:
Find all the Toy Story movies ✓
Find all the movies directed by John Lasseter ✓
Find all the movies (and director) not directed by John Lasseter ✓
Find all the WALL-* movies ✓
SELECT * FROM movies where title like "Toy Story%";
SELECT * FROM movies where director="John Lasseter";
SELECT title,director FROM movies where director!="John Lasseter";
SELECT title,director FROM movies where director<>"John Lasseter";
SELECT title,director FROM movies where title like "WALL-%";
4.SQL Lesson4:Filtering and sorting Query results(过滤&排序)
- distinct关键词帮助过滤重复的查询结果。
- order by用于排序查询结果;
- limit num_limit offset num_offset(limit…offset…)语法,用于筛选特定行数的结果子集。
- 练习:
List all directors of Pixar movies (alphabetically), without duplicates ✓
List the last four Pixar movies released (ordered from most recent to least) ✓
List the first five Pixar movies sorted alphabetically ✓
List the next five Pixar movies sorted alphabetically ✓
SELECT distinct director FROM movies order by director asc;
SELECT * FROM movies order by year desc limit 4;
SELECT * FROM movies order by title asc limit 5;
SELECT * FROM movies order by title asc limit 5 offset 5;
5.SQL Review: Simple SELECT Queries(复习简单的查询语句)
- 练习:
List all the Canadian cities and their populations ✓
Order all the cities in the United States by their latitude from north to south ✓
List all the cities west of Chicago, ordered from west to east ✓
List the two largest cities in Mexico (by population) ✓
List the third and fourth largest cities (by population) in the United States and their population ✓
SELECT city,population FROM north_american_cities where country='Canada';
SELECT * FROM north_american_cities
where country='United States' order by latitude desc;
select * from north_american_cities where longitude<
(SELECT longitude FROM north_american_cities where city='Chicago')
order by longitude;
select * from north_american_cities where country='Mexico'
order by population desc limit 2;
select city,population from north_american_cities
where country='United States'
order by population desc
limit 2 offset 2;
6.SQL Lesson 6: Multi-table queries with JOINs(使用join多表查询)
- 数据库规范化具有减少数据冗余等优势,但使得数据存储在不同的表格当中;因此部分查询无法直接从一个表当中获得结果;
- 表中的每一个实体通过主键唯一标识;
- join可以实现多表通过主键和外键连接,从而实现多表格查询;本部分介绍inner join;
- 备注:inner join经常简写成join,但是为了提高可读性,建议写全inner join;
- 练习:
SELECT m.title,b.domestic_sales,international_sales
FROM movies as m
inner join boxoffice as b on b.movie_id=m.id;
SELECT m.title,b.domestic_sales,international_sales
FROM movies as m
inner join boxoffice as b on b.movie_id=m.id
where b.domestic_sales<b.international_sales;
SELECT m.title,b.domestic_sales,international_sales
FROM movies as m
inner join boxoffice as b on b.movie_id=m.id
order by rating desc;
7.SQL Lesson 7: OUTER JOINs(外连接)
- inner join返回两个表都包含的数据,但存在不对称(asymmetric)数据的可能,因此需要left join, right join和full join。
- 备注:left outer join, right outer join和full outer join是兼容SQL-92的,去掉outer也一样。
- 练习:
Find the list of all buildings that have employees ✓
Find the list of all buildings and their capacity ✓
List all buildings and the distinct employee roles in each building (including empty buildings) ✓
SELECT distinct b.building_name FROM employees as e
inner join buildings as b
on e.building=b.building_name;
SELECT distinct building_name,capacity FROM buildings;
SELECT distinct e.role,b.building_name
FROM buildings as b
left join employees as e
on e.building=b.building_name;
8.SQL Lesson 8: A short note on NULLs(关于NULL值)
- 应当尽量避免数据库中的null值,原因是在构建查询、约束以及处理结果时,通常需要特别关注这类取值(实际上根本没有取值)。
- 但如果数据库中需要存储未完成的数据,可以使用null。
- null值和默认值(如空字符串、0)是不同的,有时候默认值可以替代null存在,但是如果需要进行特定的分析,使用默认值会影响计算结果。
- 如果无法避免null值(比如针对非对称数据进行外连接),可以通过is null或is not null进行判断。
- 练习:
Find the name and role of all employees who have not been assigned to a building ✓
Find the names of the buildings that hold no employees ✓
SELECT name,role FROM employees
where building is null;
select b.building_name from buildings as b
left join employees as e
on e.building=b.building_name
group by b.building_name
having count(e.building)=0;
9.SQL Lesson 9: Queries with expressions(带表达式的查询)
- 查询结果使用as关键词可以提高可读性;
- 每一种数据都有自带的数字、字符串和日期运算函数。
- 练习:
List all movies and their combined sales in millions of dollars ✓
List all movies and their ratings in percent ✓
List all movies that were released on even number years ✓
SELECT m.title,
(b.domestic_sales+b.international_sales)/1000000 as combined_sales
FROM movies as m
inner join boxoffice as b
on m.id=b.movie_id;
SELECT m.title,b.rating*10 as rating_percentage
FROM movies as m
inner join boxoffice as b
on m.id=b.movie_id;
SELECT * FROM movies
where year%2=0;
10.SQL Lesson 10: Queries with aggregates(聚合表达式1)
- 常用的聚合函数主要有:count(), min(), max(), avg(), sum();
- 其中,count(*)和count(column)的区别是,指定列名之后会舍弃null值;
- 聚合函数通常和group by连用(单独使用也可以)。
- 练习:
Find the longest time that an employee has been at the studio ✓
For each role, find the average number of years employed by employees in that role ✓
Find the total number of employee years worked in each building ✓
SELECT max(years_employed) FROM employees;
SELECT role,avg(years_employed) FROM employees
group by role;
SELECT building,sum(years_employed) FROM employees
group by building;
11.SQL Lesson 10: Queries with aggregates(聚合表达式2)
- group by必须放在where之后,因此where负责在分组之前过滤数据;
- having语句负责跟在group by之后过滤分组数据。
- 练习:
Find the number of Artists in the studio (without a HAVING clause)
Find the number of Employees of each role in the studio
Find the total number of years employed by all Engineers
SELECT count(role) FROM employees
where role='Artist';
SELECT role,count(*) FROM employees
group by role;
SELECT role,sum(years_employed) FROM employees
group by role
having role='Engineer';
12.SQL Lesson 12: Order of execution of a Query(执行顺序)
- 首先,执行from和join语句(包括子句中的子查询),决定请求的数据集范围;
- 其次,执行where语句过滤掉部分数据;
- 再次,执行group by语句,对数据进行分组(分组结果行数和唯一值个数一致);
- 随后,如果存在having,则对分组结果进行过滤;
- 然后,执行select中的表达式;
- 之后,执行distinct去除包含重复值的记录;
- 随后,执行order by对结果进行排序;
- 最后,limit…offset语句被执行对结果做最后的筛选。
- 练习:
Find the number of movies each director has directed
Find the total domestic and international sales that can be attributed to each director
SELECT director,count(*) FROM movies
group by director;
SELECT m.director,sum(b.domestic_sales+b.international_sales) as total_sales
FROM movies as m
inner join boxoffice as b
on b.movie_id=m.id
group by m.director;
13.SQL Lesson 13: Inserting rows(插入数据)
- 模式(schema)描述每个关系表的结构以及每个表列可以包含的数据类型;
- 这种固定结构可以使得数据库保持高效性和一致性。
- 插入数据时,不写列名则需要按顺序写入所有字段数据;写列名时可以暂时不写入部分字段,只需要按照所写列名对应写入数据即可;
- 插入数据也可以是表达式,比如1214/520,只要数据类型一致即可。
- 练习:
Add the studio’s new production, Toy Story 4 to the list of movies (you can use any director)
Toy Story 4 has been released to critical acclaim! It had a rating of 8.7, and made 340 million domestically and 270 million internationally. Add the record to the BoxOffice table.
insert into movies(title,director,year)
values('Toy Story 4','John Lasseter',2020);
insert into boxoffice(movie_id,rating,domestic_sales,international_sales)
values(15,8.7,340,270);
14.SQL Lesson 14: Updating rows(更新数据)
- 更新数据时要非常小心,因为一旦修改了就难以恢复;
- 因此,可以先用select…where查询一下再操作。
- 练习:
The director for A Bug’s Life is incorrect, it was actually directed by John Lasseter
The year that Toy Story 2 was released is incorrect, it was actually released in 1999
Both the title and director for Toy Story 8 is incorrect! The title should be “Toy Story 3” and it was directed by Lee Unkrich
UPDATE movies
SET director = "John Lasseter"
WHERE id = 2;
UPDATE movies
SET year= 1999
WHERE id = 3;
UPDATE movies
SET title = "Toy Story 3", director = "Lee Unkrich"
WHERE id = 11;
15.SQL Lesson 15: Deleting rows(删除数据)
- 与update一样,删除之前可以先查询一下确保无误。
- 练习;
This database is getting too big, lets remove all movies that were released before 2005. ✓
Andrew Stanton has also left the studio, so please remove all movies directed by him. ✓
delete FROM movies where year<2005;
delete FROM movies where director="Andrew Stanton";
16.SQL Lesson 16: Creating tables(创建关系表)
- 数据类型包括:integer,boolean,float,double,real,character(num_chars),varchar(num_chars),text,date,dateime,blob;
- 约束包括:primary key,foreign key,autoincrement(对于整数值,这意味着该值将自动填充,并随每次行插入而增加,但并非所有数据库都支持),unique,not null,check(expression);
- 练习:
Create a new table named Database with the following columns:
– Name A string (text) describing the name of the database
– Version A number (floating point) of the latest version of this database
– Download_count An integer count of the number of times this database was downloaded
This table has no constraints. ✓
create table database(name text,version float,download_count integer);
17.SQL Lesson 17: Altering tables(修改关系表)
- 对列进行增删改或增加约束,比如增加一列add column,删除一列drop column_name,重命名表名rename table_name;
- 练习:
alter table movies
add column Aspect_ratio float;
alter table movies
add column Language TEXT
default "English";
18.SQL Lesson 18: Dropping tables(删除关系表)
- 可以像create table一样,使用if exists语句;
- 如果删除的表格中存在其他表的外键,那么需要删除那些数据或者修改数据。
- 练习:
We’ve sadly reached the end of our lessons, lets clean up by removing the Movies table ✓
And drop the BoxOffice table as well ✓
drop table if exists movies;
drop table if exists boxoffice;
19.告辞!