SQL

1 Introduction to SQL

  • Tables:records & fields
  • Comments: -- 表示单行备注;/* comments */用于多行备注
  • SQL language is case insensitive

2 Fundamentals

2.1 SELECT

2.1.1 Select columns

  • SELECT / SELECT DISTINCT
SELECT 'SQL is cool!' AS result;  #function as print: SQL is cool!
SELECT name from people;
SELECT name, country from people;
SELECT * from people;
SELECT TOP 5 * FROM people;   # inspect the beginning of a table 
SELECT DISTINCT country from films;    #select all the unique values from a column

2.2 Filter rows

  • WHERE
    StrComp(string1, string2, config),当string1=string2时返回0。设置config=0可使where语句中的string变为大小写敏感
SELECT * from films where release_year = 2016;     #filtering of numerical results
SELECT COUNT(*) from films where release_year < 2000;
SELECT name, birthdate from people where birthdate = '1974-11-11';  #filter text results
SELECT * from films where language = 'Spanish' AND release_year > 2000 AND release_year < 2010    #多条件查询
SELECT * from films where (language = 'Spanish' OR language = 'French')    #或有条件查询
SELECT * FROM people WHERE StrComp(name, 'Kate', 0)=0  # 只返回Kate的结果,而不包括kate

<> 和 != 表示不等于

  • BETWEEN (简化AND,但是是inclusive)
SELECT title, release_year from films where release_year BETWEEN 1990 AND 2000
  • WHERE IN (简化OR)
SELECT title, release_year from films where release_year IN (1990, 2000)
  • IS NULL / IS NOT NULL
    NULL represents a missing or unknown value.
SELECT name 
FROM people 
WHERE deathdate IS NULL  #Get the names of people who are still alive
  • LIKE / NOT LIKE(模糊查询)
    The % wildcard, as a placeholder, will match zero, one, or many characters in text.
    The _ wildcard, as a placeholder, will match a single character.
SELECT name from people where name LIKE 'B%';  #Get the names of all people whose names begin with 'B'
SELECT name from people where name LIKE '_r%';   #Get the names of people whose names have 'r' as the second letter

2.3 条件语句

  • CASE WHEN
    CASE WHEN的功能:categorize data, filter data, aggregate data
    CASE WHEN... AND... THEN 语句中可以包含多个逻辑条件
    可以用WHERE 语句中筛选条件语句,如需要返回结果不为NULL的结果→WHERE 整个条件语句 IS NOT NULL
# 可以有多个CASE WHEN-ELSE语句
SELECT date,
    CASE WHEN hometeam_id = 8634 THEN 'FC Barcelona' 
        ELSE 'Real Madrid CF' END AS home,
    CASE WHEN awayteam_id = 8634 THEN 'FC Barcelona' 
        ELSE 'Real Madrid CF' END AS away,   #categorize data
    CASE WHEN home_goal > away_goal AND hometeam_id = 8634 THEN 'Barcelona win!'
         WHEN home_goal > away_goal AND hometeam_id = 8633 THEN 'Real Madrid win!'
         WHEN home_goal < away_goal AND awayteam_id = 8634 THEN 'Barcelona win!'
         WHEN home_goal < away_goal AND awayteam_id = 8633 THEN'Real Madrid win!'
        ELSE 'Tie' END AS outcome
INTO B_R_comp    #INTO command to create this B_R_comp table
FROM matches_spain
WHERE (awayteam_id = 8634 OR hometeam_id = 8634)
      AND (awayteam_id = 8633 OR hometeam_id = 8633);

2.4 Aggregate functions

  • COUNT
SELECT COUNT(*) from people;  #count the number of records
SELECT COUNT(birthdate) from people  #count the number of non-missing birth dates
SELECT COUNT(DISTINCT birthdate) from people  #count the number of unique birth dates
  • SUM / AVG / MAX / MIN
SELECT AVG(age) from films;
  • Basic arithmetic (+,-,*,/)
SELECT (4 * 3) AS result;
SELECT MAX(budget) AS max_budget,
       (gross - budget) AS net_profit,
       AVG(duration) / 60.0 AS avg_duration_hours,
       COUNT(deathdate) * 100.0 / COUNT(*) AS death_rate
FROM films;
  • 整数相除得整数,小数相除得小数
    例子: 4.0/3.0 --- 1.333; 4/3 --- 1
  • AS用来给结果命名

2.5 Sort and Group

  • ORDER BY
    默认升序,升序为ASC,DESC为降序
SELECT title from films ORDER BY release_year DESC;  #降序排列
SELECT birthdate, name from people order by birthdate, name;   #多条件排序
  • GROUP BY
    功能如数据透视表
SELECT sex, count(*)
FROM employees
GROUP BY sex
ORDER BY count DESC;
  • HAVING
    HAVING相当于WHERE。aggregate functions不能用于WHERE语句,HAVING可以。
SELECT release_year
FROM films
GROUP BY release_year
HAVING COUNT(title) > 10
LIMIT 5;   #只显示前5条数据

LIMIT N, 只显示前N条数据

2.6 Join data

  • INNER JOIN
    选出两个表格中都有的数据
SELECT c.code, name, region, e.year, fertility_rate, unemployment_rate
  FROM countries AS c    #给表命名,方便选择
  INNER JOIN populations AS p
    ON c.code = p.country_code
  INNER JOIN economies AS e    #多表联立
    ON c.code = e.code AND e.year = p.year;   
#多表联立时,若存在b,c表中field name重复,需要明确key之间的关系以防止重复pair

当两表中要关联的关键字一致时,可以用USING()

SELECT *
FROM countries
  INNER JOIN economies
    USING(code)
  • Self-join
    用于找出同一filed内所有元素的组合(pair),比如说可以看同一个国家人口从2010年到2015年的涨幅
SELECT p1.country_code,
       p1.size AS size2010,
       p2.size AS size2015
       ((p2.size - p1.size)/p1.size * 100.0) AS growth_perc  #计算涨幅
FROM populations as p1
INNER JOIN populations as p2
ON p1.country_code = p2.country_code
AND p1.year = p2.year - 5  #消除自我匹配,5=2015-2010
  • Outer join (LEFT JOIN, RIGHT JOIN, FULL JOIN)
    LEFT JOIN 会返回左表所有的行,即使在右表中没有匹配的记录。右表中有时会有多条纪录可匹配左表,即右表被连接的键名重复出现,这时查询结果多于左表原始数据条数。
    RIGHT JOIN 会返回右表所有的行,即使在左表中没有匹配的记录。
    FULL JOIN 是左连接和右连接的并集

  • Cross join


    CROSS JOIN
  • Semi-join & anti-join
    当满足子查询时,semi-jion选择并返回第一张表中的记录。本质上是把子查询上拉到父查询中,与父查询的表做join/semi-join的操作。

SELECT president, country, continent
FROM presidents
WHERE country IN
    (SELECT name
      FROM states
      WHERE indep_year < 1800)

Semi-join与INNER JOIN等价

当不满足子查询时,anti-join选择并返回第一张表中的记录。It is particularly useful in identifying which records are causing an incorrect number of records to appear in join queries. 即A包含B,B不包含A的records。(用WHERE NOT IN)

SELECT code, name
FROM countries
WHERE continent = 'Oceania'
    AND code NOT IN
    (SELECT code
    FROM currencies);

2.7 Set theory clauses

Venn Diagrams


Venn Diagrams
  • UNION
    UNION 代表左表和右表的并集,交集不重复出现
    UNION ALL 代表左表和右表的并集,交集重复出现
SELECT *        
FROM economies2010
UNION
SELECT *
FROM economies2015
ORDER BY code, year

UNION也可以用来查看多个表中某个field的不重复结果

SELECT country_code
from cities
union
select code as country_code   #Determine all (non-duplicated) country codes in either the cities or the currencies table
from currencies
order by country_code

UNION和JOIN的区别:UNION相当于直接合并多表,JOIN相当于Vlookup

  • INTERSECT
    INTERSECT表示左表和右表的交集,在多数情况下和INNER JOIN类似。

INTERSECT和INNER JOIN的区别

  • Intersect is an operator and Inner join is a type of join.
  • Intersect can return matching null values but inner join can't.
  • Intersect doesn't return any duplicate values but inner join returns duplicate values if it's present in the tables.
  • INTERSECT does all columns, INNER JOIN only the specified columns.
  • EXCEPT(非)

2.8 Subqueries

  • inside WHERE clauses (用于semi-join/anti-join)
SELECT *
from populations
where life_expectancy > 
  1.15 * (select avg(life_expectancy)
  from populations
  where year = 2015)
  AND year = 2015
  • inside SELECT clauses(返回与inner join相同的结果)
#上下两个代码块返回相同结果
SELECT countries.name AS country, COUNT(*) AS cities_num
  FROM cities
    INNER JOIN countries
    ON countries.code = cities.country_code
GROUP BY country
ORDER BY cities_num DESC, country
LIMIT 9;
----------------------------------------------------------------------------------
SELECT name AS country,
  (SELECT COUNT(*)
   FROM cities
   WHERE countries.code = cities.country_code) AS cities_num
FROM countries
ORDER BY cities_num DESC, country
LIMIT 9;
  • inside FROM clauses
SELECT local_name, lang_num
FROM countries,
    (SELECT code, COUNT(*) AS lang_num
    FROM languages
    GROUP BY code) AS subquery
WHERE countries.code = subquery.code
ORDER BY lang_num DESC;

Built-in-fuction in SQL

  • DATE(), 返回当天日期
  • TIME(), 返回当时时间
  • NOW(), combine日期和时间
  • IsNULL(), 括号里面写field name, returns True if the field is empty
  • UCASE, LCASE, 改变string的大小写
  • CCUR, 改成货币格式
  • ROUND(., n)
  • 数学公式: ABS, SIN, COS, EXP, SQR
  • RND(), 返回[0, 1]之间的一个随机数字
  • FORMAT(., "Standard"), 改变数字为有逗号分隔和两位小数的标准格式

2.2 INSERT

INSERT INTO tablename VALUES (...)

INSERT INTO people
VALUES (101, 'Kevin', 'Male', 36, 3000000, #01/01/1980#, False, 999)
# with named arguments, can omit ID as automatically generated
INSERT INTO people (name, gender, age, motherID)
VALUES ('Kevin', 'Male', 36, 999)

2.3 UPDATE

UPDATE tablename SET ... WHERE ...

UPDATE people
SET age = 37, salary = 1000
WHERE ID = 101

2.4 DELETE

DELETE FROM tablename WHERE ...

DELETE FROM people
WHERE name = 'Kevin'

3 CREATE TABLE / DROP TABLE

CREATE TABLE people(           # 创建一个名为people的表
ID COUNTER PRIMARY KEY,               # COUNTER表示自动生成integer, PRIMARY KEY设置主键
name VARCHAR(20) NOT NULL,         # VARCHAR(n)设置string,NOT NULL设置field不允许为空
gender VARCHAR(20),
age INT,           # integer整数格式
salary DOUBLE,       #double precision decimal number
birthdate DATE,
dead BIT)            # boolean value(True/False, Yes/No. On/Off,...)
motherID INT NOT NULL REFERENCES mother(ID);  
# 保证参照完整性,即people表中motherID的输入必须参照mother表中的ID,不能乱写 

DROP TABLE people;     # 删除表

你可能感兴趣的:(SQL)