PostgreSQL was mentioned in the slides but you’ll find that these joins and the material here applies to different forms of SQL as well.
Recall from the video the basic syntax for an INNER JOIN, here including all columns in both tables:
SELECT *
FROM left_table
INNER JOIN right_table
ON left_table.id = right_table.id;
You’ll start off with a SELECT
statement and then build up to an inner join with the cities
and countries
tables.
Instruction 1:
Begin by selecting all columns from the cities
table.
SELECT *
FROM cities;
Instruction 2:
cities
table on the left to the countries
table on the right, keeping all of the fields in both tables.country_code
field in cities
and the code
field in countries
.cities
and countries
is fine for now.SELECT *
FROM cities
-- 1. Inner join to countries
INNER JOIN countries
-- 2. Match on the country codes
ON cities.country_code = countries.code;
Instruction 3:
Modify the SELECT
statement to keep only the name of the city, the name of the country, and the name of the region the country resides in.
Recall from our Intro to SQL course that you can alias fields using AS
. Alias the name of the city AS city
and the name of the country AS country
.
-- 1. Select name fields (with alias) and region
SELECT cities.name AS city, countries.name as country, region
FROM cities
INNER JOIN countries
ON cities.country_code = countries.code;
Instead of writing the full table name, you can use table aliasing as a shortcut. For tables you also use AS
to add the alias immediately after the table name with a space. Check out the aliasing of cities
and countries
below.
SELECT c1.name AS city, c2.name AS country
FROM cities AS c1
INNER JOIN countries AS c2
ON c1.country_code = c2.code;
Notice that to select a field in your query that appears in multiple tables, you’ll need to identify which table/table alias you’re referring to by using a .
in your SELECT
statement.
You’ll now explore a way to get data from both the countries
and economies
tables to examine the inflation rate for both 2010 and 2015.
Sometimes it’s easier to write SQL code out of order: you write the SELECT
statement after you’ve done the JOIN
.
Instruction
countries
(left) and economies
(right) aliasing countries AS c
and economies AS e
.ON
.SELECT
:
c.code
, aliased as country_code
.name
, year
, and inflation_rate
, not aliased.-- 3. Select fields with aliases
SELECT c.code AS country_code, c.name, e.year, e.inflation_rate
FROM countries AS c
-- 1. Join to economies (alias e)
INNER JOIN economies AS e
-- 2. Match on code
ON c.code= e.code;
The ability to combine multiple joins in a single query is a powerful feature of SQL, e.g:
SELECT *
FROM left_table
INNER JOIN right_table
ON left_table.id = right_table.id
INNER JOIN another_table
ON left_table.id = another_table.id;
As you can see here it becomes tedious to continually write long table names in joins. This is when it becomes useful to alias each table using the first letter of its name (e.g. countries AS c
)! It is standard practice to alias in this way and, if you choose to alias tables or are asked to specifically for an exercise in this course, you should follow this protocol.
Now, for each country, you want to get the country name, its region, and the fertility rate and unemployment rate for both 2010 and 2015.
Note that results should work throughout this course with or without table aliasing unless specified differently.
Instruction 1:
countries
(left) and populations
(right) on the code
and country_code
fields respectively.countries AS c
and populations AS p
.code
, name
, and region
from countries
and also select year
and fertility_rate
from populations
(5 fields in total).-- 4. Select fields
SELECT c.code, name, region, year, fertility_rate
-- 1. From countries (alias as c)
FROM countries AS c
-- 2. Join with populations (as p)
INNER JOIN populations AS p
-- 3. Match on country code
ON c.code = p.country_code;
Instruction 2:
economies
to your previous query by joining on code
.unemployment_rate
column that became available through joining with economies
.year
appears in both populations
and economies
, so you have to explicitly use e.year
instead of year
as you did before.-- 6. Select fields
SELECT c.code, name, region, e.year, fertility_rate, unemployment_rate
-- 1. From countries (alias as c)
FROM countries AS c
-- 2. Join to populations (as p)
INNER JOIN populations AS p
-- 3. Match on country code
ON c.code = p.country_code
-- 4. Join to economies (as e)
INNER JOIN economies AS e
-- 5. Match on country code
ON c.code = e.code;
Instruction 3:
c.code = e.code
and not also including year
is that e.g. the 2010 value for fertility_rate
is also paired with the 2015 value for unemployment_rate
.ON
clause, use AND
to add an additional joining condition. In addition to joining on code
in c
and e
, also join on year
in e
and p
.-- 6. Select fields
SELECT c.code, name, region, e.year, fertility_rate, unemployment_rate
-- 1. From countries (alias as c)
FROM countries AS c
-- 2. Join to populations (as p)
INNER JOIN populations AS p
-- 3. Match on country code
ON c.code = p.country_code
-- 4. Join to economies (as e)
INNER JOIN economies AS e
-- 5. Match on country code and year
ON c.code = e.code AND p.year = e.year;
Why does the following code result in an error?
SELECT c.name AS country, l.name AS language
FROM countries AS c
INNER JOIN languages AS l;
INNER JOIN
requires a specification of the key field (or fields) in each table.
When joining tables with a common field name, e.g.
SELECT *
FROM countries
INNER JOIN economies
ON countries.code = economies.code
You can use USING
as a shortcut:
SELECT *
FROM countries
INNER JOIN economies
USING(code)
You’ll now explore how this can be done with the countries
and languages
tables.
Instruction:
countries
on the left and languages
on the right with USING(code).
Remember to alias your tables using the first letter of their names.
-- 4. Select fields
SELECT c.name AS country, continent, l.name AS language, official
-- 1. From countries (alias as c)
FROM countries as c
-- 2. Join to languages (as l)
INNER JOIN languages as l
-- 3. Match using code
USING(code)
In this exercise, you’ll use the populations
table to perform a self-join to calculate the percentage increase in population from 2010 to 2015 for each country code!
Since you’ll be joining the populations
table to itself, you can alias populations
as p1
and also populations
as p2
. This is good practice whenever you are aliasing and your tables have the same first letter. Note that you are required to alias the tables with self-joins.
Instruction 1:
populations
with itself ON
country_code
.country_code
from p1
and the size
field from both p1
and p2
. SQL won’t allow same-named fields, so alias p1.size
as size2010
and p2.size
as size2015
.-- 4. Select fields with aliases
SELECT p1.country_code,
p1.size AS size2010,
p2.size AS size2015
-- 1. From populations (alias as p1)
FROM populations AS p1
-- 2. Join to itself (alias as p2)
INNER JOIN populations AS p2
-- 3. Match on country code
ON p1.country_code = p2.country_code;
Instruction 2:
Notice from the result that for each country_code
you have four entries laying out all combinations of 2010 and 2015.
Extend the ON
in your query to include only those records where the p1.year
(2010) matches with p2.year - 5
(2015 - 5 = 2010). This will omit the three entries per country_code
that you aren’t interested in.
-- 5. Select fields with aliases
SELECT p1.country_code,
p1.size AS size2010,
p2.size AS size2015
-- 1. From populations (alias as p1)
FROM populations AS p1
-- 2. Join to itself (alias as p2)
INNER JOIN populations AS p2
-- 3. Match on country code
ON p1.country_code = p2.country_code
-- 4. and year (with calculation)
AND p1.year = p2.year - 5;
Instruction 3:
As you just saw, you can also use SQL to calculate values like p2.year - 5
for you. With two fields like size2010
and size2015
, you may want to determine the percentage increase from one field to the next:
With two numeric fields A and B, the percentage growth from A to B can be calculated as (B−A)/A∗100.0.
Add a new field to SELECT
, aliased as growth_perc
, that calculates the percentage population growth from 2010 to 2015 for each country, using p2.size
and p1.size
.
SELECT p1.country_code,
p1.size AS size2010,
p2.size AS size2015,
-- 1. calculate growth_perc