创建表并导入数据
CREATE TABLE sales (
id INT,
salesperson STRING,
region STRING,
sales_amount INT,
sale_date DATE
);
INSERT INTO sales (id, salesperson, region, sales_amount, sale_date)
VALUES
(1, 'Alice', 'North', 1000, '2023-01-01'),
(2, 'Bob', 'South', 1500, '2023-01-02'),
(3, 'Alice', 'North', 2000, '2023-01-03'),
(4, 'Charlie', 'East', 1200, '2023-01-04'),
(5, 'Bob', 'South', 1800, '2023-01-05'),
(6, 'Alice', 'North', 2500, '2023-01-06'),
(7, 'Charlie', 'East', 1300, '2023-01-07'),
(8, 'Bob', 'South', 2200, '2023-01-08'),
(9, 'Alice', 'North', 3000, '2023-01-09'),
(10, 'Charlie', 'East', 1400, '2023-01-10');
示例数据表:sales
id |
salesperson |
region |
sales_amount |
sale_date |
1 |
Alice |
North |
1000 |
2023-01-01 |
2 |
Bob |
South |
1500 |
2023-01-02 |
3 |
Alice |
North |
2000 |
2023-01-03 |
4 |
Charlie |
East |
1200 |
2023-01-04 |
5 |
Bob |
South |
1800 |
2023-01-05 |
6 |
Alice |
North |
2500 |
2023-01-06 |
7 |
Charlie |
East |
1300 |
2023-01-07 |
8 |
Bob |
South |
2200 |
2023-01-08 |
9 |
Alice |
North |
3000 |
2023-01-09 |
10 |
Charlie |
East |
1400 |
2023-01-10 |
根据销售额给销售人员分类:
```sql
SELECT
salesperson,
sales_amount,
CASE
WHEN sales_amount < 1500 THEN 'Low'
WHEN sales_amount BETWEEN 1500 AND 2500 THEN 'Medium'
ELSE 'High'
END AS sales_category
FROM
sales;
运行结果:
salesperson |
sales_amount |
sales_category |
Alice |
1000 |
Low |
Bob |
1500 |
Medium |
Alice |
2000 |
Medium |
Charlie |
1200 |
Low |
Bob |
1800 |
Medium |
Alice |
2500 |
Medium |
Charlie |
1300 |
Low |
Bob |
2200 |
Medium |
Alice |
3000 |
High |
Charlie |
1400 |
Low |
2. SUM(CASE WHEN)
示例
计算每个区域的销售总额:
SELECT
region,
SUM(CASE WHEN salesperson = 'Alice' THEN sales_amount ELSE 0 END) AS alice_sales,
SUM(CASE WHEN salesperson = 'Bob' THEN sales_amount ELSE 0 END) AS bob_sales,
SUM(CASE WHEN salesperson = 'Charlie' THEN sales_amount ELSE 0 END) AS charlie_sales
FROM
sales
GROUP BY
region;
运行结果:
region |
alice_sales |
bob_sales |
charlie_sales |
North |
8500 |
0 |
0 |
South |
0 |
5500 |
0 |
East |
0 |
0 |
3900 |
3. RANK()
示例
根据销售额对销售人员进行排名:
SELECT
salesperson,
sales_amount,
RANK() OVER (ORDER BY sales_amount DESC) AS sales_rank
FROM
sales;
运行结果:
salesperson |
sales_amount |
sales_rank |
Alice |
3000 |
1 |
Bob |
2200 |
2 |
Alice |
2500 |
3 |
Bob |
1800 |
4 |
Alice |
2000 |
5 |
Bob |
1500 |
6 |
Charlie |
1400 |
7 |
Charlie |
1300 |
8 |
Alice |
1000 |
9 |
Charlie |
1200 |
10 |
4. ROW_NUMBER()
示例
为每个销售人员的销售额分配一个唯一的行号:
SELECT
salesperson,
sales_amount,
ROW_NUMBER() OVER (PARTITION BY salesperson ORDER BY sales_amount DESC) AS row_num
FROM
sales;
运行结果:
salesperson |
sales_amount |
row_num |
Alice |
3000 |
1 |
Alice |
2500 |
2 |
Alice |
2000 |
3 |
Alice |
1000 |
4 |
Bob |
2200 |
1 |
Bob |
1800 |
2 |
Bob |
1500 |
3 |
Charlie |
1400 |
1 |
Charlie |
1300 |
2 |
Charlie |
1200 |
3 |
5. DENSE_RANK()
示例
根据销售额对销售人员进行密集排名(不会跳过排名):
SELECT
salesperson,
sales_amount,
DENSE_RANK() OVER (ORDER BY sales_amount DESC) AS dense_rank
FROM
sales;
运行结果:
salesperson |
sales_amount |
dense_rank |
Alice |
3000 |
1 |
Bob |
2200 |
2 |
Alice |
2500 |
3 |
Bob |
1800 |
4 |
Alice |
2000 |
5 |
Bob |
1500 |
6 |
Charlie |
1400 |
7 |
Charlie |
1300 |
8 |
Alice |
1000 |
9 |
Charlie |
1200 |
10 |
总结
CASE WHEN
:用于条件判断,生成新的列。
SUM(CASE WHEN)
:用于按条件汇总数据。
RANK()
:用于排名,允许并列排名并跳过后续名次。
ROW_NUMBER()
:用于生成唯一的行号,即使数据相同也会分配不同行号。
DENSE_RANK()
:用于密集排名,允许并列排名但不跳过后续名次。
通过这些示例和运行结果,可以清晰地展示每个函数的作用和用法!