牛客笔试题(1)

牛客笔试题(1)_第1张图片

(1)

不同商品来自同一个品牌,本题要求计算各 品牌 的销售额,应该聚合销售额
WITH c as (
			SELECT a.logday, b.brand_name, SUM(a.sale_amt) as sum_sales
			FROM a INNER JOIN
					 b ON a.SKU_ID = b.SKU_ID
			WHERE a.logday BETWEEN '2017-01-01' AND '2017-12-31'
			AND user_name = '小明'
			GROUP BY a.logday, b.brand_name
)
SELECT *
FROM (SELECT *, ROW_NUMBER() OVER (PARTITION BY brand_name ORDER BY sum_sales DESC) as rnk
			FROM c) aa
WHERE rnk < 4

[Err] ERROR:  column "rnk" does not exist
这是因为SQL的运行规则,先运行where了,所以不能找到rnk,而且对于窗口函数的使用也有限制

 

(2)

WITH cte_1 as (SELECT logday, brand_name, SUM(sale_amt) as sale_amt
			   FROM a INNER JOIN 
					b ON a.SKU_ID = b.SKU_ID
			   WHERE EXTRACT(YEAR FROM logday) = 2017 AND user_name = '小明'
			   GROUP BY logday, brand_name
),
cte_2 as (
SELECT *,
	   (logday - (SELECT MIN(logday) FROM cte_1))+1 as idd,
	   ((logday - (SELECT MIN(logday) FROM cte_1))+1 - ROW_NUMBER() OVER(PARTITION BY brand_name ORDER BY logday)) as qty
FROM (SELECT *,
			 (sale_amt::NUMERIC / LAG(sale_amt,1) OVER (PARTITION BY brand_name ORDER BY logday)::NUMERIC)-1 as ratio
	  FROM cte_1) ojbk
WHERE ratio > 0.5)
--SELECT * FROM cte_2
SELECT logday, brand_name, sale_amt
FROM cte_2
WHERE brand_name IN(
					SELECT brand_name
					FROM cte_2
					GROUP BY brand_name
					HAVING COUNT(*) >= 3
					AND COUNT(DISTINCT qty) = 1)

 

法2

1.cte_2中的联接将品牌对应,且将顺延的天数的找出来。
2.第二步用窗口函数标记qty,每个品牌的
3.计算出增长50%的销售额应该是多少

--终于搞出来了
WITH cte_1 as (
SELECT a.logday, b.brand_name, SUM(a.sale_amt) as sum_sales
FROM a INNER JOIN
	 b ON a.SKU_ID = b.SKU_ID
WHERE a.logday BETWEEN '2017-01-01' AND '2017-12-31'
AND user_name = '小明'
GROUP BY a.logday, b.brand_name
),
cte_2 as (
SELECT C.logday as c_logday, C.brand_name as c_brand_name,D.*, --因为where不能用窗口函数,所以无法用sum_sales > last_amt筛去不合规的记录,保留C.logday,和C.brand_name到 cte_3
			 COALESCE(1.5 * LAG(D.sum_sales,1) OVER (PARTITION BY C.logday, C.brand_name ORDER BY D.logday ASC), 0) as last_amt
FROM cte_1 C
		 JOIN cte_1 D 
		 ON C.brand_name=D.brand_name and D.logday between C.logday and C.logday + 3*INTERVAL '1 day'
),
cte_3 as (
SELECT *,
			 COUNT(*) OVER (PARTITION BY c_logday, c_brand_name) as qty --重要的是理解INTERVAL 这部分,然后都用C的logday和brand_name,就可以构建qty
FROM cte_2
WHERE sum_sales > last_amt --删去了2017-01-02那天的不合格数据,所以使该部分 qty!=4
)
SELECT logday, brand_name, sum_sales
FROM cte_3
WHERE qty = 4

两种方法的区别在于先找连续天还是先判断增长率。第二种方法是在一位大神的基础上改的,半夜两点半才完成,对于时间数据岛的提取是难点,取连续的日期就是要用到组标识符,这在两种方法的不同之处在于是先判断 ↑ %还是先判断连续三天。所以在做测试数据的时候有意增加了干扰数据。虽然自己菜,但是能用两种方法做出来还是挺开心的。

 

总结:(1)第一问中ROW_NUMBER()窗口函数的使用

           (2)窗口函数的使用位置,在where中不能使用qty

           (3)三层cte的使用,逻辑要顺。以及表的自连接的使用。

你可能感兴趣的:(牛客笔试题(1))