原表:select * from pagereferrer;
如下:
如何在这张表中,查询出按 URL分类的信息 ,例如:211.95.60.43:8080算一类信息,并按百分比显示。
预期结果如下: precent字段显示百分比,urlCount字段统计该url的数量。
会遇到两个主要问题:
1:如何只截取到红色标注的部分。
2:截取前8位字符之后,URL为空如何以显示“ / ”
3:如何统计不同类型URL所占百分比。
===================================================================================
下面步骤分解 :
第一步:
先按URL分组查询
select * from pagereferrer GROUP BY url;
第二步:
截取掉前8位字符“http://”
语句:
select substr(url,8) from pagereferrer GROUP BY url;
第三步:
使用instr函数找到 截取掉前8位字符“http://”之后的第一个"/"斜杠位置在第几个字符。
语句:
select substr(url,8),INSTR(substr(url,8),'/') from pagereferrer GROUP BY url;
第四步:
配合LEFT函数,查询到第一个" / "之前的所有字符串 ,包含" / "。并且给该列命名一个别名 url
语句:select left(SUBSTR(url,8),INSTR(SUBSTR(url,8),'/'))as url from pagereferrer GROUP BY url;
第五步:
把最后一个字符取出在left函数中加一个参数-1,并且按该列分组。
语句:select left(SUBSTR(url,8),INSTR(SUBSTR(url,8),'/')-1)as url
from pagereferrer
GROUP BY left(SUBSTR(url,8),INSTR(SUBSTR(url,8),'/')-1);
好了到这里基本上大功完成!只剩下考虑如何把为空的URL显示 " / "的问题,之后要在SQL中加一个判断。
第六步:
加一个判断,使得URL显示为" / "
语句:
select case when LENGTH(left(SUBSTR(url,8),INSTR(SUBSTR(url,8),'/')-1))=0
then '/'
else left(SUBSTR(url,8),INSTR(SUBSTR(url,8),'/')-1)
end as url
from pagereferrer
GROUP BY left(SUBSTR(url,8),INSTR(SUBSTR(url,8),'/')-1);
第7步:
完成统计不同类型URL数量和百分比显示,加一个count(1) as urlCount,统计第一列的数量。
语句:
select case when LENGTH(left(SUBSTR(url,8),INSTR(SUBSTR(url,8),'/')-1))=0
then '/'
else left(SUBSTR(url,8),INSTR(SUBSTR(url,8),'/')-1)
end as url ,count(1) as urlCount
from pagereferrer
GROUP BY left(SUBSTR(url,8),INSTR(SUBSTR(url,8),'/')-1);
第8步 :
统计百分比, 这里比较繁琐需要准备两个临时表,一个用来统计总数,一个用来统计每个分类的url数量。
先准备显示统计Url总数表:
语句:
select sum(x.urlCount) as sumUrlCount from(
select case when LENGTH(LEFT(SUBSTR(url,8),INSTR(SUBSTR(url,8),'/')-1))=0 then '/' else LEFT(SUBSTR(url,8),INSTR(SUBSTR(url,8),'/')-1) end as url,
count(1) as urlCount
from pagereferrer
group by LEFT(SUBSTR(url,8),INSTR(SUBSTR(url,8),'/')-1))x
接着准备统计每个分类的URL数量
语句:
select case when LENGTH(LEFT(SUBSTR(url,8),INSTR(SUBSTR(url,8),'/')-1))=0 then '/' else LEFT(SUBSTR(url,8),INSTR(SUBSTR(url,8),'/')-1) end as url,
count(1) as urlCount
from pagereferrer
group by LEFT(SUBSTR(url,8),INSTR(SUBSTR(url,8),'/')-1)
ORDER BY urlCount desc
最后 select (t1.urlCount/t2.sumUrlCount)*100 as precent 把该列显示百分比,t2表作为分母的临时表,t2表作为分子 语句:
select (t1.urlCount/t2.sumUrlCount)*100 as precent,t1.url,t1.urlCount from
(select case when LENGTH(LEFT(SUBSTR(url,8),INSTR(SUBSTR(url,8),'/')-1))=0 then '/' else LEFT(SUBSTR(url,8),INSTR(SUBSTR(url,8),'/')-1) end as url, count(1) as urlCount from pagereferrer group by LEFT(SUBSTR(url,8),INSTR(SUBSTR(url,8),'/')-1) ORDER BY urlCount desc)t1,
(select sum(x.urlCount) as sumUrlCount from( select case when LENGTH(LEFT(SUBSTR(url,8),INSTR(SUBSTR(url,8),'/')-1))=0 then '/' else LEFT(SUBSTR(url,8),INSTR(SUBSTR(url,8),'/')-1) end as url, count(1) as urlCount from pagereferrer group by LEFT(SUBSTR(url,8),INSTR(SUBSTR(url,8),'/')-1))x)t2
好了到这里终于把要求的结果查询出来了。。。。