1 准备测试数据
问题1:求各个部门的总工资
问题2:求各个部门的人数和平均工资
问题3:求每个部门最早进入公司的员工姓名
问题4:求各个城市的员工的总工资
问题5:列出工资比上司高的员工姓名及其工资
问题6:列出工资比公司平均工资要高的员工姓名及其工资
问题7:列出名字以J开头的员工姓名及其所属部门名称
问题8:列出工资最高的头三名员工姓名及其工资
问题9:将全体员工按照总收入(工资+提成)从高到低排列
问题10:求任何两名员工信息传递所需要经过的中间节点数
create table dept(id int,name string,city string)
row format delimited fields terminated by ',';
create table emp(
mpno int,
name string,
job string,
mgr int,
hiredate string,
sal double,
comm double,
deptid int)
row format delimited fields terminated by ',';
向表中添加数据
10,ACCOUNTING,NEW YORK
20,RESEARCH,DALLAS
30,SALES,CHICAGO
40,OPERATIONS,BOSTON
7369,SMITH,CLERK,7902,1981-6-9,800,,20
7499,ALLEN,SALESMAN,7698,1981-11-17,1600,300,30
7521,WARD,SALESMAN,7698,1982-1-23,1250,500,30
7566,JONES,MANAGER,7839,1980-12-17,2975,,20
7654,MARTIN,SALESMAN,7698,1981-4-2,1250,1400,30
7698,BLAKE,MANAGER,7839,1987-4-19,2850,,30
7782,CLARK,MANAGER,7839,1987-5-23,2450,,10
7839,KING,PRESIDENT,,1980-12-3,5000,,10
7844,TURNER,SALESMAN,7698,1983-11-22,1500,0,30
7900,JAMES,CLERK,7698,1984-3-11,950,,30
7902,FORD,ANALYST,7566,1987-10-09,3000,,20
7934,MILLER,CLERK,7782,1981-2-13,1300,,10
load data local inpath '/home/hadoop/dept.txt' into table dept;
load data local inpath '/home/hadoop/emp.txt' into table emp;
select d.name , e.sumsal from dept d join
(select deptid ,sum(sal) sumsal from emp group by deptid) e on e.deptid=d.id;
select d.name , e.avg ,e.count from dept d join
(select deptid ,avg(sal) avg,count(*) count from emp group by deptid) e on e.deptid=d.id ;
select e.name , r.minhiredate from emp e join
(select deptid, min(to_date(hiredate)) as minhiredate from emp group by deptid) r on r.deptid=e.deptid and to_date(e.hiredate)=r.minhiredate;
select d.city , e.sumsal from dept d join
(select deptid ,sum(sal) sumsal from emp group by deptid) e on e.deptid=d.id ;
select e.name ,e.sal from emp e ,
(select sal ,mpno from emp) r where r.mpno=e.mgr and e.sal >r.sal;
select e.name ,e.sal from emp e ,
(select avg(sal) avg from emp) r where e.sal>r.avg;
select d.name , r.name from dept d join
(select name , deptid from emp where name like 'J%') r where r.deptid=d.id
select name,sal from emp order by sal desc limit 3
这里存在一个 问题 当 comm 属性为null 值的时候 进行 累加 会出导致 值为null
使用nvl()函数 nvl(属性名, 默认值) 当属性为null 会使用默认值
coalesce函数也可以实现判空的功能 coalesce(属性名,参数1,参数2)
参数1 为不为null输出的内容 参数2 为null 输出的内容
也可以使用自定义函数
select name,(sal+nvl(comm,0.0)) as salary from emp order by salary desc;
未解决的问题我会尽量去完善 在补全