day02 hive 实操练习

一、某高校图书管理系统中有如下三个数据模型:

create table book(
book_id string,
`sort` string,
book_name string,
writer string,
output string,
price decimal(10,2));


INSERT INTO TABLE book VALUES ('001','TP391','information_processing','author1','machinery_industry_press','20');
INSERT INTO TABLE book VALUES ('002','TP392','database','author12','science_press','15');
INSERT INTO TABLE book VALUES ('003','TP393','computer_network','author3','machinery_industry_press','29');
INSERT INTO TABLE book VALUES ('004','TP399','microcomputer_principle','author4','science_press','39');
INSERT INTO TABLE book VALUES ('005','C931','management_information_systems','author5','machinery_industry_press','40');
INSERT INTO TABLE book VALUES ('006','C932','Operations research','author6','science_press','55');

-- 创建读者表 reader
create table reader (
reader_id string,
company string,
name string,
sex string,
grade string,
addr string);


INSERT INTO TABLE reader VALUES ('0001','alibaba','jack','M','vp','addr1');
INSERT INTO TABLE reader VALUES ('0002','baidu','robin','M','vp','addr2');
INSERT INTO TABLE reader VALUES ('0003','tencent','tony','M','vp','addr3');
INSERT INTO TABLE reader VALUES ('0004','jingdong','jasper','M','cfo','addr4');
INSERT INTO TABLE reader VALUES ('0005','netease','zhangsan','F','ceo','addr5');
INSERT INTO TABLE reader VALUES ('0006','sohu','lisi','F','ceo','addr6');


-- 创建借阅记录表 borrow_log
create table borrow_log(reader_id string,
book_id string,
borrow_date string
);
 
INSERT INTO TABLE borrow_log VALUES ('0001','002','2019-10-14');
INSERT INTO TABLE borrow_log VALUES ('0002','001','2019-10-13');
INSERT INTO TABLE borrow_log VALUES ('0003','005','2019-09-14');
INSERT INTO TABLE borrow_log VALUES ('0004','006','2019-08-15');
INSERT INTO TABLE borrow_log VALUES ('0005','003','2019-10-10');

(1)在Hive中创建数据库 test, 并创建出如上表和插入数据;

create database test;

(2)找出姓名l 开头的读者姓名(name)和所在单位(company)

结果:
name company
lisi sohu

hive> select name, company from reader where name like 'l%';
OK
name    company
lisi    sohu

(3)查找“科学出版社(machinery_industry_press)”的所有图书名称(book_name)及单价(price),结果按单价降序排序;

结果:
book_name price
management_information_systems 40
computer_network 29
information_processing 20

select book_name, price from book where output='machinery_industry_press' order by price desc;

(4)查找价格介于10元和20元之间的图书种类(sort)出版单位(output)和单价(price),结果按出版单位(output)和单价(price)升序排序
a b c d e f g
结果:
sort output price
TP391 machinery_industry_press 20
TP392 science_press 15

select sort, output, price from book where price between 10 and 20 order by asc;

(5)查找所有借了书的读者的姓名(name)及所在单位(company)
结果:
name company
jack alibaba
robin baidu
tony tencent
jasper jingdong
zhangsan netease

select name,company from reader r join borrow_log b on r.reader_id == b.reader_id;

(6)求”科学出版社(machinery_industry_press)”图书的最高单价、最低单价、平均单价;
结果:
max_price min_price avg_price
40 20 29.666667

select max(price) max_price, min(price) min_price, avg(price) avg_price from book where output='machinery_industry_press' group by output;

(7)找出至少借阅了1本图书(大于等于1本)的读者姓名及其所在单位;
结果:
name company
jack alibaba
robin baidu
tony tencent
jasper jingdong
zhangsan netease

select name, company from reader r  join borrow_log bl ON r.reader_id = bl.reader_id  group by r.name, r.company  having count(bl.book_id) >= 1;

(8)创建一张表borrow_log_bak,并将borrow_log表的结构和数据备份到该表;

create table borrow_log_bak as select * from borrow_log;

你可能感兴趣的:(hadoop,hive,hadoop,数据仓库)