在Hive中查询语句往往都要被解析成MapReduce的job进行计算,但是有两个查询语句是不走MapReduce的,如下:
1.查询某张表的所有数据
select * from employees;结果显示如下:
hive> select * from employees; OK lavimer 15000.0 ["li","lu","wang"] {"k1":1.0,"k2":2.0,"k3":3.0} {"street":"dingnan","city":"ganzhou","num":101} liao 18000.0 ["liu","li","huang"] {"k4":2.0,"k5":3.0,"k6":6.0} {"street":"dingnan","city":"ganzhou","num":102} zhang 19000.0 ["xiao","wen","tian"] {"k7":7.0,"k8":8.0,"k8":8.0} {"street":"dingnan","city":"ganzhou","num":103} Time taken: 0.176 seconds hive>从上述语句中我们可以发现这个查询语句并没有走MapReduce。
2.抽样查询
select * from employees limit 2;注:在MYSQL中limit是取前几条记录,但是在Hive中,limit是抽样,会随机返回对应的记录数。
结果显示如下:
hive> select * from employees limit 2; OK lavimer 15000.0 ["li","lu","wang"] {"k1":1.0,"k2":2.0,"k3":3.0} {"street":"dingnan","city":"ganzhou","num":101} liao 18000.0 ["liu","li","huang"] {"k4":2.0,"k5":3.0,"k6":6.0} {"street":"dingnan","city":"ganzhou","num":102} Time taken: 0.079 seconds hive>从上述语句中我们可以发现这个查询语句并没有走MapReduce。