如何让Hive简单的查询不启用Mapreduce而启用Fetch task本地运行

文章部分参考自:https://blog.csdn.net/levy_cui/article/details/52637493


让Hive简单的查询不启用Mapreduce而启用Fetch task本地运行

启用MapReduce会消耗系统性能。从Hive0.10.0版本开始,对于简单的不需要聚合的类似SELECT from

LIMIT num语句,不需要起MapReduce job,直接通过Fetch task 本地获取数据,有几种方法实现:

方法一:

1.hive启动fetch.task:

hive> set hive.fetch.task.conversion=more;

2.进行查询:

hive> select uid, number from test limit 9;

以上set hive.fetch.task.conversion=more;开启Fetch task,不再启用MapReduce job


方法二:

1.本地启动fetch.task:

hive --hiveconf hive.fetch.task.conversion=more

2.本地进行查询:

hive -e "select uid, number from test limit 9;"

方法三:

以上两种方法都可以开启Fetch task,但是都是临时的;如果想一直启用这个功能,需要在hive-site.xml里面加入配置:


  hive.fetch.task.conversion
  more
  
    Some select queries can be converted to single FETCH task
    minimizing latency.Currently the query should be single
    sourced not having any subquery and should not have
    any aggregations or distincts (which incurrs RS),
    lateral views and joins.
    1. minimal : SELECT STAR, FILTER on partition columns, LIMIT only
    2. more    : SELECT, FILTER, LIMIT only (+TABLESAMPLE, virtual columns)
  

你可能感兴趣的:(Hive基础)