先说一个0.7.1和0.8.1的Metastore不兼容
一。外部表的索引。
因为怕麻烦,就继续用之前的table02的数据,在new meta里也叫table02,不过改成了external表。最后的结论是Hive的索引也是支持外部表的。
建立索引,运行。还是6个mapper,不行。
二。建立内部表
CTAS from table02,建立一个内部表table03;
重点是,换了0.8.1,
<property>
<name>hive.optimize.index.filter</name>
<value>true</value>
<description>Whether to enable automatic use of indexes</description>
</property>
建立了索引:
- hive> create index table03_index on table table03(id)
- > as 'org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler'
- > with deferred rebuild;
- OK
- Time taken: 2.221 seconds
- hive> show tables;
- OK
- default__table02_compact_index__
- default__table03_table03_index__
- table02
- table03
- Time taken: 1.073 seconds
- hive> dfs -ls /user/hive/warehouse/default__table03_table03_index__;
- hive> dfs -ls /user/hive/warehouse/default__table02_compact_index__;
- Found 1 items
- -rw-r
- hive> alter index table03_index on table03 rebuild;
现在重新query:
- hive> select * from table03 where id=500000;
- Total MapReduce jobs = 1
- Launching Job 1 out of 1
- Number of reduce tasks is set to 0 since there's no reduce operator
- Starting Job = job_201203122135_0003, Tracking URL = http://localhost:50030/jobdetails.jsp?jobid=job_201203122135_0003
- Kill Command = /home/allen/Hadoop/hadoop-0.20.2/bin/../bin/hadoop job -Dmapred.job.tracker=localhost:9001 -kill job_201203122135_0003
- Hadoop job information for Stage-1: number of mappers: 2; number of reducers: 0
- 2012-03-12 21:58:02,283 Stage-1 map = 0%, reduce = 0%
- 2012-03-12 21:58:35,932 Stage-1 map = 50%, reduce = 0%
- 2012-03-12 21:58:39,057 Stage-1 map = 63%, reduce = 0%
- 2012-03-12 21:58:45,216 Stage-1 map = 75%, reduce = 0%
- 2012-03-12 21:58:51,407 Stage-1 map = 88%, reduce = 0%
- 2012-03-12 21:58:57,667 Stage-1 map = 100%, reduce = 0%
- 2012-03-12 21:59:03,798 Stage-1 map = 100%, reduce = 100%
- Ended Job = job_201203122135_0003
- MapReduce Jobs Launched:
- Job 0: Map: 2 HDFS Read: 356889161 HDFS Write: 357 SUCESS
- Total MapReduce CPU Time Spent: 0 msec
- OK
- 500000 A decade ago, many were predicting that Cooke, a New York City prodigy, would become a basketball shoe pitchman and would flaunt his wares and skills at All-Star weekends like the recent aerial show in Orlando, Fla. There was a time, however fleeting, when he was more heralded, or perhaps merely hyped, than any other high school player in America.
- Time taken: 77.299 seconds
还是不行,再设置一个参数:
<property>
<name>hive.optimize.index.filter.compact.minsize</name>
<value>5368</value>
<description>Minimum size (in bytes) of the inputs on which a compact index is automatically used.</description>原来是5368709120
</property>
再次尝试:好了
- hive> select * from table02 where id=500000;
- Total MapReduce jobs = 3
- Launching Job 1 out of 3
- Number of reduce tasks is set to 0 since there's no reduce operator
- Starting Job = job_201203122135_0010, Tracking URL = http://localhost:50030/jobdetails.jsp?jobid=job_201203122135_0010
- Kill Command = /home/allen/Hadoop/hadoop-0.20.2/bin/../bin/hadoop job -Dmapred.job.tracker=localhost:9001 -kill job_201203122135_0010
- Hadoop job information for Stage-3: number of mappers: 1; number of reducers: 0
- 2012-03-12 22:12:18,347 Stage-3 map = 0%, reduce = 0%
- 2012-03-12 22:12:33,834 Stage-3 map = 100%, reduce = 0%
- 2012-03-12 22:12:40,078 Stage-3 map = 100%, reduce = 100%
- Ended Job = job_201203122135_0010
- Ended Job = -1685536326, job is filtered out (removed at runtime).
- Moving data to: hdfs://localhost:9000/tmp/hive-allen/hive_2012-03-12_22-12-07_069_3484347533083360337/-ext-10000
- Moving data to: hdfs://localhost:9000/tmp/hive-allen/hive_2012-03-12_22-12-06_413_2863963861844056912/-mr-10002
- Launching Job 3 out of 3
- Number of reduce tasks is set to 0 since there's no reduce operator
- Starting Job = job_201203122135_0011, Tracking URL = http://localhost:50030/jobdetails.jsp?jobid=job_201203122135_0011
- Kill Command = /home/allen/Hadoop/hadoop-0.20.2/bin/../bin/hadoop job -Dmapred.job.tracker=localhost:9001 -kill job_201203122135_0011
- Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
- 2012-03-12 22:12:53,124 Stage-1 map = 0%, reduce = 0%
- 2012-03-12 22:13:04,590 Stage-1 map = 100%, reduce = 0%
- 2012-03-12 22:13:10,897 Stage-1 map = 100%, reduce = 100%
- Ended Job = job_201203122135_0011
- MapReduce Jobs Launched:
- Job 0: Map: 1 HDFS Read: 74706082 HDFS Write: 68 SUCESS
- Job 1: Map: 1 HDFS Read: 33554431 HDFS Write: 357 SUCESS
- Total MapReduce CPU Time Spent: 0 msec
- OK
- 500000 A decade ago, many were predicting that Cooke, a New York City prodigy, would become a basketball shoe pitchman and would flaunt his wares and skills at All-Star weekends like the recent aerial show in Orlando, Fla. There was a time, however fleeting, when he was more heralded, or perhaps merely hyped, than any other high school player in America.
- Time taken: 65.212 seconds
- hive>
问题就是那个启动index的阈值没有设置:hive.optimize.index.filter.compact.minsize
http://blog.csdn.net/liwei_1988/article/details/7346064