===== 20131219 hive.limit.optimize.enable 优化
虽然我们limit了100而且是没有任何复杂条件的查询,hive竟然也会去扫描所有的数据,这非常奇怪也很浪费。原来hive的limit在默认的情况下的执行过程就是把所有数据都跑出来.
>>> test1
select * from s_test;
两种情况无影响
>>> test2
select dp_id from s_order_hbase limit 100;
number of mappers: 486; number of reducers: 0
---- set hive.limit.optimize.enable=true;后
number of mappers: 1; number of reducers: 0
2012.08.07
- python中运行hive
- >>> command = "hive -e " + "\"" + load data inpath '/fenxi_system/cs/20120612/sms_20120612' overwrite into table s_sms partition(stat_time='20120612') + "\""
File "<stdin>", line 1
command = "hive -e " + "\"" + load data inpath '/fenxi_system/cs/20120612/sms_20120612' overwrite into table s_sms partition(stat_time='20120612') + "\""
^
SyntaxError: invalid syntax
Error in sys.excepthook:
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/apport_python_hook.py", line 68, in apport_excepthook
binary = os.path.realpath(os.path.join(os.getcwdu(), sys.argv[0]))
OSError: [Errno 2] No such file or directory
- >>> command = "hive -e " + "'load data inpath '/fenxi_system/cs/20120612/sms_20120612' overwrite into table s_sms partition(stat_time='20120612')'"
ok!
2012.08.08
- 创建了表后,想从其他hive用户中copy同样结构表的内容
hadoop fs -ls 是显示内容,可是 select * from s_sms,没有结果。可见元数据是没有的。
- load data inpath '/user/hive/warehouse/s_edm/stat_time=20120612/edm_20120612' overwrite into table s_sms partition(stat_time='20120808');
select * from s_sms Ok了
- 关于row format delimited fields
hive> create table test2(uid string,name string)row format delimited fields terminated by 'aaa';
hive> load data local inpath '/home/mjiang/tes' overwrite into table test2;
hive> select * from test2;
123
//tes内容为:123aaa456
hive> create table test3(uid string,name string)row format delimited fields terminated by ',';
hive> load data local inpath '/home/mjiang/tes' overwrite into table test3;
hive> select * from test3;
123 456
//tes内容为:123,456
hive> create table test1(uid string,name string)row format delimited fields terminated by '/t';
hive> select * from test1;
123 t 456
//tes内容为:123/t456
分割符可能只能为一个。不过用',' ,是肯定可以的。