Hive使用

修改 conf/hadoop-env.sh 的相关设置如:
export HADOOP_HEAPSIZE=64
export HADOOP_CLIENT_OPTS="-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/home/tianzhao/oom.hprof"
hive启动的时候会应用上面的设置,当OOM的时候,会dump映像到oom.hprof文件,可用java的VisualVM来查看内存相关的信息


partition相关:
hive在设置的内存相对比较小(64或128等)的时候会,对partition数有限制,写了一些脚本测试partition.

for ((i=1; i<=10000; i++));
   do echo "alter table tablenamexxx add partition(pt='${i}');" >> parit_test.sql;
done

建表语句
create table tablenamexxx(s string) partitioned by (pt string);

生成的添加partition语句是
alter table partition2 add if not exists partition(pt='1');
alter table partition2 add if not exists partition(pt='2');
alter table partition2 add if not exists partition(pt='3');

运行parit_test.sql
cd到hive目录下面  bin/hive -f parit_test.sql 即可

修改表名:
ALTER TABLE table_name RENAME TO new_table_name


hive> select distinct value from src;
hive> select max(key) from src;


log日志:
目录下面的文件 conf/hive-log4j.properties
#hive.root.logger=WARN,DRFA
hive.root.logger=DEBUG,DRFA
修改log级别为debug,日志存储在下面的文件中 /tmp/tianzhao/hive.log
hive.log.dir=/tmp/${user.name}  
hive.log.file=hive.log

user.name 为tianzhao
运行的过程中可以打开 hive.log文件  tail -f hive.log,在日志生成的过程中会在终端打印出来



hive命令记录:
hive每次执行的命令都会记录到当前用户主目录的 .hivehistory 文件中
tianzhao@tianzhao-VirtualBox:~$ less .hivehistory

代码在CliDriver的main函数中
    final String HISTORYFILE = ".hivehistory";
    String historyFile = System.getProperty("user.home") + File.separator + HISTORYFILE;
    reader.setHistory(new History(new File(historyFile)));

[-count[-q] <path>]
$hadoop fs -count /history/  目录下的文件数


(1)查看表的信息
hive> describe extended partition2;
OK
s string
pt string

Detailed Table Information Table(tableName:partition2, dbName:default, owner:tianzhao, createTime:1304566227, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:s, type:string, comment:null)], location:hdfs://localhost:54310/user/hive/warehouse/partition2, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}), partitionKeys:[FieldSchema(name:pt, type:string, comment:null)], parameters:{transient_lastDdlTime=1304566227}, viewOriginalText:null, viewExpandedText:null, tableType:MANAGED_TABLE)
Time taken: 0.054 seconds

hive> describe partition2;        
OK
s string
pt string
Time taken: 0.104 seconds


hive> show functions hash;
OK
hash
Time taken: 0.062 seconds
hive> describe function hash;
OK
hash(a1, a2, ...) - Returns a hash value of the arguments
Time taken: 0.049 seconds
hive> describe  function extended  hash;
OK
hash(a1, a2, ...) - Returns a hash value of the arguments
Time taken: 0.05 seconds



输入数据形式:
1&&&&2&&&&4

CREATE TABLE IF NOT EXISTS rtable1 (
   str1 string,
   str2 string,
   str3 string
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe'
With SERDEPROPERTIES (
"input.regex"="(\\d+)&&&&(\\d+)&&&&(\\d+)"
);

load data local inpath '/home/tianzhao/sql/data/RegexSerDe' into table rtable1;

select * from rtable1;


http://search-hadoop.com/m/WBuaH1Z4TKu1/partition+%252B++filter+%252B+udf&subj=+ANNOUNCE+Apache+Hive+0+7+0+Released

https://issues.apache.org/jira/browse/HIVE-1750
[HIVE-1609] - Support partition filtering in metastore
https://issues.apache.org/jira/browse/HIVE-1862
https://issues.apache.org/jira/browse/HIVE-1849
https://issues.apache.org/jira/browse/HIVE-1738
https://issues.apache.org/jira/browse/HIVE-1758
https://issues.apache.org/jira/browse/HIVE-1642



https://issues.apache.org/jira/browse/HIVE-1913

https://issues.apache.org/jira/browse/HIVE-1430

https://issues.apache.org/jira/browse/HIVE-1305

https://issues.apache.org/jira/browse/HIVE-1462

https://issues.apache.org/jira/browse/HIVE-1790

https://issues.apache.org/jira/browse/HIVE-1514

https://issues.apache.org/jira/browse/HIVE-1971

https://issues.apache.org/jira/browse/HIVE-1361

https://issues.apache.org/jira/browse/HIVE-138

https://issues.apache.org/jira/browse/HIVE-1835


https://issues.apache.org/jira/browse/HIVE-1815

https://issues.apache.org/jira/browse/HIVE-1943

https://issues.apache.org/jira/browse/HIVE-2056

https://issues.apache.org/jira/browse/HIVE-2028


https://issues.apache.org/jira/browse/HIVE-1918


https://issues.apache.org/jira/browse/HIVE-1803



https://issues.apache.org/jira/browse/HIVE-558
https://issues.apache.org/jira/browse/HIVE-1658

https://issues.apache.org/jira/browse/HIVE-1731

https://issues.apache.org/jira/browse/HIVE-138

https://issues.apache.org/jira/browse/HIVE-1408

在eclipse里面debug hive

未完待续

你可能感兴趣的:(eclipse,sql,hadoop,log4j,脚本)