HIVE Schema Design 模式设计

Table-by-Day


hive> CREATE TABLE supply_2011_01_02 (id int, part string, quantity int);
hive> CREATE TABLE supply_2011_01_03 (id int, part string, quantity int);
hive> CREATE TABLE supply_2011_01_04 (id int, part string, quantity int);
hive> .... load data ...
hive> SELECT part,quantity supply_2011_01_02
    > UNION ALL
    > SELECT part,quantity from supply_2011_01_03
    > WHERE quantity < 4;


hive> CREATE TABLE supply (id int, part string, quantity int)
    > PARTITIONED BY (day int);
hive> ALTER TABLE supply add PARTITION (day=20110102);

hive> ALTER TABLE supply add PARTITION (day=20110103);
hive> ALTER TABLE supply add PARTITION (day=20110102);
hive> .... load data ...
hive> SELECT part,quantity FROM supply
    > WHERE day>=20110102 AND day<20110103 AND quantity < 4;


Adding Columns to a Table


hive> CREATE TABLE weblogs (version LONG, url STRING)
    > PARTITIONED BY (hit_date int)
    > ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
hive> ! cat log1.txt
1  /mystuff
1  /toys
hive> LOAD DATA LOCAL INPATH 'log1.txt' int weblogs partition(20110101);
hive> SELECT * FROM weblogs;
1  /mystuff  20110101
1  /toys     20110101

hive> ! cat log2.txt
2  /cars   bob
2  /stuff  terry
hive> ALTER TABLE weblogs ADD COLUMNS (user_id string);
hive> LOAD DATA LOCAL INPATH 'log2.txt' int weblogs partition(20110102);
hive> SELECT * from weblogs
1  /mystuff  20110101  NULL
1  /toys     20110101  NULL
2  /cars     20110102  bob
2  /stuff    20110102  terry


你可能感兴趣的:(schema)