Hive趟过的坑-改变列的顺序

1.创建表

--drop table dev.dev_test20190225; 
create EXTERNAL table dev.dev_test20190225
    (
        c1 int comment '1',
        c2 int comment '2',
        c3 int comment '3'
    )
COMMENT '测试表'
PARTITIONED BY ( dt string ) 
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' 
NULL DEFINED AS "" 
STORED AS ORC 
tblproperties ('orc.compress'='SNAPPY');

 2.插入测试数据

insert overwrite table dev.dev_test20190225 partition ( dt = '2019-01-03')
select 1 as c1,2 as c2,3 as c3
union all
select 1 as c1,2 as c2,3 as c3
union all
select 1 as c1,2 as c2,3 as c3
;

3.查看下数据

select * from dev.dev_test20190225;

结果:

4.插入列C11,并调整到C1后面

alter table dev.dev_test20190225 add columns (c11 int comment 'c11');
desc dev.dev_test20190225;
alter table dev.dev_test20190225 change c11 c11 int after c1; 
desc dev.dev_test20190225;

Hive趟过的坑-改变列的顺序_第1张图片

5.再次查询:

发现1:元数据结构已调整,但是数据并未调整顺序,还是原来的列序。

6.继续,重新插入数据

insert overwrite table dev.dev_test20190225 partition ( dt = '2019-01-03')
select 1 as c1,11 as c11,2 as c2,3 as c3
union all
select 1 as c1,11 as c11,2 as c2,3 as c3
union all
select 1 as c1,11 as c11,2 as c2,3 as c3
;

7.查看数据

发现2:前三列插进去了,最后一列竟然全是空!!!

-------------------------------------------

解决方案:DDL语句后加cascade

alter table dev.dev_test20190225 add columns (c11 int comment 'c11') cascade;

alter table dev.dev_test20190225 change c11 c11 int after c1 cascade

-------------------------------------------

扩展学习:

1.增加的列附近数据类型不一致,可能还会导致数据查不出来并报错:

Failed with exception java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be cast to org.apache.hadoop.io.Text

测试方法比较简单,改个字符型,并插入文本,受够了,不再赘述^^

2.不想重跑数据怎么办?

调整好元数据后,重新加载分区

show create table dev.dev_test20190225;
load data inpath 'hdfs://*****.db/dev_test20190225/dt=2019-01-03' into table dev.dev_test20190225 PARTITION (dt='2019-01-03');

简单记录下,有空再整理成容易理解的文档。

你可能感兴趣的:(Hive)