Hive全量表和增量表互相转换

一、增量表变全量表

昨天和前天的full join  t1---昨天   t2--前天

INSERT OVERWRITE TABLE table_df PARTITION(ds = '${bizdate}')
select 
IF(t1.id IS NULL, t2.id, t1.id) AS id,
IF(t1.name IS NULL, t2.name, t1.name) AS id
from (select id,name from s_table where ds='${bizdate}') t1
FULL OUTER JOIN
(select id,name from table_di where ds = '${yyyymmdd-1}') t2
on t1.id=t2.id;
 

二。全量表改增量表

昨天和前天的数据之差

@day_2_table := 
    select 
        md5(concat(
            coalesce(name,0),
            coalesce(age,0)
        )) as compare
        ,row_pk 
    from table_df n
    where ds='${yyyymmdd-1}'
;
@day_1_table :=
select 
        md5(concat(
            coalesce(name,0),
            coalesce(age,0)
        )) as compare
        ,row_pk 
    from table_df n
    where ds='${bizdate}'
;

INSERT OVERWRITE TABLE table_di PARTITION(ds = '${bizdate}')
select row_pk,
       name,
       age
from table_df
where ds='${bizdate}'
and row_pk in 
(
    select t1.row_pk
    from 
    @day_1_table t1
    left outer join
    @day_2_table t2
    on t1.row_pk=t2.row_pk
    where t2.row_pk is null or t1.compare<>t2.compare
)
;

 

你可能感兴趣的:(hadoop,Hive,离线大数据)