hive求解身份证号的正确性

同时满足以下条件的判断为有效,否则为空
1、身份证号去空格后为18位;
2、第1-17位均为数字,且不是每一位数据都相同(例如第1-17位均为1,这种数据不满足);
3、最后一位为数字或者‘X’或‘x’;
4、第1、2位在(‘11’,‘12’,‘13’,‘14’,‘15’,‘21’,‘22’,‘23’,‘31’,‘32’,‘33’,‘34’,‘35’,‘36’,‘37’,‘41’,‘42’,‘43’,‘44’,‘45’,‘46’,‘50’,‘51’,‘52’,‘53’,‘54’,‘61’,‘62’,‘63’,‘64’,‘65’,‘71’,‘81’, ‘82’, ‘91’)范围内。
5、第7位至第14位为出生年月信息:
(1)月份1、3、4、5、6、7、8、9、10、11、12时,日期的最大值为31、31、30、31、30、31、31、30、31、30、31;
(2)若年份能整除4且不能整除100,或者能整除400的,月份=2时,日期最大值为29;否则月份=2时,最大值为28;
(3)第11、12位 在(‘01’,‘02’,‘03’,‘04’,‘05’,‘06’,‘07’,‘08’,‘09’,‘10’,‘11’,‘12’)范围内
(4)第13、14位 在(‘01’-‘31’)范围内;
(5)第7位为大于0,且小于等于2的数字

select case when length(sfzh) = 18 
        and SUBSTRING(sfzh, 1,17) rlike '^\\d+$'
        and SUBSTRING(sfzh, 1,17) not rlike '([0-9])\1{16}'
        
        and  not rlike(sfzh, '0{17}|1{17}|2{17}|3{17}|4{17}|5{17}|6{17}|7{17}|8{17}|9{17}')  -- 条件2,且不是每一位数据都相同
        and  not rlike (sfzh,'(\\d)\\1{16}')-- 条件2,且不是每一位数据都相同
        and rlike(substr(sfzh,-1,1),'[0-9]|x|X') 
        and substr(sfzh,1,2) in ('11','12','13','14','15','21','22','23','31','32','33','34','35','36','37','41','42','43','44','45','46','50','51','52','53','54','61','62','63','64','65','71','81',  '82', '91') 
        and substr(sfzh,11,2) in ('01','02','03','04','05','06','07','08','09','10','11','12') 
        and substr(sfzh,7,1) > '0' 
        and substr(sfzh,7,1) <= '2'
        and SUBSTRING(sfzh, 7, 8) >= '19000101'
        and SUBSTRING(sfzh, 7, 8) <= date_format('${bizdate}','yyyyMMdd')
        and (
            (SUBSTRING(sfzh, 11, 2) in ('01','03','05','07','08','10','12') and SUBSTRING(sfzh, 13, 2) <= '31' and SUBSTRING(sfzh, 13, 2) >= '01') 
            or (SUBSTRING(sfzh, 11, 2) in ('04','06','09','11') and SUBSTRING(sfzh, 13, 2) <= '30'and  SUBSTRING(sfzh, 13, 2) >= '01') 
            or (SUBSTRING(sfzh, 11, 2) = '02' and cast(SUBSTRING(sfzh, 7, 4) as int) % 4 = 0 and SUBSTRING(sfzh, 13, 2) <= '29'and SUBSTRING(sfzh, 13, 2) >= '01')
            or (SUBSTRING(sfzh, 11, 2) = '02' and cast(SUBSTRING(sfzh, 7, 4) as int) % 4 <> 0 and SUBSTRING(sfzh, 13, 2) <= '28'and SUBSTRING(sfzh, 13, 2) >= '01')
            )
        then sfzh 
        else null end as sfzh

你可能感兴趣的:(hive,hive)