原文地址:
https://blog.csdn.net/qq_35440040/article/details/89226297?ops_request_misc=%25257B%252522request%25255Fid%252522%25253A%252522160974589416780302953405%252522%25252C%252522scm%252522%25253A%25252220140713.130102334.pc%25255Fall.%252522%25257D&request_id=160974589416780302953405&biz_id=0&utm_medium=distribute.pc_search_result.none-task-blog-2~all~first_rank_v2~rank_v29-1-89226297.first_rank_v2_pc_rank_v29&utm_term=%E7%94%B0%E6%85%A7%E6%9D%B0
hive元数据存储在mysql中,因此需要进入mysql中创建hive元数据库;如果已存在hive元数据库,则修改元数据库字符格式
##创建hive元数据库hive,并指定utf-8编码格式
mysql>create database hive DEFAULT CHARSET utf8 COLLATE utf8_general_ci;
##修改已存在的hive元数据库,字符编码格式为utf-8
mysql>alter database hive character set utf8;
##进入hive元数据库
mysql>use hive;
##查看元数据库字符编码格式
mysql>show variables like 'character_set_database';
hive启动后,修改hive的元数据信息,无需重启mysql和hive就能生效;
1).修改字段注释字符集
mysql>alter table COLUMNS_V2 modify column COMMENT varchar(256) character set utf8;
2).修改表注释字符集
mysql>alter table TABLE_PARAMS modify column PARAM_VALUE varchar(4000) character set utf8;
3).修改分区表参数,以支持分区键能够用中文表示
mysql>alter table PARTITION_PARAMS modify column PARAM_VALUE varchar(4000) character set utf8;
mysql>alter table PARTITION_KEYS modify column PKEY_COMMENT varchar(4000) character set utf8;
4).修改索引注解
mysql>alter table INDEX_PARAMS modify column PARAM_VALUE varchar(4000) character set utf8;
进入hive建表测试;
##分区表
create table page_view
(
page_id bigint comment '页面ID',
page_name string comment '页面名称',
page_url string comment '页面URL'
)
comment '页面视图'
partitioned by (ds string comment '当前时间,用于分区字段') ;
desc page_view;
注意:只有修改编码后才加入的中文注释才会正常显示 ,修改编码前已经存在的中文注释不会正常显示
##创建表字段索引
create index xyz_666_index
on table page_view (page_url)
as 'bitmap'
with deferred rebuild
COMMENT "页面URL";
##添加分区
alter table page_view add partition(ds='20210101');
##插入数据
insert into page_view partition(ds='20210101') values (1,"张三","李四") ;
##查看包含中文的数据
select * from page_view where ds="20210101";
##查看索引
SHOW FORMATTED INDEX ON page_view;
##查看表结构
desc page_view;
创建包含中文的分区时出现报错:
hive (default)> alter table page_view add partition(ds='20210101开心');
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Exception thrown when executing query)
解决:
进入mysql,
##进入hive元数据库
mysql> use hive;
##查看元数据库下配置分区的表
mysql> show create table PARTITIONS;
此处验证一个小问题,utf8占用三个字节,最大默认是767,指定250 * 3 = 750可以,但是260*3=780不可以;
mysql> alter table PARTITIONS modify column `PART_NAME` varchar(767) character set utf8;
ERROR 1071 (42000): Specified key was too long; max key length is 767 bytes
mysql> alter table PARTITIONS modify column `PART_NAME` varchar(260) character set utf8;
ERROR 1071 (42000): Specified key was too long; max key length is 767 bytes
##设置成功
mysql> alter table PARTITIONS modify column `PART_NAME` varchar(250) character set utf8;
再去hive中创建中文分区,发现可以了
hive (default)> alter table page_view add partition(ds='20210101开心');