Hive中文注释问题

hive元数据建表之后需要修改一些字符集


为了解决hive的specified key was too long; max key length is 767 问题(MySQL的varchar主键只支持不超过768个字节 或者 768/2=384个双字节 或者 768/3=256个三字节的字段,UTF-8是三字节的。),需要将数据库默认字符集改为latin1;

alter database hive character set latin1;


以下是为了支持hive建表时插入中文注释:
//修改字段注释字符集
alter table COLUMNS_V2 modify column COMMENT varchar(256) character set utf8;
//修改表注释字符集
alter table TABLE_PARAMS modify column PARAM_VALUE varchar(4000) character set utf8;
//修改分区注释字符集
alter table PARTITION_KEYS modify column PKEY_COMMENT varchar(4000) character set utf8;


但是建好表之后去describe table得到的中文注释还是乱码,这个需要去修改Hive的源码重新编译:

src/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 的 describeTable方法,在hive-0.10里这个方法调用了src/ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/TextMetaDataFormatter.java,

formatter.describeTable(outStream, colPath, tableName, tbl, part, cols, descTbl.isFormatted(), descTbl.isExt());
因此需要修改后一个类的describeTable方法

    public void describeTable(DataOutputStream outStream,
                              String colPath, String tableName,
                              Table tbl, Partition part, List cols,
                              boolean isFormatted, boolean isExt)
         throws HiveException
   {
       try {
         if (colPath.equals(tableName)) {
           if (!isFormatted) {
             //outStream.writeBytes(MetaDataFormatUtils.displayColsUnformatted(cols));
             outStream.write(MetaDataFormatUtils.displayColsUnformatted(cols).getBytes("UTF-8"));
           } else {
             outStream.writeBytes(
               MetaDataFormatUtils.getAllColumnsInformation(cols,
                 tbl.isPartitioned() ? tbl.getPartCols() : null));
           }
         } else {
           if (isFormatted) {
             outStream.writeBytes(MetaDataFormatUtils.getAllColumnsInformation(cols));
           } else {
             //outStream.writeBytes(MetaDataFormatUtils.displayColsUnformatted(cols));
             outStream.write(MetaDataFormatUtils.displayColsUnformatted(cols).getBytes("UTF-8"));
           }
         }

把outStream.writeBytes方法改成outStream.write方法

 重新编译打包之后把src/build/dist/lib/目录下的hive-exec-xxx.jar和hive-builtins-xxxx.jar拷到$HIVE_HOME/lib目录下覆盖原文件即可 
  




你可能感兴趣的:(hive)