hive中集成hadoop3MapreduceNativeTask功能

参考Jira: https://issues.apache.org/jira/browse/HIVE-17498?jql=text%20~%20%22HiveKey%20writableutils%22
这里我们需要做两个改动:

第一个改动:需要修改hadoop-mapreduce-client-nativetask工程下 util/WritableUtils.cc代码
改动后需要对其进行编译然后放入到 hadoop/lib/native/
涉及文件:
libnativetask.a
libnativetask.so
libnativetask.so.1.0.0

添加如下代码:

 if (clazz == "org.apache.hadoop.io.Text") {
    return TextType;
  }
  if (clazz == "org.apache.hadoop.hive.ql.io.HiveKey") {
        return BytesType;
  }

  if (clazz == "org.apache.hadoop.io.BytesWritable") {
    return BytesType;
  }

第二个改动:

public class HivePlatform extends Platform {

    private static final Logger LOG = LoggerFactory.getLogger(Platforms.class);

    public HivePlatform() {
    }

    @Override
    public void init() throws IOException {
        registerKey(HiveKey.class.getName(), BytesWritableSerializer.class);
    }

    @Override
    public String name() {
        return "hive";
    }

    @Override
    protected boolean support(String keyClassName, INativeSerializer<?> iNativeSerializer, JobConf jobConf) {
        if (keyClassNames.contains(keyClassName)
                && iNativeSerializer instanceof INativeComparable) {
            return true;
        } else {
            return false;
        }
    }

    @Override
    protected boolean define(Class<?> aClass) {
        return false;
    }
}

resources目录下添加:META-INF.services
并添加文件: org.apache.hadoop.mapred.nativetask.Platform
内容: org.apache.hive.mapreduce.nativetask.HivePlatform

hive中集成hadoop3MapreduceNativeTask功能_第1张图片

然后对代码进行打包同步到 hadoop/share/hadoop/mapreduce目录下

涉及pom文件:

 <dependencies>
        <!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-mapreduce-client-nativetask -->
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-mapreduce-client-nativetask</artifactId>
            <version>3.2.2</version>
            <scope>compile</scope>
        </dependency>
        <!-- https://mvnrepository.com/artifact/org.apache.hive/hive-exec -->
        <dependency>
            <groupId>org.apache.hive</groupId>
            <artifactId>hive-exec</artifactId>
            <version>3.1.0</version>
            <scope>compile</scope>
        </dependency>
    </dependencies>

你可能感兴趣的:(hive,hive,hadoop,大数据)