datax oozie调用

一、关键点

要使用Oozie提交的Job,其中的Action(包括Java、Shell等等)必须能在任意一个NodeManager所在的主机上运行,其执行环境、依赖文件(jar等等)、执行用户、用户权限、输入输出路径必须在所有NodeManager的主机上配置好

二、NodeManager部署

基于上述描述,首先应该将集群中的每个NodeManager节点都配置好,都能单独运行datax

1、上传datax安装包

hdfs dfs -put datax.tar.gz /usr/local/

2、解压到/usr/local

cd /usr/local

tar -xvf datax.tar.gz

3、编写全局执行脚本

vim /usr/local/datax/datax_start.sh

#!/bin/bash echo $1 > /usr/local/datax/job/$2 python2.7 /usr/local/datax/bin/datax.py /usr/local/datax/job/$2

 

其中$1 :workflow.xml文件中的第一个argument,datax的配置内容

$2:workflow.xml文件中的第二个argument,配置文件名

4、作超链接

ln -s /usr/local/datax/datax_start.sh /usr/bin/dataxstart

 

三、测试

dataxstart “{}” “11.json”

 

四、oozie配置文件

job.properties

#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements.  See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership.  The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License.  You may obtain a copy of the License at
#
#      http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

nameNode=hdfs://hadoop-ha
jobTracker=yarn-ha
queueName=default
examplesRoot=examples

oozie.wf.application.path=${nameNode}/user/${user.name}/${examplesRoot}/apps/datax

 

workflow.xml



    
    
        
            ${jobTracker}
            ${nameNode}
            
                
                    mapred.job.queue.name
                    ${queueName}
                
            
	    dataxstart
	    {"job": {}}
	    11.json
	    
        
        
        
    
    
        
            
                ${wf:actionData('shell-node')['my_output'] eq 'Hello Oozie'}
            
            
        
    
    
        Shell action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]
    
    
        Incorrect output, expected [Hello Oozie] but was [${wf:actionData('shell-node')['my_output']}]
    
    

 

你可能感兴趣的:(datax oozie调用)