Oozie-Workflow 组件

概念定义

  • Action:

具体的可执行任务(比如MR、Hive、Pig 和shell命令)

  • Workflow:

任务的有向无环图,DAG

  • Workflow Definition:

任务流定义

  • Workflow Job:

任务流实例

控制节点


  ...
  
  ...


demo

    ...
    
    ...

开始控制节点


    ...
    
    ...

demo

    ...
    

结束控制节点


    ...
    
        [MESSAGE-TO-LOG]
    
    ...

demo

    ...
    
        Input unavailable
    
    ...

杀死控制节点


    ...
    
        
            [PREDICATE]
            ...
            [PREDICATE]
            
        
    
    ...

demo

    ...
    
        
            
              ${fs:fileSize(secondjobOutputDir) gt 10 * GB}
             
              ${fs:fileSize(secondjobOutputDir) lt 100 * MB}
            
            
              ${ hadoop:counters('secondjob')[RECORDS][REDUCE_OUT] lt 1000000 }
            
            
        
    
    ...

选择节点


    ...
    
        
        ...
        
    
    ...
    
    ...

demo

    ...
    
        
        
    
    
        
            foo:8021
            bar:8020
            job1.xml
        
        
        
    
    
        
            foo:8021
            bar:8020
            job2.xml
        
        
        
    
    
    ...

任务节点


    ...
    
        
            
            ...
            
            ...
            
            ...
            
            ...
            
            ...
            
        
        
        
    
    ...

demo

    ...
    
         
            
            
            
            
            
        
        
        
    
    ...

Fs (HDFS) action


    ...
    
        
            [USER]@[HOST]
            [SHELL]
            [ARGUMENTS]
            ...
            
        
        
        
    
    ...

demo

    ...
    
        
            [email protected]
            uploaddata
            jdbc:derby://bar.com:1527/myDB
            hdfs://foobar.com:8020/usr/tucu/myData
        
        
        
    
    ...

Ssh Action


    ...
    
        
            [WF-APPLICATION-PATH]
            
            
                
                    [PROPERTY-NAME]
                    [PROPERTY-VALUE]
                
                ...
            
        
        
        
    
    ...

demo

    ...
    
        
            child-wf
            
                
                    input.dir
                    ${wf:id()}/second-mr-output
                
            
        
        
        
    
    ...

Sub-workflow Action


    ...
    
        
            [JOB-TRACKER]
            [NAME-NODE]
            
               
               ...
               
               ...
            
            [JOB-XML]
            
                
                    [PROPERTY-NAME]
                    [PROPERTY-VALUE]
                
                ...
            
            [MAIN-CLASS]
            [JAVA-STARTUP-OPTS]
            ARGUMENT
            ...
            [FILE-PATH]
            ...
            [FILE-PATH]
            ...
            
        
        
        
    
    ...

demo

    ...
    
        
            foo:8021
            bar:8020
            
                
            
            
                
                    mapred.queue.name
                    default
                
            
            org.apache.oozie.MyFirstMainClass
            -Dblah
            argument1
            argument2
        
        
        
    
    ...

java action


    ...
    
        
            [JOB-TRACKER]
            [NAME-NODE]
            
                
                ...
                
                ...
            
            
                [MAPPER-PROCESS]
                [REDUCER-PROCESS]
                [RECORD-READER-CLASS]
                [NAME=VALUE]
                ...
                [NAME=VALUE]
                ...
            
            
            
                [MAPPER]
                [REDUCER]
                [INPUTFORMAT]
                [PARTITIONER]
                [OUTPUTFORMAT]
                [EXECUTABLE]
            
            [JOB-XML-FILE]
            
                
                    [PROPERTY-NAME]
                    [PROPERTY-VALUE]
                
                ...
            
            com.example.MyConfigClass
            [FILE-PATH]
            ...
            [FILE-PATH]
            ...
                
        
    
    ...

demo

    ...
    
        
            foo:8021
            bar:8020
            
                
            
            /myfirstjob.xml
            
                
                    mapred.input.dir
                    /usr/tucu/input-data
                
                
                    mapred.output.dir
                    /usr/tucu/input-data
                
                
                    mapred.reduce.tasks
                    ${firstJobReducers}
                
                
                    oozie.action.external.stats.write
                    true
                
            
        
        
        
    
    ...


    ...
    
        
            foo:8021
            bar:8020
            
                
            
            
                /bin/bash testarchive/bin/mapper.sh testfile
                /bin/bash testarchive/bin/reducer.sh
            
            
                
                    mapred.input.dir
                    ${input}
                
                
                    mapred.output.dir
                    ${output}
                
                
                    stream.num.map.output.key.fields
                    3
                
            
            /users/blabla/testfile.sh#testfile
            /users/blabla/testarchive.jar#testarchive
        
        
        
    
  ...


    ...
    
        
            foo:8021
            bar:8020
            
                
            
            
                bin/wordcount-simple#wordcount-simple
            
            
                
                    mapred.input.dir
                    ${input}
                
                
                    mapred.output.dir
                    ${output}
                
            
            /users/blabla/testarchive.jar#testarchive
        
        
        
    
  ...

mr action

你可能感兴趣的:(Oozie-Workflow 组件)