datax定时同步在doker中自动化部署

一、Dockerfile文件编写

(1)这是一个Dockerfile的例子,我在这里用的是ubuntu 16:

#基础镜像

FROM ubuntu:16.04

#安装必要的安装包,我这里用的是datax解压的版本,不需要maven,这里依赖环境python、jdk8、cron、rsyslog、tzdata 

#cron这里是使用crontab,进行定时任务
RUN apt-get update && apt-get install -y python && apt-get install -y openjdk-8-jre && apt-get install -y cron && apt-get -y install rsyslog
RUN apt-get install -y tzdata 

#设置时区,为上海
ENV TZ=Asia/Shanghai
RUN ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > /etc/timezone
RUN dpkg-reconfigure --frontend noninteractive tzdata

#将准备好的文件复制到镜像中,并作为工作空间
COPY . /app
WORKDIR /app
ONBUILD ADD . /app

#运行定时任务
RUN crontab /app/crontabfile
RUN cp /app/crontabfile /etc/crontab
RUN touch /var/log/cron.log
RUN chmod +x /app/run.sh
WORKDIR /app
CMD ["bash","/app/run.sh"]

(2)crontabfile文件:

#每五分钟进行一次数据同步,并且将日志到log文件

*/5 * * * * bash /app/syntask227.sh cron >> /var/log/cron.log 2>&1
*/5 * * * * bash /app/syntask147.sh cron >> /var/log/cron1.log 2>&1
*/5 * * * * bash /app/syntask149.sh cron >> /var/log/cron2.log 2>&1

(3)run.sh

rsyslogd
cron
touch /var/log/cron.log

#tail -F /var/log/syslog /var/log/cron.log这里用来输出系统和cron日志,并保持container运行。
tail -F /var/log/syslog /var/log/cron.log

(4)datax的一个json文件例子

{
    "job": {
        "content": [
            {
                "reader": {
                    "name": "mysqlreader", 
                    "parameter": {
                        "column": [
                        "uuid","drain_id","plan_hole_no_id","work_id","org_code","org_name","work_name","survey_water_mileage","drill_hole_place","hole_start_time","hole_class_num_key","hole_class_num_value","principal_name","hole_org_name","hole_no","height","hole_azimuth","hole_obliquity","hole_distance","analysis_hole_distance","hole_condition","tfs_monitor_name","safe_name","cj_monitor_name","check_name","check_time","check_class_num_key","check_class_num_value","start_drill_time","end_drill_time","remark","status","upload_time","create_time","del_flag","diy_column","update_time"
                        ], 
                        "connection": [
                            {
                                "jdbcUrl": ["${jdbcUrlSrc}"], 
                                "table": ["tfs_acceptance_checks"]
                            }
                        ], 
                        "password": "${passwordSrc}", 
                        "username": "${usernameSrc}",
                        "where": "${syn_all}=1 or update_time > FROM_UNIXTIME(${start_time}) and update_time < FROM_UNIXTIME(${end_time})"
                    }
                }, 
                "writer": {
                    "name": "mysqlwriter", 
                    "parameter": {
                        "column": [
                        "uuid","drain_id","plan_hole_no_id","work_id","org_code","org_name","work_name","survey_water_mileage","drill_hole_place","hole_start_time","hole_class_num_key","hole_class_num_value","principal_name","hole_org_name","hole_no","height","hole_azimuth","hole_obliquity","hole_distance","analysis_hole_distance","hole_condition","tfs_monitor_name","safe_name","cj_monitor_name","check_name","check_time","check_class_num_key","check_class_num_value","start_drill_time","end_drill_time","remark","status","upload_time","create_time","del_flag","diy_column","update_time"
                        ], 
                        "writeMode":"update",
                        "connection": [
                            {
                                "jdbcUrl": "${jdbcUrlDest}", 
                                "table": ["tfs_acceptance_checks"]
                            }
                        ], 
                        "password": "${passwordDest}", 
                        "username": "${usernameDest}"
                    }
                }
            }
        ], 
        "setting": {
            "speed": {
                "channel": "4",
                "batchSize":"4096"
            }
        }
    }
}
 

channel这里的通道号,为1,4,8,16,32,当然通道号越大,速度越快

batchSize批处理数,具体可以看下doker的官方文档,在github上

(5)syntask227.sh的例子

#!/bin/bash
source /etc/profile
# 截至时间设置为当前时间戳
end_time=$(date +%s)
# 开始时间设置为120s前时间戳
start_time=$(($end_time - 3600))
jdbcUrlSrc="数据库地址?useUnicode=true&characterEncoding=utf8"
usernameSrc="数据库账号"
passwordSrc="密码"
jdbcUrlDest="数据库地址?useUnicode=true&characterEncoding=utf8"
usernameDest="数据库账号"
passwordDest="密码"
syn_all=1

#镜像中自己的datax.py的地址
dataxPath="/app/datax/bin"

#镜像中自己的json地址
jsonPath="/app/auto227"
$dataxPath/datax.py $jsonPath/tfs_tunnel_designs.json -p "-Dstart_time=$start_time -Dend_time=$end_time -DjdbcUrlSrc='$jdbcUrlSrc' -DusernameSrc='$usernameSrc' -DpasswordSrc='$passwordSrc' -DjdbcUrlDest='$jdbcUrlDest' -DusernameDest='$usernameDest' -DpasswordDest='$passwordDest' -Dsyn_all=$syn_all" 

二、建立镜像

docker build -t 自己的镜像名称 .

注意镜像名称后有空格.

三、运行镜像


docker run -it 自己的镜像名称 

要是一直运行的化加上 -d,成为守护进程

用命令docker ps查看运行中的镜像

当然可以进入镜像,查看日志文件,是否成功。命令:docker exec -it 自己的镜像名称 /bin/bash

 


 

 

 

 

你可能感兴趣的:(datax,docker)