第四章 FlinkSQL集成Iceberg实战案例之Mysql实时同步

第四章 FlinkSQL集成Iceberg实战案例之Mysql实时同步_第1张图片

1、开启binlog日志

1.1、在mysql中对需要进行实时数据监测的库开启binlog同步

sudo vim /etc/my.cnf

#添加数据库的binlog
server-id=1
log-bin=mysql-bin
binlog_format=row
binlog-do-db=m_iceberg


#重启MySQL服务
sudo systemctl restart mysqld
  • 具体如下

第四章 FlinkSQL集成Iceberg实战案例之Mysql实时同步_第2张图片

1.2、查询是否开启

cd /var/lib/mysql
  • 查询同步日志

第四章 FlinkSQL集成Iceberg实战案例之Mysql实时同步_第3张图片

2、实时写入前提

  • 实时写入必要配置:在flink-conf.yaml中配置checkpoints相关参数
execution.checkpointing.interval: 3min
execution.checkpointing.externalized-checkpoint-retention: RETAIN_ON_CANCELLATION
execution.checkpointing.max-concurrent-checkpoints: 1
execution.checkpointing.min-pause: 0
execution.checkpointing.mode: EXACTLY_ONCE
execution.checkpointing.timeout: 10min
execution.checkpointing.tolerable-failed-checkpoints: 0
execution.checkpointing.unaligned: false
#
# Supported backends are 'jobmanager', 'filesystem', 'rocksdb', or the .
state.backend: filesystem

# Directory for checkpoints filesystem, when using any of the default bundled state backends.
state.checkpoints.dir: hdfs:///flink/flink-checkpoints

# Default target directory for savepoints, optional.
state.savepoints.dir: hdfs:///flink/flink-savepoints
  • 环境依赖包如下
flink-sql-connector-kafka_2.11-1.13.5.jar
flink-sql-connector-mysql-cdc-1.3.0.jar
#下载地址:https://repo.maven.apache.org/maven2/com/alibaba/ververica/flink-sql-connector-mysql-cdc/1.3.0/

3、数据源创建

CREATE TABLE `product_view` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`user_id` int(11) NOT NULL,
`product_id` int(11) NOT NULL,
`server_id` int(11) NOT NULL,
`duration` int(11) NOT NULL,
`times` varchar(11) NOT NULL,
`time` datetime NOT NULL,
PRIMARY KEY (`id`),
KEY `time` (`time`),
KEY `user_product` (`user_id`,`product_id`) USING BTREE,
KEY `times` (`times`) USING BTREE
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

-- 样本数据
INSERT INTO `product_view` VALUES ('1', '1', '1', '1', '120', '120', '2020-04-24 13:14:00');
INSERT INTO `product_view` VALUES ('2', '1', '1', '1', '120', '120', '2020-04-24 13:14:00');
INSERT INTO `product_view` VALUES ('3', '1', '1', '3', '120', '120', '2020-04-24 13:14:00');
INSERT INTO `product_view_sink` VALUES ('4', '1', '1', '2', '120', '120', '2020-04-24 13:14:00');
INSERT INTO `product_view` VALUES ('5', '8', '1', '1', '120', '120', '2020-05-14 13:14:00');
INSERT INTO `product_view` VALUES ('6', '8', '1', '2', '120', '120', '2020-05-13 13:14:00');
INSERT INTO `product_view` VALUES ('7', '8', '1', '3', '120', '120', '2020-04-24 13:14:00');
INSERT INTO `product_view` VALUES ('8', '8', '1', '3', '120', '120', '2020-04-23 13:14:00');
INSERT INTO `product_view` VALUES ('9', '8', '1', '2', '120', '120', '2020-05-13 13:14:00');

INSERT INTO `product_view` VALUES (10, 8, 1, 2, 120, '120', '2020-05-13 13:14:00');

4、创建数据表关联Mysql

  • 含义:我们在flink sql client操作这个表相当于操作mysql里面的对应表。
CREATE TABLE `default_catalog`.`default_database`.`product_view_source` (
    `id` int,
    `user_id` int,
    `product_id` int,
    `server_id` int,
    `duration` int,
    `times` string,
    `time` timestamp,
    PRIMARY KEY (`id`) NOT ENFORCED
) WITH (
    'connector' = 'mysql-cdc',
    'hostname' = '192.168.6.102',
    'port' = '3306',
    'username' = 'root',
    'password' =,
    'database-name' = 'm_iceberg',
    'table-name' = 'product_view'
);

5、Mysql表入湖

第四章 FlinkSQL集成Iceberg实战案例之Mysql实时同步_第4张图片

  • Flink CDC入湖
CREATE EXTERNAL TABLE product_view_sink (
    `id` int,
    `user_id` int,
    `product_id` int,
    `server_id` int,
    `duration` int,
    `times` string,
    `time` timestamp,
    PRIMARY KEY (`id`) NOT ENFORCED
) STORED BY 'org.apache.iceberg.mr.hive.HiveIcebergStorageHandler'
TBLPROPERTIES ( 'write.metadata.delete-after-commit.enabled'='true',
 'write.metadata.previous-versions-max'='5',
 'format-version'='2');

insert into product_view_sink select * from default_catalog.default_database.product_view_source;
  • 验证结果:Source表(在default_catalog)可实时查看,Sink表(在hive_Iceberg)无法实时查看

第四章 FlinkSQL集成Iceberg实战案例之Mysql实时同步_第5张图片

6、数据的增删改同步

  • 验证结果:Flink1.14.5支持数据增删改同步
  • hive建表配置
TBLPROPERTIES ( 'write.metadata.delete-after-commit.enabled'='true',
 'write.metadata.previous-versions-max'='5',
 'format-version'='2');

6.1、数据增加验证

INSERT INTO `product_view` VALUES ('100', '8', '1', '2', '120', '120', '2020-05-13 13:14:00');
  • 验证结果

第四章 FlinkSQL集成Iceberg实战案例之Mysql实时同步_第6张图片

6.2、数据更新验证

UPDATE product_view SET user_id = 100 WHERE id = 100;
  • 验证结果:

第四章 FlinkSQL集成Iceberg实战案例之Mysql实时同步_第7张图片

6.3、数据删除验证

delete from product_view where id = 100;
  • 验证结果:

第四章 FlinkSQL集成Iceberg实战案例之Mysql实时同步_第8张图片

你可能感兴趣的:(#,Iceberg专题,mysql,数据库,java)