实践数据湖iceberg 第一课 入门
实践数据湖iceberg 第二课 iceberg基于hadoop的底层数据格式
实践数据湖iceberg 第三课 在sqlclient中,以sql方式从kafka读数据到iceberg
实践数据湖iceberg 第四课 在sqlclient中,以sql方式从kafka读数据到iceberg(升级版本到flink1.12.7)
实践数据湖iceberg 第五课 hive catalog特点
实践数据湖iceberg 第六课 从kafka写入到iceberg失败问题 解决
实践数据湖iceberg 第七课 实时写入到iceberg
实践数据湖iceberg 第八课 hive与iceberg集成
实践数据湖iceberg 第九课 合并小文件
实践数据湖iceberg 第十课 快照删除
实践数据湖iceberg 第十一课 测试分区表完整流程(造数、建表、合并、删快照)
实践数据湖iceberg 第十二课 catalog是什么
实践数据湖iceberg 第十三课 metadata比数据文件大很多倍的问题
实践数据湖iceberg 第十四课 元数据合并(解决元数据随时间增加而元数据膨胀的问题)
数据不断写入iceberg, 也进行合并与清理快照,发现快照和manifest文件都被清理,但metadata的文件没有被清理的痕迹
数据文件只有6.3M,数据个数20个,但metadata总大小33.1G,metadata个数8715个, 清理最后一个快照前5分钟的所有数据,发现对数据没影响
问题解决方法? 待后续解决,关注后面更新。。。
基于hiveCatalog在sqlClient建表,建表语句,具体查看11课。
在第11课结尾中也发现这个问题。单独写一篇文章以显示它的重要性。
文件大小
[root@hadoop103 ~]# hadoop fs -du -h /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/
6.3 M /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/data
33.1 G /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata
文件个数
[root@hadoop101 ~]# hadoop fs -du -h /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/data|wc
21 61 2940
[root@hadoop101 ~]# hadoop fs -du -h /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata|wc
8715 26144 1246221
metadata目录
-rw-r--r-- 2 root supergroup 8118751 2022-01-26 11:19 /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08690-b9a3c862-443e-4f6b-a1fc-c17fe3e517dc.metadata.json
-rw-r--r-- 2 root supergroup 8119685 2022-01-26 11:20 /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08691-34894f4a-d881-4b8f-b228-7adba992a08f.metadata.json
-rw-r--r-- 2 root supergroup 8120615 2022-01-26 11:21 /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08692-1ce25766-4ca5-473e-945f-3fd848cae5e3.metadata.json
-rw-r--r-- 2 root supergroup 8121549 2022-01-26 11:22 /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08693-4bd481a5-f32b-4f15-aad7-4cd3a5af6b39.metadata.json
-rw-r--r-- 2 root supergroup 8122483 2022-01-26 11:23 /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08694-4f3554aa-4db7-443d-bbb9-ac0871ec02da.metadata.json
-rw-r--r-- 2 root supergroup 8123417 2022-01-26 11:24 /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08695-e8bf9bda-44e7-4624-83a2-d64db09f5660.metadata.json
-rw-r--r-- 2 root supergroup 8124351 2022-01-26 11:25 /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08696-2b95f1d4-6843-41e6-9e16-77bbe1875b7f.metadata.json
-rw-r--r-- 2 root supergroup 8125285 2022-01-26 11:26 /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08697-f11c1b8f-f987-4589-8159-521c65328163.metadata.json
-rw-r--r-- 2 root supergroup 8126219 2022-01-26 11:27 /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08698-fb8b744a-db03-4b80-8612-15de1d6278cc.metadata.json
-rw-r--r-- 2 root supergroup 8127153 2022-01-26 11:28 /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08699-a6b6683d-d9f1-45a1-a09b-b242a8284b96.metadata.json
-rw-r--r-- 2 root supergroup 8128087 2022-01-26 11:29 /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08700-cad78b24-8cd7-464f-95fe-296e96bfd648.metadata.json
-rw-r--r-- 2 root supergroup 8129021 2022-01-26 11:30 /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08701-0f702902-b2ae-4029-b8cd-97b5df0474ff.metadata.json
-rw-r--r-- 2 root supergroup 8129955 2022-01-26 11:31 /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08702-91dbcc1f-9d40-4662-874e-8f1091c0a52f.metadata.json
-rw-r--r-- 2 root supergroup 8130889 2022-01-26 11:32 /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08703-2c78ad8f-69ff-408f-afec-8d707ff944e8.metadata.json
-rw-r--r-- 2 root supergroup 8131823 2022-01-26 11:33 /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08704-84085a27-b185-468f-9c23-2984a9330762.metadata.json
-rw-r--r-- 2 root supergroup 8132757 2022-01-26 11:34 /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08705-edc7f661-0ed2-4e46-82a0-a2006dd01ad5.metadata.json
-rw-r--r-- 2 root supergroup 8133691 2022-01-26 11:35 /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08706-9c3378aa-21cb-48bf-be52-70b25ea59308.metadata.json
-rw-r--r-- 2 root supergroup 8343948 2022-01-27 11:52 /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08707-afd79c3c-e280-45c4-9797-2fa9a4fa27f4.metadata.json
-rw-r--r-- 2 root supergroup 8344913 2022-01-27 14:16 /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08708-75efd8f6-ba3f-47dc-8b89-b3177c477a62.metadata.json
-rw-r--r-- 2 root supergroup 8345875 2022-01-27 14:38 /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08709-78209251-777c-4a4f-9292-64cf3f2190ae.metadata.json
-rw-r--r-- 2 root supergroup 23219 2022-01-27 15:17 /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08710-d69a0a2b-959e-488d-8443-471986f49e32.metadata.json
-rw-r--r-- 2 root supergroup 5777 2022-01-27 14:38 /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/6c6d7719-74a9-4817-914a-b0df5eb8f6ba-m0.avro
-rw-r--r-- 2 root supergroup 6441 2022-01-27 14:38 /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/6c6d7719-74a9-4817-914a-b0df5eb8f6ba-m1.avro
-rw-r--r-- 2 root supergroup 5771 2022-01-27 14:38 /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/6c6d7719-74a9-4817-914a-b0df5eb8f6ba-m2.avro
-rw-r--r-- 2 root supergroup 3844 2022-01-27 14:38 /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/snap-7762404597294868190-1-6c6d7719-74a9-4817-914a-b0df5eb8f6ba.avro
大小格式化
7.7 M /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08684-d4af58ae-4967-48a6-ac40-9308a075fe00.metadata.json
7.7 M /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08685-89f09f2f-6cdf-43d8-acc2-79496dcaf18d.metadata.json
7.7 M /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08686-9be5033f-2592-4696-9c2f-5d1d408910c6.metadata.json
7.7 M /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08687-f111331a-599f-4068-9590-e57c76e46c31.metadata.json
7.7 M /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08688-18779a1c-fd2d-43c2-9c62-4d1efb4caed2.metadata.json
7.7 M /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08689-a1bfd5ea-23a1-431b-8208-a82f2561952e.metadata.json
7.7 M /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08690-b9a3c862-443e-4f6b-a1fc-c17fe3e517dc.metadata.json
7.7 M /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08691-34894f4a-d881-4b8f-b228-7adba992a08f.metadata.json
7.7 M /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08692-1ce25766-4ca5-473e-945f-3fd848cae5e3.metadata.json
7.7 M /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08693-4bd481a5-f32b-4f15-aad7-4cd3a5af6b39.metadata.json
7.7 M /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08694-4f3554aa-4db7-443d-bbb9-ac0871ec02da.metadata.json
7.7 M /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08695-e8bf9bda-44e7-4624-83a2-d64db09f5660.metadata.json
7.7 M /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08696-2b95f1d4-6843-41e6-9e16-77bbe1875b7f.metadata.json
7.7 M /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08697-f11c1b8f-f987-4589-8159-521c65328163.metadata.json
7.7 M /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08698-fb8b744a-db03-4b80-8612-15de1d6278cc.metadata.json
7.8 M /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08699-a6b6683d-d9f1-45a1-a09b-b242a8284b96.metadata.json
7.8 M /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08700-cad78b24-8cd7-464f-95fe-296e96bfd648.metadata.json
7.8 M /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08701-0f702902-b2ae-4029-b8cd-97b5df0474ff.metadata.json
7.8 M /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08702-91dbcc1f-9d40-4662-874e-8f1091c0a52f.metadata.json
7.8 M /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08703-2c78ad8f-69ff-408f-afec-8d707ff944e8.metadata.json
7.8 M /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08704-84085a27-b185-468f-9c23-2984a9330762.metadata.json
7.8 M /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08705-edc7f661-0ed2-4e46-82a0-a2006dd01ad5.metadata.json
7.8 M /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08706-9c3378aa-21cb-48bf-be52-70b25ea59308.metadata.json
8.0 M /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08707-afd79c3c-e280-45c4-9797-2fa9a4fa27f4.metadata.json
8.0 M /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08708-75efd8f6-ba3f-47dc-8b89-b3177c477a62.metadata.json
8.0 M /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08709-78209251-777c-4a4f-9292-64cf3f2190ae.metadata.json
22.7 K /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08710-d69a0a2b-959e-488d-8443-471986f49e32.metadata.json
5.6 K /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/6c6d7719-74a9-4817-914a-b0df5eb8f6ba-m0.avro
6.3 K /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/6c6d7719-74a9-4817-914a-b0df5eb8f6ba-m1.avro
5.6 K /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/6c6d7719-74a9-4817-914a-b0df5eb8f6ba-m2.avro
3.8 K /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/snap-7762404597294868190-1-6c6d7719-74a9-4817-914a-b0df5eb8f6ba.avro
data目录:
[root@hadoop101 ~]# hadoop fs -du -h /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/data
169.1 K /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/data/00000-0-3c21e5b1-54e8-42b1-8bdc-a0b8f1514ee1-00001.parquet
169.0 K /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/data/00000-0-3c21e5b1-54e8-42b1-8bdc-a0b8f1514ee1-00002.parquet
169.1 K /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/data/00000-0-3c21e5b1-54e8-42b1-8bdc-a0b8f1514ee1-00003.parquet
3.1 M /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/data/00000-0-cdcc5019-0c59-41e4-80c6-1d4185455065-00001.parquet
508 /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/data/00000-0-dd8bc29f-831a-4904-830e-2ef56e4a4743-08707.parquet
169.0 K /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/data/00001-0-139af0f5-d3ee-4f35-bd2e-73ce2aaf4792-00001.parquet
169.1 K /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/data/00001-0-139af0f5-d3ee-4f35-bd2e-73ce2aaf4792-00002.parquet
169.1 K /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/data/00001-0-139af0f5-d3ee-4f35-bd2e-73ce2aaf4792-00003.parquet
552 /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/data/00001-0-e9e8a782-fa82-4c4d-9786-c05b8aab251a-08707.parquet
5.9 K /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/data/00002-0-a0f46641-b14d-4f8b-a16e-4c768bcba775-00109.parquet
169.1 K /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/data/00002-0-fe001b68-3753-44a7-adb4-63d43c8b3226-00001.parquet
164.7 K /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/data/00002-0-fe001b68-3753-44a7-adb4-63d43c8b3226-00002.parquet
169.2 K /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/data/00002-0-fe001b68-3753-44a7-adb4-63d43c8b3226-00003.parquet
169.0 K /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/data/00002-0-fe001b68-3753-44a7-adb4-63d43c8b3226-00004.parquet
169.2 K /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/data/00003-0-1d71db79-abf1-4088-9282-bc907e45e262-00001.parquet
169.0 K /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/data/00003-0-1d71db79-abf1-4088-9282-bc907e45e262-00002.parquet
168.9 K /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/data/00003-0-1d71db79-abf1-4088-9282-bc907e45e262-00003.parquet
168.9 K /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/data/00003-0-1d71db79-abf1-4088-9282-bc907e45e262-00004.parquet
527.5 K /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/data/00004-0-fea6f5d5-759f-4769-9ced-b3ecca214e36-00001.parquet
169.0 K /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/data/00004-0-fea6f5d5-759f-4769-9ced-b3ecca214e36-00002.parquet
168.8 K /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/data/00004-0-fea6f5d5-759f-4769-9ced-b3ecca214e36-00003.parquet
执行合并、清理代码
清理最后一个快照的5分钟前的所有快照
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment
import org.apache.hadoop.conf.Configuration
import org.apache.iceberg.catalog.{Namespace, TableIdentifier}
import org.apache.iceberg.flink.actions.Actions
import org.apache.iceberg.flink.{CatalogLoader, TableLoader}
import org.apache.log4j.{Level, Logger}
import org.slf4j.LoggerFactory
import java.util
import java.util.concurrent.TimeUnit
object FlinkDataStreamSmallFileCompactTest {
private var logger: org.slf4j.Logger = _
def main(args: Array[String]): Unit = {
logger = LoggerFactory.getLogger(this.getClass.getSimpleName)
Logger.getLogger("org.apache").setLevel(Level.INFO)
Logger.getLogger("hive.metastore").setLevel(Level.WARN)
Logger.getLogger("akka").setLevel(Level.WARN)
// hive catalog
val env = StreamExecutionEnvironment.getExecutionEnvironment
System.setProperty("HADOOP_USER_NAME", "root")
val map = new util.HashMap[String, String]()
map.put("type", "iceberg")
map.put("catalog-type", "hive")
map.put("property-version", "2")
map.put("/warehouse", "/user/hive/warehouse")
// map.put("datanucleus.schema.autoCreateTables", "true")
// 压缩小文件
// 快照过期处理
map.put("uri", "thrift://hadoop101:9083")
val iceberg_catalog = CatalogLoader.hive(
"hive_catalog6", //catalog名称
new Configuration(),
new util.HashMap()
)
// val identifier = TableIdentifier.of(Namespace.of("iceberg_db6"), //db名称
// "behavior_with_date_log_ib") //表名称 behavior_with_date_log_ib behavior_log_ib6
val identifier = TableIdentifier.of(Namespace.of("iceberg_db6"), //db名称
"behavior_log_ib6") //表名称 behavior_with_date_log_ib behavior_log_ib6
val loader = TableLoader.fromCatalog(iceberg_catalog, identifier)
loader.open()
val table = loader.loadTable()
Actions.forTable(env, table)
.rewriteDataFiles
.maxParallelism(5)
.targetSizeInBytes(128 * 1024 * 1024)
.execute
// 清除5分钟前历史快照
val snapshot = table.currentSnapshot
val old = snapshot.timestampMillis - TimeUnit.MINUTES.toMillis(5)
if (snapshot != null) {
table.expireSnapshots
.expireOlderThan(old)
.commit()
println(s" behavior_with_date_log_ib 表 清理完成!!!")
}
}
}
清理日志:
发现:没有数据被清理
22/02/10 19:48:51 INFO conf.HiveConf: Found configuration file file:/E:/workspace/jt_workspace/iceberg-learning/flink-iceberg-learning/target/classes/hive-site.xml
22/02/10 19:48:51 WARN conf.HiveConf: HiveConf of name hive.metastore.event.db.notification.api.auth does not exist
22/02/10 19:48:51 INFO security.JniBasedUnixGroupsMapping: Error getting groups for root: Unknown error.
22/02/10 19:48:51 WARN security.UserGroupInformation: No groups available for user root
22/02/10 19:48:51 INFO iceberg.BaseMetastoreTableOperations: Refreshing table metadata from new version: hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08710-d69a0a2b-959e-488d-8443-471986f49e32.metadata.json
22/02/10 19:48:56 INFO iceberg.BaseMetastoreCatalog: Table loaded by catalog: hive_catalog6.iceberg_db6.behavior_log_ib6
22/02/10 19:48:56 INFO iceberg.BaseTableScan: Scanning table hive_catalog6.iceberg_db6.behavior_log_ib6 snapshot 7762404597294868190 created at 2022-01-27 14:38:10.105 with filter true
22/02/10 19:48:56 INFO iceberg.RemoveSnapshots: Expiring snapshots older than: Thu Jan 27 14:33:10 CST 2022 (1643265190105)
22/02/10 19:48:56 INFO iceberg.BaseMetastoreTableOperations: Nothing to commit.
22/02/10 19:48:56 INFO iceberg.RemoveSnapshots: Committed snapshot changes
其他表删除的日志:
22/02/10 17:44:27 INFO iceberg.RemoveSnapshots: Expired snapshot: BaseSnapshot{id=6848801094803890889, timestamp_ms=1644485336293, operation=append, summary={flink.job-id=78930f941991e19112d3917fd4dd4cb2, flink.max-committed-checkpoint-id=18788, added-data-files=3, added-records=5961, added-files-size=51810, changed-partition-count=2, total-records=4985317, total-files-size=43416360, total-data-files=105, total-delete-files=0, total-position-deletes=0, total-equality-deletes=0}, manifest-list=hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-6848801094803890889-1-d96ba7dc-7ff2-40ad-a582-f33c987a6740.avro, schema-id=0}
22/02/10 17:44:27 INFO iceberg.RemoveSnapshots: Expired snapshot: BaseSnapshot{id=5895976650901516425, timestamp_ms=1644485396286, operation=append, summary={flink.job-id=78930f941991e19112d3917fd4dd4cb2, flink.max-committed-checkpoint-id=18789, added-data-files=2, added-records=5960, added-files-size=50611, changed-partition-count=1, total-records=4991277, total-files-size=43466971, total-data-files=107, total-delete-files=0, total-position-deletes=0, total-equality-deletes=0}, manifest-list=hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-5895976650901516425-1-a9f423cc-0133-4118-9292-016d5227f57a.avro, schema-id=0}
22/02/10 17:44:27 INFO iceberg.RemoveSnapshots: Expired snapshot: BaseSnapshot{id=3903341502082098658, timestamp_ms=1644485457083, operation=append, summary={flink.job-id=78930f941991e19112d3917fd4dd4cb2, flink.max-committed-checkpoint-id=18790, added-data-files=2, added-records=5960, added-files-size=50631, changed-partition-count=1, total-records=4997237, total-files-size=43517602, total-data-files=109, total-delete-files=0, total-position-deletes=0, total-equality-deletes=0}, manifest-list=hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-3903341502082098658-1-7a86b5d3-8c5e-4a9c-96c3-85a0c5fa3df0.avro, schema-id=0}
22/02/10 17:44:27 INFO iceberg.RemoveSnapshots: Expired snapshot: BaseSnapshot{id=1095796975631658317, timestamp_ms=1644485516288, operation=append, summary={flink.job-id=78930f941991e19112d3917fd4dd4cb2, flink.max-committed-checkpoint-id=18791, added-data-files=2, added-records=5959, added-files-size=51052, changed-partition-count=1, total-records=5003196, total-files-size=43568654, total-data-files=111, total-delete-files=0, total-position-deletes=0, total-equality-deletes=0}, manifest-list=hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-1095796975631658317-1-b071bfb7-3109-4a92-972d-c620138f7220.avro, schema-id=0}
22/02/10 17:44:27 INFO iceberg.RemoveSnapshots: Expired snapshot: BaseSnapshot{id=451594432613548689, timestamp_ms=1644485576287, operation=append, summary={flink.job-id=78930f941991e19112d3917fd4dd4cb2, flink.max-committed-checkpoint-id=18792, added-data-files=2, added-records=5959, added-files-size=50810, changed-partition-count=1, total-records=5009155, total-files-size=43619464, total-data-files=113, total-delete-files=0, total-position-deletes=0, total-equality-deletes=0}, manifest-list=hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-451594432613548689-1-4de192bc-1b21-445b-903f-a88137b930c5.avro, schema-id=0}
22/02/10 17:44:27 INFO iceberg.RemoveSnapshots: Expired snapshot: BaseSnapshot{id=22739922463920002, timestamp_ms=1644485636293, operation=append, summary={flink.job-id=78930f941991e19112d3917fd4dd4cb2, flink.max-committed-checkpoint-id=18793, added-data-files=2, added-records=5962, added-files-size=50713, changed-partition-count=1, total-records=5015117, total-files-size=43670177, total-data-files=115, total-delete-files=0, total-position-deletes=0, total-equality-deletes=0}, manifest-list=hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-22739922463920002-1-1c513718-42d8-41a0-82ea-486d2a4a3bbb.avro, schema-id=0}
22/02/10 17:44:27 INFO iceberg.RemoveSnapshots: Expired snapshot: BaseSnapshot{id=5013785705895265232, timestamp_ms=1644485696292, operation=append, summary={flink.job-id=78930f941991e19112d3917fd4dd4cb2, flink.max-committed-checkpoint-id=18794, added-data-files=2, added-records=5961, added-files-size=50652, changed-partition-count=1, total-records=5021078, total-files-size=43720829, total-data-files=117, total-delete-files=0, total-position-deletes=0, total-equality-deletes=0}, manifest-list=hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-5013785705895265232-1-03d4f1b3-c4ee-4217-b0f2-19168a8ed28e.avro, schema-id=0}
22/02/10 17:44:27 INFO iceberg.RemoveSnapshots: Expired snapshot: BaseSnapshot{id=2526947968329093048, timestamp_ms=1644485756306, operation=append, summary={flink.job-id=78930f941991e19112d3917fd4dd4cb2, flink.max-committed-checkpoint-id=18795, added-data-files=4, added-records=5961, added-files-size=52941, changed-partition-count=2, total-records=5027039, total-files-size=43773770, total-data-files=121, total-delete-files=0, total-position-deletes=0, total-equality-deletes=0}, manifest-list=hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-2526947968329093048-1-8336f6a1-7039-41a7-b736-229ce5bcf10a.avro, schema-id=0}
22/02/10 17:44:27 INFO iceberg.RemoveSnapshots: Expired snapshot: BaseSnapshot{id=2484166318625325659, timestamp_ms=1644485816296, operation=append, summary={flink.job-id=78930f941991e19112d3917fd4dd4cb2, flink.max-committed-checkpoint-id=18796, added-data-files=2, added-records=5959, added-files-size=50849, changed-partition-count=1, total-records=5032998, total-files-size=43824619, total-data-files=123, total-delete-files=0, total-position-deletes=0, total-equality-deletes=0}, manifest-list=hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-2484166318625325659-1-02b973a7-2012-4661-9147-145ea82b5126.avro, schema-id=0}
22/02/10 17:44:27 INFO iceberg.RemoveSnapshots: Expired snapshot: BaseSnapshot{id=1992367331685787804, timestamp_ms=1644485876293, operation=append, summary={flink.job-id=78930f941991e19112d3917fd4dd4cb2, flink.max-committed-checkpoint-id=18797, added-data-files=2, added-records=5464, added-files-size=46683, changed-partition-count=1, total-records=5038462, total-files-size=43871302, total-data-files=125, total-delete-files=0, total-position-deletes=0, total-equality-deletes=0}, manifest-list=hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-1992367331685787804-1-c0b03758-41c3-46bc-b157-8e846674b1e2.avro, schema-id=0}
22/02/10 17:44:27 INFO iceberg.RemoveSnapshots: Expired snapshot: BaseSnapshot{id=3398467964620293154, timestamp_ms=1644485936300, operation=append, summary={flink.job-id=78930f941991e19112d3917fd4dd4cb2, flink.max-committed-checkpoint-id=18798, added-data-files=3, added-records=5960, added-files-size=52223, changed-partition-count=2, total-records=5044422, total-files-size=43923525, total-data-files=128, total-delete-files=0, total-position-deletes=0, total-equality-deletes=0}, manifest-list=hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-3398467964620293154-1-d10ea8c5-3986-45b1-bde6-6ed75148dce2.avro, schema-id=0}
22/02/10 17:44:27 INFO iceberg.RemoveSnapshots: Committed snapshot changes; cleaning up expired manifests and data files.
22/02/10 17:44:31 WARN iceberg.RemoveSnapshots: Manifests to delete: hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/278e6825-3381-47aa-a08b-4d86a1a0f0e6-m0.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/c00444d4-86e4-4df9-b7b1-29bc15e203a5-m0.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/9380e713-bd4a-41b4-9140-704a7624d2bf-m5.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/9380e713-bd4a-41b4-9140-704a7624d2bf-m1.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/9380e713-bd4a-41b4-9140-704a7624d2bf-m4.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/c7fb0523-a144-4bcd-89f3-56c0984561d1-m21.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/92f9c63e-bc85-4965-9a75-b346fe797ad9-m0.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/9380e713-bd4a-41b4-9140-704a7624d2bf-m3.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/9380e713-bd4a-41b4-9140-704a7624d2bf-m0.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/b98bd620-63f7-4cc5-8b77-1c6b4ba1cf95-m0.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/9380e713-bd4a-41b4-9140-704a7624d2bf-m6.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/836c917c-2207-400a-a74c-edc562a9603a-m0.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/9305f3e1-9e54-4499-ae7e-8bacc7816c31-m0.avro
22/02/10 17:44:31 WARN iceberg.RemoveSnapshots: Manifests Lists to delete: hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-541878440800103826-1-1b8107b6-6f58-41b3-bca3-21bf624c4719.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-7791706873901858756-1-b98bd620-63f7-4cc5-8b77-1c6b4ba1cf95.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-5657396929463700436-1-c00444d4-86e4-4df9-b7b1-29bc15e203a5.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-3398467964620293154-1-d10ea8c5-3986-45b1-bde6-6ed75148dce2.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-6570416090976553560-1-e329179f-e202-41f6-852e-2585b46eee2e.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-4278660516617569111-1-d6a355d7-f7d4-4be6-b640-674aedea38d0.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-7075499429808392849-1-9ade184e-f771-4413-81a8-a968785638f9.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-2342072444431983976-1-8a281c33-2d42-4828-adb9-0fcbc49cbacd.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-1095796975631658317-1-b071bfb7-3109-4a92-972d-c620138f7220.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-6005578883465127048-1-7ccf0fb1-9472-4ec4-8198-0dc6b911bdf7.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-5532078125138954836-1-6bb16325-09df-4638-8a61-e02a6f5e53f6.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-4262237804586276768-1-e7c26525-29d1-4a1a-867e-cd9790a55068.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-1722651238361119409-1-df33539d-c13f-4d07-aa7e-657b42df1f78.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-22739922463920002-1-1c513718-42d8-41a0-82ea-486d2a4a3bbb.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-468106048969373971-1-5d4446d8-d779-426a-8243-8b857383fd3e.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-8018294029736388458-1-c208faf8-7d30-474a-880c-8191db9cd448.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-1470299392035948712-1-d9dd390e-ee51-4ffd-ae30-e91a6d019757.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-1992367331685787804-1-c0b03758-41c3-46bc-b157-8e846674b1e2.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-7058941970938557666-1-9602f8d7-4638-4d23-af5b-64d3382e1644.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-5361802753278781380-1-9b81fdec-ecee-4330-8541-aea40c878268.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-2183998972431095493-1-1b41dec9-c9e6-441e-bdfd-a5cbd52b11fc.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-1078972720570425309-1-be9504a7-fb98-48df-9999-c2857d856af7.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-3495751966676651473-1-5dfc4f7b-c16d-4429-8182-a375db8ec903.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-2843486521572234923-1-754d9e82-2175-4762-8737-d95ae98200d4.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-8309783644936857381-1-9305f3e1-9e54-4499-ae7e-8bacc7816c31.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-2484166318625325659-1-02b973a7-2012-4661-9147-145ea82b5126.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-6848801094803890889-1-d96ba7dc-7ff2-40ad-a582-f33c987a6740.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-1559852676159610002-1-615b94f7-5a3a-42a3-9181-1ad9a2425427.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-5487640863335657501-1-cf9145ca-9184-4095-9af5-625307270cde.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-3606588897957810627-1-92f9c63e-bc85-4965-9a75-b346fe797ad9.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-6539673233134379517-1-345c9b77-be31-449c-a67e-970b80078069.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-5895976650901516425-1-a9f423cc-0133-4118-9292-016d5227f57a.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-3903341502082098658-1-7a86b5d3-8c5e-4a9c-96c3-85a0c5fa3df0.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-9158469320395181971-1-7674e415-c2f5-4566-b251-20c2636dfc1f.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-2526947968329093048-1-8336f6a1-7039-41a7-b736-229ce5bcf10a.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-7955617778669899471-1-1049aed0-3215-4267-82fe-e37df441957f.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-7923280809105826466-1-40b70bd0-e8fb-4186-8c1f-97a427649160.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-878441999283792062-1-68955262-5444-4898-8d98-f93736abcd9b.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-4364834558723325257-1-00fb7f19-4224-4da8-b1f0-d85ed241d7eb.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-451594432613548689-1-4de192bc-1b21-445b-903f-a88137b930c5.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-5974895447555666685-1-36baa0b7-0f1f-4e1c-9595-69179fb09aa9.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-6532101506813450600-1-5898b192-1821-48bd-9c6c-98cd496ba37a.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-8489936993001945197-1-353c552c-1595-495f-8e44-641f47ebf250.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-8127473867318873076-1-f0905056-841f-496b-a6fd-133ca6f121d2.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-3511541622291330360-1-9380e713-bd4a-41b4-9140-704a7624d2bf.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-4031948148957742647-1-278e6825-3381-47aa-a08b-4d86a1a0f0e6.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-3212606031402422010-1-a70f8e62-8c34-432e-b752-9063ed2c902f.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-5013785705895265232-1-03d4f1b3-c4ee-4217-b0f2-19168a8ed28e.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-6026165152411827559-1-5ca27510-3e7c-446e-8f69-eddd80bb2b66.avro
behavior_with_date_log_ib 表 清理完成!!!
Process finished with exit code 0
iceberg的文件合并与快照删除特点:
合并:会生成新的文件
快照删除:会删除snap和Manifests 文件,metadata文件没有合并,并清理老metadata