备份StarRocks数据到对象存储minio中/外表查minio中的数据

1.部署minio环境

docker pull minio/minio

宿主机与容器挂在映射

宿主机位置 容器位置
/data/minio/config /data
/data/minio/data /root/.minio

拉起环境:

docker run -p 9000:9000 -p 9090:9090 --name minio \
-d --restart=always \
-e "MINIO_ACCESS_KEY=admin" \
-e "MINIO_SECRET_KEY=admin123456" \
-v /data/minio/data:/data \
-v /data/minio/config:/root/.minio \minio/minio \
server /data --console-address ":9090

备份StarRocks数据到对象存储minio中/外表查minio中的数据_第1张图片

2.准备starrocks环境

参考docker部署starrocks 使用 Docker 部署 StarRocks @ deploy_with_docker @ StarRocks Docs

3.minio文件查询/全库备份·实操

借助python生成parquet文件

xiuchenggong@xiuchengdeMacBook-Pro ~ % python3
Python 3.9.10 (main, Jan 15 2022, 11:48:04) 
[Clang 13.0.0 (clang-1300.0.29.3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas as pd;
>>> pf = pd.read_csv("/Users/xiuchenggong/test.csv")
>>> pf.to_parquet("/Users/xiuchenggong/test.parquet",engine="pyarrow")

 3.1 去查存在minio上的parquet数据(支持查parquet或orc格式数据):

StarRocks > CREATE EXTERNAL TABLE table_1
    -> (
    ->     name string,
    ->     id int
    -> )
    -> ENGINE=file
    -> PROPERTIES
    -> (
    -> "path" = "s3a://starrocks/test.parquet",
    -> "format" = "parquet",
    -> "aws.s3.enable_ssl" = "false",
    -> "aws.s3.enable_path_style_access" = "true",
    -> "aws.s3.endpoint" = "172.17.0.3:9000",
    -> "aws.s3.access_key" = "0OnU8H9YwTNTJUBC2r7F",
    -> "aws.s3.secret_key" = "vFQ3fIcs90woUS4200L0BYfxelE86iF6cI4vVzYC"
    -> );
Query OK, 0 rows affected (0.009 sec)

StarRocks > show tables;
+-------------------+
| Tables_in_test_db |
+-------------------+
| table_1           |
| test1             |
| test2             |
+-------------------+
3 rows in set (0.003 sec)
StarRocks > select * from table_1;
+--------------+------+
| name         | id   |
+--------------+------+
| gongxiucheng |    1 |
| gongzixi     |    2 |
+--------------+------+
2 rows in set (0.073 sec)

3.2 全量备份到minio(外表不能备份)

创建repository:

StarRocks > create repository starrocks_backup_01
    -> with broker
    -> on location "s3a://starrocks"
    -> properties(
    ->     "aws.s3.enable_ssl" = "false",
    ->     "aws.s3.enable_path_style_access" = "true",
    ->     "aws.s3.access_key" = "0OnU8H9YwTNTJUBC2r7F",
    ->     "aws.s3.secret_key" = "vFQ3fIcs90woUS4200L0BYfxelE86iF6cI4vVzYC",
    ->     "aws.s3.endpoint" = "172.17.0.3:9000"
    -> )
    -> ;

开始备份: 

StarRocks > drop table table_1;
Query OK, 0 rows affected (0.010 sec)

StarRocks > backup snapshot test_db.snapshot_minio to starrocks_backup_01 properties("type"="full");
Query OK, 0 rows affected (0.024 sec)


StarRocks > show backup\G;
*************************** 1. row ***************************
               JobId: 11047
        SnapshotName: snapshot_minio
              DbName: test_db
               State: SAVE_META
          BackupObjs: [test_db.test1], [test_db.test2]
          CreateTime: 2023-09-05 01:58:42
SnapshotFinishedTime: 2023-09-05 01:58:48
  UploadFinishedTime: 2023-09-05 01:58:54
        FinishedTime: NULL
     UnfinishedTasks:
            Progress:
          TaskErrMsg:
              Status: [OK]
             Timeout: 86400
1 row in set (0.003 sec)

ERROR: No query specified


StarRocks > show backup\G;
*************************** 1. row ***************************
               JobId: 11047
        SnapshotName: snapshot_minio
              DbName: test_db
               State: FINISHED
          BackupObjs: [test_db.test1], [test_db.test2]
          CreateTime: 2023-09-05 01:58:42
SnapshotFinishedTime: 2023-09-05 01:58:48
  UploadFinishedTime: 2023-09-05 01:58:54
        FinishedTime: 2023-09-05 01:59:00
     UnfinishedTasks:
            Progress:
          TaskErrMsg:
              Status: [OK]
             Timeout: 86400
1 row in set (0.004 sec)

ERROR: No query specified

查看minio上文件:

备份StarRocks数据到对象存储minio中/外表查minio中的数据_第2张图片备份成功;

你可能感兴趣的:(minio,starrocks,备份)