Temporal 部署

1、集群架构

操作部署前请仔细阅读 03 Temporal 详细介绍 篇

2、Temporal Server 部署流程

原理是:启动前先用 dockerize 生成一遍实际使用的配置,然后再启动Server本体

Self-hosted Temporal Cluster guide | Temporal Documentation

新建应用

app:​编辑temporal-eco - 云效应用 - temporal-eco

数据库:temporal_eco 和 temporal_eco_vis 

数据库 Schema 创建及数据初始化

涉及系统环境变量

TEMPORAL_STORE_DB_NAME
TEMPORAL_STORE_DB_HOST
TEMPORAL_STORE_DB_PORT
TEMPORAL_STORE_DB_USER
TEMPORAL_STORE_DB_PWD
TEMPORAL_VISIBILITY_STORE_DB_NAME
TEMPORAL_VISIBILITY_STORE_DB_HOST
TEMPORAL_VISIBILITY_STORE_DB_PORT
TEMPORAL_VISIBILITY_STORE_DB_USER
TEMPORAL_VISIBILITY_STORE_DB_PWD

参考 链接 和 v1.12.0 Makefile

# in https://github.com/temporalio/temporal git repo dir

export SQL_PLUGIN=mysql

export SQL_HOST=mysql_host

export SQL_PORT=3306

export SQL_USER=mysql_user

export SQL_PASSWORD=mysql_password

  

./temporal-sql-tool create-database -database temporal

SQL_DATABASE=temporal ./temporal-sql-tool setup-schema -0.0

SQL_DATABASE=temporal ./temporal-sql-tool update -schema-dir schema/mysql/v57/temporal/versioned

  

./temporal-sql-tool create-database -database temporal_visibility

SQL_DATABASE=temporal_visibility ./temporal-sql-tool setup-schema -0.0

SQL_DATABASE=temporal_visibility ./temporal-sql-tool update -schema-dir schema/mysql/v57/visibility/versioned

集群部署

frontend history matching 在创建服务时选 GRPC 服务,worker 选择其他 (它在某个版本后不会暴露 GRPC 端口),然后客户端服务用 GRPC + 公司的服务发现连到 frontend 即可

Shards 数量

参考链接,确定数量为 4k = 4096。Shards 数量一旦确定,后续无法改变(唯一无法改变的配置)。Shards are very lightweight. There are no real implications on the cost of clusters. We (Temporal) have tested the system up to 16k shards.

组件数量及资源量

推荐值(v)参考链接

组件

数量

推荐 CPU 资源量

推荐内存资源量

frontend 3 4 4Gi
history 5 8 8Gi
matching 3 4 4Gi
worker 2 4 4Gi

测试环境数量及资源量

组件

数量 = floor(v/2)

推荐 CPU 资源量 = floor(v/4)

推荐内存资源量 = floor(v/4)

frontend 1 1 1Gi
history 2 2 2Gi
matching 1 1 1Gi
worker 1 1 1Gi

生产环境数量及资源量

组件

数量 = floor(v/2)

推荐 CPU 资源量 = floor(v/2)

推荐内存资源量 = floor(v/2)

frontend 1 2 2Gi
history 2 4 4Gi
matching 1 2 2Gi
worker 1 2 2Gi

对接外部应用的 Mesh 地址

update_at

${unit}--master.app.svc

2023.05.04 temporal-frontend--master.temporal-eco.svc.cluster.local:80

3、部署细节

生产环境如何部署:Temporal Platform production deployments | Temporal Documentation

  • 支持 Cassandra、MySQL 和 PostgreSQL(版本见链接),需要确定与 TiDB 的兼容性
  • 数据库 Schema 升级
  • 通过指标来进行系统性能监控与调优
  • Server 拓扑中的不同服务特性不同,最好独立部署
    • the Frontend service is more CPU bound
    • the History and Matching services require more memory
  • Server 的一些数据大小限制
  • what-is-the-recommended-setup-for-running-cadence-temporal-with-cassandra-on-production
    • Number of history shards is a setting which cannot be updated after the cluster is provisioned. For all other parameters you could start small and scale your cluster based on need with time but this one you have to think upfront about your maximum load
    • Temporal server consists of 4 roles. Although you can run all roles within same process but we highly recommend running them separately as they have completely different concerns and scale characteristics. It also makes it operationally much simpler to isolate problems in production. All of the roles are completely stateless and system scales horizontally as you spin up more instances of role once you identify any bottleneck. Here are some recommendations to use as a starting point:

      • Frontend: Responsible for hosting all service api. All client interaction goes through frontend and mostly scales with rps for the cluster. Start with 3 instances of 4 cores and 4GB memory.
      • History: This hosts the workflow state transition logic. Each history host is running a shard controller which is responsible for activating and passivating shards on that host. If you provision a cluster with 4k shards then they are distributed across all available history hosts within the cluster through shard controller. If history hosts are scalability bottleneck, you just add more history hosts to the cluster. All history hosts form its own membership ring and shards are distributed among available nodes in the hash ring. They are quite memory intensive as they host mutable state and event caches. Start with 5 history instances with 8 cores and 8 GB memory.
      • Matching: They are responsible for hosting TaskQueues within the system. Each TaskQueue partition is placed separately on all available matching hosts. They usually scale with the number of workers connecting for workflow or activity task, throughput of workflow/activity/query task, and number of total active TaskQueues in the system. Start with 3 matching instances each with 4 cores and 4 GB memory.
      • Worker: This is needed for various background logic for ElasticSearch kafka processor, CrossDC consumers, and some system workflows (archival, batch processing, etc). You can just start with 2 instances each with 4 cores and 4 GB memory.
  • 接入 SSO
  • 打指标到 Prometheus
  • 如何配置 Grafana Dashboard
  • 是否部署至 K8s 的讨论

4、Temporal UI 部署流程

https://temporal-eco-ui.pek01.in.zhihu.com/namespaces/default/workflows

步骤 详细 遇到问题
本地部署

 
基础环境
  • apt install nodejs
  • apt install npm
  • npm install pnpm -g
  • apt-get install -y nodejs
  • /usr/bin/node -v
  • npm install n -g
  • n stable # 更新到稳定版本

安装依赖

(会产出temporal web 所需要的前端环境及代码)

安装最新的 temporal server

  • pnpm install
  • pnpm start

构建 ui-server

  • pnpm run build:local
  • pnpm run build:cloud

前端环境问题:nodejs、npm、pnpm

  • 本地安装部署,前端环境及代码整体打包到线上

pnpm build:server 生成 assets 文件,否则报 ui/assets 的 all:assets 找不到

Temporal UI requires Temporal v1.16.0 or later

第三方依赖

(产出通信所需的 proto 文件)

  • make install
  • make build-grpc

grpc 依赖 protobuf 协议(指定了依赖的代码路径)

  • 配置submodule,产出 proto 文件

Temporal 部署_第1张图片

尝试启动
  • go build -o ui-server ./cmd/server/main.go
  • ./ui-server start
然后切到自建的 temporal server 上,再次启动
需关注文件
  • package.json/script # pnpm 执行脚本文件
  • ./env  # 映射环境配置
    • 默认 dev 环境,package.json/script
    • "start": "pnpm run dev:local -- --open"
    • "dev:local": ". ./.env && VITE_TEMPORAL_UI_BUILD_TARGET=local vite dev --port 3000"
  • config/base.yaml # 依赖 temporal server 地址
  • config/development.yaml # 暴露页面端口

目前  temporal server 地址 是写死在 temporal ui 代码库里的(但一般应该不会变)

已做了动态配置生成

线上部署

git clone [email protected]:xujialong01/temporal-eco-server.git

配置 ui-server 服务地址

外部包导入个人仓库,加快下载

与 UI 版本对齐(temporal server upgrade)

参考资料

Temporal Platform production deployments | Temporal Documentation

Temporal 部署方案

你可能感兴趣的:(开源)