一些DAG流程框架

AI之风盛行,为什么这么说?记得去年参加公司的校园招聘时,在面试的50人中95%的人学校都有人工智能的课程,如python、matlab等,但很少的人从事过具体应用实践,更不必说工程化场景了,由此看出AI的热度。人工智能前景的确很不错,市场也确实很有刚需,但真正要掌握好,真不是照着教科书打个helloworld那么容易。所以说,为了降低门槛,一直想做个工具,实现从数据准备、特征工程、模型训练、到评估的整个过程,也就是常说由多任务组成的pipeline。

在一个流程系统中,任务间往往存在复杂的依赖关系,为保证pipeline的正确执行,就是要解决各任务间依赖的问题,这样DAG结合拓扑排序是解决存在依赖关系的一类问题的利器。DAG ( Directed Acyclic Graph),有向无环图,是指任意一条边有方向,且不存在环路的图。如果把依赖关系的问题建模成 DAG, 依赖关系成为 Graph 中的 Directed Edge, 然后通过拓扑排序,不断遍历和剔除无依赖的接点,可以达到快速解决依赖的目的。

要打造一个机器学习的平台,重要是要解决如何设计一个很好的DAG流程调度系统。下面一些开源的工作流调度系统可按需选择:

  • Airflow

- Python-based platform for running directed acyclic graphs (DAGs) of tasks

  • Argo

- Open source container-native workflow engine for getting work done on Kubernetes

  • Azkaban

- Batch workflow job scheduler created at LinkedIn to run Hadoop jobs.

  • Brigade

- Brigade is a tool for running scriptable, automated tasks in the cloud — as part of your Kubernetes cluster.

  • Cadence

- An orchestration engine to execute asynchronous long-running business logic developed by Uber Engineering.

  • CloudSlang

- Workflow engine to automate your DevOps use cases.

  • Conductor

- Netflix's Conductor is an orchestration engine that runs in the cloud.

  • Cromwell

- Workflow engine written in Scala and designed for simplicity and scalability. Executes workflows written in WDL or CWL.

  • DigDag

- Digdag is a simple tool that helps you to build, run, schedule, and monitor complex pipelines of tasks.

  • Fission Workflows

- A high-perfomant workflow engine for serverless functions on Kubernetes.

  • Flor

- A workflow engine written in Ruby.

  • Imixs-Workflow

- A powerful human-centric Workflow Engine based on the BPMN 2.0 standard.

  • Kiba

- Data processing & ETL framework for Ruby

  • Mistral

- Workflow service, in OpenStack foundation.

  • Oozie

- Workflow Scheduler for Hadoop.

  • Piper

- A distributed Java workflow engine designed to be dead simple.

  • Pinball

- scalable workflow manager by Pinterest

  • RunDeck

- Job Scheduler and Runbook Automation.

  • Titanoboa

- Titanoboa is a platform for creating complex workflows on JVM.

  • Wexflow

- A high-performance, extensible, modular and cross-platform workflow engine.

  • Workflow Engine - A lightweight .NET and Java workflow engine.
  • Workflow Core

- Workflow Core is a light weight workflow engine targeting .NET Standard.

  • Copper

- A high performance Java workflow engine.

  • Zeebe

- A workflow engine for microservices orchestration that's capable of executing BPMN models, developed by the team at Camunda

你可能感兴趣的:(AI,人工智能)