DC/OS-数据中心操作系统-(1)概览

本文根据 DCOS官方文档 https://docs.mesosphere.com/1.9/ 翻译整理。

概览

1. DC/OS是什么?

关键词:distributed system, cluster manager,container platform, operating system

DC/OS是一个分布式集群管理系统,容器平台,数据中心操作系统。

1.1 分布式系统

节点分为master nodes 和 agent nodes

master nodes: leader election

1.2 集群管理器

基于Apache Mesos

1.3 容器平台

两个内置task scheduler:

  • Marathon
  • Metronome(DC/OS Jobs)

两个container runtimes:

  • Docker
  • Mesos

所有运行在DC/OS的task都是容器化的。可以使用现成的镜像,也可以使用可执行文件或脚本(会在运行时做容器化处理)

目前需要每个节点安装Docker,将来版本可能不再需要Docker引擎

1.4 操作系统

  • abstract resources
  • package management
  • networking
  • logging and metrics
  • storage and volumes
  • identity management

内核空间和用户空间

  • system space 资源分配、安全、进程隔离
  • user space: applications,jobs,services(package manager)

not a host operating system

2. 架构

关键词:Software Layer, Platform Layer, Infrastructure Layer, package, component

运行分布式、容器化软件的平台,屏蔽基础设施层差异,可运行在虚拟机、物理机,
提供计算、存储和网络服务的基础设施即可。

总体架构分为软件、平台和基础设施层。

DC/OS-数据中心操作系统-(1)概览_第1张图片
dcos-architecture-layers-1.png

软件层

软件层的特色是package

提供package management和package repository简化服务安装

(db,mq,stream processors,artifact repositories,monitoring solutions,ci tools,src contol,log aggregators)

用户也可以安装自定义服务

平台层

平台层主要由分布在各个节点的组件构成

组件有以下类别:

  • Cluster Management
  • Container Orchestration
  • Container Runtimes
  • Logging and Metrics
  • Networking
  • Package Management
  • IAM and Security (身份及访问管理(Identity and Access Management))
  • Storage

组件运行在Master nodes,Private Agent Notes,Public Agent Nodes

Infrastructure Layer

can be installed on x86 machinse on a shared IPv4 network

  • public clouds
  • private clouds
  • on-premises hardware

External components

  • GUI
  • CLI
  • package repository
  • container registry(master.mesos:5000)

接下来是以下内容:

  • 节点类型
  • Task类型
  • 组件
  • 分布式进程管理
  • 启动时序

2.1 节点类型

关键词:master node, public/private agent node,leader election, quorum

DC/OS node是一个虚拟机或物理机。 node有三类:

  • master
  • private agent
  • public agent
DC/OS-数据中心操作系统-(1)概览_第2张图片
dcos-node-types-1.png

DMZ:demilitarized zone

私有 受保护 公开
private agent master public agent

Master Nodes

大部分DC/OS组件运行在Master nodes, 包括Mesos master process

Protected Zone: master节点通常应该部署到网络访问受限的区域内。

HA: 多个master实现HA和容错。一个master的集群适用于开发

Leader Election: master节点之间的leader election将流量路由到当前leader. 同时其他组件也有自己独立的leader election,不同组件的leader可能位于不同的master

Quorum: Quorum表示法定仲裁数。保证50% + 1数量的master节点总是可用,例如3个master要保证至少2个,5个保证3个。master节点的数目只能在安装时指定。主要是由于改变Quorum比较复杂。后续可能会改进。

Agent Nodes

agent node用于运行用户tasks.

agent node只包含少数组件,包含一个Mesos Agent process.

根据网络配置不同,agent分为public和private.

Public Agent Nodes:

允许从集群外部访问。

public agent nodes的资源只会被分配给角色为slave_public的task.

public agent node的Mesos agent有public_ip:true属性作为标识。

public agent node主要用于外部反向代理LB,如Marathon-LB. 仅对外暴露DMZ减少恶意攻击风险。

集群一般有少量public agent nodes.

Private Agent Nodes:

private agent node无法访问集群外部网络。

private node agent 的资源默认执行无差别分配。 确切说,资源具有*角色,可分配给没有指定role的任意task.

private agent nodes运行大部分任务,且不暴露到外部网络,因此private agent 会占到集群节点的大多数。大多数Mesosphere Universe package也是默认安装到private agent node.

2.2 Task类型

关键词:scheduler, executor, task, Marathon,Metronome

task被两种调度器调度:内置调度器和调度服务。

Executors

scheduler在launch一个task时,指定Mesos Executor.
在Mesos中,scheduler和executor合称为framework.

内置executor对所有scheduler可用,scheduler也可以使用自定义executor.

  • Command Executor: 运行shell commands或Docker containers
  • Default Executor(Mesos 1.1): 执行一组shell commands或Docker containers

Schedulers

用户并不会直接控制task. scheduler提供了对task控制的高级抽象。

内置scheduler

  • Marathon scheduler provides services(Apps and Pods), run continuously and in parallel
  • Metronome scheduler provides jobs, run immediately or on a defined schedule.

User space schedulers--自定义调度器

用户可以安装额外的作为service的调度器,如

  • Kafka scheduler provides Kafka brokers
  • Cassandra scheduler provides Cassandra nodes
  • Spark scheduler(dispatcher) provides Spark jobs

2.3 Components

DC/OS是由很多开源微服务组件组成

Mesosphere Enterprise DC/OS 除包含大部分开源微服务组件外,还包含一些额外的组件、模块和插件。

DC/OS-数据中心操作系统-(1)概览_第3张图片
dcos-enterprise-components-1.9-1.png

From the top: batteries-included 容器平台

  • 容器编排
  • package管理
  • 安全

From the bottom: 基于Apache Mesos的操作系统

  • 集群管理
  • SDN
  • 日志和计量数据收集

下面按类别介绍各个组件

  • Cluster Management
  • Container Orchestration
  • Container Runtimes
  • Logging and Metrics
  • Networking
  • Package Management
  • IAM and Security (身份及访问管理(Identity and Access Management))
  • Storage

2.3.1 Cluster Management

集群管理相关组件 系统服务
Apache Mesos dcos-mesos-master.service
dcos-mesos-slave.service
dcos-mesos-slave-public.service
Apache ZooKeeper 被Exhibitor管理
Exhibitor dcos-exhibitor.service
DC/OS Installer dcos-download.service
dcos-setup.service
DC/OS GUI served by Admin Router
DC/OS CLI a user downloadable binary
Apache Mesos

kernel

Apache ZooKeeper

|consistent, highly available, distributed key-value storage for configuration, synchronization, name registration, and cluster state storage.

ZooKeeper被Exhibitor管理

Exhibitor

管理ZooKeeper并提供Web Interface

DC/OS Installer
  • dcos_generate_config.ee.sh 生成install artifacts并安装DC/OS.
  • DC/OS Download服务从bootstrap下载install artifacts
  • DC/OS Setup服务使用DC/OS Component Pakage Manager(Pkgpanda)安装组件
DC/OS GUI

The GUI is served by Admin Router.

DC/OS CLI

a user downloadable binary.

2.3.2 Container Orchestration

持续、自动化调度、协调、管理容器化进程和资源--容器编排

容器编排相关组件 系统服务
Marathon dcos-marathon.service
Metronome dcos-metronome.service
Marathon

orchestrates long-lived containerized services(app and pods).

Metronome

aka DC/OS Jobs

orchestrates short-lived, scheduled or immediate, containerized jobs

2.3.3 Container Runtimes

容器运行时相关组件 系统服务
Universal Container Runtime part of Mesos Agent
Docker Engine docker.service
Docke GC(Since 1.9.0) dcos-docker-gc.service
dcos-docker-gc.timer
Universal Container Runtime

also called Mesos Containerizer

  • a logical component built-in to the Mesos Agent
  • not technically a separate process
  • containerizes Mesos tasks with configurable isolators
  • supports multiple image formats, incluing Docker images with out Docker engine.

Universal Container Runtime is part of Mesos Agent.

Docker Engine

DC/OS Installer 不会安装Docker Engine.
Docker Engine 作为节点系统依赖需要手动安装。

Mesos Agent也包含一个独立逻辑组件Docker Containerizer

Docker Engine is not installed by the DC/OS installer.

Docker GC

NEW IN 1.9.0

2.3.4 Logging and Metrics

aggregating, caching, and streaming logs, metrics, and cluster state metadata.

日志和计量相关组件 系统服务
DC/OS Network Metrics(Enterprise DC/OS) dcos-networking_api.service
3DT dcos-3dt.service
dcos-3dt.socket
DC/OS Log(Since 1.9.0) dcos-log-master.service
dcos-log-master.socket
dcos-log-agent.service
dcos-log-agent.socket
Logrotate dcos-logrotate-master.service
dcos-logrotate-master.timer
dcos-logrotate-agent.service
dcos-logrotate-agent.timer
DC/OS Metrics dcos-metrics-master.service
dcos-metrics-master.socket
dcos-metrics-agent.service
dcos-metrics-agent.socket
DC/OS Signal(建议安装时取消) dcos-signal.service
dcos-signal.timer
DC/OS History dcos-history.service
DC/OS Network Metrics(Enterprise DC/OS)

DC/OS Network Metrics 即 DC/OS Networking API.

3DT

3DT: DC/OS Distributed Diagnostics Tool

aggregates and exposes component health.

DC/OS Log

NEW IN 1.9.0

exposes node, component, and container(task) logs.

Logrotate

manages rotation, compression and deletion of historical log files.

DC/OS Metrics

NEW IN 1.9.0

exposes node, container, and application metrics.

DC/OS Signal

reports cluster telemetry and analytics to help improve DC/OS.

安装时可以取消该选项。

DC/OS History

caches and exposes historical system state to facilitate cluster usage statistics in the GUI.

2.3.5 Networking

DC/OS 的networking components 用于routing, proxying, name resolution, virtual IPs, load balancing, and distributed reconfiguration.

网络相关组件 系统服务
Admin Router dcos-adminrouter.service
dcos-adminrouter-reload.service
dcos-adminrouter-reload.timer
dcos-adminrouter-agent.service
dcos-adminrouter-agent-reload.service
dcos-adminrouter-agent-reload.timer
Mesos DNS dcos-mesos-dns.service
DNSForwarder(Spartan) dcos-spartan.service
dcos-spartan-watchdog.service
dcos-spartan-watchdog.timer
Generate resolv.conf dcos-gen-resolvconf.service
dcos-gen-resolvconf.timer
Minuteman Included in Navstar
Navstar dcos-navstar.service
Erlang Port Mapping Daemon(EPMD) dcos-epmd.service
Admin Router

endpoints 代理

proxies node-specific health, logs, metrics, and package management internal endpoints.

Mesos DNS

provides domain name based service discoverty within the cluster.

集群内基于域名的服务发现

DNSForwarder(Spartan)

forwards DNS requests to multiple DNS servers. Spartan Watchdog restarts Spartan when it is unhealthy.

Generate resolv.conf

Generate resolv.conf configures network name resolution by updating /etc/resolv.conf to facilitate DC/OS’s software defined networking.

Minuteman

provides distributed Layer 4 virtual IP load balancing.

Included in Navstar.

Navstar

orchestrates virtual overlay networks using VXLAN and manages distributed Layer 4 virtual IP load balancing.

Erlang Port Mapping Daemon(EPMD)

facilitates communication between distributed Erlang programs.

2.3.6 Package Management

package管理分为两个层次:

  • machine-level for components
  • cluster-level for user services
Package管理相关组件 系统服务
DC/OS Package Manager(Cosmos) dcos-cosmos.service
DC/OS Component Package Manager (Pkgpanda) dcos-pkgpanda-api.service
dcos-pkgpanda-api.socket
Cosmos

aka DC/OS Package Manager

负责从DC/OS package repositories(如Mesosphere Universe)安装package

Pkgpanda

aka DC/OS Component Package Manager

安装并管理DC/OS组件

2.3.7 IAM and Security

在Enterprise DC/OS中,IAM功能由一个内部数据库管理用户、用户组和权限。

以下组件仅包含于Enterprise DC/OS

|IAM相关组件|系统服务|
|Bouncer|dcos-bouncer.service|
|DC/OS Certificate Authority|dcos-ca.service|
|DC/OS Secrets|dcos-secrets.service|
|Vault|dcos-vault.service|

Bouncer

即DC/OS Identity and Access Manager

支持LDAP, SAML, or Open ID Connect

DC/OS Certificate Authority

处理已签名的数字证书相关。
基于 Cloudflare’s Cfssl.

DC/OS Secrets

存储secrets.

such as API keys, passwords, certificates, and more

provides a secure API for storing and retrieving secrets from Vault, a secret store.

Vault

for securely managing secrets

provides a unified interface to any secret.

2.3.8 Storage

存储相关组件 系统服务
REX-Ray dcos-rexray.service
REX-Ray

orchestrates provisioning, attachment, and mounting of external persistent volumes

2.3.9 Legacy Component Changes

Cluster ID Service已在1.9.0移除。DC/OS Setup Service负责生成集群UUID.

Mesos Persistent Volume Discoverty service已在1.9.0移除。 mounted disk resource的检测由DC/OS Setup service执行。

2.3.10 Sockets and Timers

有些组件是响应式的,使用 systemd sockets 按需启动,而不是持续运行占用资源。

这些sockets作为独立的systemd units存在,不作为独立组件。

有些组件使用 systemd timers 定时运行或重启。 也不作为独立组件。

2.3.11 Component Installation

组件的安装、升级和管理由Pkgpanda负责。

2.3.12 Systemd Services

大多数组件以 systemd services 的形式运行在各个节点上。

查看方法:

查看

/etc/systemd/system/dcos.target.wants/

或执行 systemctl | grep dcos-

Master Node

[vagrant@m1 ~]ls /etc/systemd/system/dcos.target.wants/
dcos-3dt.service                 dcos-marathon.service
dcos-3dt.socket                  dcos-mesos-dns.service
dcos-adminrouter-reload.service  dcos-mesos-master.service
dcos-adminrouter-reload.timer    dcos-metrics-master.service
dcos-adminrouter.service         dcos-metrics-master.socket
dcos-bouncer.service             dcos-metronome.service
dcos-ca.service                  dcos-navstar.service
dcos-cosmos.service              dcos-networking_api.service
dcos-epmd.service                dcos-pkgpanda-api.service
dcos-exhibitor.service           dcos-pkgpanda-api.socket
dcos-gen-resolvconf.service      dcos-secrets.service
dcos-gen-resolvconf.timer        dcos-signal.service
dcos-history.service             dcos-signal.timer
dcos-log-master.service          dcos-spartan.service
dcos-log-master.socket           dcos-spartan-watchdog.service
dcos-logrotate-master.service    dcos-spartan-watchdog.timer
dcos-logrotate-master.timer      dcos-vault.service

Private Agent Node

[vagrant@a1 ~]ls /etc/systemd/system/dcos.target.wants/
dcos-3dt.service                       dcos-logrotate-agent.timer
dcos-3dt.socket                        dcos-mesos-slave.service
dcos-adminrouter-agent-reload.service  dcos-metrics-agent.service
dcos-adminrouter-agent-reload.timer    dcos-metrics-agent.socket
dcos-adminrouter-agent.service         dcos-navstar.service
dcos-docker-gc.service                 dcos-pkgpanda-api.service
dcos-docker-gc.timer                   dcos-pkgpanda-api.socket
dcos-epmd.service                      dcos-rexray.service
dcos-gen-resolvconf.service            dcos-signal.timer
dcos-gen-resolvconf.timer              dcos-spartan.service
dcos-log-agent.service                 dcos-spartan-watchdog.service
dcos-log-agent.socket                  dcos-spartan-watchdog.timer
dcos-logrotate-agent.service

Public Agent Node

[vagrant@p1 ~]ls /etc/systemd/system/dcos.target.wants/
dcos-3dt.service                       dcos-logrotate-agent.timer
dcos-3dt.socket                        dcos-mesos-slave-public.service
dcos-adminrouter-agent-reload.service  dcos-metrics-agent.service
dcos-adminrouter-agent-reload.timer    dcos-metrics-agent.socket
dcos-adminrouter-agent.service         dcos-navstar.service
dcos-docker-gc.service                 dcos-pkgpanda-api.service
dcos-docker-gc.timer                   dcos-pkgpanda-api.socket
dcos-epmd.service                      dcos-rexray.service
dcos-gen-resolvconf.service            dcos-signal.timer
dcos-gen-resolvconf.timer              dcos-spartan.service
dcos-log-agent.service                 dcos-spartan-watchdog.service
dcos-log-agent.socket                  dcos-spartan-watchdog.timer
dcos-logrotate-agent.service

2.4 Distributed Process Management

进程通信发生在层与层之间、同一层内部。

DC/OS-数据中心操作系统-(1)概览_第4张图片
dcos-architecture-distributed-process-management-concept-1-600x378@2x.png

以Marathon部署Docker容器的service为例:

DC/OS-数据中心操作系统-(1)概览_第5张图片
dcos-architecture-distributed-process-management-seq-diagram-1-600x363@2x.png
Step Description
1 Client/Scheduler 初始化: 客户端需要知道怎样连接Scheduler来启动一个process例如通过Mesos-DNS或DC/OS CLI.
2 Mesos master发送资源offer给Scheduler: 基于Mesos master的DRF算法和agent资源计算offer
3 Scheduler 拒绝资源offer,因为没有process请求。只要process没有初始化,scheduler会拒绝master的资源offer.
4 Client 初始化 process launch. 例如,用户通过DC/OS Services页面或HTTP endpoint v2/app创建Marathon app
5 Mesos master 发送资源offer. 例如 cpus():1; mem():128; ports(*):[21452-21452]
6 如果资源offer满足Scheduler的需求, Scheduler接受offer并发送lunchTask请求到Mesos master.
7 Mesos master协调Mesos agents启动task.
8 Mesos agent通过Executor启动task.
9 Executor向Mesos agent报告task状态.
10 Mesos agent向Mesos master报告task状态.
11 Mesos master向scheduler报告task状态.
12 Scheduler向client报告process状态.

2.5 Boot Sequence

安装DC/OS时, 各个组件是并行安装,但由于存在依赖关系,它们的初始化是有一定顺序的。

3DT服务用于监控组件服务和节点健康。如果一个节点的全部组件服务是健康的,该节点会被标记为健康。

Master节点

以下是Master节点的组件服务启动时序。

  1. 启动3DT
    1. 轮询systemd查询组件状态
    2. 报告节点unhealthy,直到所有组件(systemd services)变成健康状态
    3. 报告集群unhealthy,直到所有master节点变成健康状态
  2. 启动Exhibitor
    1. 创建ZooKeeper配置并启动ZooKeeper
  3. 启动Mesos Master
    1. 通过本地ZooKeeper注册
    2. 从ZooKeeper发现其他Mesos Master
    3. 选举leading master
  4. 启动Mesos DNS
    1. 发现leading Mesos Master(通过zk或mesos-master?--原文如此)
    2. 轮询leading Mesos Master查看集群状态

3. Features

4. Concepts

5. High Availability

6. Telemetry

7. Service Discovery

8. Feature Maturity

你可能感兴趣的:(DC/OS-数据中心操作系统-(1)概览)