先跳过对微服务的讨论,你已经知道它们是什么以及它们为什么有意义了。事实上,最近几年有很多的话题都在讲把一个大事分解成很多小事会容易处理的多。问题是:我们拆分了巨石系统以后怎么把他们结合起来形成一个有意义的大系统?Istio, Kong 或者 Kafka 或很热情的高速你,这个问题的答案有很多种,不同的场景不同的需求有不同的解决方案。这篇博客旨在揭示微服务之间组织通信的各种方式,服务网格(Service Mesh) , API 网关(API Gateway) 或者 消息队列(Message Queue)什么情况下是满足你需求的最近方案。
在讨论解决方案之前,先聊聊问题:
问题是什么呢?
为了正常工作,基于微服务的体系结构必须解决一些分布式特有的挑战:
弹性 - Resiliency
任何一个微服务都可能有十几个升甚至几百个实例,每个实例都可能在任何时间由于任意几个原因失败
负载均衡 & 自动扩容 - Load Balancing & Auto-Scaling
由于可能有数百个端点才能够满足一个请求,路由和伸缩是非常普遍的。事实上,对于大型体系结构来说,最有效的成本节约措施之一是提高路由和缩放决策的精度。
With potentially hundreds of endpoints capable of fulfilling a request, routing and scaling are anything but trivial. In fact, one of the most effective cost-saving measures for large architectures is to increase the precision of routing and scaling decisions.
服务发现 Service Discovery
应用程序越复杂,分布越广,就越难找到现有的端点并与其建立通信通道。
The more complex and distributed an application, the harder it becomes to find existing endpoints and to establish a communication channel with them.
跟踪 & 监控 Tracing & Monitoring
微服务架构中的一个事务可能需要多个服务支持,使得链路追踪变的很困难
A single transaction in a microservice architecture might travel through multiple services, making it hard to trace its journey.
版本控制 Versioning
成熟系统更新可用端点和APIs 同时保证老版本可用变的非常重要
As systems mature it becomes paramount to update available endpoints and APIs while simultaneously ensuring that older versions remain available.
解决方案 - The solutions
到时候请出解决这些问题的候选人了:服务网格(Service Mesh) , API 网关(API Gateway) 和 消息队列(Message Queue)。当然,还有许多其他的方法,从简单的静态负载平衡和固定IP到中央编排服务器——但是为了本文的目的,让我们来看看当前最流行的和在许多方面最复杂的选项。
Alright, time to meet the contenders for solving these problems: Service Meshes, API Gateways, and Message Queues. Of course, there's also a number of other approaches, ranging from simple static load balancing and fixed IPs to central orchestration servers - but for the purpose of this post, let's look at the currently most popular and in many ways most sophisticated options.
API 网关 API Gateways
API网关是旧的HTTP调用反向代理的老大哥。是可扩展的,是面向Web的服务,可以接收来自公共Internet和内部服务的请求,并将它们转发到最适合的微服务实例。API网关同时具有一些有帮助的特性,包括负载均衡和监控检查、API版本控制和路由、请求验证与授权、数据转换、分析、日志记录、SSL卸载等等。流行的开源API网关有Kong 或者 Tyk。大多数云服务商也提供自己的实现例如: AWS API Gateway, Azure Api Management or Google Cloud Endpoints。
An API Gateway is the bigger brother of the good old reverse proxy for HTTP calls. It is a scalable, usually web-facing server that can receive requests from both public internet and internal services and forward them to the best suited microservice instance. API Gateways usually come with a number of helpful features, including load balancing and health checks, API versioning and routing, request authentication & authorization, data transformation, analytics, logging, SSL termination and more. Examples for popular open source API Gateways are Kong or Tyk. Most cloud providers offer their own implementation as well, e.g. AWS API Gateway, Azure Api Management or Google Cloud Endpoints.
优点 Benefits
API网关功能强大,相对较低的复杂性,并且很容易被经验丰富的Web老手理解。它们提供了针对公共互联网的可靠防御层,并卸载了许多重复性任务,例如用户身份验证或数据验证。
API Gateways are powerful in features, comparatively low in complexity and easily understood by seasoned web-veterans. They provide a solid layer of defense against the public internet and offload a lot of repetitive tasks, such as user authentication or data validation.
缺点 Downsides
API 网关是相当中心化的。它可以水平方式部署扩展,和服务网格不同的是仍然需要一个单点去做API注册或者配置变化,从组织的角度来看,它们很可能由一个单独的团队来维护。
API Gateways are fairly centralized. They can be deployed in a horizontally scalable fashion, but unlike service meshes, they still require a single point to register new APIs or change configuration. Seen from an organizational perspective, they are likely to be maintained by a single team
服务网格 Service Meshes
服务网格是微服务实例之间去中心化和自组织的网络,用于处理负载平衡、端点发现、监控检查、监控和跟踪。通过将一个小型代理(称为“SideCar”)附加到每个实例来工作,可以做流量治理并处理实例注册、指标收集和维护。虽然概念上是去中心化的,但大多数服务网格都有一个或多个中央元素来收集数据或提供管理接口。流行的例子包括Isito、Linkerd或Hashicorp的Consul。
Service Meshes are decentralized and self-organizing networks between microservice instances that handle load balancing, endpoint discovery, health checks, monitoring, and tracing. They work by attaching a small agent, referred to as a "sidecar" to each instance that mediates traffic and handles instance registration, metric collection, and upkeep. Whilst conceptually decentralized, most service meshes come with one or more central elements to collect data or provide admin interfaces. Popular examples include Istio, Linkerd or Hashicorp's Consul.
优势 Benefits
服务网格更具动态性,可以轻松地改变形状并适应新的功能和端点。它们天然的去中性化使得在相当孤立的团队中处理微服务更加容易。
Service meshes are more dynamic and can easily shift shape and accommodate new functionalities and endpoints. Their decentralized nature makes it easier to work on micro-services within fairly isolated teams
缺点 Downsides
服务网格可能非常复杂,需要很多动态部件。例如,充分利用ISTIO需要为每个节点部署单独的流量管理器、遥测采集器、证书管理器和边车进程。它们也是一个相当新的技术,让你有一种担心它太年轻而不太适合做为你的核心架构。
Service meshes can be quite complex and require a lot of moving parts. Fully utilizing Istio, for instance, requires the deployment of a separate traffic manager, a telemetry gatherer, a certificate manager and a sidecar process for each node. They are also a fairly recent development, making something that constitutes the very backbone of your architecture worryingly young.
消息队列 Message Queues
乍一看,对比服务网格和消息队列就像苹果和橘子对比,他们确实是不同的东西,但是他们通过不同的途径解决同样的问题。
消息队列允许您通过分离发送方和接收方,在服务之间建立复杂的通信模式。消息队列使用许多概念来实现这一点,例如基于主题的路由或发布订阅消息,以及任务缓存队列,这些是的多个实例随着时间的推移处理任务的不同方面更容易。
消息队列已经存在很长时间了,因此有了广泛的选择:流行的开源替代方案包括Apache Kafka、AMQP代理(如Rabbitmq或Hornetq)和云提供商版本(如AWS SQS或Kinesis)、Google Pubsub或Azure Service Bus。
At first glimpse, comparing service meshes to message queues seems like comparing apples to oranges: They are completely different things, but they solve the same problem, though in very different ways.
Message Queues allow you to establish complex communication patterns amongst services by decoupling sender and receiver They achieve this using a number of concepts, such as topic-based routing or publish-subscribe messaging, as well as buffered task queues that make it easy for multiple instances to process different aspects of a task over time.
Message Queues have been around for ages, resulting in a wide selection to choose from: Popular open source alternatives include Apache Kafka, AMQP Broker like RabbitMQ or HornetQ and Cloud Provider versions like AWS SQS or Kinesis, Google PubSub or Azure Service Bus.
优点 Benefits
简单地分离发送者和接收者是一个有效的概念,使得许多其他概念(如健康检查、路由、端点发现或负载平衡)变得没必要。实例可以在准备就绪时从缓冲队列中选择相关任务。当自动编排和缩放决策基于每个队列中的消息计数时,这将变得特别强大,从而使的系统资源利用率很高。
Simply decoupling sender and receiver is a potent concept that makes a number of other concepts such as health checks, routing, endpoint discovery or load balancing unnecessary. Instances can pick relevant tasks from a buffered queue as and when they are ready to do so. This becomes especially powerful when auto-orchestration and scaling decisions are based on the message count in each queue, leading to highly resource efficient systems.
缺点 Downsides
消息队列不适合请求/响应通信。有些人允许在现有概念的基础上加强这一点,但这并不是真正意义上的概念。由于它的缓冲特性,还可能给系统增加显著的延迟。它也使相当中心化的(虽然水平可伸缩),并且在规模上运行成本相当高。
Message Queues are not good at request/response communication. Some allow this to be shoehorned on top of existing concepts, but its not really what they are made for. Due to their buffered nature, they can also add significant latency to a system. They are also fairly centralized (though horizontally scalable) and can be quite costly to run at scale
怎么选?
实际上并不是非此即彼的,实际应用中可以使用API网关进行对味的API发布,运行服务网格处理内部服务通信,使用消息队列进行异步任务调度
但是,如果我们只将关注内部服务通信,一个可能的答案是:
- 如果您已经为面向公共的API运行了一个API网关,那么您也可以将复杂性保持在较低的水平,并将其重新用于服务间通信。
- 如果您在一个大型组织内工作,团队独立,沟通不畅,那么服务网格可以为您提供最高程度的独立性,使您很容易随着时间的推移添加新服务。
- 如果您正在设计一个系统,其中每个步骤都会随着时间而分散开来,例如一个类似YouTube的服务,其中视频的上载、处理和发布可能需要几分钟时间,请使用消息或任务队列。
Actually - this is not necessarily an either/or decision. In fact, it can make perfect sense to front ones public facing API with an API gateway, run a service mesh to handle inter-service communication and back things with a message queue for asynchronous task scheduling.
But if we reduce the focus purely to inter-service communication one possible answer could be:
- If you already run an API Gateway for your public facing API, you might as well keep complexity low and reuse it for inter-service communication
- If you work within a large organization with siloed teams and poor communication, a service mesh can give you the highest degree of independence, making it easy to add new services over time.
- If you are designing a system where individual steps are spaced out over time, e.g. a youtube like service where upload, processing, and publishing of videos can take a couple of minutes, use a message or task queue.
未来是什么 What the future holds
尽管如此,像Istio一样的服务网格是一个相当年轻的概念,最受欢迎的替代方案,在2018年7月才达到1.0版本。我的预测是,这些概念会越来越融合,从而形成一个更去中心化的服务网格,提供外部API访问和内部通信 — 甚至可能以缓冲、队列式的方式。
Despite all the hype, service meshes are a fairly young concept with e.g. Istio, the most popular alternative only having reached its 1.0 version in July 2018. My prediction would be that these concepts increasingly merge, resulting in a more decentralized mesh of services providing both external API access and internal communication - maybe even in a buffered, queue-like fashion.