企业服务器架构_企业无服务器架构

企业服务器架构

This section in the series on Enterprise Serverless specifically covers some of the high level architecture and design aspects when it comes to large serverless applications. You can read the ideas behind the series by clicking on the ‘Series Introduction’ link below.

关于Enterprise Serverless的系列中的本节专门介绍了大型无服务器应用程序的一些高级体系结构和设计方面。您可以通过点击下面的“ 系列介绍 ”链接来阅读系列背后的想法。

The series is split over the following sections:

该系列分为以下几节：

Series Introduction
系列介绍
Tooling
工具
Architecture
建筑
Hints & Tips
提示和技巧
AWS Limits & Limitations
AWS限制与限制
Security
安全
Useful Resources
有用的资源

建筑 (Architecture)

思维无服务器 (Thinking serverless)

At the start of any new serverless project it is key that the full team ‘thinks serverless’ in my opinion — which can be a very different World to which they have been living in. Some of the key aspects are listed below:

在任何新的无服务器项目开始时，关键是整个团队“ 认为无服务器 ”是很关键的，这可能与他们所生活的世界完全不同。下面列出了一些关键方面：

DevOps built into team — builds fail (a lot)With any large scale serverless projects with lots of resources and with multiple scrum teams it is key to have DevOps embedded into the team, as even the most well designed pipelines will throw up errors daily, and builds fail quite frequently for various spurious reasons on AWS! In my experience having dedicated resource to unblock the developments teams is a must, whilst also monitoring how the services are running day to day and managing changes. Not securing this resource at the start of the project can have a major knock on effect to the overall project in my experience.

团队中内置的DevOps —失败(很多)对于任何具有大量资源且拥有多个Scrum团队的大型无服务器项目，将DevOps嵌入团队中是关键，因为即使设计得最好的管道每天也会引发错误，并且由于各种虚假的原因，构建在AWS上经常失败！以我的经验，必须有专门的资源来解除开发团队的封锁，同时还要监视服务的日常运行方式和变更管理。根据我的经验，在项目开始时不获取此资源可能会对整个项目产生重大影响。

Capacity planning Before even putting pen to paper on designing the architecture and services which make up the solution, it is key to gain an understanding on anticipated load and estimated future growth, especially in a serverless event driven architecture.

容量规划 甚至在设计构成解决方案的体系结构和服务之前，尤其是在无服务器事件驱动的体系结构中，了解预期的负载和估计的未来增长至关重要。

Serverless in my opinion is not a silver bullet for negating the need to think about capacity and load at all, as you will still need to plan for reserved concurrency at a minimum on your lambdas; and especially around areas such as service to service communication, messaging and batch processing, which can fall fowl of poor design and load if not using patterns such as fan out, pub/sub and limiting throughput to downstream services which can’t handle high volume at scale.

在我看来，无服务器并不是完全不需要考虑容量和负载的灵丹妙药，因为您仍将需要计划最少的lambda并发。尤其是在服务到服务的通信，消息传递和批处理等领域，如果不使用扇出，发布/订阅和限制无法处理大量流量的下游服务等吞吐量的模式，则设计和负载可能会很差大规模地。

Developers building infrastructureOne of the biggest shifts in mindset and day to day work for developers is building infrastructure through code using frameworks such as serverless which was previously done by an Ops team with monolithic solutions. In the new World of serverless the developer themselves alongside the architects or lead developers typically have the biggest grasp of what exactly they want this small piece of infrastructure to do; so it makes sense for them to manage it through IaC within a PR alongside the code which logically supports it.

开发人员构建基础架构开发人员的思维方式和日常工作中最大的转变之一就是使用诸如无服务器之类的框架通过代码构建基础架构，该框架以前是由Ops团队采用整体解决方案完成的。在无服务器的新世界中，开发人员本身与架构师或主要开发人员通常对他们希望这种小型基础架构到底要做什么具有最大的了解。因此，对于他们来说，通过PR中的IaC以及逻辑上支持它的代码来管理它是有意义的。

Test need to think differentlyThere is also a mindset change for test teams when working in an event driven serverless project, especially when it comes to serverless security, load testing and acceptable response times (when taking into account concurrency and cold starts for example).

测试需要以不同的方式思考在使用事件驱动的无服务器项目中，尤其是在涉及无服务器安全性，负载测试和可接受的响应时间(例如，考虑到并发和冷启动 )时，测试团队的思维方式也会发生变化。

团队清单 (Team Checklist)

Through experience of working on large scale cloud projects I built up a checklist of key areas of focus when working on features which is fairly self explanatory and aids as a checklist for team members from POs and BAs, to developers and testers when thinking about non functional requirements, and to ensure that they are thought about at feature inception and not forgotten about:

通过在大型云项目上的工作经验，我建立了功能上重点关注领域的清单，这是很容易解释的，并有助于从PO和BA到开发人员和测试人员的清单，在考虑到非功能性时要求，并确保在功能创建之初就考虑了这些要求，并且不会忘记它们：

✅ Auditing — what key actions required auditing?✅ Instrumentation/Key Metrics — what key functions/services require instrumentation and KPIs?✅ Logging — which key actions require logging and what should be included?✅ Alerting — what key actions require alerting for DevOps through CloudWatch/Pager Duty or custom dashboards?✅ Chaos Tests/DR — what testing do we need to put in place?✅ Load Tests — what load considerations do we have to consider and how do we test it?✅ Authorisation — who should have access to this resource?✅ Documentation — what do we need to document and where?✅ Caching/TTL — could we benefit from caching in this area to benefit the end user and/or cost?

✅审核 -需要审核哪些关键措施？ ✅工具/关键指标 -哪些关键功能/服务需要工具和KPI？ ✅记录 -哪些关键操作需要记录，应包括哪些内容？ ✅警报 -通过CloudWatch / Pager Duty或自定义仪表板对DevOps进行警报的关键操作是什么？ ✅混沌测试/灾难恢复 -我们需要进行哪些测试？ ✅负载测试 -我们必须考虑哪些负载注意事项以及如何对其进行测试？ ✅授权 -谁应该有权使用此资源？ ✅文档 -我们需要什么文档以及在哪里？ ✅缓存/ TTL-我们可以从该区域的缓存中受益以使最终用户和/或成本受益吗？

VPC或无VPC (VPC or no VPC)

When working with serverless architectures one of the key considerations at the start of the project is whether or not you will need a VPC, and if so, what are the additional hoops you will need to jump through, and what are the limitations.

在使用无服务器体系结构时，项目开始时的主要考虑因素之一是您是否需要VPC ，如果需要，则需要跳过哪些额外的限制，以及有哪些限制。

In my experience this is largely dictated by the use of an accompanying AWS service in your solution which needs to reside in a VPC, and more often than not it is a database such as AWS DocumentDB or RDS.

以我的经验，这在很大程度上取决于解决方案中需要驻留在VPC中的随附AWS服务的使用，而且通常不是AWS数据库，例如AWS DocumentDB或RDS。

The first question is whether or not there is an alternative service that can be used, for the example above for AWS DocumentDB it would be AWS DynamoDB as a first choice. If not then one of the biggest challenges you will face in a VPC is the communication between internal and external services which will now need VPC NAT Gateways or VPC Endpoints etc. You may also see issues with longer cold start times in a VPC, however this has been massively reduced since around September 2019.

第一个问题是是否可以使用替代服务，对于上面针对AWS DocumentDB的示例，将首选AWS DynamoDB。如果不是这样，则VPC中将面临的最大挑战之一是内部和外部服务之间的通信，现在将需要VPC NAT网关或VPC端点等。您可能还会看到VPC的冷启动时间较长的问题，但是这自2019年9月左右开始大幅减少。

Although this is not insurmountable it is another thing to manage, increases the integration code and work to implement, specific IAM roles, and can be tricky to get right first time. If at all possible in my opinion it is beneficial to stay outside the realms of a VPC with a serverless solution, and it doesn’t make your solution any less secure either. If you do require a VPC its imperative to factor the additional work into planned estimates.

尽管这不是不可克服的，但这是另一件事，需要管理，增加集成代码并实施以实现特定的IAM角色，并且第一次获得正确的建议可能很棘手。我认为如果有可能，最好采用无服务器解决方案将其置于VPC范围之外，这也不会降低您的解决方案的安全性。如果确实需要VPC，则必须将额外工作纳入计划的估算中。

This article articulates the topic very well: https://lumigo.io/blog/to-vpc-or-not-to-vpc-in-aws-lambda/

本文很好地阐述了这个主题： https : //lumigo.io/blog/to-vpc-or-not-to-vpc-in-aws-lambda/

没有整体功能 (No monolithic functions)

There are various arguments ongoing in the serverless World around individual lambdas per endpoint vs monolithic lambdas which essentially use a framework such as serverless express to host many endpoints, or sometimes the full application API.

在无服务器世界中，围绕每个终结点的单个lambda与相对于整体lambda而言，存在着各种争论，而本质上使用诸如无服务器express这样的框架来承载许多终结点，有时甚至是完整的应用程序API。

The reason this is a poor idea in my opinion is there is no way to scale out or change the reserved or provisioned concurrency of a specific endpoint in the future and they are too tightly coupled. From a development approach monolithic lambdas mean that there is more risk of cross-contamination of bugs, deployment problems, and security issues; as opposed to having one isolated piece of functionality per endpoint which does one thing well.

我认为这是一个糟糕的主意，原因是将来无法扩展或更改特定端点的保留或配置的并发性，并且它们之间的联系过于紧密。从开发方法来看，整体式lambda意味着存在交叉污染bug，部署问题和安全性问题的风险；而不是每个端点只有一个孤立的功能，这做得很好。

可观察性 (Observability)

One of the key aspects of a good serverless solution in my opinion is being able to observe how your overall application is performing from the front end clients through to the backend data stores — and the communication between them. This is even more important with event driven architectures where there are more moving parts and services to observe. When architecting solutions some of the key services (and not limited too) I build into the solutions are:

我认为，好的无服务器解决方案的关键方面之一是能够观察整个应用程序从前端客户端到后端数据存储的性能如何以及它们之间的通信。对于事件驱动的架构，这点尤为重要，因为事件驱动的架构中需要观察更多的活动部件和服务。在设计解决方案时，我构建到解决方案中的一些关键服务( 但不限于 )：

AWS X-rayAWS X-ray is an AWS service which allows you to monitor and observe how a distributed system using many services is performing.

AWS X-ray AWS X-ray是一项AWS服务，可让您监视和观察使用许多服务的分布式系统的运行情况。

Google AnalyticsBuilding in Google Analytics or equivalent service from the start of a project is key in my opinion to understand how, where and when your users are using the system.

我认为从项目开始就在Google Analytics (分析)中构建Google Analytics(分析)或同等服务是了解用户如何，在何时何地使用系统的关键。

For React applications I typically use react-ga which is easy to plug in and get going.

React对于React应用程序，我通常使用react-ga ，它很容易插入并开始使用。

New Relic BrowserServices such as New Relic Browser allow you to proactively monitor any JavaScript errors which your end users may be getting, which is key when you may have millions of customers on your service. This can be broken down into how many page views are seeing this particular error, as well as on which browsers its manifesting.

新的Relic Browser服务(例如New Relic Browser)使您可以主动监视最终用户可能遇到的任何JavaScript错误，这在您可能拥有数百万客户的服务时是至关重要的。可以将其细分为多少页面视图看到此特定错误，以及在哪些浏览器上显示该错误。

Sumo LogicAWS CloudWatch is great for day to day logging, but not great when creating dashboards or searching through multiple log groups via correlation IDs in one go. For this reason I have historically used services such as Sumo Logic which has a far greater user experience, with the caveat that you need to stream your logs using CloudWatch events and lambda, or directly in your lambda code through your logging framework such as Winston with a Sumo transport.

Sumo Logic AWS CloudWatch非常适合日常日志记录，但不适用于创建仪表板或一次性通过关联ID搜索多个日志组的情况。因此，我一直使用诸如Sumo Logic之类的服务，该服务具有更好的用户体验，但需要注意的是，您需要使用CloudWatch事件和Lambda来流式传输日志，或者直接通过Winston等日志记录框架在Lambda代码中流式传输。相扑运输。

When logging to Sumo Logic directly in the lambda code we have measured that this can add an additional 100-300ms on average to your overall lambda invocation duration. This is one to watch out for when speed is key to your consumers.

directly当直接以lambda代码登录Sumo Logic时，我们测算出，这可以平均增加整个lambda调用持续时间100-300ms。当速度对于您的消费者而言至关重要时，这是一个需要提防的问题。

适用于临时环境的CloudFlare (CloudFlare for Ephemeral Environments)

A nice approach to ephemeral environments which I have used in the past is the use of CloudFlare workers alongside using stages in the serverless framework to create short lived developer specific environments, accessed via there own subdomains. An example could be:

我过去使用的短暂环境的一种很好的方法是使用CloudFlare工人，同时使用无服务器框架中的阶段来创建短暂的，特定于开发人员的环境，并通过那里的子域进行访问。例如：

https://pr-123-uk.something.com

HTTPS：// PR-123-UK .something.com

This approach works well full-stack for both the APIs and clients, as well as allowing splitting out of routing if you have a requirement of both REST and Graph on the same domain:

对于API和客户端，此方法在整个堆栈上都可以很好地工作，并且如果您在同一域上同时需要REST和Graph，则可以拆分出路由：

https://pr-123-uk.something.com/api/v1/https://pr-123-uk.something.com/graphql

HTTPS：// PR-123-UK .something.com / API / V1 / HTTPS：// PR-123-UK .something.com / graphql

The benefit of using CloudFlare workers in this scenario over an alternative such as AWS CloudFront is the speed in which the changes are deployed; for example in the past changes to CloudFront could take up to 15 minutes to be deployed in my experience, whereas CloudFlare is instant. With ephemeral environments it is key that they are deployed as quickly as possible.

在这种情况下，使用CloudFlare worker而不是AWS CloudFront等替代方案的好处是更改部署的速度；例如，根据我的经验，过去对CloudFront的更改最多可能需要15分钟才能部署，而CloudFlare是即时的。对于临时环境，关键是尽快部署它们。

面向未来 (Future proofing)

When you expect to have 150–200+ lambdas in your solution over time it is essential to split out your code, project and files correctly to allow you to quickly adapt to future changes where required.

如果您希望随着时间的推移在您的解决方案中有150-200多个lambda，则必须正确地分割代码，项目和文件，以使您可以根据需要快速适应将来的更改。

One way I have done this is the past is splitting out the various logical layers into separate files, for example specific lambda handlers for both API Gateway (REST) and GraphQL lambda resolvers (as they will have different event objects), which ultimately call through to a separate reusable ‘manager’ file for the main functionality. This means you can share the main business logic between multiple types of service as requirements change (ECS/APIG/AppSync etc).

我做到这一点的一种方法是，过去是将各个逻辑层拆分为单独的文件，例如API网关(REST)和GraphQL lambda解析器的特定lambda处理程序( 因为它们将具有不同的事件对象 )，这些调用最终会通过到用于主要功能的单独的可重用“管理器”文件。这意味着您可以随着需求的变化( ECS / APIG / AppSync等 )在多种服务类型之间共享主要业务逻辑。

This simple approach means you can create an index file in the root of that particular ‘entity’ (for example customer or order) at some point in the future, and export as a NodeJS express app and use with AWS Fargate instead of lambda; essentially meaning those individual CRUD(L) calls are easily exported as one microservice rather than individual lambdas.

这种简单的方法意味着您可以在将来某个时候在该特定“实体”( 例如客户或订单 )的根目录中创建索引文件，并导出为NodeJS Express应用并与AWS Fargate而非lambda一起使用。本质上意味着那些单独的CRUD(L)调用很容易导出为一种微服务，而不是单独的lambda。

This has been key on a previous project where the same business logic needed to be exposed via AppSync for our own public facing web clients, as well as through APIG for cloud connected products (desktop applications).

这是以前项目的关键，在该项目中，需要通过AppSync为我们自己的面向公众的Web客户端公开相同的业务逻辑，也需要通过APIG公开与云连接的产品(桌面应用程序)。

Next section: Hints & Tips Previous section: Tooling

下一节： 提示和技巧 上一节 ：工具

结语 (Wrapping up)

Feel free to also connect with me on:

请随时与我联系：

https://www.linkedin.com/in/lee-james-gilmore/https://twitter.com/LeeJamesGilmore

https://www.linkedin.com/in/lee-james-gilmore/ https://twitter.com/LeeJamesGilmore

If you enjoyed the posts please follow my profile Lee James Gilmore for further posts/series, and don’t forget to connect and say Hi

如果您喜欢这些帖子，请关注我的个人资料Lee James Gilmore，以获取更多帖子/系列，并且别忘了联系并说声嗨

关于我 (About me)

“Hi, I’m Lee, an AWS certified technical architect and polyglot software engineer based in the UK, working for a FTSE 100 multinational enterprise software company, working primarily in full-stack JavaScript on AWS for the past 5 years.

“ 嗨，我是Lee，他是英国AWS认证的技术架构师和多语种软件工程师，在FTSE 100跨国企业软件公司工作，过去5年主要在AWS上使用全栈JavaScript。

I consider myself a serverless evangelist with a love of all things AWS, innovation, software architecture and technology.”

我认为自己是一位无服务器的传播者，热爱AWS，创新，软件架构和技术。 ”

** The information provided are my own personal views and I accept no responsibility on the use of the information.

**所提供的信息是我个人的观点，我对信息的使用不承担任何责任。

翻译自: https://medium.com/swlh/enterprise-serverless-architecture-39c4f4ae5aff