aws云服务器
While it is not possible to compete with a public cloud in terms of feature set, elasticity, scale, managed services, geographic reach and bursty workloads, there are cases where it makes sense to run part of the workloads in an on-premises environment. AWS recognizes the potential benefits of a hybrid requirement as detailed on their edge offering using AWS technologies called AWS Outposts (https://aws.amazon.com/outposts/). Microsoft with Azure stack has similar offerings where they provide an edge deployment using hardware and software managed by them.
尽管在功能集,弹性,规模,托管服务,地理覆盖范围和突发性工作负载方面无法与公共云竞争,但在某些情况下,有必要在本地环境中运行部分工作负载。 AWS通过使用称为AWS Outposts( https://aws.amazon.com/outposts/ )的AWS技术的边缘产品详细介绍了混合需求的潜在好处。 具有Azure Stack的Microsoft提供了类似的产品,它们使用由其管理的硬件和软件来进行边缘部署。
In the context of this blog, we will refer to AWS Outpost synonymous to Azure stack
在此博客的上下文中,我们将引用与Azure堆栈同义的AWS Outpost
The key advantages of AWS Outposts vs the traditional on premise IT deployments are:
与传统的内部IT部署相比,AWS Outposts的主要优势在于:
The availability of application centric AWS platform services like container management, big data analytics, single management interface for application deployment, monitoring and logging. See https://aws.amazon.com/outposts/features/ for example
以应用程序为中心的AWS平台服务的可用性,例如容器管理,大数据分析,用于应用程序部署,监视和日志记录的单一管理界面。 例如,请参阅https://aws.amazon.com/outposts/features/
- An access policy model to enable use of the rest of the platform services from cloud like S3, Dynamo, SQS, Lambda, Azure Storage, AD and so on. 一种访问策略模型,用于启用来自云的其余平台服务,例如S3,Dynamo,SQS,Lambda,Azure存储,AD等。
- A unified approach to automation and security of infrastructure in cloud and on-premise, thus enabling elasticity of workloads. 云和内部部署中基础结构的自动化和安全性的统一方法,从而实现了工作负载的弹性。
As enterprises continue their journey to cloud, they also learn some lessons in terms of variable costs that come with the pay-as-you-go model. For example, in public cloud, you pay for compute, storage, data transfer, API calls, number of requests, IOPS. So it gets hard to predict the cost and eliminate some of these variable expenses that are workload dependent. In addition, some workloads inherently are better suited to running on-premises using an Outpost like deployment.
随着企业继续进行云计算之旅,他们还从按需付费模式中获得了有关可变成本方面的经验教训。 例如,在公共云中,您需要为计算,存储,数据传输,API调用,请求数和IOPS付费。 因此,很难预测成本并消除一些与工作负载有关的可变费用。 此外,某些工作负载本来就更适合使用Outpost之类的部署在本地运行。
Unfortunately, many organizations haven’t been able to realize these benefits due to either their existing infrastructure or as a result of the cost barrier in terms of the minimum hardware spend and logistics of procuring and installing the hardware. This got us thinking about how we could enable a subset of AWS Outposts like functionality to be seamlessly enabled on any over-the-self hardware in a few hours.
不幸的是,由于现有的基础架构或由于最低硬件花费以及采购和安装硬件的物流成本方面的成本障碍,许多组织无法实现这些优势。 这让我们开始思考如何在几个小时内在所有自行式硬件上无缝启用AWS Outposts之类的功能。
In this blog we detail this subset of use cases and outline a software only implementation to get an AWS Outpost like environment using standard commodity servers.
在此博客中,我们详细介绍了用例的这一子集,并概述了仅软件的实现,以使用标准商品服务器获得类似于AWS Outpost的环境。
仅软件前哨部署 (Software Only Outpost Deployment)
One can deploy the smallest AWS Outposts for $6,965 per month catered to test and dev workloads. See details for larger, more feature rich and expensive stacks here: https://aws.amazon.com/outposts/pricing/
一个人可以每月花费6,965美元部署最小的AWS Outposts,用于测试和开发工作负载。 在此处查看更大,更多功能丰富和昂贵的堆栈的详细信息: https : //aws.amazon.com/outposts/pricing/
As an example, a 2 node (g4dn.12xlarge, 48 vcpus, 192GB RAM, 4 GPUs) GPU ready configuration is $8,914 per month. This comes to $282,000 if you pay upfront for 3 years. Similar hardware if procured with 4 GPUs, same number of cores, memory and storage will cost around $70,000 as one time expense for 3–5 years. On top of that you have to add up some of the cost in terms of datacenter, networking and management, to get the actual cost savings.
例如,一个2节点(g4dn.12xlarge,48 vcpus,192GB RAM,4个GPU)的GPU就绪配置为每月8,914美元。 如果您预付款3年,则总计为$ 282,000。 如果购买具有4个GPU,相同数量的内核,内存和存储的类似硬件,则需要花费大约70,000美元,作为3-5年的一次性费用。 最重要的是,您必须在数据中心,网络和管理方面加总一些成本,以节省实际成本。
So if you want to take advantage of your own existing hardware or buy something specific to your application needs, you can’t do that with Outpost. You have to use both hardware and software from the cloud vendor and you have no control over it.
因此,如果您想利用自己现有的硬件或购买特定于您的应用程序需求的东西,则无法使用Outpost做到这一点。 您必须同时使用云供应商提供的硬件和软件,并且无法对其进行控制。
仅软件的要求AWS Outpost (Requirements for a software only AWS Outposts)
We believe that there are two requirements that, if met, can allow you to leverage the on-premise servers even in a small quantity as an extension to your public cloud.
我们认为,有两个要求可以满足您的要求,这些要求可以允许您利用本地服务器,即使是少量使用作为公共云的扩展。
These two prerequisites for running an on-premise environment are as follows:
运行本地环境的两个先决条件如下:
Container Adoption or ETL jobs: you should be able to either package and run your workload using a set of containers. Alternatively the workload can be a set of jobs in an ETL pipeline that the user wants to run against a big data cluster like spark
容器采用或ETL作业:您应该能够使用一组容器打包并运行工作负载。 或者,工作负载可以是ETL管道中的一组作业,用户希望针对像Spark这样的大数据集群运行
No low latency access needed to public cloud services: If the application is cloud native and wants to use platform services from Cloud like DNS, Load balancers, S3, Dynamo, SQS while running the expensive compute on premise then in such scenarios they should be tolerant to the latency incurred while the on-premise compute communicated with cloud services.
不需要对公共云服务的低延迟访问:如果应用程序是云原生的,并且想要在内部运行昂贵的计算时使用DNS,负载均衡器,S3,Dynamo,SQS等云服务平台服务,则在这种情况下,它们应该可以容忍本地计算与云服务进行通信时发生的延迟。
您可能会问为什么要满足这两个要求? (Why these two requirements, you may ask?)
Simply put, the first requirement removes the need to have an expensive virtualized infrastructure on-premises and dealing with all the operational complexity that comes with it. The second requirement ensures that your application or parts of it can leverage the compute on-premise thus offsetting the bulk of the cloud costs. Here you can continue to use some of the non-compute cloud services like S3, SNS, SQS, as long as accessing them over WAN is not an issue.
简而言之,第一个要求消除了在内部拥有昂贵的虚拟化基础架构以及处理随之而来的所有运营复杂性的需求。 第二个要求确保您的应用程序或应用程序的一部分可以利用内部部署的计算,从而抵消大量的云计算成本。 在这里,您可以继续使用某些非计算云服务,例如S3,SNS,SQS,只要通过WAN访问它们就不会成为问题。
Once these two requirements are met, a few challenges still remain both in terms of dealing with hardware and operational burden that comes with managing and running workloads on these servers.
一旦满足了这两个要求,在处理与管理和运行这些服务器上的工作负载相关的硬件和操作负担方面,仍然存在一些挑战。
Ideally you need management software to make the on-premises servers almost look and feel like cloud instances (like EC2 instances in case of AWS). If you can truly achieve that operational simplification, then the edge can look just like an extension of public cloud, while providing cost savings, higher performance and locality that your team will love.
理想情况下,您需要管理软件以使本地服务器的外观几乎像云实例(例如AWS的EC2实例)。 如果您可以真正实现操作上的简化,那么边缘看起来就像是公共云的扩展,同时可以节省成本,提高性能并提高团队喜欢的位置。
Let’s look at some of the use cases first, followed by specific challenges and how you can really achieve this cloud + edge nirvana!
首先,让我们看一些用例,然后是特定的挑战,以及如何真正实现云+边缘的必杀技!
边缘计算的用例 (Use Cases For Edge Compute)
We have provided an edge deployment successfully to several customers who were able to extend their AWS cloud to an on-premises infrastructure and cut their costs by 60% while getting higher performance as part of running native containers with local IO access.
我们已经成功地向几位客户提供了边缘部署,这些客户能够将其AWS云扩展到本地基础架构,并降低60%的成本,同时通过运行具有本地IO访问权限的本地容器来获得更高的性能。
Based on these existing customer scenarios, we have come up with 4 use cases where such an environment can help as long as you can minimize the operational cost.
根据这些现有的客户场景,我们提出了4个用例,只要您可以最大程度地降低运营成本,这种环境就可以提供帮助。
- Test and Dev Workloads 测试和开发工作量
- Big data analysis 大数据分析
- AI/ML on local data 本地数据上的AI / ML
- Basic IaaS based App hosting 基于IaaS的基本应用托管
Let’s look at each in more detail:
让我们更详细地看看每个:
1) Test & Dev: The test & dev workloads need to run 24x7 as part of a company’s CI environment. They are not mission critical, they do not require reliable replicated storage and can provide fast local access to the developers. Builds can also complete faster due to lack of virtualization overhead and local disk access. Running them on a set of servers on premise can have several such benefits.
1)测试与开发:作为公司CI环境的一部分,测试与开发工作负载需要24x7全天候运行。 它们不是关键任务,它们不需要可靠的复制存储,并且可以为开发人员提供快速的本地访问。 由于缺乏虚拟化开销和本地磁盘访问,构建还可以更快地完成。 在内部的一组服务器上运行它们可以带来许多好处。
2) Big data analysis: Most big data analysis software relies on compute intensive distributed processing like Spark, storage of large data sets and finally light weight post processing to produce the final results. These are mostly batch jobs and experiments that are not real-time.
2)大数据分析:大多数大数据分析软件都依赖于计算密集型分布式处理(例如Spark),大数据集的存储以及最终的轻量级后处理才能产生最终结果。 这些主要是批处理作业和非实时的实验。
One could build an ETL pipeline that consists of compute intensive spark jobs on premise, then transfer the results to S3 followed by light weight post processing via Lambda and finally exposing the data in S3 via SQL AWS Athena. Again you don’t need reliable storage as data is typically replicated by the underlying NoSQL storage system like HDFS. Since you always pay for the data at rest in the cloud and pay extra for the IOPS, this can be a cost effective solution.
可以构建一个由内部计算密集型火花作业组成的ETL管道,然后将结果传输到S3,然后通过Lambda进行轻量级后处理,最后通过SQL AWS Athena公开S3中的数据。 同样,您不需要可靠的存储,因为数据通常由基础的NoSQL存储系统(如HDFS)复制。 由于您总是要为云中的静态数据付费,而要为IOPS支付额外费用,因此这可能是一种经济高效的解决方案。
3) AI/ML on local data: Many companies are collecting terabytes of data per day from their AI/ML based applications. Examples are self driving car companies generating training data from their cars everyday, insurance companies going through a large number of documents, ride sharing platforms like Uber, Lyft going through millions of driver documents for verification everyday and real estate or financial companies going through a lot of legal documents for verification. Lot of this data is produced on the edge and needs to be analyzed there. You can continue to run some bursty training jobs either on-premises or cloud. Ideally, it would be great to have a single cluster across local GPU servers and cloud GPU instances. You can then choose to run the AI workload on a specific location based on data and compute availability.
3)关于本地数据的AI / ML:许多公司每天都在从基于AI / ML的应用程序中收集TB级数据。 例如无人驾驶汽车公司每天从其汽车生成培训数据,保险公司处理大量文档,Uber,Lyft之类的乘车共享平台每天处理数以百万计的驾驶员文档以进行验证以及房地产或金融公司每天都要进行大量验证法律文件进行验证。 这些数据很多是在边缘产生的,需要在那里进行分析。 您可以继续在本地或云中运行一些突发性的培训作业。 理想情况下,跨本地GPU服务器和云GPU实例具有单个群集将是很棒的。 然后,您可以根据数据和计算可用性选择在特定位置运行AI工作负载。
4) Standard IaaS based web hosting: This workload again needs to run 24x7 and does not require a lot of cloud features other than a simple load balancer, basic application monitoring and a periodic database backup. Again these workloads can be easily migrated to a couple of edge servers where they can run in a much more cost effective manner. You can continue to use ELB, Route53, WAF, CDN and monitoring features from the public cloud itself.
4)基于标准IaaS的虚拟主机:此工作负载再次需要24x7全天候运行,除了简单的负载平衡器,基本的应用程序监视和定期的数据库备份外,不需要大量的云功能。 同样,这些工作负载可以轻松迁移到几个边缘服务器,在这些服务器中它们可以以更具成本效益的方式运行。 您可以继续使用公共云本身的ELB,Route53,WAF,CDN和监视功能。
Table 1 below provides a high level summary of these workloads and the features that make them unique to benefit from an edge deployment.
下表1简要概述了这些工作负载以及使这些工作负载从边缘部署中受益的独特功能。
边缘部署中的挑战 (Challenges In Edge Deployments)
Let’s say you have gotten some servers in a rack in a colo or in a lab inside your company. Once you rack and stack these servers, you need to go through a bunch of steps to make them consumable:
假设您已经在公司内部的Colo机架或实验室中获得了一些服务器。 一旦将这些服务器机架和堆叠起来,就需要执行一系列步骤以使其成为可消耗的:
OS and Hardware Management
操作系统和硬件管理
- OS installation: PXE boot, USB install 操作系统安装:PXE引导,USB安装
- Firmware install or upgrades 固件安装或升级
- Package install: using internal or external package repos 软件包安装:使用内部或外部软件包存储库
Application deployment
应用部署
- Access to container registry: internal or external 访问容器注册表:内部或外部
- Application deployment: container management, spark and hadoop clusters 应用程序部署:容器管理,Spark和Hadoop集群
Logging and Monitoring
记录与监控
- Logging: Filebeat, Elasticsearch, Kibana 日志记录:Filebeat,Elasticsearch,Kibana
- Monitoring: Nagios, Prometheus, Grafana, InfluxDB 监控:Nagios,Prometheus,Grafana,InfluxDB
Security and Access Control
安全与访问控制
- Multi-tenant access to developers or teams from different projects 来自不同项目的开发人员或团队的多租户访问
- Connecting on-premises deployment securely with AWS services 使用AWS服务安全地连接本地部署
- Use role based access control without leaking AWS keys 使用基于角色的访问控制而不会泄漏AWS密钥
Now all this needs to be handled by some software layer or you have a big operational task in your hands to automate all this and make this environment really useful. This is exactly what AWS Outposts solves by providing a combined hardware and software solution managed by AWS. While this is still useful, having a software only solution that can work with any hardware is even more powerful and will offer a lot more flexibility.
现在,所有这些都需要由某个软件层来处理,或者您手中的一项繁重的操作任务要使所有这些自动化,并使此环境真正有用。 这正是AWS Outposts通过提供由AWS管理的硬件和软件组合解决方案来解决的。 尽管这仍然有用,但是拥有可以与任何硬件一起使用的仅软件解决方案更加强大,并且将提供更多的灵活性。
Imagine, if you could simply install an OS on these machine and within few minutes, they could just be part of your AWS environment as a cluster, where you can deploy your application using containers, get monitoring, logging, multi-tenancy and upgrades done automatically by a management software running in the cloud itself as part of your own account?
想象一下,如果您只需在几分钟内在这些机器上安装操作系统,它们就可以作为群集在AWS环境中的一部分,您可以在其中使用容器部署应用程序,进行监视,日志记录,多租户和升级是否可以通过您自己的帐户的一部分在云本身中运行的管理软件自动运行?
The goal is to make these servers on the edge appear like EC2 instances with some tags and similar networking and roles to even access other AWS services. This is where DuploCloud comes in and stitches all this together to create a seamless edge cloud. Let’s look at this really impressive technology in the next section.
目的是使这些服务器在边缘上看起来像具有某些标签和相似网络和角色的EC2实例,甚至可以访问其他AWS服务。 这是DuploCloud出现的地方,并将所有这些缝合在一起以创建无缝边缘云。 在下一部分中,让我们看一下这项令人印象深刻的技术。
使用DuploCloud进行无缝边缘管理 (Seamless Edge Management with DuploCloud)
With DuploCloud, we have built an edge computing solution that can convert any set of edge servers into AWS EC2 instances with local storage used as EBS volumes. You can simply install the operating system on the edge servers, which can be a standard Linux or Windows install and add it to the cluster managed by DuploCloud. Other machines in the cluster can be standard EC2 instances also. DuploCloud itself is a self-hosted software that you can run as part of your own cloud account in AWS or Azure.
借助DuploCloud,我们构建了一种边缘计算解决方案,可以将任何边缘服务器集转换为以本地存储作为EBS卷的AWS EC2实例。 您只需在边缘服务器上安装操作系统即可(可以是标准Linux或Windows安装),然后将其添加到DuploCloud管理的群集中。 群集中的其他计算机也可以是标准EC2实例。 DuploCloud本身是一种自托管软件,您可以在AWS或Azure中将其作为自己的云帐户的一部分运行。
Once DuploCloud takes over the machine, it will automatically install all the containers needed for logging, monitoring, security and compliance. It will also assign roles to the machine so that it can look and behave like an EC2 instance with specific access to the AWS resources. Now as you launch containers, which can be part of your CI/CD, big data analytics, AI/ML workloads or any general hosting, you can launch them either in the public cloud or an on-premises machine from the same management plane built by DuploCloud.
一旦DuploCloud接管了机器,它将自动安装记录,监视,安全性和合规性所需的所有容器。 它还将为计算机分配角色,以便其外观和行为类似于具有对AWS资源的特定访问权限的EC2实例。 现在,当您启动容器时,可以将其作为CI / CD,大数据分析,AI / ML工作负载或任何常规托管的一部分,您可以在公共云或本地计算机中从相同的管理平面中启动它们由DuploCloud。
运作模式 (Operational Model)
For a company that already has on-premises hardware, they can continue to use their racks, servers and teams to do minimal setup needed. For others, they can use a colocation facility that provides bare-metal servers and takes care of power, cooling, OS install & networking. Once the servers are available, DuploCloud will take care of installing all the packages related to monitoring, logging, security and compliance and show these machines as part of a built-in SIEM dashboard for your specific compliance. The dashboard will also show you when some packages are out of date and needs updating. So the tasks that a customer has to do are:
对于已经拥有本地硬件的公司,他们可以继续使用其机架,服务器和团队来进行最少的设置。 对于其他人,他们可以使用托管设备,该托管设备提供裸机服务器并负责电源,散热,操作系统安装和联网。 服务器可用后,DuploCloud将负责安装与监视,日志记录,安全性和合规性相关的所有软件包,并将这些机器显示为内置SIEM仪表板的一部分,以实现您的特定合规性。 当某些软件包已过期并且需要更新时,仪表板还会显示给您。 因此,客户要做的任务是:
- Installing the operating system 安装操作系统
- Any firmware upgrades if needed 如果需要,可以进行任何固件升级
- Software package upgrades on the host 主机上的软件包升级
These are all infrequent or a one time operation which can be handled by someone with basic understanding of Linux or Windows.
这些都是很少或一次性的操作,可以由对Linux或Windows有基本了解的人员来处理。
Rest everything is taken care of by DuploCloud’s management and orchestration software.
休息一切都由DuploCloud的管理和编排软件负责。
结论 (Conclusion)
If you have any on-premises hardware that you want to use as an extension to your cloud environment, you can use DuploCloud for that. You will also get built-in application deployment, CI flow, monitoring, logging, security controls, connectivity to your VPC and a SIEM dashboard for compliance. All these can take months to setup if done manually or even using automation scripts using Ansible, Chef or Puppet. If you choose to do automation yourself, getting a true extension of your public cloud account where local machines have secure access to cloud resources and services is a lot of extra work that you will need to do in order to have seamless deployments and ability to burst.
如果您要使用任何本地硬件作为云环境的扩展,则可以使用DuploCloud。 您还将获得内置的应用程序部署,CI流,监视,日志记录,安全控制,与VPC的连接以及SIEM仪表板,以确保合规性。 如果手动完成,甚至使用Ansible,Chef或Puppet使用自动化脚本,所有这些设置都可能需要花费数月的时间。 如果您选择自己进行自动化,那么要真正拥有公共云帐户的扩展权(本地计算机可以安全地访问云资源和服务),您将需要做很多额外工作,以实现无缝部署和爆发能力。
If you want a seamless edge cloud but don’t want to deal with all the operational challenges that come with it, please check out https://www.duplocloud.com/out-of-box-edge-computing.html
如果您想要一个无缝的边缘云,但又不想处理它带来的所有运营挑战,请查看https://www.duplocloud.com/out-of-box-edge-computing.html
翻译自: https://medium.com/@duplocloud/convert-your-on-premise-servers-into-edge-cloud-like-aws-outpost-d10a7bd2c010
aws云服务器