服务器虚拟化发展现状_无服务器艺术的现状

服务器虚拟化发展现状

Over the past 2 years, the Hydro team I lead at Berkeley’s RISELab has been running hard and fast in the area of serverless computing. We designed and built a stateful Functions-as-a-Service platform called Cloudburst. I like to say “Cloudburst puts the state into the state-of-the-art” in serverless computing.

在过去的2年中,我在伯克利RISELab领导的Hydro团队在无服务器计算领域一直保持着坚韧不拔的运转。 我们设计并构建了一个名为Cloudburst的有状态功能即服务平台。 我想说的是无服务器计算中的“ Cloudburst使状态成为最新技术”。

In addition to the software prototype, papers on Cloudburst and serverless computing have been rolling out at quite a clip:

除了软件原型之外,有关Cloudburst和无服务器计算的论文也已大量发布:

  • Cloudburst system architecture in VLDB 2020. (overview below)

    VLDB 2020中的 Cloudburst系统架构。 ( 下面的概述 )

  • Transactional causally-consistent caching in SIGMOD 2020. (overview below)

    SIGMOD 2020中的事务因果一致缓存。 ( 下面的概述 )

  • Atomic Fault Tolerance (AFT) in Eurosys 2020. (overview below)

    Eurosys 2020中的原子容错(AFT)。 ( 下面的概述 )

  • Optimized Serverless ML Prediction Serving using CloudFlow over Cloudburst at arXiv. (overview below)

    在arXiv上使用CloudFlow over Cloudburst优化无服务器ML预测服务。 ( 下面的概述 )

  • A critique of the (prior) state of the art serverless systems in CIDR 2019. (overview below)

    对CIDR 2019中(现有)最先进的无服务器系统的评论。( 以下概述 )

  • All this on top of the two award-winning papers on the Anna serverless KVS at ICDE18 and VLDB19. (overview below)

    所有这些都是在ICDE18和VLDB19上有关Anna无服务器KVS的两篇获奖论文之上 。 ( 下面的概述 )

In this post I go into the background of the serverless computing space, and how we got to where we are today. For better or worse, this is a long post. If you want to skip the background, you can jump straight to descriptions of the new results.

在这篇文章中,我将探讨无服务器计算领域的背景,以及如何达到今天的状态。 无论好坏,这都是一个漫长的过程。 如果要跳过背景,可以直接跳至新结果的描述 。

为有史以来最大的计算机编程 (Programming for The Biggest Computer Ever Built)

I got interested in serverless computing because of my ongoing obsession with techniques for programming the cloud. Why obsess about that, you might ask?

我对无服务器计算很感兴趣,因为我一直对云编程技术感到痴迷。 您可能会问,为什么沉迷于此?

To put a fine point on it, the public cloud is the most powerful general-purpose computer ever assembled. Forget the race for the “biggest supercomputer”, secreted away in a government lab. The cloud is orders of magnitude bigger, and growing every day. Even better, it’s not locked up in a lab — anyone can rent out its power at any scale, on demand. Everyone in computer science should be excited about this, as it’s arguably the biggest game-changer in computing access since the rise of the PDP-11 and UNIX in the early 1970s.

确切地说, 公共云是有史以来功能最强大的通用计算机 。 忘记争夺在政府实验室中隐秘的“最大的超级计算机”的竞赛。 云的数量级要大得多,并且每天都在增长。 更好的是,它没有被锁在实验室中,任何人都可以按需以任何规模出租其电源。 计算机科学的每个人都应该对此感到兴奋,因为它可以说是自1970年代初PDP-11和UNIX兴起以来计算访问领域的最大变革者。

Unfortunately, raw computing power does not translate directly to useful computation. The cloud is not just massive, it is also a data-rich distributed system, which raises notoriously difficult computer science problems, including parallel programming, mid-flight failures of participating nodes, and data inconsistencies across distributed machines. For general-purpose cloud programming, developers today are forced to (a) write sequential programs to run on each machine they want to use, (b) ensure that code works together in concert to achieve desired outcomes in the face of the core CS problems described above, and (c) figure out how to deploy and manage all that complexity. As a result, it is very difficult today for developers to harness the power of the cloud at scale. To continue the analogy, the cloud is like the PDP-11 without UNIX and C — we’re programming it with the distributed systems equivalent of assembly code (though honestly it’s far harder than that from a correctness perspective).

不幸的是,原始计算能力无法直接转化为有用的计算。 云不仅是庞大的,而且还是一个数据丰富的分布式系统,它引起了众所周知的困难的计算机科学问题,包括并行编程,参与节点的运行中故障以及分布式机器之间的数据不一致。 对于通用云编程,当今的开发人员被迫(a)编写顺序程序以在他们要使用的每台计算机上运行,​​(b)确保代码协同工作以在面对核心CS问题的同时达到期望的结果(c)找出如何部署和管理所有这些复杂性。 结果,今天开发人员很难大规模利用云的功能。 继续进行类比,云就像没有 UNIX和C的PDP-11一样-我们正在使用与汇编代码等效的分布式系统对其进行编程(尽管从正确性的角度来看,它要困难得多)。

背景:十年去了哪里? (Background: Where Did a Decade Go?)

One of the reasons we’re moving so fast in my Hydro team recently is because my students and I have been beavering away in this space for over a decade at Berkeley. Ten years ago this fall, at ACM PODS 2010, I issued a call to arms in a keynote talk:

最近我们在Hydro团队中发展如此之快的原因之一是,因为我和我的学生在伯克利呆了10多年,一直在这个领域里呆呆。 十年前的这个秋天,在2010年ACM PODS上 ,我在主题演讲中呼吁武装:

It is now within the budget of individual developers to rent massive resources in the worlds’ largest computing centers. But … this computing potential will go untapped unless those developers can write programs that harness parallelism, while managing the heterogeneity and component failures endemic to very large clusters of distributed computers.

现在,在单个开发商的预算范围内,可以在世界上最大的计算中心租用大量资源。 但是……除非那些开发人员可以编写利用并行性的程序,同时管理大型分布式计算机集群特有的异构性和组件故障,否则这种计算潜力将无法开发。

Given that imperative, I assembled the BOOM project team back in 2010 to explore and demonstrate new ways to write programs. We started designing programming languages like Dedalus and Bloom that use the data in the cloud to drive computation, rather than worrying about which computer is doing what and when. Our early message was not lost on the tech press, which covered the ideas and flagged the promise of our work quite a bit.

鉴于此,我在2010年召集了BOOM项目团队 ,以探索和演示编写程序的新方法。 我们开始设计像Dedalus和Bloom这样的编程语言,它们使用云中的数据来驱动计​​算,而不用担心哪台计算机在做什么以及何时做。 我们的早期信息在技术媒体上没有失传,它涵盖了想法并相当多地 标志了我们工作的承诺 。

But the agenda of general-purpose cloud programming got surprisingly little uptake in the ensuing decade, either in practice or research.

但是,在随后的十年中,无论是在实践中还是在研究中,通用云编程的议程都很少被采用。

In retrospect, the likely distraction was easier money. Amazon Web Services spent the better part of the ‘teens demonstrating that well-capitalized firms could disrupt the enterprise software market without third-party developers or radical new software. Forget cultivating an iPhone-style “app for that” developer community! It was easier to go after aging giants like Oracle and IBM, and offer traditional software to traditional use cases, exploiting the radical new platform solely to lower administrative overheads.

回想起来,可能的分心是更容易赚钱。 亚马逊网络服务部门在上世纪90年代中的大部分时间里都在证明,资金雄厚的公司可能会在没有第三方开发人员或全新软件的情况下破坏企业软件市场。 忘了培养iPhone风格的“该应用程序”开发人员社区! 追随像Oracle和IBM这样的老牌巨头,并为传统用例提供传统软件,而利用根本性的新平台来降低管理开销,则变得更加容易。

And so a decade went by, and we wrote a bunch of papers, built some prototypes, and graduated some new PhDs. We felt pretty excited about the work, and we got plenty of academic recognition. But as the old joke goes, “if you’re so smart, why ain’t you rich”? I have to admit that Jeff Bezos made more money on AWS in the last decade than I did at Berkeley doing research. So to be clear, I’m not arguing that the hundreds of billions of dollars of “boring” cloud revenue was a bad play for businesses.

十年过去了,我们写了很多论文,建立了一些原型,并毕业了一些新的博士学位。 我们对这项工作感到非常兴奋,并且获得了很多学术认可。 但是正如老话所说的那样,“如果你这么聪明,为什么你不富有”? 我必须承认,杰夫·贝佐斯(Jeff Bezos)在过去十年中在AWS上赚的钱比我在伯克利做研究时赚的要多。 因此要明确地说,我并不是在争辩说数千亿美元的“无聊”云收入对企业来说是一个坏游戏。

Nonetheless, the deeper technical revolution in cloud programming still awaits. Now that the cloud market has real competition, and the on-premises software market is back on its heels, we’re entering a new era where enabling the new stuff is going to matter.

尽管如此,云编程中的更深层次的技术革命仍在等待中。 现在,云市场已经有了真正的竞争,并且内部部署软件市场又重新站起来了,我们正在进入一个新时代,在这个新时代中,使新事物变得至关重要。

商用无服务器:FaaS (Commercial Serverless: FaaS)

As part of that new era, the cloud vendors have finally made some moves to empower developers outside their walls. The moniker they’ve chosen? Serverless computing. It’s not my favorite term, but it will have to do for now.

作为新时代的一部分,云供应商终于采取了一些措施,以增强开发人员的实力。 他们选择的绰号? 无服务器计算。 这不是我最喜欢的术语,但现在必须这样做。

In its first incarnation, the idea of serverless computing has been embodied with an API called Functions as a Service (FaaS). As expected, Amazon was first with their AWS Lambda offering, but Microsoft Azure Functions and Google Cloud Functions followed quickly. The idea is simple: a developer writes a function in their favorite traditional programming language. They then upload the function to the cloud, and are given APIs to invoke the function remotely at will. Whenever data arrives at the function input, computation spins up in the cloud, and the result is passed to the output. The developer spends zero time configuring servers. The cloud resources auto-scale up and down dynamically according to usage, and the developer pays as they go, according to that usage.

在无服务器计算的第一个 实例中,已通过称为功能即服务 ( FaaS )的API来体现这种想法。 不出所料,亚马逊首先提供了他们的AWS Lambda产品,但是紧随其后的是Microsoft Azure Functions和Google Cloud Functions。 这个想法很简单:开发人员使用他们最喜欢的传统编程语言编写函数。 然后,他们将功能上传到云中,并获得API以随意远程调用该功能。 每当数据到达函数输入时,计算就会在云中旋转,并将结果传递到输出。 开发人员花费零时间配置服务器。 云资源会根据使用情况自动进行动态伸缩,而开发人员将根据使用情况付费。

To be clear, FaaS is only a first step in cloud programming. It is targeted at launching single-threaded sequential code in traditional languages, i.e. the “assembly language of distributed programming” I mention above. Still, while programming may be rudimentary, at least I don’t need to be a cloud devops wizard as well! And I only pay for what I use. That is, without question, progress.

需要明确的是,FaaS仅仅是云编程的第一步。 它旨在以传统语言(即我上面提到的“分布式编程的汇编语言”)启动单线程顺序代码。 尽管如此,尽管编程可能是基本的,但至少我也不需要成为云开发向导! 而且我只为自己使用的东西付费。 毫无疑问,这就是进步。

In late 2018, a bunch of us in the RISELab at Berkeley started looking at serverless computing. The systems folks in the lab began a writing-by-committee effort to describe the movement of this bandwagon in one of their “Berkeley View” assessment papers. Having already spent a decade thinking about the future of cloud programming, I had stronger opinions. As a counterpoint to the committee effort, my team laid out our frank assessment of the basic pros and cons of first-generation FaaS in a paper entitled Serverless Computing: One Step Forward, Two Steps Back. In a nutshell:

在2018年末,位于伯克利RISELab的一群人开始研究无服务器计算。 实验室中的系统人员开始逐个委员会地努力,在他们的“ Berkeley View ”评估文件之一中描述这种潮流。 在花了十年时间思考云编程的未来之后,我有了更强的见解。 作为委员会工作的对立面,我的团队在一篇名为《 无服务器计算:前进,退后两步》的论文中对第一代FaaS的基本优缺点进行了坦率的评估。 简而言之:

  • Forward: Autoscaling. Third-party software is automatically scaled up and down according to usage patterns, in a pay-as-you go manner.

    前进:自动缩放。 第三方软件会根据使用模式以按需付费的方式自动按比例缩放。

  • Back: Slow Data Access. Serverless functions see embarrassingly high-latency and costly access to stored data.

    返回:缓慢的数据访问。 无服务器功能带来了令人尴尬的高延迟和对存储数据的昂贵访问。

  • Back: No Distributed Computing. Functions are not allowed to communicate with one another except through high-latency storage, making most distributed computing techniques impossible.

    返回:无分布式计算。 除了通过高延迟存储,其他功能不允许相互通信,这使得大多数分布式计算技术无法实现。

Some folks, especially at the orange website, cast the article as a hit job from clueless academics. But the Morning Paper, which has followed our work since the beginning, got the spirit of it:

一些人,特别是在Orange网站上 ,将这篇文章视为无知学者的热门工作。 但是从一开始就遵循我们工作的《 晨报》就具有以下精神 :

[this is ] an appeal from the heart to not stop where we are today, but to continue to pursue infrastructure and programming models truly designed for cloud platforms

[这]是发自内心的呼吁,不要停止我们今天的位置,而是继续追求真正为云平台设计的基础架构和编程模型

Also I like to think we’re not totally clueless (nor totally academic). While writing that paper we were already moving forward, getting past the challenges that the first-gen serverless offerings had dodged. In the papers and prototypes we’ve released since then, we are demonstrating what’s possible.

我也想认为我们并不是一无所知(也不是学术性的)。 在撰写该论文时,我们已经在前进,克服了第一代无服务器产品所躲避的挑战。 从那时起,在我们发布的论文和原型中,我们展示了可能的方法。

有状态无服务器基础架构1:存储 (Stateful Serverless Infrastructure 1: Storage)

In the early days of the RISElab, we wanted to demonstrate that the lessons of the BOOM project — notably avoiding coordination in the style of the CALM Theorem — could be realized in a high-performance system. So Chenggang Wu set out to build a key-value storage (KVS) database called Anna that embraced and extended those lessons.

在RISElab的早期,我们想证明BOOM项目的教训-尤其是避免以CALM定理的方式进行协调-可以在高性能系统中实现。 因此,吴承刚着手建立一个名为Anna的键值存储(KVS)数据库,该数据库包含并扩展了这些课程。

The first goal of Anna—and the name of the original paper—was to perform well at any scale. What did we mean by that? Well, conventional wisdom said that systems have to be rearchitected every time they expand 10x beyond plan. Anna was designed to demonstrate that the lessons of coordination-freeness could result in a system that offered world-beating performance at the small scale on a single multicore box, and at massive scale on machines distributed across the globe.

安娜(Anna)的第一个目标-以及原始报纸的名称-是要在任何规模上都表现出色。 那是什么意思? 好吧, 传统观点认为,每当系统扩展超出计划10倍时,都必须重新构建系统。 Anna的设计目的是证明免协调的教训可能会导致该系统在单个多核设备上以小规模提供全球领先的性能, 在全球分布的机器上则提供大规模的性能。

The Anna story is richer than just the any-scale story. Anna is the subject of two earlier posts of mine (here and here) and two award-winning research papers (ICDE18 and VLDB19), and given the length of this post I’ll be brief here, focusing on technical headlines:

安娜的故事比任何规模的故事都丰富。 安娜是我的两个较早职位( 这里和此处 )和两个获奖研究论文( ICDE18和VLDB19 )的主题,鉴于这篇文章的篇幅,在这里我将简要介绍一下,重点关注技术标题:

  • Anna is crazy fast. In simple workloads Anna is as fast as anything around at any scale. Under contention, Anna is orders of magnitude faster than the fastest KVSes out there, including Redis, Masstree, and Intel’s TBB hashtable. This is because Anna never coordinates (no locks, no atomics, no consensus protocols!), whereas those systems spend 90+% of their time coordinating under contention.

    安娜快疯了。 在简单的工作负载中,Anna的速度与任何规模的事物一样快。 在竞争激烈的情况下,Anna比最快的KVS(包括Redis,Masstree和Intel的TBB哈希表)快几个数量级 。 这是因为Anna从不协调(没有锁,没有原子,没有共识协议!),而那些系统花费90%以上的时间在竞争下进行协调。

  • Anna offers flexible autoscaling. This is the hallmark of a good serverless infrastructure: scales up when you use it hard, scales down to save money and power when you don’t. Again, coordination-freeness is key: there’s no need to maintain distributed membership information, so the cost to add or remove nodes remains low at every scale.

    Anna提供灵活的自动缩放功能。 这是良好的无服务器基础架构的标志:当您辛苦使用它时可以扩大规模,而在不使用它时可以缩小规模以节省金钱和功率。 同样,无协调性是关键:无需维护分布式成员资格信息,因此添加或删除节点的成本在每个规模上都保持较低水平。

  • Anna provides rich data consistency. Even under parallel and distributed execution, Anna can offer various consistency guarantees to allow programmers to reason about data across machines, including powerful classical notions including causal consistency or repeatable read transactional isolation.

    Anna提供了丰富的数据一致性。 即使在并行和分布式执行下,Anna仍可以提供各种一致性保证,以使程序员能够在机器上推理数据,包括强大的经典概念,包括因果一致性或可重复的读取事务隔离。

  • Anna provides unified caching/tiering. Many KVS systems today are designed for one level of storage: either disks, or RAM. In contrast, you can deploy Anna as a caching tier in memory, as a database on disk, or as a multitiered system with a smaller cache on top of a larger database. Anna moves data up and down the tiers, and provides uniform consistency guarantees across both.

    Anna提供统一的缓存/分层。 当今,许多KVS系统都是为一种存储级别而设计的:磁盘或RAM。 相反,您可以将Anna部署为内存中的缓存层,磁盘上的数据库,或者部署为在较大的数据库之上具有较小缓存的多层系统。 Anna将数据上下移动,并在两者之间提供统一的一致性保证。

There is no storage offering from any cloud vendor today that compares with what Chenggang has done with Anna. I believe Anna identifies and can fill a significant hole in the current cloud architectures.

如今,没有任何云供应商提供的存储产品可以与成刚对Anna所做的相比。 我相信Anna可以确定并且可以填补当前云体系结构中的重大漏洞。

有状态无服务器基础架构2:有状态计算 (Stateful Serverless Infrastructure 2: Stateful Compute)

As Anna was maturing, we were ready to move up the stack and contemplate programming. As our first phase, we decided to try and build a FaaS system that tackles the “two steps backward” that plague the commercial FaaS services. This means two things:

随着Anna的成熟,我们已经准备好提高堆栈并考虑编程。 作为我们的第一阶段,我们决定尝试构建FaaS系统,以解决困扰商业FaaS服务的“后退两步”。 这意味着两件事:

  1. Allow cloud functions to communicate with each other over the network. Commercial FaaS systems prevent 2 functions from communicating directly; they have to share any information via some slow distributed storage system. This is true even for simple stuff like passing the results of g(x) to another function f so you can compute f(g(x)). Beyond the basics, fast point-to-point communication is absolutely essential if you hope to do any non-trivial distributed computing other than batch jobs. The potential problem here is that serverless functions come and go pretty often, so their IP addresses aren’t reliable endpoints. This is solved with a classic level of indirection: a lookup service, implemented as some kind of lightweight distributed database. DNS is arguably too heavyweight to deploy for this setting, which is perhaps why the cloud vendors refuse to support networking for FaaS. Fortunately we have Anna—a lightweight autoscaling database. So functions can look each other up by “name” in Anna, and get a current IP address for that name. In a direct sense, Anna serves both as a database and as a Distributed Hash Table overlay network, a duality we explored years ago.

    允许云功能通过网络相互通信。 商业FaaS系统阻止2种功能直接通信; 他们必须通过一些缓慢的分布式存储系统共享任何信息。 即使对于简单的事情(例如将g(x)的结果传递给另一个函数f以便可以计算f(g(x))),也是如此。 除了基础知识之外,如果您希望执行批处理作业以外的任何非平凡的分布式计算,那么快速的点对点通信绝对必不可少。 这里的潜在问题是无服务器功能来来往往,因此它们的IP地址不是可靠的端点。 这可以通过经典的间接级别解决:查找服务,实现为某种轻量级的分布式数据库。 DNS可以说太笨重,无法部署此设置,这也许就是为什么云供应商拒绝支持FaaS联网的原因。 幸运的是,我们有了Anna —一个轻量级的自动缩放数据库。 因此,函数可以在Anna中按“名称”相互查询,并获取该名称的当前IP地址。 从直接意义上讲,Anna既充当数据库又充当分布式哈希表覆盖网络,这是我们几年前探讨的双重性 。

  2. Provide cloud functions with low-latency data access (LDPC). All the interesting challenges in distributed computing begin with data, or as some people like to say, the state of a program. Commercial FaaS vendors are targeted at stateless programs that simply map inputs to outputs with no “side effects” like data updates. But most applications of note these days manage data (state), often in complex ways. Adding to the complexity here is the trend towards disaggregation of storage from compute. In a big cloud environment, you don’t know when and how you need to scale out or upgrade your storage tier or your compute tier, so it’s best to keep them separate. The challenge is that storage services like DynamoDB or ElastiCache become very “far away” in latency terms. To get good latency, we still want some physical colocation of storage near our functions, even if the two tiers are managed and scaled separately. This is what we call Logical Disaggregation with Physical Colocation (LDPC). On this front we needed to innovate, and colocate a data cache on the same machines as the cloud functions, while still providing consistency in concert with Anna.

    提供具有低延迟数据访问(LDPC)的云功能。 分布式计算中所有有趣的挑战都始于数据或某些人喜欢说的程序状态 。 商业FaaS供应商的目标是无状态程序,该程序只需将输入映射到输出而不会产生诸如数据更新之类的“副作用”。 但是,如今,大多数值得注意的应用程序通常以复杂的方式管理数据(状态)。 存储从计算中分解的趋势在这里增加了复杂性。 在大型云环境中,您不知道何时以及如何扩展或升级存储层或计算层,因此最好将它们分开。 挑战在于,就延迟而言,DynamoDB或ElastiCache之类的存储服务将变得“遥不可及”。 为了获得良好的延迟,即使两层都是分别管理和扩展的,我们仍然希望在功能附近进行一些物理主机托管。 这就是我们所说的具有物理托管的逻辑分解(LDPC) 在这方面,我们需要在与云功能相同的计算机上进行创新并在同一台计算机上放置数据缓存,同时仍要与Anna保持一致。

This is where a lot of our energy has been spent in the last year. I’ve learned a lot along the way — while the programming problem remains, the system infrastructure space was interesting in its own right, and I think we got a good handle on the big issues. Here is a rundown of the recent results:

去年,我们在这里花费了大量精力。 在整个过程中,我学到了很多东西–尽管仍然存在编程问题,但系统基础结构空间本身就很有趣,我认为我们可以很好地处理重大问题。 以下是最新结果的摘要:

  • Cloudburst System Architecture: The big ideas, overall architecture and some of the details are spelled out in our VLDB 20 paper on Cloudburst. We argue for the LDPC principle and describe the resulting architecture. Then the paper goes into detail on how we automatically encapsulate a developer’s mutable Python state in coordination-avoiding, composable lattice structures so arbitrary Python objects can be integrated into the coordination-free consistency model of Anna. We also describe how we achieve a simple version of causal consistency through these caches. Microbenchmarks show that we can outperform commercial serverless platforms by 1–2 orders of magnitude, and compete with hand-managed serverful distributed frameworks like Dask. We also show end-to-end numbers for two applications: ML prediction serving, and the Retwis twitter clone. Although we did nothing special to tune for ML prediction serving, we outperform AWS Sagemaker, a system specially designed for the task. (We also outperform AWS Lambda by quite a bit more.)

    Cloudburst系统架构:我们的VLDB 20关于Cloudburst的论文详细阐述了主要构想,总体架构和一些细节。 我们为LDPC原理辩护,并描述了最终的架构。 然后,本文详细介绍了如何在避免协调的可组合网格结构中自动封装开发人员的可变Python状态,以便将任意Python对象集成到Anna的无协调一致性模型中。 我们还将描述如何通过这些缓存实现因果一致性的简单版本。 微基准测试表明,我们可以将商用无服务器平台的性能提高1-2个数量级,并且可以与诸如Dask之类的手动管理的无服务器分布式框架竞争。 我们还将显示两个应用程序的端到端编号:ML预测服务和Retwis twitter克隆。 尽管我们没有为调整ML预测服务做任何特别调整,但我们的性能优于为该任务专门设计的系统AWS Sagemaker。 (我们的性能也比AWS Lambda好得多。)

  • Hydrocache and TCC: The Hydrocache paper in SIGMOD 2020 delves deeper into the ways we keep caches and the database consistent, while still providing low latency. We set the consistency bar even higher in this paper, with the goal of offering transactional causal consistency (TCC). You do not get this level of consistency from the typical distributed caches or KVS systems (looking at you, Redis Cluster!) Yet we show it can be done with very low latency. There’s no question that this paper is quite technical, though. Enjoy :-)

    Hydrocache和TCC: SIGMOD 2020中的Hydrocache论文深入研究了我们在保持低延迟的同时保持缓存和数据库一致的方式。 我们在本文中将一致性标准设置得更高,目的是提供事务因果一致性(TCC)。 您无法从典型的分布式缓存或KVS系统中获得这种级别的一致性(在您看来,Redis群集!),但是我们证明可以以非常低的延迟来实现这一点。 毫无疑问,本文是非常技术性的。 请享用 :-)

  • Atomic Fault Tolerance (AFT): The question of fault tolerance should be on your mind when reading about any distributed system. The FaaS vendors are quite naive about it right now — they tell developers that any function may fail and have to be retried, so it’s up to the developer to ensure that their code is idempotent, meaning it has the same effect whether run once or more than once. That’s not very nice, nor is it very likely to be guaranteed. (OK pop quiz time. Stop what you’re doing. Did you write any code this week? Cool. Is it idempotent? How do you know? Is it reasonable to expect you to worry about that? I thought not!) But it gets worse. If your function modifies stored state (say by issuing a call to a database), and it fails a fraction of the way through execution, it will have visibly run a fractional number of times. That is, the partial execution of your function is now exposed in storage and may be read by other code. This paper points out that what’s needed for FaaS fault tolerance is Atomicity, i.e. the “A” from the ACID guarantees. All your function’s external effects should occur, or none should. Idempotence then becomes easy — just include a unique ID for the request, and regardless of how messy it is, we can run it 0 or at most 1 times. That’s how idempotence is supposed to be exposed. This paper leans on our prior work on Read Atomic isolation, and provides a surprisingly simple implementation as a “shim” layer that works in any FaaS architecture. We have it running in the Cloudburst/Anna stack, but the paper shows how to deploy it in the AWS Lambda/S3 stack.

    原子容错(AFT) :在阅读任何分布式系统时,您都应该想到容错的问题。 FaaS供应商现在对此还很幼稚–他们告诉开发人员任何功能都可能失败并且必须重试,因此由开发人员确保他们的代码是幂等的 ,这意味着无论运行一次还是多次,它都具有相同的效果。不止一次。 那不是很好,也不太可能得到保证。 (OK弹出问答时间。停止你正在做什么。你有没有这个星期编写任何代码?酷。它是幂等?你怎么知道?是不是有理由期待你有关的担心?我认为不是 !),但它变差。 如果你的函数修改存储的状态(比如通过发出一个到数据库的调用),它没有通过执行方式的一小部分,这将有明显跑次小数 。 也就是说,您的函数的部分执行现在已在存储中公开,并且可以由其他代码读取。 本文指出FaaS容错所需的是原子性,即ACID保证中的“ A”。 您所有函数的外部影响都应该发生,或者不应该发生。 幂等就变得很容易-只需为请求添加唯一的ID,无论请求有多混乱,我们都可以将其运行0次或最多运行1次。 这就是幂等性应该被暴露出来的方式。 本文基于我们先前在读取原子隔离方面的工作 ,并提供了一个令人惊讶的简单实现,它是可在任何FaaS架构中使用的“填充”层 我们让它在Cloudburst / Anna堆栈中运行,但是本文展示了如何在AWS Lambda / S3堆栈中部署它。

  • Model Serving. Our first foray into model serving in the VLDB 20 Cloudburst architecture paper whet our appetite to do better. A few years back, when my co-conspirator Joey Gonzalez was leading the Clipper model serving project, I needled him by saying “hey I think all these optimizations you’re exploring — cascades and ensembles and whatnot — could be written as simple Bloom programs”. And I proceeded to sketch them as dataflows on a whiteboard. Well, with the Cloudburst infrastructure under his belt, Vikram Sreekanti took up that idea and made it real. He implemented a simple dataflow language called Cloudflow, and deployed it over Cloudburst. Then he proceeded to explore optimization opportunities exposed by the combination of explicit dataflow and stateful serverless computing, including things like (a) placing code on the right HW resources (i.e. GPUs) or colocated with the right data (i.e. in a Hydrocache), (b) autoscaling different stages of an ML pipeline differently, (c) fusing operators so they run colocated with each other, and (d) running competing versions of operators in parallel to let the fastest execution win. What’s really nice here is that the ML code remains a black box, so this is compatible with your favorite ML libraries (Tensorflow, PyTorch, MXNet, Scikit-Learn, etc.) Joey and I feel like Vikram really made the case that this is the right way to architect a model serving system.

    模型服务。 我们在VLDB 20 Cloudburst体系结构论文中首次涉足模型服务,这激发了我们追求更好的欲望。 几年前,当我的同谋乔伊·冈萨雷斯 ( Joey Gonzalez )领导Clipper模型服务项目时,我说了“嘿,我认为您正在探索的所有这些优化方法(级联,合奏等等)都可以写为简单的Bloom程序” ”。 然后,我开始在白板上将它们绘制为数据流。 好吧,凭借Cloudburst基础设施,Vikram Sreekanti接受了这个想法并将其变为现实。 他实现了一种称为Cloudflow的简单数据流语言,并将其部署在Cloudburst上。 然后,他继续探索显式数据流与有状态无服务器计算相结合所带来的优化机会,其中包括(a)将代码放置在正确的硬件资源(即GPU)上或与正确的数据并置(例如在Hydrocache中),( b)以不同方式自动缩放ML管道的不同阶段,(c)融合运算符,使它们彼此并置运行,以及(d)并行运行竞争版本的运算符,以使最快的执行获胜。 真正的好处是ML代码仍然是一个黑匣子,因此它与您喜欢的ML库( Tensorflow , PyTorch , MXNet , Scikit-Learn等) 兼容 。Joey和我觉得Vikram确实证明了构建模型服务系统的正确方法。

In sum, Cloudburst is our answer to the critiques of FaaS we raised 2 years ago. Cloudburst shows that FaaS can provide 3 steps forward, and provide an underpinning for general-purpose cloud programming. Most programming tasks that can benefit from the world’s biggest computer absolutely require efficient and consistent management of program state, and that’s where much of the hard computer science lies in this space.

总而言之,Cloudburst是我们对两年前提出的FaaS批评的回答。 Cloudburst表明FaaS可以提供​​3个步骤,并为通用云编程提供了基础。 绝对可以从世界上最大的计算机受益的大多数编程任务 需要对程序状态进行有效且一致的管理,而这正是硬核计算机科学在这一领域中所处的位置。

加起来 (Summing Up)

Obviously all this work was done by a team. The lion’s share was done by the lead PhD students, Vikram Sreekanti and Chenggang Wu, who are truly a dynamic duo. Joey Gonzalez was my co-conspirator as faculty advisor. Other contributors include Saurav Chhatrapati, Charles Lin, Yihan Lin, and Hari Subbaraj, with wise input from Jose Faleiro, Johann Schleier-Smith, and Alexey Tumanov.

显然,所有这些工作都是由团队完成的。 大部分学生是由真正的博士生Vikram Sreekanti和Wu Chenggang所完成的。 乔伊·冈萨雷斯 ( Joey Gonzalez)是我的同谋,是一名教职顾问。 其他贡献者包括Saurav Chhatrapati , Charles Lin ,Yihan Lin和Hari Subbaraj ,来自Jose Faleiro , Johann Schleier-Smith和Alexey Tumanov的明智贡献 。

Our ability to slay some dragons in this space in recent years is also thanks to a long line of research from an even bigger group of collaborators from BOOM and P2 days. There’s more to come from our end, and I expect to see more good stuff from the community at large. Programming the cloud is one of the biggest challenges and opportunities in computer science, and we’ll continue pushing forward.

近年来,我们有能力在BOOM和P2 Day的更大范围的合作伙伴中进行大量研究,从而在该领域中杀死了一些巨龙。 我们的目标还有很多,我希望整个社区都能看到更多的好东西。 对云进行编程是计算机科学中最大的挑战和机遇之一,我们将继续前进。

In addition to NSF CISE Expeditions Award CCF-1730628, this research is supported by gifts from Alibaba, Amazon Web Services, Ant Financial, CapitalOne, Ericsson, Facebook, Futurewei, Google, Intel, Microsoft, Nvidia, Scotiabank, Splunk and VMware.

除了NSF CISE Expeditions Award CCF-1730628外,这项研究还得到了阿里巴巴,亚马逊网络服务,蚂蚁金服,CapitalOne,爱立信,Facebook,Futurewei,谷歌,英特尔,微软,Nvidia,丰业银行,Splunk和VMware的支持。

翻译自: https://medium.com/riselab/the-state-of-the-serverless-art-78a4f02951eb

服务器虚拟化发展现状

你可能感兴趣的:(https,mysql)