aws lambda使用
Originally published at https://www.philschmid.de on August 12, 2020.
最初于 2020年8月12日 在 https://www.philschmid.de 上 发布 。
介绍 (Introduction)
“ Just like wireless internet has wires somewhere, serverless architectures still have servers somewhere. What ‘serverless’ really means is that as a developer you don’t have to think about those servers. You just focus on code.” — serverless.com
就像无线互联网在某处有电线一样,无服务器架构仍然在某处有服务器。 “无服务器”的真正含义是,作为开发人员,您不必考虑这些服务器。 您只需专注于代码。” — serverless.com
This focus is only possible if we make some tradeoffs. Currently, all Serverless FaaS Services like AWS Lambda, Google Cloud Functions, Azure Functions are having limits. For example, there is no real state or no endless configurable memory.
仅当我们进行一些权衡时,才有可能关注此焦点。 当前,所有无服务器FaaS服务(如AWS Lambda , Google Cloud Functions , Azure Functions)都有局限性。 例如,没有真实状态或无休止的可配置内存。
These limitations have led to serverless architectures being used more for software development and less for machine learning, especially deep learning.
这些局限性导致无服务器架构越来越多地用于软件开发,而很少用于机器学习,尤其是深度学习。
A big hurdle to overcome in serverless deep learning with tools like AWS Lambda, Google Cloud Functions, Azure Functions is storage. Tensorflow and Pytorch are having a huge size and newer “State of the art” models like BERT have a size of over 300MB. So far it was only possible to use them if you used some compression techniques. You can check out two of my posts on how you could do this:
使用AWS Lambda , Google Cloud Functions和Azure Functions之类的工具在无服务器深度学习中要克服的一大障碍是存储。 Tensorflow和Pytorch的容量很大,而像BERT这样的较新“最新技术”模式则超过300MB。 到目前为止,只有使用某些压缩技术才能使用它们。 您可以查看我的两篇有关如何执行此操作的文章:
Scaling Machine Learning from ZERO to HERO
从零到英雄的缩放机器学习
Serverless BERT with HuggingFace and AWS Lambda
带有HuggingFace和AWS Lambda的无服务器BERT
But last month AWS announced mountable storage to your serverless functions. They added support for Amazon Elastic File System (EFS), a scalable and elastic NFS file system. This allows you to mount your AWS EFS filesystem to your AWS Lambda function.
但是上个月,AWS宣布了可挂载的存储到您的无服务器功能。 他们增加了对Amazon Elastic File System(EFS)的支持, EFS是一种可扩展的弹性NFS文件系统。 这使您可以将AWS EFS文件系统挂载到AWS Lambda函数。
In their blog post, they explain to connect an AWS lambda function to AWS EFS. The blog post is very nice, definitely check it out.
在他们的博客文章中 ,他们解释了如何将AWS lambda函数连接到AWS EFS。 该博客文章非常好,一定要检查一下。
In this post, we are going to do the same, but a bit better with using the Serverless Framework and without the manual work.
在这篇文章中,我们将做同样的事情,但是使用无服务器框架并且无需手动工作会更好一些。
I am building a CLI tool called efsync
which enables you to upload automatically files (pip packages, ML models, ...) to an EFS file system.
我正在构建一个名为efsync
的CLI工具,该工具可让您自动将文件(pip包,ML模型,...)上传到EFS文件系统。
Until I finished efsync
you can use AWS Datasync to upload you data to an AWS EFS file system.
在完成efsync
您可以使用AWS Datasync将数据上传到AWS EFS文件系统 。
什么是AWS Lambda? (What is AWS Lambda?)
You are probably familiar with AWS Lambda, but to make things clear AWS Lambda is a computing service that lets you run code without managing servers. It executes your code only when required and scales automatically, from a few requests per day to thousands per second. You only pay for the compute time you consume — there is no charge when your code is not running.
您可能熟悉AWS Lambda ,但是要弄清楚一点,AWS Lambda是一种计算服务,使您可以在不管理服务器的情况下运行代码。 它仅在需要时执行代码并自动扩展,从每天几个请求到每秒数千个。 您只需为您消耗的计算时间付费-代码未运行时不收费。
什么是AWS EFS? (What is AWS EFS?)
Amazon EFS is a fully-managed service that makes it easy to set up, scale, and cost-optimize file storage in the Amazon Cloud. Amazon EFS-filesystems can automatically scale from gigabytes to petabytes of data without needing to provision storage. Amazon EFS is designed to be highly durable and highly available. With Amazon EFS, there is no minimum fee or setup costs, and you pay only for what you use.
Amazon EFS是一项完全托管的服务,可轻松在Amazon Cloud中设置,扩展和成本优化文件存储。 Amazon EFS文件系统可以自动将数据从GB扩展到PB,而无需配置存储。 Amazon EFS旨在具有高度耐用性和高可用性。 使用Amazon EFS,没有最低费用或设置成本,您只需为使用的商品付费。
无服务器框架 (Serverless Framework)
The Serverless Framework helps us develop and deploy AWS Lambda functions. It’s a CLI that offers structure, automation, and best practices right out of the box. It also allows us to focus on building sophisticated, event-driven, serverless architectures, comprised of functions and events.
无服务器框架可帮助我们开发和部署AWS Lambda函数。 它是一个CLI,可立即提供结构,自动化和最佳实践。 它还使我们能够专注于构建由功能和事件组成的复杂的,事件驱动的,无服务器的体系结构。
If you aren’t familiar or haven’t set up the Serverless Framework, take a look at this quick-start with the Serverless Framework.
如果您不熟悉或尚未设置Serverless Framework,请使用Serverless Framework 快速入门 。
讲解 (Tutorial)
We build an AWS Lambda function with python3.8
as runtime, which is going to import and use pip packages
located on our EFS-filesystem. As an example, we use pandas
and pyjokes
. They could easily be replaced by Tensorflow
or Pytorch
.
我们使用python3.8
作为运行时构建了一个AWS Lambda函数, python3.8
导入并使用位于EFS文件系统上的pip packages
。 例如,我们使用pandas
和pyjokes
。 可以轻松地用Tensorflow
或Pytorch
替换它们。
Before we get started, make sure you have the Serverless Framework configured and an EFS-filesystem set up with the required dependencies. We are not going to cover the steps on how to install the dependencies and upload them to EFS in this blog post. You can either user AWS Datasync or start an ec2-instance
connect with ssh
, mount the EFS-filesystem with amazon-efs-utils
, and use pip install -t
to install the pip packages on efs.
在开始之前,请确保已配置了Serverless Framework,并设置了具有必需依赖项的EFS文件系统。 在本博文中,我们将不介绍如何安装依赖项并将其上传到EFS的步骤。 您可以使用AWS Datasync或通过ssh
启动ec2-instance
连接,使用amazon-efs-utils
挂载EFS文件系统,然后使用pip install -t
在efs上安装pip软件包。
We are going to do:
我们要做的是:
- create a Python Lambda function with the Serverless Framework 使用无服务器框架创建Python Lambda函数
configure the
serverless.yaml
and add ourEFS-filesystem
as mount volume配置
serverless.yaml
并添加我们的EFS-filesystem
作为装载卷adjust the
handler.py
and importpandas
andpyjokes
from EFS调整
handler.py
和进口pandas
和pyjokes
从EFS- deploy & test the function 部署并测试功能
创建一个Python Lambda函数 (Create a Python Lambda function)
First, we create our AWS Lambda function by using the Serverless CLI with the aws-python3
template.
首先,我们通过结合使用aws-python3
模板的无服务器CLI创建AWS Lambda函数。
This CLI command creates a new directory containing a handler.py
, .gitignore
, and serverless.yaml
file. The handler.py
contains some basic boilerplate code.
此CLI命令创建一个新目录,其中包含handler.py
, .gitignore
和serverless.yaml
文件。 handler.py
包含一些基本的样板代码。
配置serverless.yaml
并添加我们的EFS-filesystem
作为装载卷 (Configure the serverless.yaml
and add our EFS-filesystem
as mount volume)
I provide the complete serverless.yaml
for this example, but we go through all the details we need for our EFS-filesystem and leave out all standard configurations. If you want to learn more about the serverless.yaml
, I suggest you check out Scaling Machine Learning from ZERO to HERO. In this article, I went through each configuration and explain the usage of them.
我为该示例提供了完整的serverless.yaml
,但我们serverless.yaml
研究了EFS文件系统所需的所有细节,并省略了所有标准配置。 如果您想了解有关serverless.yaml
更多信息,建议您查看从ZERO到HERO的Scaling Machine Learning 。 在本文中,我介绍了每种配置并解释了它们的用法。
First, we need to install the serverless-pseudo-parameters
plugin with the following command.
首先,我们需要使用以下命令来安装serverless-pseudo-parameters
插件。
We use the serverless-pseudo-parameters
plugin to get our AWS::AccountID
referenced in the serverless.yaml
. All custom needed variables are referenced under custom
.
我们使用serverless-pseudo-parameters
插件,让我们的AWS::AccountID
中所引用的serverless.yaml
。 所有需要定制的变量都在custom
下引用。
efsAccessPoint
should be the value of your EFS access point. You can find it in the AWS Management Console underEFS
. This one should look similar to thisfsap-0a31095162dd0ca44
efsAccessPoint
应该是您的EFS访问点的值。 您可以在AWS Management Console中的EFS
下找到它。 这应该看起来类似于此fsap-0a31095162dd0ca44
LocalMountPath
is the path under which EFS is mounted in the AWS Lambda functionLocalMountPath
是在AWS Lambda函数中安装EFS的路径subnetsId
should have the same id as the EFS-filesystem. If you started your filesystem in multiple Availability Zones you can choose the one you want.subnetsId
应该具有与EFS文件系统相同的ID。 如果您在多个可用区中启动了文件系统,则可以选择所需的一个。securityGroup
can be any security group in the AWS account. We need this to deploy our AWS Lambda function into the required subnet. We can use thedefault
security group id. This one should look like thissg-1018g448
.securityGroup
可以是AWS账户中的任何安全组。 我们需要这样做才能将AWS Lambda函数部署到所需的子网中。 我们可以使用default
安全组ID。 这应该看起来像是sg-1018g448
。
We utilize Cloudformation extensions to mount the EFS-filesystem after our lambda is created. Therefore we use this little snippet. Extensions can be used to override Cloudformation Resources.
创建lambda之后,我们利用Cloudformation扩展来挂载EFS文件系统。 因此,我们使用这个小片段。 扩展可用于覆盖Cloudformation资源 。
调整handler.py
和进口pandas
和pyjokes
从EFS (Adjust the handler.py
and import pandas
and pyjokes
from EFS)
The last step before we can deploy is to adjust our handler.py
and import pandas
and pyjokes
from EFS. In my example, I used /mnt/efs
as localMountPath
and installed my pip packages in lib/
.
之前,我们可以部署的最后一步是调整我们的handler.py
和进口pandas
和pyjokes
从EFS。 在我的示例中,我将/mnt/efs
用作localMountPath
并将pip软件包安装在lib/
。
To use our dependencies from our EFS-filesystem we have to add our localMountPath
path to our PYTHONPATH
. Therefore we add a small try/except
statement at the top of your handler.py
, which appends our mnt/efs/lib
to the PYTHONPATH
. Lastly, we add some demo calls to show our 2 dependencies work.
要使用EFS文件系统中的依赖项,我们必须将localMountPath
路径添加到PYTHONPATH
。 因此,我们在您的handler.py
顶部添加了一条try/except
语句,将mnt/efs/lib
附加到PYTHONPATH
。 最后,我们添加一些演示调用以显示我们的2个依赖项工作。
部署和测试功能 (Deploy & Test the function)
In order to deploy the function we only have to run serverless deploy
.
为了部署该功能,我们只需要运行serverless deploy
。
After this process is done we should see something like this.
完成此过程后,我们应该会看到类似的内容。
To test our Lambda function we can use Insomnia, Postman, or any other REST client. Just send a GET-Request to our created endpoint. The answer should look like this.
要测试Lambda函数,我们可以使用Insomnia,Postman或任何其他REST客户端。 只需将GET-Request发送到我们创建的端点即可。 答案应如下所示。
The first request to the cold AWS Lambda function took around 8 seconds. After it is warmed up it takes around 100–150ms as you can see in the screenshot.
对冷AWS AWS Lambda函数的第一个请求耗时约8秒。 预热后大约需要100–150毫秒,如屏幕截图所示。
The best thing is, our AWS Lambda function automatically scales up if there are several incoming requests up to thousands of parallel requests without any worries.
最好的是,如果有多个传入请求,多达数千个并行请求,我们的AWS Lambda函数将自动扩展,而无需担心。
If you rebuild this, you have to be careful that the first request could take a while.
如果重新构建,则必须小心,第一个请求可能需要一段时间。
You can find the GitHub repository with the complete code here.
您可以在此处找到带有完整代码的GitHub存储库 。
Thanks for reading. If you have any questions, feel free to contact me or comment on this article. You can also connect with me on Twitter or LinkedIn.
谢谢阅读。 如果您有任何疑问,请随时与我联系或对本文发表评论。 您也可以在Twitter或LinkedIn上与我联系。
翻译自: https://medium.com/swlh/mount-your-aws-efs-volume-into-aws-lambda-with-the-serverless-framework-470b1c6b1b2d
aws lambda使用