node.js 事件循环
In this article, we will see some ways to quickly block or slow down the Event-loop of Node.js. If you are not familiar with the concept of “Event-loop” in Node.js, I recommend you read some articles first about this subject. The most important thing to remember is: the Event-loop is single-threaded, so if you block it or slow it down then this will impact your entire application.
在本文中,我们将看到一些方法来快速阻止或减慢Node.js的事件循环。 如果您不熟悉Node.js中的“事件循环”的概念,建议您首先阅读一些有关此主题的文章。 要记住的最重要的事情是:事件循环是单线程的,因此,如果将其阻塞或放慢速度,则将影响整个应用程序。
事件循环快速概述 (Event loop quick overview)
I will not go into a long explanation about the Event-loop since many have done it before and better than me. For the understanding of this article you just have to remember 2 things :
我将不对事件循环做详尽的解释,因为许多人以前做过并且比我做得更好。 为了理解本文,您只需要记住两件事:
- the Event-loop is the heart of Node.js, it can be seen as an abstraction of how Node.js executes code and runs your application 事件循环是Node.js的核心,可以看作是Node.js如何执行代码和运行应用程序的抽象
- it must run without interruption and without slowing down otherwise, your users will quickly become frustrated 它必须不间断地运行并且不降低速度,否则您的用户将很快感到沮丧
The Event-loop can be schematized as follows (thanks to Bert Belder for the diagram):
Event-loop可以按如下方式进行模式化(感谢Bert Belder提供的图表):
事件循环的局限性和危险 (Event loop limitations & dangers)
As previously said it’s crucial to have a running event loop and keep its latency as low as possible. The latency is basically the average time separating two successive iterations of your Event-loop.
如前所述,至关重要的是要有一个运行中的事件循环并保持其等待时间尽可能短。 延迟基本上是将事件循环的两个连续迭代分开的平均时间。
A potential point of failure in Node.js application can come from two factors:
Node.js应用程序中的潜在故障点可能来自两个因素:
The Event-loop is single-threaded
事件循环是单线程的
Also known as the “one instruction to block them all” factor. Indeed in Node.js you can block every request just because one of them had a blocking instruction. A good review of your code should always start with a distinction between blocking and non-blocking code.
也称为“全部阻止一条指令”因素。 实际上,在Node.js中,您可以阻止每个请求,因为其中一个请求具有阻止指令。 对代码进行良好的审查应始终从区分阻塞代码和非阻塞代码开始。
Non-blocking code (simple instructions):
非阻塞代码(简单指令):
Blocking code (long operation):
阻止代码(长时间运行):
Observation 1: do not confuse blocking code and infinite loop, a blocking code is generally a long operation (more than a few milliseconds).
观察1:不要混淆阻塞代码和无限循环,阻塞代码通常是一个很长的操作(超过几毫秒)。
Observation 2: try to differentiate long operations and operations that will slow down or block the Event-loop. Some long operations can be handled asynchronously without disturbing your app (like database access).
观察2:尝试区分长时间操作和会减慢或阻塞事件循环的操作。 可以异步处理一些长操作,而不会干扰您的应用程序(例如数据库访问)。
Observation 3: long response time of your application does not necessarily mean you have a blocking task (it can be related to long DB access, external API calls, etc).
观察结果3:应用程序的响应时间长不一定意味着您有阻止任务(它可能与长时间的数据库访问,外部API调用等有关)。
2. Thread pool limit
2. 线程池限制
Node.js tries to always process a blocking operation with async APIs or with a thread pool. In this manner, some blocking operations become non-blocking from your application’s point of view. As much as possible it will use async APIs as it’s a more powerful and lightweight system and it keeps the usage of thread pool when no other choice is possible. Why? Only because a thread has a bigger footprint on your system and consumes more resources.
Node.js尝试始终使用异步API或线程池来处理阻塞操作。 以这种方式,从您的应用程序的角度来看,某些阻止操作将变为非阻止。 它将尽可能使用异步API,因为它是功能更强大,更轻便的系统,并且在没有其他选择时,它会保持线程池的使用。 为什么? 仅因为线程在您的系统上具有更大的占用空间并消耗更多的资源。
There are a few cases where Node.js has to use the thread pool:
在某些情况下,Node.js必须使用线程池:
all fs (File system) operations, except
fs.FSWatcher()
除
fs.FSWatcher()
外的所有fs(文件系统)操作some functions from Crypto lib
加密库的一些功能
- almost all Zlib functions 几乎所有的Zlib函数
dns.lookup()
,dns.lookupService()
dns.lookup()
,dns.lookupService()
And this thread pool has a size limit, by default, Node.js has access to only 4 threads, so you can parallelize only 4 operations at the same time.
而且此线程池有大小限制,默认情况下,Node.js只能访问4个线程,因此您只能同时并行化4个操作。
This value can be customized with the variable UV_THREADPOOL_SIZE.
可以使用变量UV_THREADPOOL_SIZE自定义该值。
UV_THREADPOOL_SIZE=16 node index.js
In any case, every operation that uses the thread pool behind the scenes is a potential performance bottleneck.
无论如何,在后台使用线程池的每个操作都是潜在的性能瓶颈。
如何减慢事件循环 (How to slow down the event loop)
CPU-intensive operations: crypto
CPU密集型操作:加密
The Node.js crypto lib is known to have a lot of functions that use a lot of CPU. In a real case, it means you can quickly slow down your application. The problem becomes critical when this lib is used in every incoming request. It will:
众所周知,Node.js加密库具有许多使用大量CPU的功能。 在实际情况下,这意味着您可以快速降低应用程序的速度。 当在每个传入请求中使用此lib时,问题就变得很严重。 它会:
- slow down all individual requests (and generate users frustration) 放慢所有单独的请求(并引起用户沮丧)
- generate too many instances to compensate for the increase in CPU consumption 生成太多实例以补偿CPU消耗的增加
In this example, we generate a token in each request which is probably not useful.
在此示例中,我们在每个请求中生成一个可能没有用的令牌。
We prefer to generate it only once and then reuse it. In this manner, you don’t slow down the event loop for each new request.
我们更喜欢只生成一次,然后重用它。 这样,您不会降低每个新请求的事件循环。
Of course, it’s a simple example but here is the difference in terms of performance:
当然,这是一个简单的示例,但这是性能方面的差异:
Before:
之前:
After:
后:
We go from 195 requests in 10 seconds to 39,434: no possible comparison!
我们将在10秒内将195个请求增加到39,434个:无法进行比较!
In a real case, it means you will decrease the number of instances you need to serve the same amount of requests and/or you can use smaller servers to do the same work.
在实际情况下 ,这意味着您将减少处理相同数量的请求所需的实例数,并且/或者可以使用较小的服务器来完成相同的工作。
JSON.parse / JSON.stringify
JSON.parse / JSON.stringify
Another interesting point is the famous JSON parser. We commonly use JSON.stringify
and JSON.parse
functions, but these two methods have a complexity of O(n)
where n is the length of your JSON object.
另一个有趣的地方是著名的JSON解析器。 我们通常使用JSON.stringify
和JSON.parse
函数,但是这两种方法的复杂度为O(n)
,其中n是JSON对象的长度。
Let’s see the difference when we use JSON.stringify
with a small JSON file (~0.4Kb) and a large JSON file (~9Mb).
让我们看看将JSON.stringify
与一个小的JSON文件(〜0.4Kb)和一个大的JSON文件(〜9Mb)一起使用时的区别。
We go from 252 requests in 10 seconds to 75k. The solution can be to work with small files only or to load large files only once.
我们将在10秒内将252个请求增加到75k。 解决方案可以是只处理小文件,也可以只加载大文件一次。
If you really need to work with large JSON objects you should take a look at these solutions:
如果您确实需要使用大型JSON对象,则应查看以下解决方案:
JSONStream
JSONStream
Big-Friendly JSON
大友好的JSON
Fastify stringify
固定串化
Read a file instead of memory
读取文件而不是内存
As said before, each time you read a file you will potentially create a performance bottleneck, especially if you read a file each time a request is processed. Sometimes this operation is hidden inside a dependency and it’s hard to detect.
如前所述,每次读取文件都可能会造成性能瓶颈,特别是如果每次处理请求时都读取文件。 有时,此操作隐藏在依赖项中,很难检测到。
I will talk about a concrete example we encountered in one of our projects at Voodoo. We use the MaxMind database to extract the user’s country from the IP address. To do that we simply use an existing npm module. Basically it uses readFile from Node.js core (fs
module) under the hood. It’s an asynchronous operation, so it should be a piece of cake, right?
我将谈论在Voodoo的一个项目中遇到的具体示例。 我们使用MaxMind数据库从IP地址提取用户所在的国家/地区。 为此,我们只需使用现有的npm模块。 基本上,它使用引擎盖下Node.js核心( fs
模块)中的readFile 。 这是一个异步操作,因此应该轻而易举,对吗?
But for every new incoming request, we read the DB file (remember we have a limited number of threads for this). So in a high traffic API, it tends to slow down the Event-loop.
但是对于每个新的传入请求,我们都会读取数据库文件(请记住,为此,我们只有有限数量的线程)。 因此,在高流量的API中,它倾向于减慢事件循环。
Solution: store all the DB in memory during server startup.
解决方案 :在服务器启动期间将所有数据库存储在内存中。
The following chart should speak for itself concerning the performance gain.
下表应足以说明性能。
Vulnerable regexp
脆弱的正则表达式
A vulnerable regular expression is one on which your regular expression engine might take exponential time.
易受攻击的正则表达式是您的正则表达式引擎可能要花费指数时间的表达式。
Most of the time your regexp complexity will be O(n)
(where n is the length of your input) but it some cases it can beO(n^2)
and it can lead to REDOS.
在大多数情况下,您的正则表达式复杂度将为O(n)
(其中n是您输入的长度),但在某些情况下,它可能为O(n^2)
,并且可能导致REDOS。
Let see a simple regexp to check if an email address is valid.
让我们看一个简单的正则表达式来检查电子邮件地址是否有效。
[a-z]+@[a-z]+([a-z\.]+\.)+[a-z]+
Now we can measure the execution time with a simple email address and with a fake email.
现在,我们可以使用简单的电子邮件地址和伪造的电子邮件来衡量执行时间。
If you add some points at the end of the input it will quickly block your app. In this simple example, we go from 0.05s to 8.4s. And you can add a few more points to completely block your Node.js instance.
如果在输入末尾添加一些点,它将Swift阻止您的应用程序。 在这个简单的示例中,我们从0.05s变为8.4s。 您还可以添加一些点以完全阻止您的Node.js实例。
To avoid it you can check your regexp with some tools like safe-regex, or you can use solutions that will handle regexp for you like validator.js.
为了避免这种情况,您可以使用诸如safe-regex之类的工具检查regexp,或者使用可以为您处理regexp的解决方案(例如validator.js) 。
如何阻止事件循环 (How to block the event loop)
Programmatic errors
程序错误
Of course, the easiest way to block your application is to insert an infinite loop. It seems obvious to detect and to avoid but it’s still possible especially when you work a lot with modules or with events.
当然,阻止应用程序的最简单方法是插入无限循环。 检测和避免似乎很明显,但是仍然可以实现,尤其是当您处理大量模块或事件时。
Sometimes this kind of behavior is created faster than you might think, even by good programmers. Let see the example with date
and while loop.
有时,即使是优秀的程序员,这种行为的产生也比您想象的要快。 让我们看一下带有date
和while循环的示例。
Still not convinced? What about process.nextTick()
?
还是不服气? 那process.nextTick()
呢?
process.nextTick() & infinite loop
process.nextTick()和无限循环
process.nextTick() will invoke a callback at the end of the current operation, before the next event loop tick starts.
在下一个事件循环滴答开始之前,process.nextTick()将在当前操作结束时调用回调。
It can be used in some cases, but the problem is:
在某些情况下可以使用它,但是问题是:
- it will prevent the event loop to continue its cycle until your callback is finished 它会阻止事件循环继续其循环,直到回调完成为止
- it allows you to block every I/O by making recursive process.nextTick() calls. It’s not technically an infinite loop but it will produce the same effect, like a bad recursive function without termination condition. 它允许您通过递归process.nextTick()调用来阻止每个I / O。 从技术上讲,它不是无限循环,但会产生相同的效果,就像没有终止条件的不良递归函数一样。
Recursion has something to do with infinity
递归与无穷大有关
Sync operations
同步操作
This is not a surprise, synchronous operations in Node.js are bad practices. If you have read this whole article it should be obvious to you! Every time you use them, you will block your entire application until the operation is finished. Node.js will not be able to use the thread pool or async APIs and the event-loop activity will be suspended.
这不足为奇,Node.js中的同步操作是不好的做法。 如果您已阅读整篇文章,那么对您来说应该显而易见! 每次使用它们时,您将阻塞整个应用程序,直到操作完成。 Node.js将无法使用线程池或异步API,并且事件循环活动将被暂停。
如何创建无限事件循环(您的程序将永远不会退出) (How to create an infinite event loop (your program will never exit))
Let’s say you want to create a simple program that needs to exit after a simple task is finished, like a worker or a simple script. Those programs are supposed to stop in any case and very quickly. But you can create a situation where the Event-loop will never exit. Do you remember the first diagram? There is a ref
which is a simple counter of all pending tasks in the Event-loop. If this ref
is greater than 0, then the program will not exit and Node.js will check every pending task. If a task is finished then the ref
will be decrement. So you will only be able to exit your program once all the tasks are finished and so if the ref
is equal to 0.
假设您要创建一个简单的程序,该简单程序需要在完成简单任务(例如工作程序或简单脚本)之后退出。 这些程序在任何情况下都应该很快停止。 但是,您可能会造成事件循环永远不会退出的情况。 你还记得第一张图吗? 有一个ref
,它是事件循环中所有未决任务的简单计数器。 如果此ref
大于0,则程序将不会退出,Node.js将检查每个待处理的任务。 如果任务完成,则ref
将减少。 因此,只有当所有任务完成并且ref
等于0时,您才能退出程序。
setInterval
setInterval
Timers are the best example! If you introduce a simple setInterval
inside a script, if you don’t clear this timer, it will run forever, and your program will too.
计时器是最好的例子! 如果在脚本中引入一个简单的setInterval
,那么如果不清除此计时器,它将永久运行,您的程序也将永远运行。
To avoid this, you can:
为避免这种情况,您可以:
- clear all your timers when they become no longer useful 清除所有不再有用的计时器
use
process.exit()
orprocess.abort()
orprocess.kill()
使用
process.exit()
或process.abort()
或process.kill()
Event listeners (no problem)
事件监听器(没问题)
An event listener can be seen as a background task that will go on forever until you clean it. We can assume it will increment the ref
counter of the EventLoop and so create a kind of infinite loop. But it’s not the case, even if you forget to remove your handlers.
事件侦听器可以看作是后台任务,它将一直持续到您清除它为止。 我们可以假设它将增加EventLoop的ref
计数器,从而创建一种无限循环。 但这不是事实 ,即使您忘记删除处理程序也是如此。
Even if you don’t block the Event-loop with an EventEmitter
it’s always a best practice to clean your listeners. You can use removeListener
or removeAllListeners
methods.
即使您不使用EventEmitter
阻止事件循环,也始终是清理监听器的最佳实践。 您可以使用removeListener
或removeAllListeners
方法。
监控方式 (Monitoring)
Modules
模组
Some tools can help you to inspect the Event-loop state and to visualize its behavior:
一些工具可以帮助您检查事件循环状态并可视化其行为:
wtfnode is a simple module that generates a “dump” of the Event-loop: https://www.npmjs.com/package/wtfnode
wtfnode是一个简单的模块,可生成事件循环的“转储”: https ://www.npmjs.com/package/wtfnode
you can use the internal methods directly
process._getActiveRequests()
andprocess._getActiveHandles()
which will give you the raw data about tasks inside your Event-loop.您可以直接使用内部方法
process._getActiveRequests()
和process._getActiveHandles()
,这将为您提供有关事件循环内任务的原始数据。clinicjs can also provide some valuable data
clinic.js还可以提供一些有价值的数据
APM
APM
Some APM solutions provide information about Event-loop and its latency. It can be useful to detect an instance in a bad state.
一些APM解决方案提供有关事件循环及其延迟的信息。 检测处于不良状态的实例可能很有用。
Some of them display information about Garbage Collector which is another key concept to better understand Node.js and to debug your application. If you want to learn more about it you can read my article about GC.
其中一些显示有关Garbage Collector的信息,这是更好理解Node.js并调试应用程序的另一个关键概念。 如果您想了解更多有关它的信息,可以阅读我有关GC的文章 。
资料来源 (Sources)
翻译自: https://medium.com/voodoo-engineering/node-js-lots-of-ways-to-block-your-event-loop-and-how-to-avoid-it-b41f41deecf5
node.js 事件循环