通常,Node.js如何处理10,000个并发请求?

本文翻译自:How, in general, does Node.js handle 10,000 concurrent requests?

I understand that Node.js uses a single-thread and an event loop to process requests only processing one at a time (which is non-blocking). 我知道Node.js使用单线程和事件循环来处理一次仅处理一个请求的请求(这是非阻塞的)。 But still, how does that work, lets say 10,000 concurrent requests. 但是,这是如何工作的,可以说有10,000个并发请求。 The event loop will process all the requests? 事件循环会处理所有请求吗? Would not that take too long? 那会不会花费太长时间?

I can not understand (yet) how it can be faster than a multi-threaded web server. 我还不了解(但是)它如何比多线程Web服务器更快。 I understand that multi-threaded web server will be more expensive in resources (memory, CPU), but would not it still be faster? 我知道多线程Web服务器的资源(内存,CPU)会更昂贵,但是会不会更快? I am probably wrong; 我可能错了; please explain how this single-thread is faster in lots of requests, and what it typically does (in high level) when servicing lots of requests like 10,000. 请说明在处理大量请求时此单线程的速度如何,以及在处理诸如10,000之类的大量请求时通常会执行的操作(高级)。

And also, will that single-thread scale well with that large amount? 而且,单线程是否可以很好地扩展此数量? Please bear in mind that I am just starting to learn Node.js. 请记住,我才刚刚开始学习Node.js。


#1楼

参考:https://stackoom.com/question/2MFT6/通常-Node-js如何处理-个并发请求


#2楼

What you seem to be thinking is that most of the processing is handled in the node event loop. 您似乎想的是,大多数处理是在节点事件循环中处理的。 Node actually farms off the I/O work to threads. 节点实际上将I / O工作分配给线程。 I/O operations typically take orders of magnitude longer than CPU operations so why have the CPU wait for that? I / O操作通常比CPU操作花费几个数量级,那么CPU为什么要等待呢? Besides, the OS can handle I/O tasks very well already. 此外,操作系统已经可以很好地处理I / O任务。 In fact, because Node does not wait around it achieves much higher CPU utilisation. 实际上,由于Node不等待它,可以提高CPU使用率。

By way of analogy, think of NodeJS as a waiter taking the customer orders while the I/O chefs prepare them in the kitchen. 以此类推,可以将NodeJS看作是服务员,在I / O厨师在厨房中准备客户时接受客户的订单。 Other systems have multiple chefs, who take a customers order, prepare the meal, clear the table and only then attend to the next customer. 其他系统有多位厨师,他们接一位顾客的订单,准备饭菜,清理桌子,然后再拜访下一位顾客。


#3楼

If you have to ask this question then you're probably unfamiliar with what most web applications/services do. 如果您必须问这个问题,那么您可能不熟悉大多数Web应用程序/服务的功能。 You're probably thinking that all software do this: 您可能会认为所有软件都可以这样做:

user do an action
       │
       v
 application start processing action
   └──> loop ...
          └──> busy processing
 end loop
   └──> send result to user

However, this is not how web applications, or indeed any application with a database as the back-end, work. 但是,这不是Web应用程序或任何以数据库为后端的应用程序的工作方式。 Web apps do this: Web应用程序可以这样做:

user do an action
       │
       v
 application start processing action
   └──> make database request
          └──> do nothing until request completes
 request complete
   └──> send result to user

In this scenario, the software spend most of its running time using 0% CPU time waiting for the database to return. 在这种情况下,该软件将大部分运行时间都用0%的CPU时间来等待数据库返回。

Multithreaded network app: 多线程网络应用程序:

Multithreaded network apps handle the above workload like this: 多线程网络应用程序可以像这样处理上述工作量:

request ──> spawn thread
              └──> wait for database request
                     └──> answer request
request ──> spawn thread
              └──> wait for database request
                     └──> answer request
request ──> spawn thread
              └──> wait for database request
                     └──> answer request

So the thread spend most of their time using 0% CPU waiting for the database to return data. 因此,线程大部分时间都使用0%的CPU等待数据库返回数据。 While doing so they have had to allocate the memory required for a thread which includes a completely separate program stack for each thread etc. Also, they would have to start a thread which while is not as expensive as starting a full process is still not exactly cheap. 这样做时,他们不得不分配一个线程所需的内存,其中每个线程等都包含一个完全独立的程序堆栈。此外,他们还必须启动一个线程,尽管它并不像启动一个完整的进程那样昂贵。贱。

Singlethreaded event loop 单线程事件循环

Since we spend most of our time using 0% CPU, why not run some code when we're not using CPU? 由于我们大部分时间都使用0%的CPU,为什么不使用CPU时不运行一些代码? That way, each request will still get the same amount of CPU time as multithreaded applications but we don't need to start a thread. 这样,每个请求仍将获得与多线程应用程序相同的CPU时间,但是我们不需要启动线程。 So we do this: 因此,我们这样做:

request ──> make database request
request ──> make database request
request ──> make database request
database request complete ──> send response
database request complete ──> send response
database request complete ──> send response

In practice both approaches return data with roughly the same latency since it's the database response time that dominates the processing. 在实践中,两种方法都以大致相同的延迟返回数据,这是因为数据库响应时间决定了处理过程。

The main advantage here is that we don't need to spawn a new thread so we don't need to do lots and lots of malloc which would slow us down. 这里的主要优点是我们不需要产生新的线程,因此我们不需要执行大量的malloc会减慢我们的速度。

Magic, invisible threading 魔术隐形螺纹

The seemingly mysterious thing is how both the approaches above manage to run workload in "parallel"? 看似神秘的事情是上述两种方法如何设法以“并行”方式运行工作负载? The answer is that the database is threaded. 答案是数据库是线程化的。 So our single-threaded app is actually leveraging the multi-threaded behaviour of another process: the database. 因此,我们的单线程应用程序实际上是在利用另一个进程的多线程行为:数据库。

Where singlethreaded approach fails 单线程方法失败的地方

A singlethreaded app fails big if you need to do lots of CPU calculations before returning the data. 如果您需要在返回数据之前进行大量CPU计算,则单线程应用程序会失败很大。 Now, I don't mean a for loop processing the database result. 现在,我不是说要for循环来处理数据库结果。 That's still mostly O(n). 大部分还是O(n)。 What I mean is things like doing Fourier transform (mp3 encoding for example), ray tracing (3D rendering) etc. 我的意思是诸如执行傅立叶变换(例如,mp3编码),光线跟踪(3D渲染)等操作。

Another pitfall of singlethreaded apps is that it will only utilise a single CPU core. 单线程应用程序的另一个陷阱是,它将仅利用单个CPU内核。 So if you have a quad-core server (not uncommon nowdays) you're not using the other 3 cores. 因此,如果您拥有四核服务器(如今并不常见),则您不会使用其他3核。

Where multithreaded approach fails 多线程方法失败的地方

A multithreaded app fails big if you need to allocate lots of RAM per thread. 如果您需要为每个线程分配大量RAM,则多线程应用程序会失败很大。 First, the RAM usage itself means you can't handle as many requests as a singlethreaded app. 首先,RAM本身的使用量意味着您无法处理与单线程应用程序一样多的请求。 Worse, malloc is slow. 更糟糕的是,malloc很慢。 Allocating lots and lots of objects (which is common for modern web frameworks) means we can potentially end up being slower than singlethreaded apps. 分配大量对象(这在现代Web框架中很常见)意味着我们最终可能会比单线程应用程序慢。 This is where node.js usually win. 这是node.js通常获胜的地方。

One use-case that end up making multithreaded worse is when you need to run another scripting language in your thread. 一个最终使多线程变得更糟的用例是,当您需要在线程中运行另一种脚本语言时。 First you usually need to malloc the entire runtime for that language, then you need to malloc the variables used by your script. 首先,通常需要为该语言分配整个运行时,然后需要分配脚本使用的变量。

So if you're writing network apps in C or go or java then the overhead of threading will usually not be too bad. 因此,如果您使用C或go或java编写网络应用程序,则线程的开销通常不会太糟。 If you're writing a C web server to serve PHP or Ruby then it's very easy to write a faster server in javascript or Ruby or Python. 如果您要编写C Web服务器来服务PHP或Ruby,那么用javascript,Ruby或Python编写速度更快的服务器非常容易。

Hybrid approach 混合方式

Some web servers use a hybrid approach. 某些Web服务器使用混合方法。 Nginx and Apache2 for example implement their network processing code as a thread pool of event loops. 例如,Nginx和Apache2将其网络处理代码实现为事件循环的线程池。 Each thread runs an event loop simultaneously processing requests single-threaded but requests are load-balanced among multiple threads. 每个线程运行一个事件循环,同时处理单线程请求,但请求在多个线程之间进行负载平衡。

Some single-threaded architectures also use a hybrid approach. 一些单线程体系结构还使用混合方法。 Instead of launching multiple threads from a single process you can launch multiple applications - for example, 4 node.js servers on a quad-core machine. 您可以启动多个应用程序,而不是从单个进程启动多个线程,例如,在四核计算机上启动4个node.js服务器。 Then you use a load balancer to spread the workload amongst the processes. 然后,您可以使用负载平衡器在各个进程之间分配工作负载。

In effect the two approaches are technically identical mirror-images of each other. 实际上,这两种方法在技术上是彼此相同的镜像。


#4楼

I understand that Node.js uses a single-thread and an event loop to process requests only processing one at a time (which is non-blocking). 我知道Node.js使用单线程和事件循环来处理一次仅处理一个请求的请求(这是非阻塞的)。

I could be misunderstanding what you've said here, but "one at a time" sounds like you may not be fully understanding the event-based architecture. 我可能会误解您在这里所说的内容,但是“一次一次”听起来似乎您可能没有完全理解基于事件的体系结构。

In a "conventional" (non event-driven) application architecture, the process spends a lot of time sitting around waiting for something to happen. 在“常规”(非事件驱动)应用程序体系结构中,该过程花费大量时间坐在等待发生的事情上。 In an event-based architecture such as Node.js the process doesn't just wait, it can get on with other work. 在基于事件的体系结构(例如Node.js)中,过程不仅要等待,还可以继续进行其他工作。

For example: you get a connection from a client, you accept it, you read the request headers (in the case of http), then you start to act on the request. 例如:从客户端获得连接,接受连接,读取请求标头(对于http),然后开始对请求执行操作。 You might read the request body, you will generally end up sending some data back to the client (this is a deliberate simplification of the procedure, just to demonstrate the point). 您可能会阅读请求正文,通常最终将向客户端发送一些数据(这是过程的故意简化,仅用于说明要点)。

At each of these stages, most of the time is spent waiting for some data to arrive from the other end - the actual time spent processing in the main JS thread is usually fairly minimal. 在每个阶段中,大部分时间都花在等待另一端的数据到达上-在JS主线程中处理的实际时间通常非常短。

When the state of an I/O object (such as a network connection) changes such that it needs processing (eg data is received on a socket, a socket becomes writable, etc) the main Node.js JS thread is woken with a list of items needing to be processed. 当I / O对象的状态(例如网络连接)发生变化以至于需要处理时(例如,在套接字上接收到数据,套接字变得可写等),主Node.js JS线程将被列表唤醒需要处理的项目。

It finds the relevant data structure and emits some event on that structure which causes callbacks to be run, which process the incoming data, or write more data to a socket, etc. Once all of the I/O objects in need of processing have been processed, the main Node.js JS thread will wait again until it's told that more data is available (or some other operation has completed or timed out). 它找到相关的数据结构,并在该结构上发出一些事件,从而导致运行回调,处理传入数据或将更多数据写入套接字等。一旦所有需要处理的I / O对象都已被处理。在处理完之后,Node.js主JS线程将再次等待,直到被告知有更多数据可用(或某些其他操作已完成或超时)。

The next time that it is woken, it could well be due to a different I/O object needing to be processed - for example a different network connection. 下次唤醒它,很可能是由于需要处理不同的I / O对象-例如,不同的网络连接。 Each time, the relevant callbacks are run and then it goes back to sleep waiting for something else to happen. 每次都运行相关的回调,然后返回睡眠状态以等待其他事件发生。

The important point is that the processing of different requests is interleaved, it doesn't process one request from start to end and then move onto the next. 重要的一点是,不同请求的处理是交错的,它不会从头到尾处理一个请求,然后再处理下一个。

To my mind, the main advantage of this is that a slow request (eg you're trying to send 1MB of response data to a mobile phone device over a 2G data connection, or you're doing a really slow database query) won't block faster ones. 在我看来,这样做的主要优点是请求很慢(例如,您试图通过2G数据连接向移动电话设备发送1MB响应数据,或者您正在执行非常慢的数据库查询)将不会'阻止更快的。

In a conventional multi-threaded web server, you will typically have a thread for each request being handled, and it will process ONLY that request until it's finished. 在传统的多线程Web服务器中,通常会为每个正在处理的请求提供一个线程,并且它将仅处理该请求,直到完成为止。 What happens if you have a lot of slow requests? 如果您有很多慢请求怎么办? You end up with a lot of your threads hanging around processing these requests, and other requests (which might be very simple requests that could be handled very quickly) get queued behind them. 您最终将有很多线程挂在处理这些请求的周围,而其他请求(可能是非常简单的请求,可以很快地进行处理)会排在它们后面。

There are plenty of others event-based systems apart from Node.js, and they tend to have similar advantages and disadvantages compared with the conventional model. 除了Node.js之外,还有许多其他基于事件的系统,与常规模型相比,它们往往具有相似的优缺点。

I wouldn't claim that event-based systems are faster in every situation or with every workload - they tend to work well for I/O-bound workloads, not so well for CPU-bound ones. 我不会说基于事件的系统在每种情况下或在每种工作负载下都更快-它们往往适用于受I / O约束的工作负载,而不适用于受CPU约束的工作负载。


#5楼

Adding to slebetman answer: When you say Node.JS can handle 10,000 concurrent requests they are essentially non-blocking requests ie these requests are majorly pertaining to database query. slebetman的答案:当您说Node.JS可以处理10,000个并发请求时,它们本质上是非阻塞请求,即这些请求主要与数据库查询有关。

Internally, event loop of Node.JS is handling a thread pool , where each thread handles a non-blocking request and event loop continues to listen to more request after delegating work to one of the thread of the thread pool . 在内部, Node.JS event loop正在处理一个thread pool ,其中每个线程处理一个non-blocking request并且在将工作委派给thread pool一个线程后,事件循环继续侦听更多请求。 When one of the thread completes the work, it send a signal to the event loop that it has finished aka callback . 当线程之一完成工作时,它向event loop发送信号,表明它已经完成了callback Event loop then process this callback and send the response back. 然后, Event loop处理此回调并将其发送回。

As you are new to NodeJS, do read more about nextTick to understand how event loop works internally. 当您不nextTick ,请阅读有关nextTick更多信息,以了解事件循环在内部如何工作。 Read blogs on http://javascriptissexy.com , they were really helpful for me when I started with JavaScript/NodeJS. 阅读http://javascriptissexy.com上的博客,当我开始使用JavaScript / NodeJS时,它们对我真的很有帮助。


#6楼

Single Threaded Event Loop Model Processing Steps: 单线程事件循环模型处理步骤:

  • Clients Send request to Web Server. 客户端将请求发送到Web服务器。

  • Node JS Web Server internally maintains a Limited Thread pool to provide services to the Client Requests. 节点JS Web服务器在内部维护一个有限线程池,以为客户端请求提供服务。

  • Node JS Web Server receives those requests and places them into a Queue. Node JS Web Server接收这些请求并将其放入队列。 It is known as “Event Queue”. 它被称为“事件队列”。

  • Node JS Web Server internally has a Component, known as “Event Loop”. Node JS Web Server在内部具有一个称为“事件循环”的组件。 Why it got this name is that it uses indefinite loop to receive requests and process them. 之所以获得此名称,是因为它使用无限循环来接收请求并对其进行处理。

  • Event Loop uses Single Thread only. 事件循环仅使用单线程。 It is main heart of Node JS Platform Processing Model. 它是Node JS平台处理模型的主要核心。

  • Event Loop checks any Client Request is placed in Event Queue. 事件循环检查是否有任何客户端请求被放置在事件队列中。 If not then wait for incoming requests for indefinitely. 如果不是,则无限期地等待传入的请求。

  • If yes, then pick up one Client Request from Event Queue 如果是,则从事件队列中选择一个客户端请求

    1. Starts process that Client Request 开始客户要求的流程
    2. If that Client Request Does Not requires any Blocking IO Operations, then process everything, prepare response and send it back to client. 如果该客户端请求不需要任何阻塞IO操作,则处理所有内容,准备响应并将其发送回客户端。
    3. If that Client Request requires some Blocking IO Operations like interacting with Database, File System, External Services then it will follow different approach 如果该客户请求需要某些阻止IO操作(例如与数据库,文件系统,外部服务进行交互),则它将采用不同的方法
  • Checks Threads availability from Internal Thread Pool 从内部线程池检查线程可用性
  • Picks up one Thread and assign this Client Request to that thread. 拾取一个线程并将此客户请求分配给该线程。
  • That Thread is responsible for taking that request, process it, perform Blocking IO operations, prepare response and send it back to the Event Loop 该线程负责处理该请求,处理该请求,执行阻塞IO操作,准备响应并将其发送回事件循环

    very nicely explained by @Rambabu Posa for more explanation go throw this Link @Rambabu Posa很好地解释了更多解释,请抛出此链接

你可能感兴趣的:(node.js)