【翻译搬运】起源引擎 C/S延迟补偿方法在游戏协议中的设计与优化【一】

Latency Compensating Methods in Client/Server In-game Protocol Design and Optimization


1 概述 Overview(本篇)
2 客户端/服务器游戏的基本体系结构 Basic Architecture of a Client / Server Game (本篇)
3 用户输入消息内容 Contents of the User Input messages(本篇)
4 客户端一侧预测 Client Side Prediction (本篇)
5 Client-Side Prediction of Weapon Firing
6 Umm, This is a Lot of Work
7 Display of Targets
8 Lag Compensation
9 Game Design Implications of Lag Compensation
10 Conclusion
11 附注 Footnotes(分散于各篇最后)

概述 Overview

Paragraph 1

Designing first-person action games for Internet play is a challenging process. Having robust on-line gameplay in your action title, however, is becoming essential to the success and longevity of the title. In addition, the PC space is well known for requiring developers to support a wide variety of customer setups. Often, customers are running on less than state-of-the-art hardware. The same holds true for their network connections.


Paragraph 2

While broadband has been held out as a panacea for all of the current woes of on-line gaming, broadband is not a simple solution allowing developers to ignore the implications of latency and other network factors in game designs. It will be some time before broadband truly becomes adopted in the United States, and much longer before it can be assumed to exist for your clients in the rest of the world. In addition, there are a lot of poor broadband solutions, where users may occasionally have high bandwidth, but more often than not also have significant latency and packet loss in their connections.

虽然宽带可以作为目前的网络游戏问题解决的灵丹妙药,但是宽带这种解决方案并不容许开发者忽略游戏设计过程中的网络延迟和其他网络问题。 宽带在美国被真正普及还需要一段时间,那么在其他的客户端存在的地方,就要花更长时间。 【额外说明】文章比较久远,所以这段描述可以忽略。 此外,还有很多糟糕的宽带情况,比如用户可能有时候拥有高带宽,而更经常出现的情况是在连接中出现严重延迟和丢包的情况。

Paragraph 3

Your game must behave well in this world. This discussion will give you a sense of some of the tradeoffs required to deliver a cutting-edge action experience on the Internet. The discussion will provide some background on how client / server architectures work in many on-line action games. In addition, the discussion will show how predictive modeling can be used to mask the effects of latency. Finally, the discussion will describe a specific mechanism, lag compensation, for allowing the game to compensate for connection quality.


客户端/服务器游戏的基本体系结构 Basic Architecture of a Client / Server Game

Paragraph 1

Most action games played on the net today are modified client / server games. Games such as Half-Life, including its mods such as Counter-Strike and Team Fortress Classic, operate on such a system, as do games based on the Quake3 engine and the Unreal Tournament engine. In these games, there is a single, authoritative server that is responsible for running the main game logic. To this are connected one or more “dumb” clients. These clients, initially, were nothing more than a way for the user input to be sampled and forwarded to the server for execution. The server would execute the input commands, move around other objects, and then send back to the client a list of objects to render. Of course, the real world system has more components to it, but the simplified breakdown is useful for thinking about prediction and lag compensation.


Paragraph 2

With this in mind, the typical client / server game engine architecture generally looks like this:
【翻译搬运】起源引擎 C/S延迟补偿方法在游戏协议中的设计与优化【一】_第1张图片

【翻译搬运】起源引擎 C/S延迟补偿方法在游戏协议中的设计与优化【一】_第2张图片

For this discussion, all of the messaging and coordination needed to start up the connection between client and server is omitted. The client’s frame loop looks something like the following:
1. Sample clock to find start time
2. Sample user input (mouse, keyboard, joystick)
3. Package up and send movement command using simulation time
4. Read any packets from the server from the network system
5. Use packets to determine visible objects and their state
6. Render Scene
7. Sample clock to find end time
8. End time minus start time is the simulation time for the next frame

1. 采样时钟寻找开始时间点
2. 对玩家的输入信息采样(鼠标、键盘、摇杆)
3. 对模拟时间段采集的信息进行进行打包,发送活动命令
4. 从网络系统模块中读取服务器发送来的数据包
5. 使用数据包中的数据决定对象的状态
6. 渲染场景
7. 采样时钟寻找结束时间点
8. 结束时间点减去开始时间点,作为下帧的模拟时间段

Each time the client makes a full pass through this loop, the “frametime” is used for determining how much simulation is needed on the next frame. If your framerate is totally constant then frametime will be a correct measure. Otherwise, the frametimes will be incorrect, but there isn’t really a solution to this (unless you could deterministically figure out exactly how long it was going to take to run the next frame loop iteration before running it…).

The server has a somewhat similar loop:
1. Sample clock to find start time
2. Read client user input messages from network
3. Execute client user input messages
4. Simulate server-controlled objects using simulation time from last full pass
5. For each connected client, package up visible objects/world state and send to client
6. Sample clock to find end time
7. End time minus start time is the simulation time for the next frame

1. 采样时钟寻找起始时间点
2. 从网络中读取客户端发来的输入信息
3. 执行客户端输入的信息
4. 使用上一次循环的时间,模拟服务器控制下的对象状态。
5. 对象状态、世界状态打包,发送给每一个连接中的客户端
6. 采样时钟寻找结束时间点
7. 结束时间点减去开始时间点,作为下一帧模拟的模拟时间段

In this model, non-player objects run purely on the server, while player objects drive their movements based on incoming packets. Of course, this is not the only possible way to accomplish this task, but it does make sense.


用户输入消息内容 Contents of the User Input messages

Paragraph 1

In Half-Life engine games, the user input message format is quite simple and is encapsulated in a data structure containing just a few essential fields:


typedef struct usercmd_s
    // Interpolation time on client  客户端插值时间(关于插值时间,请参照多人网络同步模型的相关解释)
    short       lerp_msec;   
    // Duration in ms of command  命令时间
    byte        msec;      
    // Command view angles.  命令时刻视角角度
    vec3_t  viewangles;   
    // intended velocities
    // Forward velocity.  前向速度
    float       forwardmove;  
    // Sideways velocity.  横向速度
    float       sidemove;    
    // Upward velocity.  向上的速度
    float       upmove;   
    // Attack buttons   攻击按钮
    unsigned short buttons; 
    // Additional fields omitted...
    // 省略的附加字段
} usercmd_t;

The critical fields here are the msec, viewangles, forward, side, and upmove, and buttons fields. The msec field corresponds to the number of milliseconds of simulation that the command corresponds to (it’s the frametime). The viewangles field is a vector representing the direction the player was looking during the frame. The forward, side, and upmove fields are the impulses determined by examining the keyboard, mouse, and joystick to see if any movement keys were held down. Finally, the buttons field is just a bit field with one or more bits set for each button that is being held down.

这里面比较关键的是msec,viewangles,forward、side、upmove,buttonmsec 的值等于命令产生的毫秒数(时间是指代帧时间)。viewangles 的值等于在这帧中玩家的视角方向。forward、side、upmove 的值是通过检测键盘、鼠标和摇杆是否有移动或按下得到的脉冲值。最后,button 的值是一个位字段,使用一个或者多个位来表示每一个按键是否被按下。

Paragraph 2

Using the above data structures and client / server architecture, the core of the simulation is as follows. First, the client creates and sends a user command to the server. The server then executes the user command and sends updated positions of everything back to client. Finally, the client renders the scene with all of these objects. This core, though quite simple, does not react well under real world situations, where users can experience significant amounts of latency in their Internet connections. The main problem is that the client truly is “dumb” and all it does is the simple task of sampling movement inputs and waiting for the server to tell it the results. If the client has 500 milliseconds of latency in its connection to the server, then it will take 500 milliseconds for any client actions to be acknowledged by the server and for the results to be perceptible on the client. While this round trip delay may be acceptable on a Local Area Network (LAN), it is not acceptable on the Internet.

使用上述的数据结构和C/S结构,那么模拟核心过程如下。 首先,客户端创建并发送用户指令到服务器。服务器执行用户命令,并且向客户端返回每一件事物的位置更新。最后,客户端将这些对象渲染到屏幕上。这个核心流程十分简单,但在现实中反馈并不是很好,因为用户会在网络连接的过程中体会到非常大的网络延迟。主要的问题体现在,客户端是完全“沉默的”,他所有的任务就是采集移动输入,并且等待服务器告诉他结果。假设客户端在连接服务器的过程中存在500ms的延迟,那么任何客户端行为被服务器认同都需要500ms,客户端对行为结果产生认知同样需要500ms。在本地网络之中(LAN)往返的延迟是可以接受的,但是放到互联网之中,就完全不可接受了。

客户端一侧预测 Client Side Prediction

Paragraph 1

One method for ameliorating this problem is to perform the client’s movement locally and just assume, temporarily, that the server will accept and acknowledge the client commands directly. This method is labeled as client-side prediction.


Paragraph 2

Client-side prediction of movements requires us to let go of the “dumb” or minimal client principle. That’s not to say that the client is fully in control of its simulation, as in a peer-to-peer game with no central server.There still is an authoritative server running the simulation just as noted above. Having an authoritative server means that even if the client simulates different results than the server, the server’s results will eventually correct the client’s incorrect simulation. Because of the latency in the connection, the correction might not occur until a full round trip’s worth of time has passed. The downside is that this can cause a very perceptible shift in the player’s position due to the fixing up of the prediction error that occurred in the past.


Paragraph 3

To implement client-side prediction of movement, the following general procedure is used. As before, client inputs are sampled and a user command is generated. Also as before, this user command is sent off to the server. However, each user command (and the exact time it was generated) is stored on the client. The prediction algorithm uses these stored commands.


Paragraph 4

For prediction, the last acknowledged movement from the server is used as a starting point. The acknowledgement indicates which user command was last acted upon by the server and also tells us the exact position (and other state data) of the player after that movement command was simulated on the server. The last acknowledged command will be somewhere in the past if there is any lag in the connection. For instance, if the client is running at 50 frames per second (fps) and has 100 milliseconds of latency (roundtrip), then the client will have stored up five user commands ahead of the last one acknowledged by the server.These five user commands are simulated on the client as a part of client-side prediction. Assuming full prediction(注1), the client will want to start with the latest data from the server, and then run the five user commands through “similar logic” to what the server uses for simulation of client movement. Running these commands should produce an accurate final state on the client (final player position is most important) that can be used to determine from what position to render the scene during the current frame.


Paragraph 5

In Half-Life, minimizing discrepancies between client and server in the prediction logic is accomplished by sharing the identical movement code for players in both the server-side game code and the client-side game code. These are the routines in the pm_shared/ (which stands for “player movement shared”) folder of the HL SDK. The input to the shared routines is encapsulated by the user command and a “from” player state. The output is the new player state after issuing the user command. The general algorithm on the client is as follows:

在游戏《半条命》中,为了最小化客户端和服务器在预测这部分产生的不同,两端共享了玩家移动这一部分的游戏代码。这些在HL_SDK中的 pm_shared(即“玩家共享移动”)/文件夹下。共享代码部分的输入是用户指令和用户状态的封装,输出结果是经过用户指令影响的新用户状态。客户端上的大致运算如下所示

"from state" <- state after last user command acknowledged by the server;
"from state" <- 从服务器接收的“最后一条”用户指令得到的状态;

"command" <- first command after last user command acknowledged by server;
"command" <- 服务器的“最后一条”用户指令后产生的第一条用户指令;

while (true)
    run "command" on "from state" to generate "to state";
    if (this was the most up to date "command")

    "from state" = "to state";
    "command" = next "command";

Paragraph 6

The origin and other state info in the final “to state” is the prediction result and is used for rendering the scene that frame. The portion where the command is run is simply the portion where all of the player state data is copied into the shared data structure, the user command is processed (by executing the common code in the pm_shared routines in Half-Life’s case), and the resulting data is copied back out to the “to state”.

在预测结果“to state”中的位置和其他状态将用作场景得渲染根据。指令的运行是将所有的玩家状态数据拷贝到共享数据结构中、指令执行(《半条命》中执行的代码在pm_shared)、运算出的结果拷贝到“to state”中。

Paragraph 7

There are a few important caveats to this system. First, you’ll notice that, depending upon the client’s latency and how fast the client is generating user commands (i.e., the client’s framerate), the client will most often end up running the same commands over and over again until they are finally acknowledged by the server and dropped from the list (a sliding window in Half-Life’s case) of commands yet to be acknowledged. The first consideration is how to handle any sound effects and visual effects that are created in the shared code. Because commands can be run over and over again, it’s important not to create footstep sounds, etc. multiple times as the old commands are re-run to update the predicted position.In addition, it’s important for the server not to send the client effects that are already being predicted on the client. However, the client still must re-run the old commands or else there will be no way for the server to correct any erroneous prediction by the client. The solution to this problem is easy: the client just marks those commands which have not been predicted yet on the client and only plays effects if the user command is being run for the first time on the client.

这里有些针对这个系统的重要附加说明。首先,我们会注意到,由于客户端的延迟,不论客户端生成用户指令的速度多块(比如:帧率),客户端将会在收到服务器确认执行某条指令执行之前,在未被确认执行的命令队列(《半条命》的滑动窗口)中重复运行。首先要考虑的是渲染代码部分中声音和特效,因为指令要反复的被执行,不要重复创建类似脚步声这样的东西,旧的命令重复运行只用来更新预测位置。 另外,服务器也不要发送客户端已经进行预测的特效。 然而这种情况下,客户端仍旧需要重复运算旧的命令,否则服务器无法纠正客户端出现的预测错误。对于这个问题的解决方法,客户端对未进行预测的命令进行标记, 在这个命令第一次被执行的时候播放特效。

Paragraph 8

The other caveat is with respect to state data that exists solely on the client and is not part of the authoritative update data from the server. If you don’t have any of this type of data, then you can simply use the last acknowledged state from the server as a starting point, and run the prediction user commands “in-place” on that data to arrive at a final state (which includes your position for rendering). In this case, you don’t need to keep all of the intermediate results along the route for predicting from the last acknowledged state to the current time. However, if you are doing any logic totally client side (this logic could include functionality such as determining where the eye position is when you are in the process of crouching—and it’s not really totally client side since the server still simulates this data also) that affects fields that are not replicated from the server to the client by the networking layer handling the player’s state info, then you will need to store the intermediate results of prediction. This can be done with a sliding window, where the “from state” is at the start and then each time you run a user command through prediction, you fill in the next state in the window. When the server finally acknowledges receiving one or more commands that had been predicted, it is a simple matter of looking up which state the server is acknowledging and copying over the data that is totally client side to the new starting or “from state”.

另一个需要注意的地方是,客户端存储的状态数据,并不单纯是从服务器下发数据的一个子集。如果你没有服务器下发数据以外的数据,那么你可以直接使用上一次服务器通知的起始点的状态数据,用户指令的预测在这种情况下计算出最终状态(包括即将渲染的角色位置)这样,就不需要保留从服务器通知最后的状态到当前时间之间的中间计算结果。然而,如果你正在做一些纯客户端的逻辑(比如在蹲的过程中,视线的位置——这其实并不是一个纯客户端逻辑,服务器也将进行模拟),那么从服务器取回的玩家状态数据不会影响这一部分的表现,直接使用预测结果即可。这可以通过一个滑动窗口来完成,从“from state”开始,每次对一条用户指令进行预测计算,在窗口中可以用他对下一个状态进行填充。当服务器的消息执行了一条或者数条已经进行过预测的用户命令发回消息。可以明显的看出哪些数据是从服务器的下发状态拷贝的,哪些数据是从纯客户端演算或者从“from state”中获得的。

Paragraph 8

So far, the above procedure describes how to accomplish client side prediction of movements. This system is similar to the system used in QuakeWorld.


附注 Footnotes

prediction : In the Half-Life engine, it is possible to ask the client-side prediction algorithm to account for some, but not all, of the latency in performing prediction. The user could control the amount of prediction by changing the value of the “pushlatency” console variable to the engine. This variable is a negative number indicating the maximum number of milliseconds of prediction to perform. If the number is greater (in the negative) than the user’s current latency, then full prediction up to the current time occurs. In this case, the user feels zero latency in his or her movements. Based upon some erroneous superstition in the community, many users insisted that setting pushlatency to minus one-half of the current average latency was the proper setting. Of course, this would still leave the player’s movements lagged (often described as if you are moving around on ice skates) by half of the user’s latency. All of this confusion has brought us to the conclusion that full prediction should occur all of the time and that the pushlatency variable should be removed from the Half-Life engine.





