python 套接字
Sockets and the socket API are used to send messages across a network. They provide a form of inter-process communication (IPC). The network can be a logical, local network to the computer, or one that’s physically connected to an external network, with its own connections to other networks. The obvious example is the Internet, which you connect to via your ISP.
套接字和套接字API用于通过网络发送消息。 它们提供了一种进程间通信(IPC)的形式 。 该网络可以是计算机的逻辑本地网络,也可以是物理连接到外部网络的计算机,并且可以自己连接到其他网络。 显而易见的示例是Internet,您可以通过ISP连接到Internet。
This tutorial has three different iterations of building a socket server and client with Python:
本教程具有使用Python构建套接字服务器和客户端的三个不同的迭代:
By the end of this tutorial, you’ll understand how to use the main functions and methods in Python’s socket module to write your own client-server applications. This includes showing you how to use a custom class to send messages and data between endpoints that you can build upon and utilize for your own applications.
在本教程结束时,您将了解如何使用Python 套接字模块中的主要功能和方法来编写自己的客户端-服务器应用程序。 这包括向您展示如何使用自定义类在可构建并用于自己的应用程序的端点之间发送消息和数据。
The examples in this tutorial use Python 3.6. You can find the source code on GitHub.
本教程中的示例使用Python 3.6。 您可以在GitHub上找到源代码 。
Networking and sockets are large subjects. Literal volumes have been written about them. If you’re new to sockets or networking, it’s completely normal if you feel overwhelmed with all of the terms and pieces. I know I did!
网络和套接字是大问题。 关于它们的文字量已经有记载。 如果您不熟悉套接字或网络,则对所有条款和条件不知所措是完全正常的。 我知道我做到了!
Don’t be discouraged though. I’ve written this tutorial for you. As we did with Python, we can learn a little bit at a time. Use your browser’s bookmark feature and come back when you’re ready for the next section.
不过不要气our。 我已经为您编写了本教程。 就像使用Python一样,我们可以一次学习一些知识。 使用浏览器的书签功能,并在准备下一部分时返回。
Let’s get started!
让我们开始吧!
Sockets have a long history. Their use originated with ARPANET in 1971 and later became an API in the Berkeley Software Distribution (BSD) operating system released in 1983 called Berkeley sockets.
套接字历史悠久。 它们的使用始于 1971年的ARPANET ,后来成为1983年发布的称为Berkeley套接字的Berkeley软件分发(BSD)操作系统中的API。
When the Internet took off in the 1990s with the World Wide Web, so did network programming. Web servers and browsers weren’t the only applications taking advantage of newly connected networks and using sockets. Client-server applications of all types and sizes came into widespread use.
当Internet在1990年代随着万维网兴起时,网络编程也开始兴起。 并非只有Web服务器和浏览器才能利用新连接的网络并使用套接字。 各种类型和大小的客户端-服务器应用程序已广泛使用。
Today, although the underlying protocols used by the socket API have evolved over the years, and we’ve seen new ones, the low-level API has remained the same.
如今,尽管套接字API使用的底层协议已经发展了许多年,并且我们已经看到了新的协议,但底层API仍然保持不变。
The most common type of socket applications are client-server applications, where one side acts as the server and waits for connections from clients. This is the type of application that I’ll be covering in this tutorial. More specifically, we’ll look at the socket API for Internet sockets, sometimes called Berkeley or BSD sockets. There are also Unix domain sockets, which can only be used to communicate between processes on the same host.
套接字应用程序最常见的类型是客户端服务器应用程序,其中一侧充当服务器并等待来自客户端的连接。 这是本教程将介绍的应用程序类型。 更具体地说,我们将研究Internet套接字 (有时称为Berkeley或BSD套接字)的套接字API。 还有Unix域套接字 ,只能用于在同一主机上的进程之间进行通信。
Python’s socket module provides an interface to the Berkeley sockets API. This is the module that we’ll use and discuss in this tutorial.
Python的套接字模块提供了与Berkeley套接字API的接口 。 这是我们将在本教程中使用和讨论的模块。
The primary socket API functions and methods in this module are:
此模块中的主要套接字API功能和方法是:
socket()
bind()
listen()
accept()
connect()
connect_ex()
send()
recv()
close()
socket()
bind()
listen()
accept()
connect()
connect_ex()
send()
recv()
close()
Python provides a convenient and consistent API that maps directly to these system calls, their C counterparts. We’ll look at how these are used together in the next section.
Python提供了一个方便且一致的API,可直接映射到这些系统调用(与C对应)。 在下一节中,我们将研究如何将它们一起使用。
As part of its standard library, Python also has classes that make using these low-level socket functions easier. Although it’s not covered in this tutorial, see the socketserver module, a framework for network servers. There are also many modules available that implement higher-level Internet protocols like HTTP and SMTP. For an overview, see Internet Protocols and Support.
作为其标准库的一部分,Python还具有一些类,这些类使使用这些低级套接字函数更加容易。 尽管本教程未涵盖它,但请参阅socketserver模块 (用于网络服务器的框架)。 还有许多模块可以实现更高级别的Internet协议,例如HTTP和SMTP。 有关概述,请参见Internet协议和支持 。
As you’ll see shortly, we’ll create a socket object using socket.socket()
and specify the socket type as socket.SOCK_STREAM
. When you do that, the default protocol that’s used is the Transmission Control Protocol (TCP). This is a good default and probably what you want.
稍后您将看到,我们将使用socket.socket()
创建一个套接字对象,并将套接字类型指定为socket.SOCK_STREAM
。 当您这样做时,使用的默认协议是传输控制协议(TCP) 。 这是一个很好的默认值,可能也是您想要的。
Why should you use TCP? The Transmission Control Protocol (TCP):
为什么要使用TCP? 传输控制协议(TCP):
In contrast, User Datagram Protocol (UDP) sockets created with socket.SOCK_DGRAM
aren’t reliable, and data read by the receiver can be out-of-order from the sender’s writes.
相反, 使用 socket.SOCK_DGRAM
创建的用户数据报协议(UDP)套接字不可靠,并且接收方读取的数据可能与发送方的写入乱序。
Why is this important? Networks are a best-effort delivery system. There’s no guarantee that your data will reach its destination or that you’ll receive what’s been sent to you.
为什么这很重要? 网络是尽力而为的交付系统。 不能保证您的数据将到达目的地,也不保证您会收到发送给您的信息。
Network devices (for example, routers and switches), have finite bandwidth available and their own inherent system limitations. They have CPUs, memory, buses, and interface packet buffers, just like our clients and servers. TCP relieves you from having to worry about packet loss, data arriving out-of-order, and many other things that invariably happen when you’re communicating across a network.
网络设备(例如,路由器和交换机)具有有限的可用带宽及其自身固有的系统限制。 它们具有CPU,内存,总线和接口数据包缓冲区,就像我们的客户端和服务器一样。 TCP使您不必担心数据包丢失 ,数据乱序到达以及在通过网络进行通信时总是发生的许多其他事情。
In the diagram below, let’s look at the sequence of socket API calls and data flow for TCP:
在下图中,让我们看一下套接字API调用和TCP数据流的顺序:
Image source) 图像源 )The left-hand column represents the server. On the right-hand side is the client.
左列代表服务器。 右边是客户。
Starting in the top left-hand column, note the API calls the server makes to setup a “listening” socket:
从左上角的列开始,请注意服务器调用以设置“侦听”套接字的API:
socket()
bind()
listen()
accept()
socket()
bind()
listen()
accept()
A listening socket does just what it sounds like. It listens for connections from clients. When a client connects, the server calls accept()
to accept, or complete, the connection.
监听套接字确实可以听起来像。 它侦听来自客户端的连接。 当客户端连接时,服务器调用accept()
接受或完成连接。
The client calls connect()
to establish a connection to the server and initiate the three-way handshake. The handshake step is important since it ensures that each side of the connection is reachable in the network, in other words that the client can reach the server and vice-versa. It may be that only one host, client or server, can reach the other.
客户端调用connect()
建立与服务器的连接并启动三向握手。 握手步骤很重要,因为它可以确保连接的每一端在网络中都可以访问,换句话说,客户端可以访问服务器,反之亦然。 可能只有一台主机,客户端或服务器可以访问另一台。
In the middle is the round-trip section, where data is exchanged between the client and server using calls to send()
and recv()
.
中间是往返部分,其中使用send()
和recv()
调用在客户端和服务器之间交换数据。
At the bottom, the client and server close()
their respective sockets.
在底部,客户端和服务器close()
各自的套接字。
Now that you’ve seen an overview of the socket API and how the client and server communicate, let’s create our first client and server. We’ll begin with a simple implementation. The server will simply echo whatever it receives back to the client.
既然您已经了解了套接字API以及客户端和服务器如何通信的概述,那么让我们创建第一个客户端和服务器。 我们将从一个简单的实现开始。 服务器将简单地将收到的任何内容回显给客户端。
Here’s the server, echo-server.py
:
这是服务器echo-server.py
:
#!/usr/bin/env python3
#!/usr/bin/env python3
import import socket
socket
HOST HOST = = '127.0.0.1' '127.0.0.1' # Standard loopback interface address (localhost)
# Standard loopback interface address (localhost)
PORT PORT = = 65432 65432 # Port to listen on (non-privileged ports are > 1023)
# Port to listen on (non-privileged ports are > 1023)
with with socketsocket .. socketsocket (( socketsocket .. AF_INETAF_INET , , socketsocket .. SOCK_STREAMSOCK_STREAM ) ) as as ss :
:
ss .. bindbind (((( HOSTHOST , , PORTPORT ))
))
ss .. listenlisten ()
()
connconn , , addr addr = = ss .. acceptaccept ()
()
with with connconn :
:
printprint (( 'Connected by''Connected by' , , addraddr )
)
while while TrueTrue :
:
data data = = connconn .. recvrecv (( 10241024 )
)
if if not not datadata :
:
break
break
connconn .. sendallsendall (( datadata )
)
Note: Don’t worry about understanding everything above right now. There’s a lot going on in these few lines of code. This is just a starting point so you can see a basic server in action.
注意:不必担心现在就了解以上内容。 这几行代码中发生了很多事情。 这只是一个起点,因此您可以看到运行中的基本服务器。
There’s a reference section at the end of this tutorial that has more information and links to additional resources. I’ll link to these and other resources throughout the tutorial.
本教程末尾有一个参考部分 ,其中包含更多信息以及指向其他资源的链接。 在整个教程中,我将链接到这些资源和其他资源。
Let’s walk through each API call and see what’s happening.
让我们遍历每个API调用,看看发生了什么。
socket.socket()
creates a socket object that supports the context manager type, so you can use it in a with
statement. There’s no need to call s.close()
:
socket.socket()
创建一个支持上下文管理器类型的套接字对象,因此可以在with
语句中使用它。 无需调用s.close()
:
The arguments passed to socket()
specify the address family and socket type. AF_INET
is the Internet address family for IPv4. SOCK_STREAM
is the socket type for TCP, the protocol that will be used to transport our messages in the network.
传递给socket()
的参数指定地址族和套接字类型。 AF_INET
是IPv4的Internet地址族。 SOCK_STREAM
是TCP的套接字类型,该协议将用于在网络中传输消息。
bind()
is used to associate the socket with a specific network interface and port number:
bind()
用于将套接字与特定的网络接口和端口号关联:
HOST HOST = = '127.0.0.1' '127.0.0.1' # Standard loopback interface address (localhost)
# Standard loopback interface address (localhost)
PORT PORT = = 65432 65432 # Port to listen on (non-privileged ports are > 1023)
# Port to listen on (non-privileged ports are > 1023)
# ...
# ...
ss .. bindbind (((( HOSTHOST , , PORTPORT ))
))
The values passed to bind()
depend on the address family of the socket. In this example, we’re using socket.AF_INET
(IPv4). So it expects a 2-tuple: (host, port)
.
传递给bind()
的值取决于套接字的地址族 。 在此示例中,我们使用socket.AF_INET
(IPv4)。 因此,它需要一个2元组: (host, port)
。
host
can be a hostname, IP address, or empty string. If an IP address is used, host
should be an IPv4-formatted address string. The IP address 127.0.0.1
is the standard IPv4 address for the loopback interface, so only processes on the host will be able to connect to the server. If you pass an empty string, the server will accept connections on all available IPv4 interfaces.
host
可以是主机名,IP地址或空字符串。 如果使用IP地址,则host
应为IPv4格式的地址字符串。 IP地址127.0.0.1
是回送接口的标准IPv4地址,因此只有主机上的进程才能连接到服务器。 如果传递空字符串,则服务器将接受所有可用IPv4接口上的连接。
port
should be an integer from 1
–65535
(0
is reserved). It’s the TCP port number to accept connections on from clients. Some systems may require superuser privileges if the port is < 1024
.
port
应该是1
到65535
的整数(保留0
)。 这是接受客户端连接的TCP端口号。 如果端口< 1024
则某些系统可能需要超级用户特权。
Here’s a note on using hostnames with bind()
:
这是关于将主机名与bind()
一起使用的说明:
“If you use a hostname in the host portion of IPv4/v6 socket address, the program may show a non-deterministic behavior, as Python uses the first address returned from the DNS resolution. The socket address will be resolved differently into an actual IPv4/v6 address, depending on the results from DNS resolution and/or the host configuration. For deterministic behavior use a numeric address in host portion.” (Source)
“如果您在IPv4 / v6套接字地址的主机部分中使用主机名,则该程序可能会显示不确定的行为,因为Python使用从DNS解析返回的第一个地址。 套接字地址将根据DNS解析和/或主机配置的结果不同地解析为实际的IPv4 / v6地址。 对于确定性行为,请在主机部分使用数字地址。” (资源)
I’ll discuss this more later in Using Hostnames, but it’s worth mentioning here. For now, just understand that when using a hostname, you could see different results depending on what’s returned from the name resolution process.
我将在稍后的“ 使用主机名”中进行更多讨论,但是这里值得一提。 现在,仅了解使用主机名时,您会看到不同的结果,具体取决于名称解析过程返回的内容。
It could be anything. The first time you run your application, it might be the address 10.1.2.3
. The next time it’s a different address, 192.168.0.1
. The third time, it could be 172.16.7.8
, and so on.
可能是任何东西。 第一次运行应用程序时,它的地址可能是10.1.2.3
。 下次是另一个地址192.168.0.1
。 第三次,可能是172.16.7.8
,依此类推。
Continuing with the server example, listen()
enables a server to accept()
connections. It makes it a “listening” socket:
继续服务器示例, listen()
使服务器能够accept()
连接。 它使其成为“监听”套接字:
listen()
has a backlog
parameter. It specifies the number of unaccepted connections that the system will allow before refusing new connections. Starting in Python 3.5, it’s optional. If not specified, a default backlog
value is chosen.
listen()
有一个backlog
参数。 它指定系统在拒绝新连接之前允许的不可接受的连接数。 从Python 3.5开始,它是可选的。 如果未指定,则选择默认的backlog
值。
If your server receives a lot of connection requests simultaneously, increasing the backlog
value may help by setting the maximum length of the queue for pending connections. The maximum value is system dependent. For example, on Linux, see /proc/sys/net/ipv4/tcp_max_syn_backlog
.
如果您的服务器同时接收到许多连接请求,则通过设置待处理连接的队列的最大长度,增加backlog
值可能会有所帮助。 最大值取决于系统。 例如,在Linux上,请参阅/proc/sys/net/ipv4/tcp_max_syn_backlog
。
accept()
blocks and waits for an incoming connection. When a client connects, it returns a new socket object representing the connection and a tuple holding the address of the client. The tuple will contain (host, port)
for IPv4 connections or (host, port, flowinfo, scopeid)
for IPv6. See Socket Address Families in the reference section for details on the tuple values.
accept()
阻止并等待传入连接。 客户端连接时,它将返回一个表示连接的新套接字对象和一个保存客户端地址的元组。 该元组将包含用于IPv4连接的(host, port)
或用于IPv6的(host, port, flowinfo, scopeid)
。 有关元组值的详细信息,请参见参考部分中的套接字地址系列 。
One thing that’s imperative to understand is that we now have a new socket object from accept()
. This is important since it’s the socket that you’ll use to communicate with the client. It’s distinct from the listening socket that the server is using to accept new connections:
必须了解的一件事是,我们现在从accept()
获得了一个新的套接字对象。 这很重要,因为它是您用来与客户端通信的套接字。 它与服务器用来接受新连接的侦听套接字不同:
connconn , , addr addr = = ss .. acceptaccept ()
()
with with connconn :
:
printprint (( 'Connected by''Connected by' , , addraddr )
)
while while TrueTrue :
:
data data = = connconn .. recvrecv (( 10241024 )
)
if if not not datadata :
:
break
break
connconn .. sendallsendall (( datadata )
)
After getting the client socket object conn
from accept()
, an infinite while
loop is used to loop over blocking calls to conn.recv()
. This reads whatever data the client sends and echoes it back using conn.sendall()
.
从accept()
获得客户端套接字对象conn
之后,将使用无限while
循环来循环阻塞对conn.recv()
调用 。 这将读取客户端发送的所有数据,并使用conn.sendall()
将其回conn.sendall()
。
If conn.recv()
returns an empty bytes
object, b''
, then the client closed the connection and the loop is terminated. The with
statement is used with conn
to automatically close the socket at the end of the block.
如果conn.recv()
返回一个空bytes
对象b''
,则客户端关闭连接并终止循环。 with
语句与conn
一起使用,以自动关闭该块末尾的套接字。
Now let’s look at the client, echo-client.py
:
现在让我们看一下客户端echo-client.py
:
In comparison to the server, the client is pretty simple. It creates a socket object, connects to the server and calls s.sendall()
to send its message. Lastly, it calls s.recv()
to read the server’s reply and then prints it.
与服务器相比,客户端非常简单。 它创建一个套接字对象,连接到服务器并调用s.sendall()
发送其消息。 最后,它调用s.recv()
读取服务器的回复,然后打印出来。
Let’s run the client and server to see how they behave and inspect what’s happening.
让我们运行客户端和服务器,看看它们的行为并检查发生了什么。
Note: If you’re having trouble getting the examples or your own code to run from the command line, read How Do I Make My Own Command-Line Commands Using Python? If you’re on Windows, check the Python Windows FAQ.
注意:如果您无法从命令行运行示例或自己的代码,请阅读如何使用Python编写自己的命令行命令? 如果您使用的是Windows,请查看Python Windows常见问题解答 。
Open a terminal or command prompt, navigate to the directory that contains your scripts, and run the server:
打开终端或命令提示符,导航到包含脚本的目录,然后运行服务器:
$ ./echo-server.py
$ ./echo-server.py
Your terminal will appear to hang. That’s because the server is blocked (suspended) in a call:
您的终端将似乎挂起。 这是因为服务器在呼叫中被阻止 (挂起):
It’s waiting for a client connection. Now open another terminal window or command prompt and run the client:
它正在等待客户端连接。 现在打开另一个终端窗口或命令提示符并运行客户端:
$ ./echo-client.py
$ ./echo-client.py
Received b'Hello, world'
Received b'Hello, world'
In the server window, you should see:
在服务器窗口中,您应该看到:
In the output above, the server printed the addr
tuple returned from s.accept()
. This is the client’s IP address and TCP port number. The port number, 64623
, will most likely be different when you run it on your machine.
在上面的输出中,服务器打印了从s.accept()
返回的addr
元组。 这是客户端的IP地址和TCP端口号。 在计算机上运行时,端口号64623
可能会有所不同。
To see the current state of sockets on your host, use netstat
. It’s available by default on macOS, Linux, and Windows.
要查看主机上套接字的当前状态,请使用netstat
。 在macOS,Linux和Windows上默认情况下可用。
Here’s the netstat output from macOS after starting the server:
这是启动服务器后macOS的netstat输出:
$ netstat -an
$ netstat -an
Active Internet connections (including servers)
Active Internet connections (including servers)
Proto Recv-Q Send-Q Local Address Foreign Address (state)
Proto Recv-Q Send-Q Local Address Foreign Address (state)
tcp4 0 0 127.0.0.1.65432 *.* LISTEN
tcp4 0 0 127.0.0.1.65432 *.* LISTEN
Notice that Local Address
is 127.0.0.1.65432
. If echo-server.py
had used HOST = ''
instead of HOST = '127.0.0.1'
, netstat would show this:
请注意, Local Address
是127.0.0.1.65432
。 如果echo-server.py
使用HOST = ''
而不是HOST = '127.0.0.1'
,则netstat将显示以下内容:
Local Address
is *.65432
, which means all available host interfaces that support the address family will be used to accept incoming connections. In this example, in the call to socket()
, socket.AF_INET
was used (IPv4). You can see this in the Proto
column: tcp4
.
Local Address
是*.65432
,这意味着将使用支持地址族的所有可用主机接口来接受传入的连接。 在此示例中,在对socket()
socket.AF_INET
,使用了socket.AF_INET
(IPv4)。 您可以在Proto
列中看到: tcp4
。
I’ve trimmed the output above to show the echo server only. You’ll likely see much more output, depending on the system you’re running it on. The things to notice are the columns Proto
, Local Address
, and (state)
. In the last example above, netstat shows the echo server is using an IPv4 TCP socket (tcp4
), on port 65432 on all interfaces (*.65432
), and it’s in the listening state (LISTEN
).
我已经整理了上面的输出以仅显示回显服务器。 您可能会看到更多的输出,具体取决于运行它的系统。 需要注意的是Proto
, Local Address
和(state)
。 在上面,netstat显示的回波服务器使用IPv4的TCP套接字(最后一个例子tcp4
上的所有接口(),在端口65432 *.65432
),它是处于监听状态( LISTEN
)。
Another way to see this, along with additional helpful information, is to use lsof
(list open files). It’s available by default on macOS and can be installed on Linux using your package manager, if it’s not already:
另一种有用的查看方法以及其他有用的信息是使用lsof
(列出打开的文件)。 它默认在macOS上可用,如果尚未安装,则可以使用包管理器在Linux上安装:
$ lsof -i -n
$ lsof -i -n
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
Python 67982 nathan 3u IPv4 0xecf272 0t0 TCP *:65432 (LISTEN)
Python 67982 nathan 3u IPv4 0xecf272 0t0 TCP *:65432 (LISTEN)
lsof
gives you the COMMAND
, PID
(process id), and USER
(user id) of open Internet sockets when used with the -i
option. Above is the echo server process.
当与-i
选项一起使用时, lsof
会为您提供打开的Internet套接字的COMMAND
, PID
(进程ID)和USER
(用户ID)。 上面是回显服务器进程。
netstat
and lsof
have a lot of options available and differ depending on the OS you’re running them on. Check the man
page or documentation for both. They’re definitely worth spending a little time with and getting to know. You’ll be rewarded. On macOS and Linux, use man netstat
and man lsof
. For Windows, use netstat /?
.
netstat
和lsof
有很多可用的选项,并且取决于运行它们的操作系统。 检查man
页或说明文件。 他们绝对值得花一点时间来了解。 您会得到奖励的。 在macOS和Linux上,使用man netstat
和man lsof
。 对于Windows,请使用netstat /?
。
Here’s a common error you’ll see when a connection attempt is made to a port with no listening socket:
在没有监听套接字的端口上进行连接尝试时,会看到一个常见错误:
Either the specified port number is wrong or the server isn’t running. Or maybe there’s a firewall in the path that’s blocking the connection, which can be easy to forget about. You may also see the error Connection timed out
. Get a firewall rule added that allows the client to connect to the TCP port!
指定的端口号错误或服务器未运行。 也许在路径上有防火墙阻止了连接,这很容易忘记。 您可能还会看到错误Connection timed out
。 获取添加的防火墙规则,该规则允许客户端连接到TCP端口!
There’s a list of common errors in the reference section.
参考部分中列出了一些常见错误 。
Let’s take a closer look at how the client and server communicated with each other:
让我们仔细看看客户端和服务器之间如何通信:
When using the loopback interface (IPv4 address 127.0.0.1
or IPv6 address ::1
), data never leaves the host or touches the external network. In the diagram above, the loopback interface is contained inside the host. This represents the internal nature of the loopback interface and that connections and data that transit it are local to the host. This is why you’ll also hear the loopback interface and IP address 127.0.0.1
or ::1
referred to as “localhost.”
使用回送接口(IPv4地址127.0.0.1
或IPv6地址::1
)时,数据永远不会离开主机或接触外部网络。 在上图中,环回接口包含在主机内部。 这表示环回接口的内部性质,并且传递该接口的连接和数据对于主机而言是本地的。 这就是为什么您还会听到回送接口和IP地址127.0.0.1
或::1
称为“ localhost”)的原因。
Applications use the loopback interface to communicate with other processes running on the host and for security and isolation from the external network. Since it’s internal and accessible only from within the host, it’s not exposed.
应用程序使用环回接口与主机上运行的其他进程进行通信,并实现安全性和与外部网络的隔离。 由于它是内部的,只能从主机内部访问,因此不会公开。
You can see this in action if you have an application server that uses its own private database. If it’s not a database used by other servers, it’s probably configured to listen for connections on the loopback interface only. If this is the case, other hosts on the network can’t connect to it.
如果您有一个使用其自己的私有数据库的应用程序服务器,则可以看到它的作用。 如果不是其他服务器使用的数据库,则可能已配置为仅侦听回送接口上的连接。 在这种情况下,网络上的其他主机将无法连接到它。
When you use an IP address other than 127.0.0.1
or ::1
in your applications, it’s probably bound to an Ethernet interface that’s connected to an external network. This is your gateway to other hosts outside of your “localhost” kingdom:
当您在应用程序中使用127.0.0.1
或::1
以外的IP地址时,它可能绑定到连接到外部网络的以太网接口。 这是您通往“本地主机”王国以外的其他主机的网关:
Be careful out there. It’s a nasty, cruel world. Be sure to read the section Using Hostnames before venturing from the safe confines of “localhost.” There’s a security note that applies even if you’re not using hostnames and using IP addresses only.
小心点。 这是一个令人讨厌的残酷世界。 在脱离“ localhost”的安全范围之前,请务必阅读使用主机名一节。 即使您不使用主机名并且仅使用IP地址,也有一个安全注意事项适用。
The echo server definitely has its limitations. The biggest being that it serves only one client and then exits. The echo client has this limitation too, but there’s an additional problem. When the client makes the following call, it’s possible that s.recv()
will return only one byte, b'H'
from b'Hello, world'
:
回声服务器肯定有其局限性。 最大的好处是,它仅服务一个客户,然后退出。 回显客户端也有此限制,但是还有另一个问题。 当客户端进行以下调用时, s.recv()
可能s.recv()
b'Hello, world'
返回一个字节b'H'
b'Hello, world'
:
data data = = ss .. recvrecv (( 10241024 )
)
The bufsize
argument of 1024
used above is the maximum amount of data to be received at once. It doesn’t mean that recv()
will return 1024
bytes.
上面使用的bufsize
参数1024
是一次要接收的最大数据量。 这并不意味着recv()
将返回1024
个字节。
send()
also behaves this way. send()
returns the number of bytes sent, which may be less than the size of the data passed in. You’re responsible for checking this and calling send()
as many times as needed to send all of the data:
send()
也具有这种行为。 send()
返回send()
的字节数,该字节数可能小于传入数据的大小。您负责检查此字节并根据需要多次调用send()
来发送所有数据:
“Applications are responsible for checking that all data has been sent; if only some of the data was transmitted, the application needs to attempt delivery of the remaining data.” (Source)
“应用程序负责检查是否已发送所有数据; 如果仅传输了一些数据,则应用程序需要尝试传送其余数据。” (资源)
We avoided having to do this by using sendall()
:
我们避免使用sendall()
来执行此操作:
“Unlike send(), this method continues to send data from bytes until either all data has been sent or an error occurs. None is returned on success.” (Source)
与send()不同,此方法将继续从字节发送数据,直到所有数据都已发送或发生错误为止。 成功一无所获。” (资源)
We have two problems at this point:
此时,我们有两个问题:
send()
and recv()
until all data is sent or received.send()
和recv()
直到发送或接收所有数据为止。 What do we do? There are many approaches to concurrency. More recently, a popular approach is to use Asynchronous I/O. asyncio
was introduced into the standard library in Python 3.4. The traditional choice is to use threads.
我们做什么? 并发方法很多。 最近,一种流行的方法是使用异步I / O。 asyncio
在Python 3.4中引入了标准库。 传统的选择是使用线程 。
The trouble with concurrency is it’s hard to get right. There are many subtleties to consider and guard against. All it takes is for one of these to manifest itself and your application may suddenly fail in not-so-subtle ways.
并发的麻烦在于很难正确。 有许多需要考虑和注意的微妙之处。 所有这些只是其中之一要体现出来,您的应用程序可能突然以不太巧妙的方式失败。
I don’t say this to scare you away from learning and using concurrent programming. If your application needs to scale, it’s a necessity if you want to use more than one processor or one core. However, for this tutorial, we’ll use something that’s more traditional than threads and easier to reason about. We’re going to use the granddaddy of system calls: select()
.
我并不是说这会让您远离学习和使用并发编程。 如果您的应用程序需要扩展,则需要使用多个处理器或一个内核。 但是,在本教程中,我们将使用比线程更传统且更易于推理的东西。 我们将使用系统调用的祖父: select()
。
select()
allows you to check for I/O completion on more than one socket. So you can call select()
to see which sockets have I/O ready for reading and/or writing. But this is Python, so there’s more. We’re going to use the selectors module in the standard library so the most efficient implementation is used, regardless of the operating system we happen to be running on:
select()
允许您检查多个套接字上的I / O完成。 因此,您可以调用select()
来查看哪些套接字已准备好进行读取和/或写入的I / O。 但这是Python,因此还有更多。 我们将使用标准库中的选择器模块,以便使用最高效的实现,而不管我们碰巧在哪个操作系统上运行:
“This module allows high-level and efficient I/O multiplexing, built upon the select module primitives. Users are encouraged to use this module instead, unless they want precise control over the OS-level primitives used.” (Source)
“该模块允许在选择模块原语的基础上进行高层且高效的I / O复用。 鼓励用户改用此模块,除非他们希望对所使用的OS级原语进行精确控制。” (资源)
Even though, by using select()
, we’re not able to run concurrently, depending on your workload, this approach may still be plenty fast. It depends on what your application needs to do when it services a request and the number of clients it needs to support.
即使通过使用select()
,我们也无法同时运行,但根据您的工作量,此方法可能仍然足够快。 这取决于您的应用程序在处理请求时需要执行的操作以及需要支持的客户端数量。
asyncio
uses single-threaded cooperative multitasking and an event loop to manage tasks. With select()
, we’ll be writing our own version of an event loop, albeit more simply and synchronously. When using multiple threads, even though you have concurrency, we currently have to use the GIL with CPython and PyPy. This effectively limits the amount of work we can do in parallel anyway.
asyncio
使用单线程协作式多任务处理和事件循环来管理任务。 使用select()
,我们将编写我们自己的事件循环版本,尽管更简单,更同步。 当使用多个线程时,即使您具有并发性,我们目前也必须将GIL与CPython和PyPy结合使用 。 无论如何,这有效地限制了我们可以并行进行的工作量。
I say all of this to explain that using select()
may be a perfectly fine choice. Don’t feel like you have to use asyncio
, threads, or the latest asynchronous library. Typically, in a network application, your application is I/O bound: it could waiting on the local network, endpoints on the other side of the network, on a disk, and so forth.
我说这些都是为了说明使用select()
可能是一个很好的选择。 不需要使用asyncio
,线程或最新的异步库。 通常,在网络应用程序中,您的应用程序受I / O约束:它可以在本地网络,网络另一端的端点,磁盘等上等待。
If you’re getting requests from clients that initiate CPU bound work, look at the concurrent.futures module. It contains the class ProcessPoolExecutor that uses a pool of processes to execute calls asynchronously.
如果您从启动CPU绑定工作的客户端收到请求,请查看parallel.futures模块。 它包含ProcessPoolExecutor类,该类使用进程池异步执行调用。
If you use multiple processes, the operating system is able to schedule your Python code to run in parallel on multiple processors or cores, without the GIL. For ideas and inspiration, see the PyCon talk John Reese – Thinking Outside the GIL with AsyncIO and Multiprocessing – PyCon 2018.
如果使用多个进程,则操作系统可以安排Python代码在没有GIL的情况下在多个处理器或内核上并行运行。 有关想法和灵感,请参阅PyCon演讲John Reese –使用AsyncIO和多处理在GIL之外进行思考– PyCon 2018 。
In the next section, we’ll look at examples of a server and client that address these problems. They use select()
to handle multiple connections simultaneously and call send()
and recv()
as many times as needed.
在下一节中,我们将介绍解决这些问题的服务器和客户端的示例。 他们使用select()
同时处理多个连接,并根据需要多次调用send()
和recv()
。
In the next two sections, we’ll create a server and client that handles multiple connections using a selector
object created from the selectors module.
在接下来的两节中,我们将创建一个服务器和客户端,这些服务器和客户端将使用从选择器模块创建的选择 selector
对象来处理多个连接。
First, let’s look at the multi-connection server, multiconn-server.py
. Here’s the first part that sets up the listening socket:
首先,让我们看一下多连接服务器multiconn-server.py
。 这是设置侦听套接字的第一部分:
The biggest difference between this server and the echo server is the call to lsock.setblocking(False)
to configure the socket in non-blocking mode. Calls made to this socket will no longer block. When it’s used with sel.select()
, as you’ll see below, we can wait for events on one or more sockets and then read and write data when it’s ready.
此服务器与回显服务器之间的最大区别是调用lsock.setblocking(False)
将套接字配置为非阻塞模式。 对此套接字的调用将不再阻塞 。 当它与sel.select()
,如您将在下面看到的那样,我们可以在一个或多个套接字上等待事件,然后在就绪时读取和写入数据。
sel.register()
registers the socket to be monitored with sel.select()
for the events you’re interested in. For the listening socket, we want read events: selectors.EVENT_READ
.
sel.register()
注册要监视的套接字,以sel.select()
您感兴趣的事件。对于监听套接字,我们需要读取事件: selectors.EVENT_READ
。
data
is used to store whatever arbitrary data you’d like along with the socket. It’s returned when select()
returns. We’ll use data
to keep track of what’s been sent and received on the socket.
data
用于与套接字一起存储您想要的任意数据。 当select()
返回时返回。 我们将使用data
来跟踪套接字上已发送和接收的内容。
Next is the event loop:
接下来是事件循环:
import import selectors
selectors
sel sel = = selectorsselectors .. DefaultSelectorDefaultSelector ()
()
# ...
# ...
while while TrueTrue :
:
events events = = selsel .. selectselect (( timeouttimeout == NoneNone )
)
for for keykey , , mask mask in in eventsevents :
:
if if keykey .. data data is is NoneNone :
:
accept_wrapperaccept_wrapper (( keykey .. fileobjfileobj )
)
elseelse :
:
service_connectionservice_connection (( keykey , , maskmask )
)
sel.select(timeout=None)
blocks until there are sockets ready for I/O. It returns a list of (key, events) tuples, one for each socket. key
is a SelectorKey namedtuple
that contains a fileobj
attribute. key.fileobj
is the socket object, and mask
is an event mask of the operations that are ready.
sel.select(timeout=None)
阻塞,直到有可用于I / O的套接字为止。 它返回(键,事件)元组的列表,每个套接字一个。 key
是SelectorKey namedtuple
包含fileobj
属性。 key.fileobj
是套接字对象,而mask
是已准备好的操作的事件掩码。
If key.data
is None
, then we know it’s from the listening socket and we need to accept()
the connection. We’ll call our own accept()
wrapper function to get the new socket object and register it with the selector. We’ll look at it in a moment.
如果key.data
为None
,那么我们知道它来自监听套接字,我们需要accept()
连接。 我们将调用我们自己的accept()
包装函数来获取新的套接字对象,并将其注册到选择器中。 我们待会儿看。
If key.data
is not None
, then we know it’s a client socket that’s already been accepted, and we need to service it. service_connection()
is then called and passed key
and mask
, which contains everything we need to operate on the socket.
如果key.data
不是None
,那么我们知道它是一个已经被接受的客户端套接字,我们需要为其提供服务。 然后调用service_connection()
并传递key
和mask
,其中包含我们需要在套接字上进行操作的所有内容。
Let’s look at what our accept_wrapper()
function does:
让我们看一下accept_wrapper()
函数的作用:
Since the listening socket was registered for the event selectors.EVENT_READ
, it should be ready to read. We call sock.accept()
and then immediately call conn.setblocking(False)
to put the socket in non-blocking mode.
由于侦听套接字已为事件selectors.EVENT_READ
.EVENT_READ注册,因此应该可以读取了。 我们调用sock.accept()
,然后立即调用conn.setblocking(False)
将套接字置于非阻塞模式。
Remember, this is the main objective in this version of the server since we don’t want it to block. If it blocks, then the entire server is stalled until it returns. Which means other sockets are left waiting. This is the dreaded “hang” state that you don’t want your server to be in.
请记住,这是此版本服务器的主要目标,因为我们不希望其阻塞 。 如果阻塞,则整个服务器将停止运行,直到返回为止。 这意味着其他套接字正在等待。 这是您不希望服务器进入的可怕的“挂起”状态。
Next, we create an object to hold the data we want included along with the socket using the class types.SimpleNamespace
. Since we want to know when the client connection is ready for reading and writing, both of those events are set using the following:
接下来,我们使用类types.SimpleNamespace
创建一个对象,以保存我们想要包含在套接字中的types.SimpleNamespace
。 由于我们想知道客户端连接何时可以进行读取和写入,因此可以使用以下方法设置这两个事件:
events events = = selectorsselectors .. EVENT_READ EVENT_READ | | selectorsselectors .. EVENT_WRITE
EVENT_WRITE
The events
mask, socket, and data objects are then passed to sel.register()
.
然后将events
掩码,套接字和数据对象传递给sel.register()
。
Now let’s look at service_connection()
to see how a client connection is handled when it’s ready:
现在让我们看一下service_connection()
以了解客户端连接准备就绪后如何处理:
This is the heart of the simple multi-connection server. key
is the namedtuple
returned from select()
that contains the socket object (fileobj
) and data object. mask
contains the events that are ready.
这是简单的多连接服务器的核心。 key
是namedtuple
从返回select()
包含插座对象( fileobj
)和数据对象。 mask
包含已准备好的事件。
If the socket is ready for reading, then mask & selectors.EVENT_READ
is true, and sock.recv()
is called. Any data that’s read is appended to data.outb
so it can be sent later.
如果套接字已准备好读取,则mask & selectors.EVENT_READ
为true,并sock.recv()
。 读取的所有数据都会附加到data.outb
以便以后发送。
Note the else:
block if no data is received:
注意else:
如果没有接收到数据则阻塞:
if if recv_datarecv_data :
:
datadata .. outb outb += += recv_data
recv_data
elseelse :
:
printprint (( 'closing connection to''closing connection to' , , datadata .. addraddr )
)
selsel .. unregisterunregister (( socksock )
)
socksock .. closeclose ()
()
This means that the client has closed their socket, so the server should too. But don’t forget to first call sel.unregister()
so it’s no longer monitored by select()
.
这意味着客户端已关闭其套接字,因此服务器也应关闭。 但是不要忘记先调用sel.unregister()
这样它就不再受select()
监视。
When the socket is ready for writing, which should always be the case for a healthy socket, any received data stored in data.outb
is echoed to the client using sock.send()
. The bytes sent are then removed from the send buffer:
当套接字准备好进行写操作时(对于正常的套接字应该始终如此),使用sock.send()
将存储在data.outb
所有接收到的数据回显到客户端。 然后从发送缓冲区中删除已发送的字节:
Now let’s look at the multi-connection client, multiconn-client.py
. It’s very similar to the server, but instead of listening for connections, it starts by initiating connections via start_connections()
:
现在,让我们看一下多连接客户端multiconn-client.py
。 它与服务器非常相似,但是它不是通过侦听连接,而是通过start_connections()
启动连接开始的:
messages messages = = [[ bb 'Message 1 from client.''Message 1 from client.' , , bb 'Message 2 from client.''Message 2 from client.' ]
]
def def start_connectionsstart_connections (( hosthost , , portport , , num_connsnum_conns ):
):
server_addr server_addr = = (( hosthost , , portport )
)
for for i i in in rangerange (( 00 , , num_connsnum_conns ):
):
connid connid = = i i + + 1
1
printprint (( 'starting connection''starting connection' , , connidconnid , , 'to''to' , , server_addrserver_addr )
)
sock sock = = socketsocket .. socketsocket (( socketsocket .. AF_INETAF_INET , , socketsocket .. SOCK_STREAMSOCK_STREAM )
)
socksock .. setblockingsetblocking (( FalseFalse )
)
socksock .. connect_exconnect_ex (( server_addrserver_addr )
)
events events = = selectorsselectors .. EVENT_READ EVENT_READ | | selectorsselectors .. EVENT_WRITE
EVENT_WRITE
data data = = typestypes .. SimpleNamespaceSimpleNamespace (( connidconnid == connidconnid ,
,
msg_totalmsg_total == sumsum (( lenlen (( mm ) ) for for m m in in messagesmessages ),
),
recv_totalrecv_total == 00 ,
,
messagesmessages == listlist (( messagesmessages ),
),
outboutb == bb '''' )
)
selsel .. registerregister (( socksock , , eventsevents , , datadata == datadata )
)
num_conns
is read from the command-line, which is the number of connections to create to the server. Just like the server, each socket is set to non-blocking mode.
num_conns
读取num_conns
,这是与服务器建立的连接数。 就像服务器一样,每个套接字都设置为非阻塞模式。
connect_ex()
is used instead of connect()
since connect()
would immediately raise a BlockingIOError
exception. connect_ex()
initially returns an error indicator, errno.EINPROGRESS
, instead of raising an exception while the connection is in progress. Once the connection is completed, the socket is ready for reading and writing and is returned as such by select()
.
connect_ex()
代替connect()
因为connect()
会立即引发BlockingIOError
异常。 connect_ex()
最初返回错误指示符errno.EINPROGRESS
,而不是在进行连接时引发异常。 一旦连接完成,套接字就可以进行读写了,并由select()
返回。
After the socket is setup, the data we want stored with the socket is created using the class types.SimpleNamespace
. The messages the client will send to the server are copied using list(messages)
since each connection will call socket.send()
and modify the list. Everything needed to keep track of what the client needs to send, has sent and received, and the total number of bytes in the messages is stored in the object data
.
设置套接字后,使用类types.SimpleNamespace
创建要与套接字一起存储的数据。 客户端将发送到服务器的list(messages)
是使用list(messages)
复制的,因为每个连接都会调用socket.send()
并修改列表。 跟踪客户端需要发送,已发送和已接收的内容以及消息中的字节总数所需的所有内容都存储在对象data
。
Let’s look at service_connection()
. It’s fundamentally the same as the server:
让我们看一下service_connection()
。 它与服务器基本相同:
There’s one important difference. It keeps track of the number of bytes it’s received from the server so it can close its side of the connection. When the server detects this, it closes its side of the connection too.
有一个重要的区别。 它跟踪从服务器接收到的字节数,以便可以关闭其连接的一侧。 当服务器检测到此情况时,它也会关闭其连接的一侧。
Note that by doing this, the server depends on the client being well behaved: the server expects the client to close its side of the connection when it’s done sending messages. If the client doesn’t close, the server will leave the connection open. In a real application, you may want to guard against this in your server and prevent client connections from accumulating if they don’t send a request after a certain amount of time.
请注意,通过这样做,服务器取决于客户端的行为是否良好:服务器希望客户端在完成发送消息后关闭其连接的一侧。 如果客户端没有关闭,服务器将保持连接打开状态。 在实际的应用程序中,您可能要防止服务器中的这种情况发生,并防止客户端连接在一定时间后未发送请求的情况下累积。
Now let’s run multiconn-server.py
and multiconn-client.py
. They both use command-line arguments. You can run them without arguments to see the options.
现在让我们运行multiconn-server.py
和multiconn-client.py
。 它们都使用命令行参数。 您可以在不带参数的情况下运行它们以查看选项。
For the server, pass a host
and port
number:
对于服务器,请传递host
和port
号:
$ ./multiconn-server.py
$ ./multiconn-server.py
usage: ./multiconn-server.py
usage: ./multiconn-server.py
For the client, also pass the number of connections to create to the server, num_connections
:
对于客户端,还要将要创建的连接数传递给服务器num_connections
:
Below is the server output when listening on the loopback interface on port 65432:
下面是侦听端口65432上的环回接口时的服务器输出:
$ ./multiconn-server.py $ ./multiconn-server.py 127.0.0.1 127 .0.0.1 65432
65432
listening on ('127.0.0.1', 65432)
listening on ('127.0.0.1', 65432)
accepted connection from ('127.0.0.1', 61354)
accepted connection from ('127.0.0.1', 61354)
accepted connection from ('127.0.0.1', 61355)
accepted connection from ('127.0.0.1', 61355)
echoing b'Message 1 from client.Message 2 from client.' to ('127.0.0.1', 61354)
echoing b'Message 1 from client.Message 2 from client.' to ('127.0.0.1', 61354)
echoing b'Message 1 from client.Message 2 from client.' to ('127.0.0.1', 61355)
echoing b'Message 1 from client.Message 2 from client.' to ('127.0.0.1', 61355)
closing connection to ('127.0.0.1', 61354)
closing connection to ('127.0.0.1', 61354)
closing connection to ('127.0.0.1', 61355)
closing connection to ('127.0.0.1', 61355)
Below is the client output when it creates two connections to the server above:
下面是客户端与上面的服务器创建两个连接时的输出:
The multi-connection client and server example is definitely an improvement compared with where we started. However, let’s take one more step and address the shortcomings of the previous “multiconn” example in a final implementation: the application client and server.
与我们开始时相比,多连接客户端和服务器示例绝对是一个改进。 但是,让我们再采取一步,并在最终的实现中解决前面的“ multiconn”示例的缺点:应用程序客户端和服务器。
We want a client and server that handles errors appropriately so other connections aren’t affected. Obviously, our client or server shouldn’t come crashing down in a ball of fury if an exception isn’t caught. This is something we haven’t discussed up until now. I’ve intentionally left out error handling for brevity and clarity in the examples.
我们希望客户端和服务器能够正确处理错误,以便其他连接不受影响。 显然,如果未捕获到异常,我们的客户端或服务器不应崩溃。 到目前为止,我们还没有讨论过这一点。 为了示例的简洁和清楚起见,我故意省略了错误处理。
Now that you’re familiar with the basic API, non-blocking sockets, and select()
, we can add some error handling and discuss the “elephant in the room” that I’ve kept hidden from you behind that large curtain over there. Yes, I’m talking about the custom class I mentioned way back in the introduction. I knew you wouldn’t forget.
现在您已经熟悉了基本的API,无阻塞套接字和select()
,我们可以添加一些错误处理并讨论我一直藏在那间大窗帘后面的“房间里的大象” 。 是的,我说的是我在引言中提到的自定义类。 我知道你不会忘记的。
First, let’s address the errors:
首先,让我们解决错误:
“All errors raise exceptions. The normal exceptions for invalid argument types and out-of-memory conditions can be raised; starting from Python 3.3, errors related to socket or address semantics raise
OSError
or one of its subclasses.” (Source)“所有错误都会引发异常。 无效参数类型和内存不足条件的普通异常可以被提出; 从Python 3.3开始,与套接字或地址语义相关的错误引发了
OSError
或其子类之一。” (资源)
We need to catch OSError
. Another thing I haven’t mentioned in relation to errors is timeouts. You’ll see them discussed in many places in the documentation. Timeouts happen and are a “normal” error. Hosts and routers are rebooted, switch ports go bad, cables go bad, cables get unplugged, you name it. You should be prepared for these and other errors and handle them in your code.
我们需要捕获OSError
。 关于错误,我没有提到的另一件事是超时。 您会在文档中的许多地方看到它们的讨论。 发生超时,是“正常”错误。 主机和路由器会重新启动,交换机端口会损坏,电缆会损坏,电缆会被拔出,这就是您的名字。 您应该为这些错误和其他错误做好准备,并在代码中进行处理。
What about the “elephant in the room?” As hinted by the socket type socket.SOCK_STREAM
, when using TCP, you’re reading from a continuous stream of bytes. It’s like reading from a file on disk, but instead you’re reading bytes from the network.
那“房间里的大象”呢? 如套接字类型socket.SOCK_STREAM
所暗示的,使用TCP时,您正在读取连续的字节流。 这就像从磁盘上的文件中读取内容,而是从网络中读取字节。
However, unlike reading a file, there’s no f.seek()
. In other words, you can’t reposition the socket pointer, if there was one, and move randomly around the data reading whatever, whenever you’d like.
但是,与读取文件不同,没有f.seek()
。 换句话说,您无法重新定位套接字指针(如果有的话),并在数据读取时随心所欲地随意移动。
When bytes arrive at your socket, there are network buffers involved. Once you’ve read them, they need to be saved somewhere. Calling recv()
again reads the next stream of bytes available from the socket.
当字节到达您的套接字时,将涉及网络缓冲区。 阅读它们后,需要将它们保存在某个地方。 再次调用recv()
会从套接字读取下一个可用字节流。
What this means is that you’ll be reading from the socket in chunks. You need to call recv()
and save the data in a buffer until you’ve read enough bytes to have a complete message that makes sense to your application.
这意味着您将在套接字中分块读取。 您需要调用recv()
并将数据保存在缓冲区中,直到您读取足够的字节以获取对应用程序有意义的完整消息为止。
It’s up to you to define and keep track of where the message boundaries are. As far as the TCP socket is concerned, it’s just sending and receiving raw bytes to and from the network. It knows nothing about what those raw bytes mean.
由您决定并跟踪消息边界在哪里。 就TCP套接字而言,它只是与网络之间发送和接收原始字节。 对于这些原始字节的含义,它一无所知。
This bring us to defining an application-layer protocol. What’s an application-layer protocol? Put simply, your application will send and receive messages. These messages are your application’s protocol.
这使我们能够定义应用程序层协议。 什么是应用层协议? 简而言之,您的应用程序将发送和接收消息。 这些消息是您的应用程序的协议。
In other words, the length and format you choose for these messages define the semantics and behavior of your application. This is directly related to what I explained in the previous paragraph regarding reading bytes from the socket. When you’re reading bytes with recv()
, you need to keep up with how many bytes were read and figure out where the message boundaries are.
换句话说,您为这些消息选择的长度和格式定义了应用程序的语义和行为。 这与我在上一段中解释的有关从套接字读取字节的内容直接相关。 当使用recv()
读取字节时,需要跟上读取的字节数,并弄清楚消息边界在哪里。
How is this done? One way is to always send fixed-length messages. If they’re always the same size, then it’s easy. When you’ve read that number of bytes into a buffer, then you know you have one complete message.
怎么做? 一种方法是始终发送固定长度的消息。 如果它们总是相同的大小,那很容易。 将一定数量的字节读入缓冲区后,您便知道了一条完整的消息。
However, using fixed-length messages is inefficient for small messages where you’d need to use padding to fill them out. Also, you’re still left with the problem of what to do about data that doesn’t fit into one message.
但是,对于需要使用填充来填充小消息的小消息,使用定长消息效率低下。 同样,您仍然面临着如何处理不适合一条消息的数据的问题。
In this tutorial, we’ll take a generic approach. An approach that’s used by many protocols, including HTTP. We’ll prefix messages with a header that includes the content length as well as any other fields we need. By doing this, we’ll only need to keep up with the header. Once we’ve read the header, we can process it to determine the length of the message’s content and then read that number of bytes to consume it.
在本教程中,我们将采用通用方法。 许多协议(包括HTTP)使用的一种方法。 我们将为邮件添加一个标题,该标题包括内容长度以及我们需要的任何其他字段。 这样,我们只需要跟头即可。 读取标头后,我们可以对其进行处理以确定消息内容的长度,然后读取该字节数以使用它。
We’ll implement this by creating a custom class that can send and receive messages that contain text or binary data. You can improve and extend it for your own applications. The most important thing is that you’ll be able to see an example of how this is done.
我们将通过创建一个自定义类来实现此目的,该类可以发送和接收包含文本或二进制数据的消息。 您可以针对自己的应用程序进行改进和扩展。 最重要的是,您将能够看到一个有关如何完成此操作的示例。
I need to mention something regarding sockets and bytes that may affect you. As we talked about earlier, when sending and receiving data via sockets, you’re sending and receiving raw bytes.
我需要提及一些可能影响您的套接字和字节。 如前所述,通过套接字发送和接收数据时,您正在发送和接收原始字节。
If you receive data and want to use it in a context where it’s interpreted as multiple bytes, for example a 4-byte integer, you’ll need to take into account that it could be in a format that’s not native to your machine’s CPU. The client or server on the other end could have a CPU that uses a different byte order than your own. If this is the case, you’ll need to convert it to your host’s native byte order before using it.
如果您接收数据并想在将其解释为多个字节(例如4字节整数)的上下文中使用它,则需要考虑到它的格式可能不是计算机CPU固有的格式。 另一端的客户端或服务器可能具有使用与您自己的字节顺序不同的字节顺序的CPU。 如果是这种情况,则需要在使用前将其转换为主机的本机字节顺序。
This byte order is referred to as a CPU’s endianness. See Byte Endianness in the reference section for details. We’ll avoid this issue by taking advantage of Unicode for our message header and using the encoding UTF-8. Since UTF-8 uses an 8-bit encoding, there are no byte ordering issues.
This byte order is referred to as a CPU's endianness . See Byte Endianness in the reference section for details. We'll avoid this issue by taking advantage of Unicode for our message header and using the encoding UTF-8. Since UTF-8 uses an 8-bit encoding, there are no byte ordering issues.
You can find an explanation in Python’s Encodings and Unicode documentation. Note that this applies to the text header only. We’ll use an explicit type and encoding defined in the header for the content that’s being sent, the message payload. This will allow us to transfer any data we’d like (text or binary), in any format.
You can find an explanation in Python's Encodings and Unicode documentation. Note that this applies to the text header only. We'll use an explicit type and encoding defined in the header for the content that's being sent, the message payload. This will allow us to transfer any data we'd like (text or binary), in any format.
You can easily determine the byte order of your machine by using sys.byteorder
. For example, on my Intel laptop, this happens:
You can easily determine the byte order of your machine by using sys.byteorder
. For example, on my Intel laptop, this happens:
$ python3 -c $ python3 -c 'import sys; print(repr(sys.byteorder))'
'import sys; print(repr(sys.byteorder))'
'little'
'little'
If I run this in a virtual machine that emulates a big-endian CPU (PowerPC), then this happens:
If I run this in a virtual machine that emulates a big-endian CPU (PowerPC), then this happens:
In this example application, our application-layer protocol defines the header as Unicode text with a UTF-8 encoding. For the actual content in the message, the message payload, you’ll still have to swap the byte order manually if needed.
In this example application, our application-layer protocol defines the header as Unicode text with a UTF-8 encoding. For the actual content in the message, the message payload, you'll still have to swap the byte order manually if needed.
This will depend on your application and whether or not it needs to process multi-byte binary data from a machine with a different endianness. You can help your client or server implement binary support by adding additional headers and using them to pass parameters, similar to HTTP.
This will depend on your application and whether or not it needs to process multi-byte binary data from a machine with a different endianness. You can help your client or server implement binary support by adding additional headers and using them to pass parameters, similar to HTTP.
Don’t worry if this doesn’t make sense yet. In the next section, you’ll see how all of this works and fits together.
Don't worry if this doesn't make sense yet. In the next section, you'll see how all of this works and fits together.
Let’s fully define the protocol header. The protocol header is:
Let's fully define the protocol header. The protocol header is:
The required headers, or sub-headers, in the protocol header’s dictionary are as follows:
The required headers, or sub-headers, in the protocol header's dictionary are as follows:
Name | 名称 | Description | 描述 |
---|---|---|---|
byteorder byteorder |
sys.byteorder ). This may not be required for your application.sys.byteorder ). This may not be required for your application. |
||
content-length content-length |
The length of the content in bytes. | The length of the content in bytes. | |
content-type content-type |
text/json or text/json or binary/my-binary-type .binary/my-binary-type . |
||
content-encoding content-encoding |
utf-8 for Unicode text or utf-8 for Unicode text or binary for binary data.binary for binary data. |
These headers inform the receiver about the content in the payload of the message. This allows you to send arbitrary data while providing enough information so the content can be decoded and interpreted correctly by the receiver. Since the headers are in a dictionary, it’s easy to add additional headers by inserting key/value pairs as needed.
These headers inform the receiver about the content in the payload of the message. This allows you to send arbitrary data while providing enough information so the content can be decoded and interpreted correctly by the receiver. Since the headers are in a dictionary, it's easy to add additional headers by inserting key/value pairs as needed.
There’s still a bit of a problem. We have a variable-length header, which is nice and flexible, but how do you know the length of the header when reading it with recv()
?
There's still a bit of a problem. We have a variable-length header, which is nice and flexible, but how do you know the length of the header when reading it with recv()
?
When we previously talked about using recv()
and message boundaries, I mentioned that fixed-length headers can be inefficient. That’s true, but we’re going to use a small, 2-byte, fixed-length header to prefix the JSON header that contains its length.
When we previously talked about using recv()
and message boundaries, I mentioned that fixed-length headers can be inefficient. That's true, but we're going to use a small, 2-byte, fixed-length header to prefix the JSON header that contains its length.
You can think of this as a hybrid approach to sending messages. In effect, we’re bootstrapping the message receive process by sending the length of the header first. This makes it easy for our receiver to deconstruct the message.
You can think of this as a hybrid approach to sending messages. In effect, we're bootstrapping the message receive process by sending the length of the header first. This makes it easy for our receiver to deconstruct the message.
To give you a better idea of the message format, let’s look at a message in its entirety:
To give you a better idea of the message format, let's look at a message in its entirety:
A message starts with a fixed-length header of 2 bytes that’s an integer in network byte order. This is the length of the next header, the variable-length JSON header. Once we’ve read 2 bytes with recv()
, then we know we can process the 2 bytes as an integer and then read that number of bytes before decoding the UTF-8 JSON header.
A message starts with a fixed-length header of 2 bytes that's an integer in network byte order. This is the length of the next header, the variable-length JSON header. Once we've read 2 bytes with recv()
, then we know we can process the 2 bytes as an integer and then read that number of bytes before decoding the UTF-8 JSON header.
The JSON header contains a dictionary of additional headers. One of those is content-length
, which is the number of bytes of the message’s content (not including the JSON header). Once we’ve called recv()
and read content-length
bytes, we’ve reached a message boundary and read an entire message.
The JSON header contains a dictionary of additional headers. One of those is content-length
, which is the number of bytes of the message's content (not including the JSON header). Once we've called recv()
and read content-length
bytes, we've reached a message boundary and read an entire message.
Finally, the payoff! Let’s look at the Message
class and see how it’s used with select()
when read and write events happen on the socket.
Finally, the payoff! Let's look at the Message
class and see how it's used with select()
when read and write events happen on the socket.
For this example application, I had to come up with an idea for what types of messages the client and server would use. We’re far beyond toy echo clients and servers at this point.
For this example application, I had to come up with an idea for what types of messages the client and server would use. We're far beyond toy echo clients and servers at this point.
To keep things simple and still demonstrate how things would work in a real application, I created an application protocol that implements a basic search feature. The client sends a search request and the server does a lookup for a match. If the request sent by the client isn’t recognized as a search, the server assumes it’s a binary request and returns a binary response.
To keep things simple and still demonstrate how things would work in a real application, I created an application protocol that implements a basic search feature. The client sends a search request and the server does a lookup for a match. If the request sent by the client isn't recognized as a search, the server assumes it's a binary request and returns a binary response.
After reading the following sections, running the examples, and experimenting with the code, you’ll see how things work. You can then use the Message
class as a starting point and modify it for you own use.
After reading the following sections, running the examples, and experimenting with the code, you'll see how things work. You can then use the Message
class as a starting point and modify it for you own use.
We’re really not that far off from the “multiconn” client and server example. The event loop code stays the same in app-client.py
and app-server.py
. What I’ve done is move the message code into a class named Message
and added methods to support reading, writing, and processing of the headers and content. This is a great example for using a class.
We're really not that far off from the “multiconn” client and server example. The event loop code stays the same in app-client.py
and app-server.py
. What I've done is move the message code into a class named Message
and added methods to support reading, writing, and processing of the headers and content. This is a great example for using a class.
As we discussed before and you’ll see below, working with sockets involves keeping state. By using a class, we keep all of the state, data, and code bundled together in an organized unit. An instance of the class is created for each socket in the client and server when a connection is started or accepted.
As we discussed before and you'll see below, working with sockets involves keeping state. By using a class, we keep all of the state, data, and code bundled together in an organized unit. An instance of the class is created for each socket in the client and server when a connection is started or accepted.
The class is mostly the same for both the client and the server for the wrapper and utility methods. They start with an underscore, like Message._json_encode()
. These methods simplify working with the class. They help other methods by allowing them to stay shorter and support the DRY principle.
The class is mostly the same for both the client and the server for the wrapper and utility methods. They start with an underscore, like Message._json_encode()
. These methods simplify working with the class. They help other methods by allowing them to stay shorter and support the DRY principle.
The server’s Message
class works in essentially the same way as the client’s and vice-versa. The difference being that the client initiates the connection and sends a request message, followed by processing the server’s response message. Conversely, the server waits for a connection, processes the client’s request message, and then sends a response message.
The server's Message
class works in essentially the same way as the client's and vice-versa. The difference being that the client initiates the connection and sends a request message, followed by processing the server's response message. Conversely, the server waits for a connection, processes the client's request message, and then sends a response message.
It looks like this:
看起来像这样:
Step | 步 | Endpoint | 终点 | Action / Message Content | Action / Message Content |
---|---|---|---|---|---|
1 | 1个 | Client | 客户 | Message containing request contentMessage containing request content |
|
2 | 2 | Server | 服务器 | Message Message |
|
3 | 3 | Server | 服务器 | Message containing response contentMessage containing response content |
|
4 | 4 | Client | 客户 | Message Message |
Here’s the file and code layout:
Here's the file and code layout:
Application | 应用 | File | 文件 | Code | 码 |
---|---|---|---|---|---|
Server | 服务器 | app-server.py app-server.py |
The server’s main script | The server's main script | |
Server | 服务器 | libserver.py libserver.py |
Message classMessage class |
||
Client | 客户 | app-client.py app-client.py |
The client’s main script | The client's main script | |
Client | 客户 | libclient.py libclient.py |
Message classMessage class |
I’d like to discuss how the Message
class works by first mentioning an aspect of its design that wasn’t immediately obvious to me. Only after refactoring it at least five times did I arrive at what it is currently. Why? Managing state.
I'd like to discuss how the Message
class works by first mentioning an aspect of its design that wasn't immediately obvious to me. Only after refactoring it at least five times did I arrive at what it is currently. 为什么? Managing state.
After a Message
object is created, it’s associated with a socket that’s monitored for events using selector.register()
:
After a Message
object is created, it's associated with a socket that's monitored for events using selector.register()
:
message message = = libserverlibserver .. MessageMessage (( selsel , , connconn , , addraddr )
)
selsel .. registerregister (( connconn , , selectorsselectors .. EVENT_READEVENT_READ , , datadata == messagemessage )
)
Note: Some of the code examples in this section are from the server’s main script and Message
class, but this section and discussion applies equally to the client as well. I’ll show and explain the client’s version when it differs.
Note: Some of the code examples in this section are from the server's main script and Message
class, but this section and discussion applies equally to the client as well. I'll show and explain the client's version when it differs.
When events are ready on the socket, they’re returned by selector.select()
. We can then get a reference back to the message object using the data
attribute on the key
object and call a method in Message
:
When events are ready on the socket, they're returned by selector.select()
. We can then get a reference back to the message object using the data
attribute on the key
object and call a method in Message
:
Looking at the event loop above, you’ll see that sel.select()
is in the driver’s seat. It’s blocking, waiting at the top of the loop for events. It’s responsible for waking up when read and write events are ready to be processed on the socket. Which means, indirectly, it’s also responsible for calling the method process_events()
. This is what I mean when I say the method process_events()
is the entry point.
Looking at the event loop above, you'll see that sel.select()
is in the driver's seat. It's blocking, waiting at the top of the loop for events. It's responsible for waking up when read and write events are ready to be processed on the socket. Which means, indirectly, it's also responsible for calling the method process_events()
. This is what I mean when I say the method process_events()
is the entry point.
Let’s see what the process_events()
method does:
Let's see what the process_events()
method does:
def def process_eventsprocess_events (( selfself , , maskmask ):
):
if if mask mask & & selectorsselectors .. EVENT_READEVENT_READ :
:
selfself .. readread ()
()
if if mask mask & & selectorsselectors .. EVENT_WRITEEVENT_WRITE :
:
selfself .. writewrite ()
()
That’s good: process_events()
is simple. It can only do two things: call read()
and write()
.
That's good: process_events()
is simple. It can only do two things: call read()
and write()
.
This brings us back to managing state. After a few refactorings, I decided that if another method depended on state variables having a certain value, then they would only be called from read()
and write()
. This keeps the logic as simple as possible as events come in on the socket for processing.
This brings us back to managing state. After a few refactorings, I decided that if another method depended on state variables having a certain value, then they would only be called from read()
and write()
. This keeps the logic as simple as possible as events come in on the socket for processing.
This may seem obvious, but the first few iterations of the class were a mix of some methods that checked the current state and, depending on their value, called other methods to process data outside read()
or write()
. In the end, this proved too complex to manage and keep up with.
This may seem obvious, but the first few iterations of the class were a mix of some methods that checked the current state and, depending on their value, called other methods to process data outside read()
or write()
. In the end, this proved too complex to manage and keep up with.
You should definitely modify the class to suit your own needs so it works best for you, but I’d recommend that you keep the state checks and the calls to methods that depend on that state to the read()
and write()
methods if possible.
You should definitely modify the class to suit your own needs so it works best for you, but I'd recommend that you keep the state checks and the calls to methods that depend on that state to the read()
and write()
methods if possible.
Let’s look at read()
. This is the server’s version, but the client’s is the same. It just uses a different method name, process_response()
instead of process_request()
:
Let's look at read()
. This is the server's version, but the client's is the same. It just uses a different method name, process_response()
instead of process_request()
:
The _read()
method is called first. It calls socket.recv()
to read data from the socket and store it in a receive buffer.
The _read()
method is called first. It calls socket.recv()
to read data from the socket and store it in a receive buffer.
Remember that when socket.recv()
is called, all of the data that makes up a complete message may not have arrived yet. socket.recv()
may need to be called again. This is why there are state checks for each part of the message before calling the appropriate method to process it.
Remember that when socket.recv()
is called, all of the data that makes up a complete message may not have arrived yet. socket.recv()
may need to be called again. This is why there are state checks for each part of the message before calling the appropriate method to process it.
Before a method processes its part of the message, it first checks to make sure enough bytes have been read into the receive buffer. If there are, it processes its respective bytes, removes them from the buffer and writes its output to a variable that’s used by the next processing stage. Since there are three components to a message, there are three state checks and process
method calls:
Before a method processes its part of the message, it first checks to make sure enough bytes have been read into the receive buffer. If there are, it processes its respective bytes, removes them from the buffer and writes its output to a variable that's used by the next processing stage. Since there are three components to a message, there are three state checks and process
method calls:
Message Component | Message Component | Method | 方法 | Output | 输出量 |
---|---|---|---|---|---|
Fixed-length header | Fixed-length header | process_protoheader() process_protoheader() |
self._jsonheader_len self._jsonheader_len |
||
JSON header | JSON header | process_jsonheader() process_jsonheader() |
self.jsonheader self.jsonheader |
||
Content | 内容 | process_request() process_request() |
self.request self.request |
Next, let’s look at write()
. This is the server’s version:
Next, let's look at write()
. This is the server's version:
def def writewrite (( selfself ):
):
if if selfself .. requestrequest :
:
if if not not selfself .. response_createdresponse_created :
:
selfself .. create_responsecreate_response ()
()
selfself .. _write_write ()
()
write()
checks first for a request
. If one exists and a response hasn’t been created, create_response()
is called. create_response()
sets the state variable response_created
and writes the response to the send buffer.
write()
checks first for a request
. If one exists and a response hasn't been created, create_response()
is called. create_response()
sets the state variable response_created
and writes the response to the send buffer.
The _write()
method calls socket.send()
if there’s data in the send buffer.
The _write()
method calls socket.send()
if there's data in the send buffer.
Remember that when socket.send()
is called, all of the data in the send buffer may not have been queued for transmission. The network buffers for the socket may be full, and socket.send()
may need to be called again. This is why there are state checks. create_response()
should only be called once, but it’s expected that _write()
will need to be called multiple times.
Remember that when socket.send()
is called, all of the data in the send buffer may not have been queued for transmission. The network buffers for the socket may be full, and socket.send()
may need to be called again. This is why there are state checks. create_response()
should only be called once, but it's expected that _write()
will need to be called multiple times.
The client version of write()
is similar:
The client version of write()
is similar:
Since the client initiates a connection to the server and sends a request first, the state variable _request_queued
is checked. If a request hasn’t been queued, it calls queue_request()
. queue_request()
creates the request and writes it to the send buffer. It also sets the state variable _request_queued
so it’s only called once.
Since the client initiates a connection to the server and sends a request first, the state variable _request_queued
is checked. If a request hasn't been queued, it calls queue_request()
. queue_request()
creates the request and writes it to the send buffer. It also sets the state variable _request_queued
so it's only called once.
Just like the server, _write()
calls socket.send()
if there’s data in the send buffer.
Just like the server, _write()
calls socket.send()
if there's data in the send buffer.
The notable difference in the client’s version of write()
is the last check to see if the request has been queued. This will be explained more in the section Client Main Script, but the reason for this is to tell selector.select()
to stop monitoring the socket for write events. If the request has been queued and the send buffer is empty, then we’re done writing and we’re only interested in read events. There’s no reason to be notified that the socket is writable.
The notable difference in the client's version of write()
is the last check to see if the request has been queued. This will be explained more in the section Client Main Script , but the reason for this is to tell selector.select()
to stop monitoring the socket for write events. If the request has been queued and the send buffer is empty, then we're done writing and we're only interested in read events. There's no reason to be notified that the socket is writable.
I’ll wrap up this section by leaving you with one thought. The main purpose of this section was to explain that selector.select()
is calling into the Message
class via the method process_events()
and to describe how state is managed.
I'll wrap up this section by leaving you with one thought. The main purpose of this section was to explain that selector.select()
is calling into the Message
class via the method process_events()
and to describe how state is managed.
This is important because process_events()
will be called many times over the life of the connection. Therefore, make sure that any methods that should only be called once are either checking a state variable themselves, or the state variable set by the method is checked by the caller.
This is important because process_events()
will be called many times over the life of the connection. Therefore, make sure that any methods that should only be called once are either checking a state variable themselves, or the state variable set by the method is checked by the caller.
In the server’s main script app-server.py
, arguments are read from the command line that specify the interface and port to listen on:
In the server's main script app-server.py
, arguments are read from the command line that specify the interface and port to listen on:
$ ./app-server.py
$ ./app-server.py
usage: ./app-server.py
usage: ./app-server.py
For example, to listen on the loopback interface on port 65432
, enter:
For example, to listen on the loopback interface on port 65432
, enter:
Use an empty string for
to listen on all interfaces.
Use an empty string for
to listen on all interfaces.
After creating the socket, a call is made to socket.setsockopt()
with the option socket.SO_REUSEADDR
:
After creating the socket, a call is made to socket.setsockopt()
with the option socket.SO_REUSEADDR
:
# Avoid bind() exception: OSError: [Errno 48] Address already in use
# Avoid bind() exception: OSError: [Errno 48] Address already in use
lsocklsock .. setsockoptsetsockopt (( socketsocket .. SOL_SOCKETSOL_SOCKET , , socketsocket .. SO_REUSEADDRSO_REUSEADDR , , 11 )
)
Setting this socket option avoids the error Address already in use
. You’ll see this when starting the server and a previously used TCP socket on the same port has connections in the TIME_WAIT state.
Setting this socket option avoids the error Address already in use
. You'll see this when starting the server and a previously used TCP socket on the same port has connections in the TIME_WAIT state.
For example, if the server actively closed a connection, it will remain in the TIME_WAIT
state for two minutes or more, depending on the operating system. If you try to start the server again before the TIME_WAIT
state expires, you’ll get an OSError
exception of Address already in use
. This is a safeguard to make sure that any delayed packets in the network aren’t delivered to the wrong application.
For example, if the server actively closed a connection, it will remain in the TIME_WAIT
state for two minutes or more, depending on the operating system. If you try to start the server again before the TIME_WAIT
state expires, you'll get an OSError
exception of Address already in use
. This is a safeguard to make sure that any delayed packets in the network aren't delivered to the wrong application.
The event loop catches any errors so the server can stay up and continue to run:
The event loop catches any errors so the server can stay up and continue to run:
When a client connection is accepted, a Message
object is created:
When a client connection is accepted, a Message
object is created:
def def accept_wrapperaccept_wrapper (( socksock ):
):
connconn , , addr addr = = socksock .. acceptaccept () () # Should be ready to read
# Should be ready to read
printprint (( 'accepted connection from''accepted connection from' , , addraddr )
)
connconn .. setblockingsetblocking (( FalseFalse )
)
message message = = libserverlibserver .. MessageMessage (( selsel , , connconn , , addraddr )
)
selsel .. registerregister (( connconn , , selectorsselectors .. EVENT_READEVENT_READ , , datadata == messagemessage )
)
The Message
object is associated with the socket in the call to sel.register()
and is initially set to be monitored for read events only. Once the request has been read, we’ll modify it to listen for write events only.
The Message
object is associated with the socket in the call to sel.register()
and is initially set to be monitored for read events only. Once the request has been read, we'll modify it to listen for write events only.
An advantage of taking this approach in the server is that in most cases, when a socket is healthy and there are no network issues, it will always be writable.
An advantage of taking this approach in the server is that in most cases, when a socket is healthy and there are no network issues, it will always be writable.
If we told sel.register()
to also monitor EVENT_WRITE
, the event loop would immediately wakeup and notify us that this is the case. However, at this point, there’s no reason to wake up and call send()
on the socket. There’s no response to send since a request hasn’t been processed yet. This would consume and waste valuable CPU cycles.
If we told sel.register()
to also monitor EVENT_WRITE
, the event loop would immediately wakeup and notify us that this is the case. However, at this point, there's no reason to wake up and call send()
on the socket. There's no response to send since a request hasn't been processed yet. This would consume and waste valuable CPU cycles.
In the section Message Entry Point, we looked at how the Message
object was called into action when socket events were ready via process_events()
. Now let’s look at what happens as data is read on the socket and a component, or piece, of the message is ready to be processed by the server.
In the section Message Entry Point , we looked at how the Message
object was called into action when socket events were ready via process_events()
. Now let's look at what happens as data is read on the socket and a component, or piece, of the message is ready to be processed by the server.
The server’s message class is in libserver.py
. You can find the source code on GitHub.
The server's message class is in libserver.py
. You can find the source code on GitHub .
The methods appear in the class in the order in which processing takes place for a message.
The methods appear in the class in the order in which processing takes place for a message.
When the server has read at least 2 bytes, the fixed-length header can be processed:
When the server has read at least 2 bytes, the fixed-length header can be processed:
The fixed-length header is a 2-byte integer in network (big-endian) byte order that contains the length of the JSON header. struct.unpack() is used to read the value, decode it, and store it in self._jsonheader_len
. After processing the piece of the message it’s responsible for, process_protoheader()
removes it from the receive buffer.
The fixed-length header is a 2-byte integer in network (big-endian) byte order that contains the length of the JSON header. struct.unpack() is used to read the value, decode it, and store it in self._jsonheader_len
. After processing the piece of the message it's responsible for, process_protoheader()
removes it from the receive buffer.
Just like the fixed-length header, when there’s enough data in the receive buffer to contain the JSON header, it can be processed as well:
Just like the fixed-length header, when there's enough data in the receive buffer to contain the JSON header, it can be processed as well:
def def process_jsonheaderprocess_jsonheader (( selfself ):
):
hdrlen hdrlen = = selfself .. _jsonheader_len
_jsonheader_len
if if lenlen (( selfself .. _recv_buffer_recv_buffer ) ) >= >= hdrlenhdrlen :
:
selfself .. jsonheader jsonheader = = selfself .. _json_decode_json_decode (( selfself .. _recv_buffer_recv_buffer [:[: hdrlenhdrlen ],
],
'utf-8''utf-8' )
)
selfself .. _recv_buffer _recv_buffer = = selfself .. _recv_buffer_recv_buffer [[ hdrlenhdrlen :]
:]
for for reqhdr reqhdr in in (( 'byteorder''byteorder' , , 'content-length''content-length' , , 'content-type''content-type' ,
,
'content-encoding''content-encoding' ):
):
if if reqhdr reqhdr not not in in selfself .. jsonheaderjsonheader :
:
raise raise ValueErrorValueError (( ff 'Missing required header "'Missing required header " {reqhdr}{reqhdr} ".'".' )
)
The method self._json_decode()
is called to decode and deserialize the JSON header into a dictionary. Since the JSON header is defined as Unicode with a UTF-8 encoding, utf-8
is hardcoded in the call. The result is saved to self.jsonheader
. After processing the piece of the message it’s responsible for, process_jsonheader()
removes it from the receive buffer.
The method self._json_decode()
is called to decode and deserialize the JSON header into a dictionary. Since the JSON header is defined as Unicode with a UTF-8 encoding, utf-8
is hardcoded in the call. The result is saved to self.jsonheader
. After processing the piece of the message it's responsible for, process_jsonheader()
removes it from the receive buffer.
Next is the actual content, or payload, of the message. It’s described by the JSON header in self.jsonheader
. When content-length
bytes are available in the receive buffer, the request can be processed:
Next is the actual content, or payload, of the message. It's described by the JSON header in self.jsonheader
. When content-length
bytes are available in the receive buffer, the request can be processed:
After saving the message content to the data
variable, process_request()
removes it from the receive buffer. Then, if the content type is JSON, it decodes and deserializes it. If it’s not, for this example application, it assumes it’s a binary request and simply prints the content type.
After saving the message content to the data
variable, process_request()
removes it from the receive buffer. Then, if the content type is JSON, it decodes and deserializes it. If it's not, for this example application, it assumes it's a binary request and simply prints the content type.
The last thing process_request()
does is modify the selector to monitor write events only. In the server’s main script, app-server.py
, the socket is initially set to monitor read events only. Now that the request has been fully processed, we’re no longer interested in reading.
The last thing process_request()
does is modify the selector to monitor write events only. In the server's main script, app-server.py
, the socket is initially set to monitor read events only. Now that the request has been fully processed, we're no longer interested in reading.
A response can now be created and written to the socket. When the socket is writable, create_response()
is called from write()
:
A response can now be created and written to the socket. When the socket is writable, create_response()
is called from write()
:
def def create_responsecreate_response (( selfself ):
):
if if selfself .. jsonheaderjsonheader [[ 'content-type''content-type' ] ] == == 'text/json''text/json' :
:
response response = = selfself .. _create_response_json_content_create_response_json_content ()
()
elseelse :
:
# Binary or unknown content-type
# Binary or unknown content-type
response response = = selfself .. _create_response_binary_content_create_response_binary_content ()
()
message message = = selfself .. _create_message_create_message (( **** responseresponse )
)
selfself .. response_created response_created = = True
True
selfself .. _send_buffer _send_buffer += += message
message
A response is created by calling other methods, depending on the content type. In this example application, a simple dictionary lookup is done for JSON requests when action == 'search'
. You can define other methods for your own applications that get called here.
A response is created by calling other methods, depending on the content type. In this example application, a simple dictionary lookup is done for JSON requests when action == 'search'
. You can define other methods for your own applications that get called here.
After creating the response message, the state variable self.response_created
is set so write()
doesn’t call create_response()
again. Finally, the response is appended to the send buffer. This is seen by and sent via _write()
.
After creating the response message, the state variable self.response_created
is set so write()
doesn't call create_response()
again. Finally, the response is appended to the send buffer. This is seen by and sent via _write()
.
One tricky bit to figure out was how to close the connection after the response is written. I put the call to close()
in the method _write()
:
One tricky bit to figure out was how to close the connection after the response is written. I put the call to close()
in the method _write()
:
Although it’s somewhat “hidden,” I think it’s an acceptable trade-off given that the Message
class only handles one message per connection. After the response is written, there’s nothing left for the server to do. It’s completed its work.
Although it's somewhat “hidden,” I think it's an acceptable trade-off given that the Message
class only handles one message per connection. After the response is written, there's nothing left for the server to do. It's completed its work.
In the client’s main script app-client.py
, arguments are read from the command line and used to create requests and start connections to the server:
In the client's main script app-client.py
, arguments are read from the command line and used to create requests and start connections to the server:
$ ./app-client.py
$ ./app-client.py
usage: ./app-client.py
usage: ./app-client.py
Here’s an example:
这是一个例子:
After creating a dictionary representing the request from the command-line arguments, the host, port, and request dictionary are passed to start_connection()
:
After creating a dictionary representing the request from the command-line arguments, the host, port, and request dictionary are passed to start_connection()
:
def def start_connectionstart_connection (( hosthost , , portport , , requestrequest ):
):
addr addr = = (( hosthost , , portport )
)
printprint (( 'starting connection to''starting connection to' , , addraddr )
)
sock sock = = socketsocket .. socketsocket (( socketsocket .. AF_INETAF_INET , , socketsocket .. SOCK_STREAMSOCK_STREAM )
)
socksock .. setblockingsetblocking (( FalseFalse )
)
socksock .. connect_exconnect_ex (( addraddr )
)
events events = = selectorsselectors .. EVENT_READ EVENT_READ | | selectorsselectors .. EVENT_WRITE
EVENT_WRITE
message message = = libclientlibclient .. MessageMessage (( selsel , , socksock , , addraddr , , requestrequest )
)
selsel .. registerregister (( socksock , , eventsevents , , datadata == messagemessage )
)
A socket is created for the server connection as well as a Message
object using the request
dictionary.
A socket is created for the server connection as well as a Message
object using the request
dictionary.
Like the server, the Message
object is associated with the socket in the call to sel.register()
. However, for the client, the socket is initially set to be monitored for both read and write events. Once the request has been written, we’ll modify it to listen for read events only.
Like the server, the Message
object is associated with the socket in the call to sel.register()
. However, for the client, the socket is initially set to be monitored for both read and write events. Once the request has been written, we'll modify it to listen for read events only.
This approach gives us the same advantage as the server: not wasting CPU cycles. After the request has been sent, we’re no longer interested in write events, so there’s no reason to wake up and process them.
This approach gives us the same advantage as the server: not wasting CPU cycles. After the request has been sent, we're no longer interested in write events, so there's no reason to wake up and process them.
In the section Message Entry Point, we looked at how the message object was called into action when socket events were ready via process_events()
. Now let’s look at what happens after data is read and written on the socket and a message is ready to be processed by the client.
In the section Message Entry Point , we looked at how the message object was called into action when socket events were ready via process_events()
. Now let's look at what happens after data is read and written on the socket and a message is ready to be processed by the client.
The client’s message class is in libclient.py
. You can find the source code on GitHub.
The client's message class is in libclient.py
. You can find the source code on GitHub .
The methods appear in the class in the order in which processing takes place for a message.
The methods appear in the class in the order in which processing takes place for a message.
The first task for the client is to queue the request:
The first task for the client is to queue the request:
The dictionaries used to create the request, depending on what was passed on the command line, are in the client’s main script, app-client.py
. The request dictionary is passed as an argument to the class when a Message
object is created.
The dictionaries used to create the request, depending on what was passed on the command line, are in the client's main script, app-client.py
. The request dictionary is passed as an argument to the class when a Message
object is created.
The request message is created and appended to the send buffer, which is then seen by and sent via _write()
. The state variable self._request_queued
is set so queue_request()
isn’t called again.
The request message is created and appended to the send buffer, which is then seen by and sent via _write()
. The state variable self._request_queued
is set so queue_request()
isn't called again.
After the request has been sent, the client waits for a response from the server.
After the request has been sent, the client waits for a response from the server.
The methods for reading and processing a message in the client are the same as the server. As response data is read from the socket, the process
header methods are called: process_protoheader()
and process_jsonheader()
.
The methods for reading and processing a message in the client are the same as the server. As response data is read from the socket, the process
header methods are called: process_protoheader()
and process_jsonheader()
.
The difference is in the naming of the final process
methods and the fact that they’re processing a response, not creating one: process_response()
, _process_response_json_content()
, and _process_response_binary_content()
.
The difference is in the naming of the final process
methods and the fact that they're processing a response, not creating one: process_response()
, _process_response_json_content()
, and _process_response_binary_content()
.
Last, but certainly not least, is the final call for process_response()
:
Last, but certainly not least, is the final call for process_response()
:
def def process_responseprocess_response (( selfself ):
):
# ...
# ...
# Close when response has been processed
# Close when response has been processed
selfself .. closeclose ()
()
I’ll conclude the Message
class discussion by mentioning a couple of things that are important to notice with a few of the supporting methods.
I'll conclude the Message
class discussion by mentioning a couple of things that are important to notice with a few of the supporting methods.
Any exceptions raised by the class are caught by the main script in its except
clause:
Any exceptions raised by the class are caught by the main script in its except
clause:
Note the last line: message.close()
.
Note the last line: message.close()
.
This is a really important line, for more than one reason! Not only does it make sure that the socket is closed, but message.close()
also removes the socket from being monitored by select()
. This greatly simplifies the code in the class and reduces complexity. If there’s an exception or we explicitly raise one ourselves, we know close()
will take care of the cleanup.
This is a really important line, for more than one reason! Not only does it make sure that the socket is closed, but message.close()
also removes the socket from being monitored by select()
. This greatly simplifies the code in the class and reduces complexity. If there's an exception or we explicitly raise one ourselves, we know close()
will take care of the cleanup.
The methods Message._read()
and Message._write()
also contain something interesting:
The methods Message._read()
and Message._write()
also contain something interesting:
def def _read_read (( selfself ):
):
trytry :
:
# Should be ready to read
# Should be ready to read
data data = = selfself .. socksock .. recvrecv (( 40964096 )
)
except except BlockingIOErrorBlockingIOError :
:
# Resource temporarily unavailable (errno EWOULDBLOCK)
# Resource temporarily unavailable (errno EWOULDBLOCK)
pass
pass
elseelse :
:
if if datadata :
:
selfself .. _recv_buffer _recv_buffer += += data
data
elseelse :
:
raise raise RuntimeErrorRuntimeError (( 'Peer closed.''Peer closed.' )
)
Note the except
line: except BlockingIOError:
.
Note the except
line: except BlockingIOError:
.
_write()
has one too. These lines are important because they catch a temporary error and skip over it using pass
. The temporary error is when the socket would block, for example if it’s waiting on the network or the other end of the connection (its peer).
_write()
has one too. These lines are important because they catch a temporary error and skip over it using pass
. The temporary error is when the socket would block , for example if it's waiting on the network or the other end of the connection (its peer).
By catching and skipping over the exception with pass
, select()
will eventually call us again, and we’ll get another chance to read or write the data.
By catching and skipping over the exception with pass
, select()
will eventually call us again, and we'll get another chance to read or write the data.
After all of this hard work, let’s have some fun and run some searches!
After all of this hard work, let's have some fun and run some searches!
In these examples, I’ll run the server so it listens on all interfaces by passing an empty string for the host
argument. This will allow me to run the client and connect from a virtual machine that’s on another network. It emulates a big-endian PowerPC machine.
In these examples, I'll run the server so it listens on all interfaces by passing an empty string for the host
argument. This will allow me to run the client and connect from a virtual machine that's on another network. It emulates a big-endian PowerPC machine.
First, let’s start the server:
First, let's start the server:
Now let’s run the client and enter a search. Let’s see if we can find him:
Now let's run the client and enter a search. Let's see if we can find him:
$ ./app-client.py $ ./app-client.py 10.0.1.1 10 .0.1.1 65432 search morpheus
65432 search morpheus
starting connection to ('10.0.1.1', 65432)
starting connection to ('10.0.1.1', 65432)
sending b'x00d{"byteorder": "big", "content-type": "text/json", "content-encoding": "utf-8", "content-length": 41}{"action": "search", "value": "morpheus"}' to ('10.0.1.1', 65432)
sending b'x00d{"byteorder": "big", "content-type": "text/json", "content-encoding": "utf-8", "content-length": 41}{"action": "search", "value": "morpheus"}' to ('10.0.1.1', 65432)
received response {'result': 'Follow the white rabbit. '} from ('10.0.1.1', 65432)
received response {'result': 'Follow the white rabbit. '} from ('10.0.1.1', 65432)
got result: Follow the white rabbit.
got result: Follow the white rabbit.
closing connection to ('10.0.1.1', 65432)
closing connection to ('10.0.1.1', 65432)
My terminal is running a shell that’s using a text encoding of Unicode (UTF-8), so the output above prints nicely with emojis.
My terminal is running a shell that's using a text encoding of Unicode (UTF-8), so the output above prints nicely with emojis.
Let’s see if we can find the puppies:
Let's see if we can find the puppies:
Notice the byte string sent over the network for the request in the sending
line. It’s easier to see if you look for the bytes printed in hex that represent the puppy emoji: xf0x9fx90xb6
. I was able to enter the emoji for the search since my terminal is using Unicode with the encoding UTF-8.
Notice the byte string sent over the network for the request in the sending
line. It's easier to see if you look for the bytes printed in hex that represent the puppy emoji: xf0x9fx90xb6
. I was able to enter the emoji for the search since my terminal is using Unicode with the encoding UTF-8.
This demonstrates that we’re sending raw bytes over the network and they need to be decoded by the receiver to be interpreted correctly. This is why we went to all of the trouble to create a header that contains the content type and encoding.
This demonstrates that we're sending raw bytes over the network and they need to be decoded by the receiver to be interpreted correctly. This is why we went to all of the trouble to create a header that contains the content type and encoding.
Here’s the server output from both client connections above:
Here's the server output from both client connections above:
accepted connection from ('10.0.2.2', 55340)
accepted connection from ('10.0.2.2', 55340)
received request {'action': 'search', 'value': 'morpheus'} from ('10.0.2.2', 55340)
received request {'action': 'search', 'value': 'morpheus'} from ('10.0.2.2', 55340)
sending b'x00g{"byteorder": "little", "content-type": "text/json", "content-encoding": "utf-8", "content-length": 43}{"result": "Follow the white rabbit. xf0x9fx90xb0"}' to ('10.0.2.2', 55340)
sending b'x00g{"byteorder": "little", "content-type": "text/json", "content-encoding": "utf-8", "content-length": 43}{"result": "Follow the white rabbit. xf0x9fx90xb0"}' to ('10.0.2.2', 55340)
closing connection to ('10.0.2.2', 55340)
closing connection to ('10.0.2.2', 55340)
accepted connection from ('10.0.2.2', 55338)
accepted connection from ('10.0.2.2', 55338)
received request {'action': 'search', 'value': ''} from ('10.0.2.2', 55338)
received request {'action': 'search', 'value': ''} from ('10.0.2.2', 55338)
sending b'x00g{"byteorder": "little", "content-type": "text/json", "content-encoding": "utf-8", "content-length": 37}{"result": "xf0x9fx90xbe Playing ball! xf0x9fx8fx90"}' to ('10.0.2.2', 55338)
sending b'x00g{"byteorder": "little", "content-type": "text/json", "content-encoding": "utf-8", "content-length": 37}{"result": "xf0x9fx90xbe Playing ball! xf0x9fx8fx90"}' to ('10.0.2.2', 55338)
closing connection to ('10.0.2.2', 55338)
closing connection to ('10.0.2.2', 55338)
Look at the sending
line to see the bytes that were written to the client’s socket. This is the server’s response message.
Look at the sending
line to see the bytes that were written to the client's socket. This is the server's response message.
You can also test sending binary requests to the server if the action
argument is anything other than search
:
You can also test sending binary requests to the server if the action
argument is anything other than search
:
Since the request’s content-type
is not text/json
, the server treats it as a custom binary type and doesn’t perform JSON decoding. It simply prints the content-type
and returns the first 10 bytes to the client:
Since the request's content-type
is not text/json
, the server treats it as a custom binary type and doesn't perform JSON decoding. It simply prints the content-type
and returns the first 10 bytes to the client:
$ ./app-server.py $ ./app-server.py '' '' 65432
65432
listening on ('', 65432)
listening on ('', 65432)
accepted connection from ('10.0.2.2', 55320)
accepted connection from ('10.0.2.2', 55320)
received binary/custom-client-binary-type request from ('10.0.2.2', 55320)
received binary/custom-client-binary-type request from ('10.0.2.2', 55320)
sending b'x00x7f{"byteorder": "little", "content-type": "binary/custom-server-binary-type", "content-encoding": "binary", "content-length": 37}First 10 bytes of request: binaryxf0x9fx98x83' to ('10.0.2.2', 55320)
sending b'x00x7f{"byteorder": "little", "content-type": "binary/custom-server-binary-type", "content-encoding": "binary", "content-length": 37}First 10 bytes of request: binaryxf0x9fx98x83' to ('10.0.2.2', 55320)
closing connection to ('10.0.2.2', 55320)
closing connection to ('10.0.2.2', 55320)
Inevitably, something won’t work, and you’ll be wondering what to do. Don’t worry, it happens to all of us. Hopefully, with the help of this tutorial, your debugger, and favorite search engine, you’ll be able to get going again with the source code part.
Inevitably, something won't work, and you'll be wondering what to do. Don't worry, it happens to all of us. Hopefully, with the help of this tutorial, your debugger, and favorite search engine, you'll be able to get going again with the source code part.
If not, your first stop should be Python’s socket module documentation. Make sure you read all of the documentation for each function or method you’re calling. Also, read through the Reference section for ideas. In particular, check the Errors section.
If not, your first stop should be Python's socket module documentation. Make sure you read all of the documentation for each function or method you're calling. Also, read through the Reference section for ideas. In particular, check the Errors section.
Sometimes, it’s not all about the source code. The source code might be correct, and it’s just the other host, the client or server. Or it could be the network, for example, a router, firewall, or some other networking device that’s playing man-in-the-middle.
Sometimes, it's not all about the source code. The source code might be correct, and it's just the other host, the client or server. Or it could be the network, for example, a router, firewall, or some other networking device that's playing man-in-the-middle.
For these types of issues, additional tools are essential. Below are a few tools and utilities that might help or at least provide some clues.
For these types of issues, additional tools are essential. Below are a few tools and utilities that might help or at least provide some clues.
ping
will check if a host is alive and connected to the network by sending an ICMP echo request. It communicates directly with the operating system’s TCP/IP protocol stack, so it works independently from any application running on the host.
ping
will check if a host is alive and connected to the network by sending an ICMP echo request. It communicates directly with the operating system's TCP/IP protocol stack, so it works independently from any application running on the host.
Below is an example of running ping on macOS:
Below is an example of running ping on macOS:
Note the statistics at the end of the output. This can be helpful when you’re trying to discover intermittent connectivity problems. For example, is there any packet loss? How much latency is there (see the round-trip times)?
Note the statistics at the end of the output. This can be helpful when you're trying to discover intermittent connectivity problems. For example, is there any packet loss? How much latency is there (see the round-trip times)?
If there’s a firewall between you and the other host, a ping’s echo request may not be allowed. Some firewall administrators implement policies that enforce this. The idea being that they don’t want their hosts to be discoverable. If this is the case and you have firewall rules added to allow the hosts to communicate, make sure that the rules also allow ICMP to pass between them.
If there's a firewall between you and the other host, a ping's echo request may not be allowed. Some firewall administrators implement policies that enforce this. The idea being that they don't want their hosts to be discoverable. If this is the case and you have firewall rules added to allow the hosts to communicate, make sure that the rules also allow ICMP to pass between them.
ICMP is the protocol used by ping
, but it’s also the protocol TCP and other lower-level protocols use to communicate error messages. If you’re experiencing strange behavior or slow connections, this could be the reason.
ICMP is the protocol used by ping
, but it's also the protocol TCP and other lower-level protocols use to communicate error messages. If you're experiencing strange behavior or slow connections, this could be the reason.
ICMP messages are identified by type and code. To give you an idea of the important information they carry, here are a few:
ICMP messages are identified by type and code. To give you an idea of the important information they carry, here are a few:
ICMP Type | ICMP Type | ICMP Code | ICMP Code | Description | 描述 |
---|---|---|---|---|---|
8 | 8 | 0 | 0 | Echo request | Echo request |
0 | 0 | 0 | 0 | Echo reply | Echo reply |
3 | 3 | 0 | 0 | Destination network unreachable | Destination network unreachable |
3 | 3 | 1 | 1个 | Destination host unreachable | Destination host unreachable |
3 | 3 | 2 | 2 | Destination protocol unreachable | Destination protocol unreachable |
3 | 3 | 3 | 3 | Destination port unreachable | Destination port unreachable |
3 | 3 | 4 | 4 | Fragmentation required, and DF flag set | Fragmentation required, and DF flag set |
11 | 11 | 0 | 0 | TTL expired in transit | TTL expired in transit |
See the article Path MTU Discovery for information regarding fragmentation and ICMP messages. This is an example of something that can cause strange behavior that I mentioned previously.
See the article Path MTU Discovery for information regarding fragmentation and ICMP messages. This is an example of something that can cause strange behavior that I mentioned previously.
In the section Viewing Socket State, we looked at how netstat
can be used to display information about sockets and their current state. This utility is available on macOS, Linux, and Windows.
In the section Viewing Socket State , we looked at how netstat
can be used to display information about sockets and their current state. This utility is available on macOS, Linux, and Windows.
I didn’t mention the columns Recv-Q
and Send-Q
in the example output. These columns will show you the number of bytes that are held in network buffers that are queued for transmission or receipt, but for some reason haven’t been read or written by the remote or local application.
I didn't mention the columns Recv-Q
and Send-Q
in the example output. These columns will show you the number of bytes that are held in network buffers that are queued for transmission or receipt, but for some reason haven't been read or written by the remote or local application.
In other words, the bytes are waiting in network buffers in the operating system’s queues. One reason could be the application is CPU bound or is otherwise unable to call socket.recv()
or socket.send()
and process the bytes. Or there could be network issues affecting communications like congestion or failing network hardware or cabling.
In other words, the bytes are waiting in network buffers in the operating system's queues. One reason could be the application is CPU bound or is otherwise unable to call socket.recv()
or socket.send()
and process the bytes. Or there could be network issues affecting communications like congestion or failing network hardware or cabling.
To demonstrate this and see how much data I could send before seeing an error, I wrote a test client that connects to a test server and repeatedly calls socket.send()
. The test server never calls socket.recv()
. It just accepts the connection. This causes the network buffers on the server to fill, which eventually raises an error on the client.
To demonstrate this and see how much data I could send before seeing an error, I wrote a test client that connects to a test server and repeatedly calls socket.send()
. The test server never calls socket.recv()
. It just accepts the connection. This causes the network buffers on the server to fill, which eventually raises an error on the client.
First, I started the server:
First, I started the server:
$ ./app-server-test.py $ ./app-server-test.py 127.0.0.1 127 .0.0.1 65432
65432
listening on ('127.0.0.1', 65432)
listening on ('127.0.0.1', 65432)
Then I ran the client. Let’s see what the error is:
Then I ran the client. Let's see what the error is:
Here’s the netstat
output while the client and server were still running, with the client printing out the error message above multiple times:
Here's the netstat
output while the client and server were still running, with the client printing out the error message above multiple times:
$ netstat -an $ netstat -an | grep | grep 65432
65432
Proto Recv-Q Send-Q Local Address Foreign Address (state)
Proto Recv-Q Send-Q Local Address Foreign Address (state)
tcp4 408300 0 127.0.0.1.65432 127.0.0.1.53225 ESTABLISHED
tcp4 408300 0 127.0.0.1.65432 127.0.0.1.53225 ESTABLISHED
tcp4 0 269868 127.0.0.1.53225 127.0.0.1.65432 ESTABLISHED
tcp4 0 269868 127.0.0.1.53225 127.0.0.1.65432 ESTABLISHED
tcp4 0 0 127.0.0.1.65432 *.* LISTEN
tcp4 0 0 127.0.0.1.65432 *.* LISTEN
The first entry is the server (Local Address
has port 65432):
The first entry is the server ( Local Address
has port 65432):
Notice the Recv-Q
: 408300
.
Notice the Recv-Q
: 408300
.
The second entry is the client (Foreign Address
has port 65432):
The second entry is the client ( Foreign Address
has port 65432):
Proto Recv-Q Send-Q Local Address Foreign Address (state)
Proto Recv-Q Send-Q Local Address Foreign Address (state)
tcp4 0 269868 127.0.0.1.53225 127.0.0.1.65432 ESTABLISHED
tcp4 0 269868 127.0.0.1.53225 127.0.0.1.65432 ESTABLISHED
Notice the Send-Q
: 269868
.
Notice the Send-Q
: 269868
.
The client sure was trying to write bytes, but the server wasn’t reading them. This caused the server’s network buffer queue to fill on the receive side and the client’s network buffer queue to fill on the send side.
The client sure was trying to write bytes, but the server wasn't reading them. This caused the server's network buffer queue to fill on the receive side and the client's network buffer queue to fill on the send side.
If you work with Windows, there’s a suite of utilities that you should definitely check out if you haven’t already: Windows Sysinternals.
If you work with Windows, there's a suite of utilities that you should definitely check out if you haven't already: Windows Sysinternals .
One of them is TCPView.exe
. TCPView is a graphical netstat
for Windows. In addition to addresses, port numbers, and socket state, it will show you running totals for the number of packets and bytes, sent and received. Like the Unix utility lsof
, you also get the process name and ID. Check the menus for other display options.
One of them is TCPView.exe
. TCPView is a graphical netstat
for Windows. In addition to addresses, port numbers, and socket state, it will show you running totals for the number of packets and bytes, sent and received. Like the Unix utility lsof
, you also get the process name and ID. Check the menus for other display options.
Sometimes you need to see what’s happening on the wire. Forget about what the application log says or what the value is that’s being returned from a library call. You want to see what’s actually being sent or received on the network. Just like debuggers, when you need to see it, there’s no substitute.
Sometimes you need to see what's happening on the wire. Forget about what the application log says or what the value is that's being returned from a library call. You want to see what's actually being sent or received on the network. Just like debuggers, when you need to see it, there's no substitute.
Wireshark is a network protocol analyzer and traffic capture application that runs on macOS, Linux, and Windows, among others. There’s a GUI version named wireshark
, and also a terminal, text-based version named tshark
.
Wireshark is a network protocol analyzer and traffic capture application that runs on macOS, Linux, and Windows, among others. There's a GUI version named wireshark
, and also a terminal, text-based version named tshark
.
Running a traffic capture is a great way to watch how an application behaves on the network and gather evidence about what it sends and receives, and how often and how much. You’ll also be able to see when a client or server closes or aborts a connection or stops responding. This information can be extremely helpful when you are troubleshooting.
Running a traffic capture is a great way to watch how an application behaves on the network and gather evidence about what it sends and receives, and how often and how much. You'll also be able to see when a client or server closes or aborts a connection or stops responding. This information can be extremely helpful when you are troubleshooting.
There are many good tutorials and other resources on the web that will walk you through the basics of using Wireshark and TShark.
There are many good tutorials and other resources on the web that will walk you through the basics of using Wireshark and TShark.
Here’s an example of a traffic capture using Wireshark on the loopback interface:
Here's an example of a traffic capture using Wireshark on the loopback interface:
Here’s the same example shown above using tshark
:
Here's the same example shown above using tshark
:
This section serves as a general reference with additional information and links to external resources.
This section serves as a general reference with additional information and links to external resources.
The following is from Python’s socket
module documentation:
The following is from Python's socket
module documentation:
“All errors raise exceptions. The normal exceptions for invalid argument types and out-of-memory conditions can be raised; starting from Python 3.3, errors related to socket or address semantics raise
OSError
or one of its subclasses.” (Source)“All errors raise exceptions. The normal exceptions for invalid argument types and out-of-memory conditions can be raised; starting from Python 3.3, errors related to socket or address semantics raise
OSError
or one of its subclasses.” (资源)
Here are some common errors you’ll probably encounter when working with sockets:
Here are some common errors you'll probably encounter when working with sockets:
Exception | 例外 | errno Constanterrno Constant |
Description | 描述 | |
---|---|---|---|---|---|
BlockingIOError | BlockingIOError | EWOULDBLOCK | EWOULDBLOCK | send() and the peer is busy and not reading, the send queue (network buffer) is full. Or there are issues with the network. Hopefully this is a temporary condition.send() and the peer is busy and not reading, the send queue (network buffer) is full. Or there are issues with the network. Hopefully this is a temporary condition. |
|
OSError | OSError | EADDRINUSE | EADDRINUSE | SO_REUSEADDR : SO_REUSEADDR : socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1) .socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1) . |
|
ConnectionResetError | ConnectionResetError | ECONNRESET | ECONNRESET | Connection reset by peer. The remote process crashed or did not close its socket properly (unclean shutdown). Or there’s a firewall or other device in the network path that’s missing rules or misbehaving. | Connection reset by peer. The remote process crashed or did not close its socket properly (unclean shutdown). Or there's a firewall or other device in the network path that's missing rules or misbehaving. |
TimeoutError | TimeoutError | ETIMEDOUT | ETIMEDOUT | Operation timed out. No response from peer. | Operation timed out. No response from peer. |
ConnectionRefusedError | ConnectionRefusedError | ECONNREFUSED | ECONNREFUSED | Connection refused. No application listening on specified port. | Connection refused. No application listening on specified port. |
socket.AF_INET
and socket.AF_INET6
represent the address and protocol families used for the first argument to socket.socket()
. APIs that use an address expect it to be in a certain format, depending on whether the socket was created with socket.AF_INET
or socket.AF_INET6
.
socket.AF_INET
and socket.AF_INET6
represent the address and protocol families used for the first argument to socket.socket()
. APIs that use an address expect it to be in a certain format, depending on whether the socket was created with socket.AF_INET
or socket.AF_INET6
.
Address Family | Address Family | Protocol | 协议 | Address Tuple | Address Tuple | Description | 描述 |
---|---|---|---|---|---|---|---|
socket.AF_INET socket.AF_INET |
IPv4 | IPv4 | (host, port) (host, port) |
host is a string with a hostname like host is a string with a hostname like 'www.example.com' or an IPv4 address like 'www.example.com' or an IPv4 address like '10.1.2.3' . '10.1.2.3' . port is an integer.port is an integer. |
|||
socket.AF_INET6 socket.AF_INET6 |
IPv6 | IPv6 | (host, port, flowinfo, scopeid) (host, port, flowinfo, scopeid) |
host is a string with a hostname like host is a string with a hostname like 'www.example.com' or an IPv6 address like 'www.example.com' or an IPv6 address like 'fe80::6203:7ab:fe88:9c23' . 'fe80::6203:7ab:fe88:9c23' . port is an integer. port is an integer. flowinfo and flowinfo and scopeid represent the scopeid represent the sin6_flowinfo and sin6_flowinfo and sin6_scope_id members in the C struct sin6_scope_id members in the C struct sockaddr_in6 .sockaddr_in6 . |
Note the excerpt below from Python’s socket module documentation regarding the host
value of the address tuple:
Note the excerpt below from Python's socket module documentation regarding the host
value of the address tuple:
“For IPv4 addresses, two special forms are accepted instead of a host address: the empty string represents
INADDR_ANY
, and the string'
represents' INADDR_BROADCAST
. This behavior is not compatible with IPv6, therefore, you may want to avoid these if you intend to support IPv6 with your Python programs.” (Source)“For IPv4 addresses, two special forms are accepted instead of a host address: the empty string represents
INADDR_ANY
, and the string'
represents' INADDR_BROADCAST
. This behavior is not compatible with IPv6, therefore, you may want to avoid these if you intend to support IPv6 with your Python programs.” (资源)
See Python’s Socket families documentation for more information.
See Python's Socket families documentation for more information.
I’ve used IPv4 sockets in this tutorial, but if your network supports it, try testing and using IPv6 if possible. One way to support this easily is by using the function socket.getaddrinfo(). It translates the host
and port
arguments into a sequence of 5-tuples that contains all of the necessary arguments for creating a socket connected to that service. socket.getaddrinfo()
will understand and interpret passed-in IPv6 addresses and hostnames that resolve to IPv6 addresses, in addition to IPv4.
I've used IPv4 sockets in this tutorial, but if your network supports it, try testing and using IPv6 if possible. One way to support this easily is by using the function socket.getaddrinfo() . It translates the host
and port
arguments into a sequence of 5-tuples that contains all of the necessary arguments for creating a socket connected to that service. socket.getaddrinfo()
will understand and interpret passed-in IPv6 addresses and hostnames that resolve to IPv6 addresses, in addition to IPv4.
The following example returns address information for a TCP connection to example.org
on port 80
:
The following example returns address information for a TCP connection to example.org
on port 80
:
>>> >>> socketsocket .. getaddrinfogetaddrinfo (( "example.org""example.org" , , 8080 , , protoproto == socketsocket .. IPPROTO_TCPIPPROTO_TCP )
)
[(, ,
[(, ,
6, '', ('2606:2800:220:1:248:1893:25c8:1946', 80, 0, 0)),
6, '', ('2606:2800:220:1:248:1893:25c8:1946', 80, 0, 0)),
(, ,
(, ,
6, '', ('93.184.216.34', 80))]
6, '', ('93.184.216.34', 80))]
Results may differ on your system if IPv6 isn’t enabled. The values returned above can be used by passing them to socket.socket()
and socket.connect()
. There’s a client and server example in the Example section of Python’s socket module documentation.
Results may differ on your system if IPv6 isn't enabled. The values returned above can be used by passing them to socket.socket()
and socket.connect()
. There's a client and server example in the Example section of Python's socket module documentation.
For context, this section applies mostly to using hostnames with bind()
and connect()
, or connect_ex()
, when you intend to use the loopback interface, “localhost.” However, it applies any time you’re using a hostname and there’s an expectation of it resolving to a certain address and having a special meaning to your application that affects its behavior or assumptions. This is in contrast to the typical scenario of a client using a hostname to connect to a server that’s resolved by DNS, like www.example.com.
For context, this section applies mostly to using hostnames with bind()
and connect()
, or connect_ex()
, when you intend to use the loopback interface, “localhost.” However, it applies any time you're using a hostname and there's an expectation of it resolving to a certain address and having a special meaning to your application that affects its behavior or assumptions. This is in contrast to the typical scenario of a client using a hostname to connect to a server that's resolved by DNS, like www.example.com.
The following is from Python’s socket
module documentation:
The following is from Python's socket
module documentation:
“If you use a hostname in the host portion of IPv4/v6 socket address, the program may show a non-deterministic behavior, as Python uses the first address returned from the DNS resolution. The socket address will be resolved differently into an actual IPv4/v6 address, depending on the results from DNS resolution and/or the host configuration. For deterministic behavior use a numeric address in host portion.” (Source)
“If you use a hostname in the host portion of IPv4/v6 socket address, the program may show a non-deterministic behavior, as Python uses the first address returned from the DNS resolution. The socket address will be resolved differently into an actual IPv4/v6 address, depending on the results from DNS resolution and/or the host configuration. For deterministic behavior use a numeric address in host portion.” (资源)
The standard convention for the name “localhost” is for it to resolve to 127.0.0.1
or ::1
, the loopback interface. This will more than likely be the case for you on your system, but maybe not. It depends on how your system is configured for name resolution. As with all things IT, there are always exceptions, and there are no guarantees that using the name “localhost” will connect to the loopback interface.
The standard convention for the name “ localhost ” is for it to resolve to 127.0.0.1
or ::1
, the loopback interface. This will more than likely be the case for you on your system, but maybe not. It depends on how your system is configured for name resolution. As with all things IT, there are always exceptions, and there are no guarantees that using the name “localhost” will connect to the loopback interface.
For example, on Linux, see man nsswitch.conf
, the Name Service Switch configuration file. Another place to check on macOS and Linux is the file /etc/hosts
. On Windows, see C:WindowsSystem32driversetchosts
. The hosts
file contains a static table of name to address mappings in a simple text format. DNS is another piece of the puzzle altogether.
For example, on Linux, see man nsswitch.conf
, the Name Service Switch configuration file. Another place to check on macOS and Linux is the file /etc/hosts
. On Windows, see C:WindowsSystem32driversetchosts
. The hosts
file contains a static table of name to address mappings in a simple text format. DNS is another piece of the puzzle altogether.
Interestingly enough, as of this writing (June 2018), there’s an RFC draft Let ‘localhost’ be localhost that discusses the conventions, assumptions and security around using the name “localhost.”
Interestingly enough, as of this writing (June 2018), there's an RFC draft Let 'localhost' be localhost that discusses the conventions, assumptions and security around using the name “localhost.”
What’s important to understand is that when you use hostnames in your application, the returned address(es) could literally be anything. Don’t make assumptions regarding a name if you have a security-sensitive application. Depending on your application and environment, this may or may not be a concern for you.
What's important to understand is that when you use hostnames in your application, the returned address(es) could literally be anything. Don't make assumptions regarding a name if you have a security-sensitive application. Depending on your application and environment, this may or may not be a concern for you.
Note: Security precautions and best practices still apply, even if your application isn’t “security-sensitive.” If your application accesses the network, it should be secured and maintained. This means, at a minimum:
Note: Security precautions and best practices still apply, even if your application isn't “security-sensitive.” If your application accesses the network, it should be secured and maintained. This means, at a minimum:
System software updates and security patches are applied regularly, including Python. Are you using any third party libraries? If so, make sure those are checked and updated too.
If possible, use a dedicated or host-based firewall to restrict connections to trusted systems only.
What DNS servers are configured? Do you trust them and their administrators?
Make sure that request data is sanitized and validated as much as possible prior to calling other code that processes it. Use (fuzz) tests for this and run them regularly.
System software updates and security patches are applied regularly, including Python. Are you using any third party libraries? If so, make sure those are checked and updated too.
If possible, use a dedicated or host-based firewall to restrict connections to trusted systems only.
What DNS servers are configured? Do you trust them and their administrators?
Make sure that request data is sanitized and validated as much as possible prior to calling other code that processes it. Use (fuzz) tests for this and run them regularly.
Regardless of whether or not you’re using hostnames, if your application needs to support secure connections (encryption and authentication), you’ll probably want to look into using TLS. This is its own separate topic and beyond the scope of this tutorial. See Python’s ssl module documentation to get started. This is the same protocol that your web browser uses to connect securely to web sites.
Regardless of whether or not you're using hostnames, if your application needs to support secure connections (encryption and authentication), you'll probably want to look into using TLS . This is its own separate topic and beyond the scope of this tutorial. See Python's ssl module documentation to get started. This is the same protocol that your web browser uses to connect securely to web sites.
With interfaces, IP addresses, and name resolution to consider, there are many variables. What should you do? Here are some recommendations that you can use if you don’t have a network application review process:
With interfaces, IP addresses, and name resolution to consider, there are many variables. 你该怎么办? Here are some recommendations that you can use if you don't have a network application review process:
Application | 应用 | Usage | 用法 | Recommendation | 建议 |
---|---|---|---|---|---|
Server | 服务器 | loopback interface | loopback interface | 127.0.0.1 or 127.0.0.1 or ::1 .::1 . |
|
Server | 服务器 | ethernet interface | ethernet interface | 10.1.2.3 . To support more than one interface, use an empty string for all interfaces/addresses. See the security note above.10.1.2.3 . To support more than one interface, use an empty string for all interfaces/addresses. See the security note above. |
|
Client | 客户 | loopback interface | loopback interface | 127.0.0.1 or 127.0.0.1 or ::1 .::1 . |
|
Client | 客户 | ethernet interface | ethernet interface | Use an IP address for consistency and non-reliance on name resolution. For the typical case, use a hostname. See the security note above. | Use an IP address for consistency and non-reliance on name resolution. For the typical case, use a hostname. See the security note above. |
For clients or servers, if you need to authenticate the host you’re connecting to, look into using TLS.
For clients or servers, if you need to authenticate the host you're connecting to, look into using TLS.
A socket function or method that temporarily suspends your application is a blocking call. For example, accept()
, connect()
, send()
, and recv()
“block.” They don’t return immediately. Blocking calls have to wait on system calls (I/O) to complete before they can return a value. So you, the caller, are blocked until they’re done or a timeout or other error occurs.
A socket function or method that temporarily suspends your application is a blocking call. For example, accept()
, connect()
, send()
, and recv()
“block.” They don't return immediately. Blocking calls have to wait on system calls (I/O) to complete before they can return a value. So you, the caller, are blocked until they're done or a timeout or other error occurs.
Blocking socket calls can be set to non-blocking mode so they return immediately. If you do this, you’ll need to at least refactor or redesign your application to handle the socket operation when it’s ready.
Blocking socket calls can be set to non-blocking mode so they return immediately. If you do this, you'll need to at least refactor or redesign your application to handle the socket operation when it's ready.
Since the call returns immediately, data may not be ready. The callee is waiting on the network and hasn’t had time to complete its work. If this is the case, the current status is the errno
value socket.EWOULDBLOCK
. Non-blocking mode is supported with setblocking().
Since the call returns immediately, data may not be ready. The callee is waiting on the network and hasn't had time to complete its work. If this is the case, the current status is the errno
value socket.EWOULDBLOCK
. Non-blocking mode is supported with setblocking() .
By default, sockets are always created in blocking mode. See Notes on socket timeouts for a description of the three modes.
By default, sockets are always created in blocking mode. See Notes on socket timeouts for a description of the three modes.
An interesting thing to note with TCP is it’s completely legal for the client or server to close their side of the connection while the other side remains open. This is referred to as a “half-open” connection. It’s the application’s decision whether or not this is desirable. In general, it’s not. In this state, the side that’s closed their end of the connection can no longer send data. They can only receive it.
An interesting thing to note with TCP is it's completely legal for the client or server to close their side of the connection while the other side remains open. This is referred to as a “half-open” connection. It's the application's decision whether or not this is desirable. In general, it's not. In this state, the side that's closed their end of the connection can no longer send data. They can only receive it.
I’m not advocating that you take this approach, but as an example, HTTP uses a header named “Connection” that’s used to standardize how applications should close or persist open connections. For details, see section 6.3 in RFC 7230, Hypertext Transfer Protocol (HTTP/1.1): Message Syntax and Routing.
I'm not advocating that you take this approach, but as an example, HTTP uses a header named “Connection” that's used to standardize how applications should close or persist open connections. For details, see section 6.3 in RFC 7230, Hypertext Transfer Protocol (HTTP/1.1): Message Syntax and Routing .
When designing and writing your application and its application-layer protocol, it’s a good idea to go ahead and work out how you expect connections to be closed. Sometimes this is obvious and simple, or it’s something that can take some initial prototyping and testing. It depends on the application and how the message loop is processed with its expected data. Just make sure that sockets are always closed in a timely manner after they complete their work.
When designing and writing your application and its application-layer protocol, it's a good idea to go ahead and work out how you expect connections to be closed. Sometimes this is obvious and simple, or it's something that can take some initial prototyping and testing. It depends on the application and how the message loop is processed with its expected data. Just make sure that sockets are always closed in a timely manner after they complete their work.
See Wikipedia’s article on endianness for details on how different CPUs store byte orderings in memory. When interpreting individual bytes, this isn’t a problem. However, when handling multiple bytes that are read and processed as a single value, for example a 4-byte integer, the byte order needs to be reversed if you’re communicating with a machine that uses a different endianness.
See Wikipedia's article on endianness for details on how different CPUs store byte orderings in memory. When interpreting individual bytes, this isn't a problem. However, when handling multiple bytes that are read and processed as a single value, for example a 4-byte integer, the byte order needs to be reversed if you're communicating with a machine that uses a different endianness.
Byte order is also important for text strings that are represented as multi-byte sequences, like Unicode. Unless you’re always using “true,” strict ASCII and control the client and server implementations, you’re probably better off using Unicode with an encoding like UTF-8 or one that supports a byte order mark (BOM).
Byte order is also important for text strings that are represented as multi-byte sequences, like Unicode. Unless you're always using “true,” strict ASCII and control the client and server implementations, you're probably better off using Unicode with an encoding like UTF-8 or one that supports a byte order mark (BOM) .
It’s important to explicitly define the encoding used in your application-layer protocol. You can do this by mandating that all text is UTF-8 or using a “content-encoding” header that specifies the encoding. This prevents your application from having to detect the encoding, which you should avoid if possible.
It's important to explicitly define the encoding used in your application-layer protocol. You can do this by mandating that all text is UTF-8 or using a “content-encoding” header that specifies the encoding. This prevents your application from having to detect the encoding, which you should avoid if possible.
This becomes problematic when there is data involved that’s stored in files or a database and there’s no metadata available that specifies its encoding. When the data is transferred to another endpoint, it will have to try to detect the encoding. For a discussion, see Wikipedia’s Unicode article that references RFC 3629: UTF-8, a transformation format of ISO 10646:
This becomes problematic when there is data involved that's stored in files or a database and there's no metadata available that specifies its encoding. When the data is transferred to another endpoint, it will have to try to detect the encoding. For a discussion, see Wikipedia's Unicode article that references RFC 3629: UTF-8, a transformation format of ISO 10646 :
“However RFC 3629, the UTF-8 standard, recommends that byte order marks be forbidden in protocols using UTF-8, but discusses the cases where this may not be possible. In addition, the large restriction on possible patterns in UTF-8 (for instance there cannot be any lone bytes with the high bit set) means that it should be possible to distinguish UTF-8 from other character encodings without relying on the BOM.” (Source)
“However RFC 3629, the UTF-8 standard, recommends that byte order marks be forbidden in protocols using UTF-8, but discusses the cases where this may not be possible. In addition, the large restriction on possible patterns in UTF-8 (for instance there cannot be any lone bytes with the high bit set) means that it should be possible to distinguish UTF-8 from other character encodings without relying on the BOM.” (资源)
The takeaway from this is to always store the encoding used for data that’s handled by your application if it can vary. In other words, try to somehow store the encoding as metadata if it’s not always UTF-8 or some other encoding with a BOM. Then you can send that encoding in a header along with the data to tell the receiver what it is.
The takeaway from this is to always store the encoding used for data that's handled by your application if it can vary. In other words, try to somehow store the encoding as metadata if it's not always UTF-8 or some other encoding with a BOM. Then you can send that encoding in a header along with the data to tell the receiver what it is.
The byte ordering used in TCP/IP is big-endian and is referred to as network order. Network order is used to represent integers in lower layers of the protocol stack, like IP addresses and port numbers. Python’s socket module includes functions that convert integers to and from network and host byte order:
The byte ordering used in TCP/IP is big-endian and is referred to as network order. Network order is used to represent integers in lower layers of the protocol stack, like IP addresses and port numbers. Python's socket module includes functions that convert integers to and from network and host byte order:
Function | 功能 | Description | 描述 |
---|---|---|---|
socket.ntohl(x) socket.ntohl(x) |
Convert 32-bit positive integers from network to host byte order. On machines where the host byte order is the same as network byte order, this is a no-op; otherwise, it performs a 4-byte swap operation. | Convert 32-bit positive integers from network to host byte order. On machines where the host byte order is the same as network byte order, this is a no-op; otherwise, it performs a 4-byte swap operation. | |
socket.ntohs(x) socket.ntohs(x) |
Convert 16-bit positive integers from network to host byte order. On machines where the host byte order is the same as network byte order, this is a no-op; otherwise, it performs a 2-byte swap operation. | Convert 16-bit positive integers from network to host byte order. On machines where the host byte order is the same as network byte order, this is a no-op; otherwise, it performs a 2-byte swap operation. | |
socket.htonl(x) socket.htonl(x) |
Convert 32-bit positive integers from host to network byte order. On machines where the host byte order is the same as network byte order, this is a no-op; otherwise, it performs a 4-byte swap operation. | Convert 32-bit positive integers from host to network byte order. On machines where the host byte order is the same as network byte order, this is a no-op; otherwise, it performs a 4-byte swap operation. | |
socket.htons(x) socket.htons(x) |
Convert 16-bit positive integers from host to network byte order. On machines where the host byte order is the same as network byte order, this is a no-op; otherwise, it performs a 2-byte swap operation. | Convert 16-bit positive integers from host to network byte order. On machines where the host byte order is the same as network byte order, this is a no-op; otherwise, it performs a 2-byte swap operation. |
You can also use the struct module to pack and unpack binary data using format strings:
You can also use the struct module to pack and unpack binary data using format strings:
We covered a lot of ground in this tutorial. Networking and sockets are large subjects. If you’re new to networking or sockets, don’t be discouraged by all of the terms and acronyms.
We covered a lot of ground in this tutorial. Networking and sockets are large subjects. If you're new to networking or sockets, don't be discouraged by all of the terms and acronyms.
There are a lot of pieces to become familiar with in order to understand how everything works together. However, just like Python, it will start to make more sense as you get to know the individual pieces and spend more time with them.
There are a lot of pieces to become familiar with in order to understand how everything works together. However, just like Python, it will start to make more sense as you get to know the individual pieces and spend more time with them.
We looked at the low-level socket API in Python’s socket
module and saw how it can be used to create client-server applications. We also created our own custom class and used it as an application-layer protocol to exchange messages and data between endpoints. You can use this class and build upon it to learn and help make creating your own socket applications easier and faster.
We looked at the low-level socket API in Python's socket
module and saw how it can be used to create client-server applications. We also created our own custom class and used it as an application-layer protocol to exchange messages and data between endpoints. You can use this class and build upon it to learn and help make creating your own socket applications easier and faster.
You can find the source code on GitHub.
You can find the source code on GitHub .
Congratulations on making it to the end! You are now well on your way to using sockets in your own applications.
Congratulations on making it to the end! You are now well on your way to using sockets in your own applications.
I hope this tutorial has given you the information, examples, and inspiration needed to start you on your sockets development journey.
I hope this tutorial has given you the information, examples, and inspiration needed to start you on your sockets development journey.
[ Improve Your Python With Python Tricks – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]
[通过Python技巧Improve改进Python –每两天将简短而可爱的Python技巧传递到您的收件箱。 >>单击此处了解更多信息,并查看示例 ]
翻译自: https://www.pybloggers.com/2018/08/socket-programming-in-python-guide/
python 套接字