Unix Network Programming Episode 29

Finally, we note that the POSIX standard specifies that the first argument to socket be a PF_ value, and the AF_ value be used for a socket address structure. But, it then defines only one family value in the addrinfo structure, intended for use in either a call to socket or in a socket address structure!

‘connect’ Function

The connect function is used by a TCP client to establish a connection with a TCP server.

int connect(int sockfd, const struct sockaddr *servaddr, socklen_t addrlen);

sockfd is a socket descriptor returned by the socket function.

In the case of a TCP socket, the connect function initiates TCP’s three-way handshake (Section 2.6(See 7.2.6)). The function returns only when the connection is established or an error occurs. There are several different error returns possible.

1.If the client TCP receives no response to its SYN segment, ETIMEDOUT is returned. 4.4BSD, for example, sends one SYN when connect is called, another 6 seconds later, and another 24 seconds later (p. 828 of TCPv2). If no response is received after a total of 75 seconds, the error is returned.
Some systems provide administrative control over this timeout; see Appendix E of TCPv1.

2.If the server’s response to the client’s SYN is a reset (RST), this indicates that no process is waiting for connections on the server host at the port specified (i.e., the server process is probably not running). This is a hard error and the error ECONNREFUSED is returned to the client as soon as the RST is received.
An RST is a type of TCP segment that is sent by TCP when something is wrong. Three conditions that generate an RST are: when a SYN arrives for a port that has no listening server (what we just described), when TCP wants to abort an existing connection, and when TCP receives a segment for a connection that does not exist. (TCPv1 [pp. 246–250] contains additional information.)

3.f the client’s SYN elicits an ICMP “destination unreachable” from some intermediate router, this is considered a soft error. The client kernel saves the message but keeps sending SYNs with the same time between each SYN as in the first scenario. If no response is received after some fixed amount of time (75 seconds for 4.4BSD), the saved ICMP error is returned to the process as either EHOSTUNREACH or ENETUNREACH. It is also possible that the remote system is not reachable by any route in the local system’s forwarding table, or that the connect call returns without waiting at all.
Many earlier systems, such as 4.2BSD, incorrectly aborted the connection establishment attempt when the ICMP “destination unreachable” was received. This is wrong because this ICMP error can indicate a transient condition. For example, it could be that the condition is caused by a routing problem that will be corrected.
Notice that ENETUNREACH is not listed, even when the error indicates that the destination network is unreachable. Network unreachables are considered obsolete, and applications should just treat ENETUNREACH and EHOSTUNREACH as the same error.

In terms of the TCP state transition diagram, connect moves from the CLOSED state (the state in which a socket begins when it is created by the socket function) to the SYN_SENT state, and then, on success, to the ESTABLISHED state. If connect fails, the socket is no longer usable and must be closed. We cannot call connect again on the socket. In Figure 11.10(See 8.9.12), we will see that when we call connect in a loop, trying each IP address for a given host until one works, each time connect fails, we must close the socket descriptor and call socket again.

‘bind’ Function

The bind function assigns a local protocol address to a socket. With the Internet protocols, the protocol address is the combination of either a 32-bit IPv4 address or a 128-bit IPv6 address, along with a 16-bit TCP or UDP port number.

int bind (int sockfd, const struct sockaddr *myaddr, socklen_t addrlen);

Historically, the man page description of bind has said “bind assigns a name to an unnamed socket.” The use of the term “name” is confusing and gives the connotation of domain names (Chapter 11(See 8.9)) such as foo.bar.com. The bind function has nothing to do with names. bind assigns a protocol address to a socket, and what that protocol address means depends on the protocol.
