Linux Howtos: C/C++ -> Sockets Tutorial
Sockets Tutorial
This is a simple tutorial on using sockets for interprocess communication.
The client server model
Most interprocess communication uses the client server model. These terms refer to the two processes which will be communicating with each other. One of the two processes, the client, connects to the other process, the server, typically to make a request for information. A good analogy is a person who makes a phone call to another person.Notice that the client needs to know of the existence of and the address of the server, but the server does not need to know the address of (or even the existence of) the client prior to the connection being established.
Notice also that once a connection is established, both sides can send and receive information.
The system calls for establishing a connection are somewhat different for the client and the server, but both involve the basic construct of a socket.
A socket is one end of an interprocess communication channel. The two processes
each establish their own socket.The steps involved in establishing a socket on the client side are as follows:
- Create a socket with the socket() system call
- Connect the socket to the address of the server using the connect() system call
- Send and receive data. There are a number of ways to do this, but the simplest is to use the read() and write() system calls.
The steps involved in establishing a socket on the server side are as follows:
- Create a socket with the socket() system call
- Bind the socket to an address using the bind() system call. For a server socket on the Internet, an address consists of a port number on the host machine.
- Listen for connections with the listen() system call
- Accept a connection with the accept() system call. This call typically blocks until a client connects with the server.
- Send and receive data
Socket Types
When a socket is created, the program has to specify the address domain and the socket type. Two processes can communicate with each other only if their sockets are of the same type and in the same domain.
There are two widely used address domains, the unix domain, in which two processes which share a common file system communicate, and the Internet domain, in which two processes running on any two hosts on the Internet communicate. Each of these has its own address format.
The address of a socket in the Unix domain is a character string which is basically an entry in the file system.
The address of a socket in the Internet domain consists of the Internet address of the host machine (every computer on the Internet has a unique 32 bit address, often referred to as its IP address).
In addition, each socket needs a port number on that host.
Port numbers are 16 bit unsigned integers.
The lower numbers are reserved in Unix for standard services. For example, the port number for the FTP server is 21. It is important that standard services be at the same port on all computers so that clients will know their addresses.
However, port numbers above 2000 are generally available.There are two widely used socket types, stream sockets, and datagram sockets. Stream sockets treat communications as a continuous stream of characters, while datagram sockets have to read entire messages at once. Each uses its own communciations protocol.
Stream sockets use TCP (Transmission Control Protocol), which is a reliable, stream oriented protocol, and datagram sockets use UDP (Unix Datagram Protocol), which is unreliable and message oriented.
The examples in this tutorial will use sockets in the Internet domain using the TCP protocol.
Sample code
C code for a very simple client and server are provided for you. These communicate using stream sockets in the Internet domain. The code is described in detail below. However, before you read the descriptions and look at the code, you should compile and run the two programs to see what they do.
Download these into files called
server.c
andclient.c
and compile them separately into two executables calledserver
andclient
.They probably won't require any special compiling flags, but on some solaris systems you may need to link to the socket library by appending
-lsocket
to your compile command.Ideally, you should run the client and the server on separate hosts on the Internet. Start the server first. Suppose the server is running on a machine called
cheerios
. When you run the server, you need to pass the port number in as an argument. You can choose any number between 2000 and 65535. If this port is already in use on that machine, the server will tell you this and exit. If this happens, just choose another port and try again. If the port is available, the server will block until it receives a connection from the client. Don't be alarmed if the server doesn't do anything;It's not supposed to do anything until a connection is made.
Here is a typical command line:server 51717To run the client you need to pass in two arguments, the name of the host on which the server is running and the port number on which the server is listening for connections.
Here is the command line to connect to the server described above:
client cheerios 51717
The client will prompt you to enter a message.
If everything works correctly, the server will display your message on stdout, send an acknowledgement message to the client and terminate.
The client will print the acknowledgement message from the server and then terminate.You can simulate this on a single machine by running the server in one window and the client in another. In this case, you can use the keyword
localhost
as the first argument to the client.The server code uses a number of ugly programming constructs, and so we will go through it line by line.
#include <stdio.h>
This header file contains declarations used in most input and output and is typically included in all C programs.
#include <sys/types.h>
This header file contains definitions of a number of data types used in system calls. These types are used in the next two include files.
#include <sys/socket.h>
The header file socket.h includes a number of definitions of structures needed for sockets.
#include <netinet/in.h>
The header file in.h contains constants and structures needed for internet domain addresses.
void error(char *msg)
{
perror(msg);
exit(1);
}
This function is called when a system call fails. It displays a message about the error onstderr
and then aborts the program. The perror man page gives more information.
int main(int argc, char *argv[])
{
int sockfd, newsockfd, portno, clilen, n;
sockfd
andnewsockfd
are file descriptors, i.e. array subscripts into the file descriptor table . These two variables store the values returned by the socket system call and the accept system call.
portno
stores the port number on which the server accepts connections.
clilen
stores the size of the address of the client. This is needed for the accept system call.
n
is the return value for theread()
andwrite()
calls; i.e. it contains the number of characters read or written.
char buffer[256];
The server reads characters from the socket connection into this buffer.
struct sockaddr_in serv_addr, cli_addr;
Asockaddr_in
is a structure containing an internet address. This structure is defined innetinet/in.h
.
Here is the definition:
struct sockaddr_in
{
short sin_family; /* must be AF_INET */
u_short sin_port;
struct in_addr sin_addr;
char sin_zero[8]; /* Not used, must be zero */
};
Anin_addr
structure, defined in the same header file, contains only one field, a unsigned long calleds_addr
.The variable
serv_addr
will contain the address of the server, andcli_addr
will contain the address of the client which connects to the server.
if (argc < 2)
{
fprintf(stderr,"ERROR, no port provided
");
exit(1);
}
The user needs to pass in the port number on which the server will accept connections as an argument. This code displays an error message if the user fails to do this.
sockfd = socket(AF_INET, SOCK_STREAM, 0);
if (sockfd < 0)
error("ERROR opening socket");
Thesocket()
system call creates a new socket. It takes three arguments. The first is the address domain of the socket.Recall that there are two possible address domains, the unix domain for two processes which share a common file system, and the Internet domain for any two hosts on the Internet. The symbol constant
AF_UNIX
is used for the former, andAF_INET
for the latter (there are actually many other options which can be used here for specialized purposes).The second argument is the type of socket. Recall that there are two choices here, a stream socket in which characters are read in a continuous stream as if from a file or pipe, and a datagram socket, in which messages are read in chunks. The two symbolic constants are
SOCK_STREAM
andSOCK_DGRAM
.The third argument is the protocol. If this argument is zero (and it always should be except for unusual circumstances), the operating system will choose the most appropriate protocol. It will choose TCP for stream sockets and UDP for datagram sockets.
The socket system call returns an entry into the file descriptor table (i.e. a small integer). This value is used for all subsequent references to this socket. If the socket call fails, it returns -1.
In this case the program displays and error message and exits. However, this system call is unlikely to fail.This is a simplified description of the socket call; there are numerous other choices for domains and types, but these are the most common. The socket() man page has more information.