Generic Netlink HOW-TO based on Jamal's original doc

An Introduction To Using Generic Netlink
===============================================================================

Last Updated: November 10, 2006

Table of Contents

 1. Introduction
 1.1. Document Overview
 1.2. Netlink And Generic Netlink
 2. Architectural Overview
 3. Generic Netlink Families
    3.1. Family Overview
         3.1.1. The genl_family Structure
         3.1.2. The genl_ops Structure
    3.2. Registering A Family
 4. Generic Netlink Communications
    4.1. Generic Netlink Message Format
    4.2. Kernel Communication
         4.2.1. Sending Messages
         4.2.2. Receiving Messages
    4.3. Userspace Communication
 5. Recommendations
    5.1. Attributes And Message Payloads
    5.2. Operation Granularity
    5.3. Acknowledgment And Error Reporting


1. Introduction
------------------------------------------------------------------------------

1.1. Document Overview
------------------------------------------------------------------------------

This document gives is a brief introduction to Generic Netlink, some simple
examples on how to use it, and some recommendations on how to make the most of
the Generic Netlink communications interface.  While this document does not
require that the reader have a detailed understanding of what Netlink is
and how it works, some basic Netlink knowledge is assumed.  As usual, the
kernel source code is your best friend here.

While this document talks briefly about Generic Netlink from a userspace point
of view it's primary focus is on the kernel's Generic Netlink API.  It is
recommended that application developers who are interested in using Generic
Netlink make use of the libnl library[1].

[1] http://people.suug.ch/~tgr/libnl

1.2. Netlink And Generic Netlink
------------------------------------------------------------------------------

Netlink is a flexible, robust wire-format communications channel typically
used for kernel to user communication although it can also be used for
user to user and kernel to kernel communications.  Netlink communication
channels are associated with families or "busses", where each bus deals with a
specific service; for example, different Netlink busses exist for routing,
XFRM, netfilter, and several other kernel subsystems.  More information about
Netlink can be found in RFC 3549[1].

Over the years, Netlink has become very popular which has brought about a very
real concern that the number of Netlink family numbers may be exhausted in the
near future.  In response to this the Generic Netlink family was created which
acts as a Netlink multiplexer, allowing multiple service to use a single
Netlink bus.

[1] ftp://ftp.rfc-editor.org/in-notes/rfc3549.txt

2. Architectural Overview
------------------------------------------------------------------------------

Figure #1 illustrates how the basic Generic Netlink architecture which is
composed of five different types of components.

 1) The Netlink subsystem which serves as the underlying transport layer for
    all of the Generic Netlink communications.

 2) The Generic Netlink bus which is implemented inside the kernel, but which
    is available to userspace through the socket API and inside the kernel via
    the normal Netlink and Generic Netlink APIs.

 3) The Generic Netlink users who communicate with each other over the Generic
    Netlink bus; users can exist both in kernel and user space.

 4) The Generic Netlink controller which is part of the kernel and is
    responsible for dynamically allocating Generic Netlink communication
    channels and other management tasks.  The Generic Netlink controller is
    implemented as a standard Generic Netlink user, however, it listens on a
    special, pre-allocated Generic Netlink channel.

 5) The kernel socket API.  Generic Netlink sockets are created with the
    PF_NETLINK domain and the NETLINK_GENERIC protocol values.

      +---------------------+      +---------------------+
      | (3) application "A" |      | (3) application "B" |
      +------+--------------+      +--------------+------+
             |                                    |
             \                                    /
              \                                  /
               |                                |
       +-------+--------------------------------+-------+
       |        :                               :       |   user-space
  =====+        :   (5)  Kernel socket API      :       +================
       |        :                               :       |   kernel-space
       +--------+-------------------------------+-------+
                |                               |
          +-----+-------------------------------+----+
          |        (1)  Netlink subsystem            |
          +---------------------+--------------------+
                                |
          +---------------------+--------------------+
          |       (2) Generic Netlink bus            |
          +--+--------------------------+-------+----+
             |                          |       |
     +-------+---------+                |       |
     |  (4) Controller |               /         \
     +-----------------+              /           \
                                      |           |
                   +------------------+--+     +--+------------------+
                   | (3) kernel user "X" |     | (3) kernel user "Y" |
                   +---------------------+     +---------------------+

  Figure 1: Generic Netlink Architecture

When looking at figure #1 it is important to note that any Generic Netlink
user can communicate with any other user over the bus using the same API
regardless of where the user resides in relation to the kernel/userspace
boundary.

Generic Netlink communications are essentially a series of different
communication channels which are multiplexed on a single Netlink family.
Communication channels are uniquely identified by channel numbers which are
dynamically allocated by the Generic Netlink controller.  The controller is a
special Generic Netlink user which listens on a fixed communication channel,
number 0x10, which is always present.  Kernel or userspace users which provide
services over the Generic Netlink bus establish new communication channels by
registering their services with the Generic Netlink controller.  Users who
want to use an existing service query the controller to see if it exists and
determine the correct channel number.

3. Generic Netlink Families
------------------------------------------------------------------------------

The Generic Netlink mechanism is based on a client-server model.  The Generic
Netlink servers register families, which are a collection of well defined
services, with the controller and the clients communicate with the server
through these service registrations.  This section explains how Generic Netlink
families are defined, created and registered.

3.1. Family Overview
------------------------------------------------------------------------------

Generic Netlink family service registrations are defined by two structures,
genl_family and genl_ops.  The genl_family structure defines the family and
it's associated communication channel.  The genl_ops structure defines
an individual service or operation which the family provides to other Generic
Netlink users.

This section focuses on Generic Netlink families as they are represented in
the kernel.  A similar API exists for userspace applications using the libnl
library[1].

[1] http://people.suug.ch/~tgr/libnl

3.1.2. The genl_family Structure

Generic Netlink services are defined by the genl_family structure, which is
shown below:

  struct genl_family
  {
        unsigned int            id;
        unsigned int            hdrsize;
        char                    name[GENL_NAMSIZ];
        unsigned int            version;
        unsigned int            maxattr;
        struct nlattr **        attrbuf;
        struct list_head        ops_list;
        struct list_head        family_list;
  };

  Figure 2: The genl_family structure

The genl_family structure fields are used in the following manner:

 * unsigned int id

   This is the dynamically allocated channel number.  A value of 0x0 signifies
   that the channel number should be assigned by the controller and the 0x10
   value is reserved for use by the controller.  Users should always use
   value 0x0 when registering a new family.

 * unsigned int hdrsize

   If the family makes use of a family specific header, it's size is stored
   here.  If there is no family specific header this value should be zero.

 * char name[GENL_NAMSIZ]

   This string should be unique to the family as it is the key that the
   controller uses to lookup channel numbers when requested.

 * unsigned int version

   Family specific version number.

 * unsigned int maxattr

   Generic Netlink makes use of the standard Netlink attributes, this value
   holds the maximum number of attributes defined for the Generic Netlink
   family.

 * struct nlattr **attrbuf
 * struct list_head ops_list
 * struct list_head family_list

   These are private fields and should not be modified.

3.1.2. The genl_ops Structure

  struct genl_ops
  {
        u8                      cmd;
        unsigned int            flags;
        struct nla_policy       *policy;
        int                     (*doit)(struct sk_buff *skb,
                                        struct genl_info *info);
        int                     (*dumpit)(struct sk_buff *skb,
                                          struct netlink_callback *cb);
        struct list_head        ops_list;
  };

  Figure 3: The genl_ops structure

The genl_ops structure fields are used in the following manner:

 * u8 cmd

   This value is unique across the corresponding Generic Netlink family and is
   used to reference the operation.

 * unsigned int flags

   This field is used to specify any special attributes of the operation.  The
   following flags may be used, multiple flags can be OR'd together:

   - GENL_ADMIN_PERM

     The operation requires the CAP_NET_ADMIN privilege

 * struct nla_policy policy

   This field defines the Netlink attribute policy for the operation request
   message.  If specified, the Generic Netlink mechanism uses this policy to
   verify all of the attributes in a operation request message before calling
   the operation handler.

   The attribute policy is defined as an array of nla_policy structures indexed
   by the attribute number.  The nla_policy structure is defined in figure #4.

     struct nla_policy
     {
        u16             type;
        u16             len;
     };

     Figure 4: The nla_policy structure

   The fields are used in the following manner:

   - u16 type

     This specifies the type of the attribute, presently the following types
     are defined for general use:

     o NLA_UNSPEC

       Undefined type

     o NLA_U8

       A 8 bit unsigned integer

     o NLA_U16

       A 16 bit unsigned integer

     o NLA_U32

       A 32 bit unsigned integer

     o NLA_U64

       A 64 bit unsigned integer

     o NLA_FLAG

       A simple boolean flag

     o NLA_MSECS

       A 64 bit time value in msecs

     o NLA_STRING

       A variable length string

     o NLA_NUL_STRING

       A variable length NULL terminated string

     o NLA_NESTED

       A stream of attributes

   - u16 len

     When the attribute type is one of the string types then this field should
     be set to the maximum length of the string, not including the terminal
     NULL byte.  If the attribute type is unknown or NLA_UNSPEC then this field
     should be set to the exact length of the attribute's payload.

     Unless the attribute type is one of the fixed length types above, a value
     of zero indicates that no validation of the attribute should be performed.

 * int (*doit)(struct skbuff *skb, struct genl_info *info)

   This callback is similar in use to the standard Netlink 'doit' callback, the
   primary difference being the change in parameters.

   The 'doit' handler receives two parameters, the first if the message buffer
   which triggered the handler and the second is a Generic Netlink genl_info
   structure which is defined in figure #5.

     struct genl_ops
     {
        u32                     snd_seq;
        u32                     snd_pid;
        struct nlmsghdr *       nlhdr;
        struct genlmsghdr *     genlhdr;
        void *                  userhdr;
        struct nlattr **        attrs;
     };

     Figure 5: The genl_info structure

   The fields are populated in the following manner:

   - u32 snd_seq

     This is the Netlink sequence number of the request.

   - u32 snd_pid

     This is the PID of the client which issued the request.

   - struct nlmsghdr *nlhdr

     This is set to point to the Netlink message header of the request.

   - struct genlmsghdr *genlhdr

     This is set to point to the Generic Netlink message header of the request.

   - void *userhdr

     If the Generic Netlink family makes use of a family specific header, this
     pointer will be set to point to the start of the family specific header.

   - struct nlattr **attrs

     The parsed Netlink attributes from the request, if the Generic Netlink
     family definition specified a Netlink attribute policy then the
     attributes will have already been validated.

   The 'doit' handler should do whatever processing is necessary and return
   zero on success, or a negative value on failure.  Negative return values
   will cause a NLMSG_ERROR message to be sent while a zero return value will
   only cause a NLMSG_ERROR message to be sent if the request is received with
   the NLM_F_ACK flag set.

 * int (*dumpit)(struct sk_buff *skb, struct netlink_callback *cb)

   This callback is similar in use to the standard Netlink 'dumpit' callback.
   The 'dumpit' callback is invoked when a Generic Netlink message is received
   with the NLM_F_DUMP flag set.

   The main difference between a 'dumpit' handler and a 'doit' handler is
   that a 'dumpit' handler does not allocate a message buffer for a response;
   a pre-allocated sk_buff is passed to the 'dumpit' handler as the first
   parameter.  The 'dumpit' handler should fill the message buffer with the
   appropriate response message and return the size of the sk_buff,
   i.e. sk_buff->len, and the message buffer will automatically be sent to the
   Generic Netlink client that initiated the request.  As long as the 'dumpit'
   handler returns a value greater than zero it will be called again with a
   newly allocated message buffer to fill, when the handler has no more data
   to send it should return zero; error conditions are indicated by returning
   a negative value.  If necessary, state can be preserved in the
   netlink_callback parameter which is passed to the 'dumpit' handler; the
   netlink_callback parameter values will be preserved across handler calls
   for a single request.

 * struct list_head ops_list

   This is a private field and should not be modified.

3.2. Registering A Family
------------------------------------------------------------------------------

Registering a Generic Netlink family is a simple four step process: define the
family, define the operations, register the family, register the operations.
In order to help demonstrate these steps below is a simple example broken down
and explained in detail.

The first step is to define the family itself, which we do by creating an
instance of the genl_family structure which we explained in section 3.1.1..
In our simple example we are going to create a new Generic Netlink family
named "DOC_EXMPL".

  /* attributes */
  enum {
        DOC_EXMPL_A_UNSPEC,
        DOC_EXMPL_A_MSG,
        __DOC_EXMPL_A_MAX,
  };
  #define DOC_EXMPL_A_MAX (__DOC_EXMPL_A_MAX - 1)

  /* attribute policy */
  static struct nla_policy doc_exmpl_genl_policy = [DOC_EXMPL_A_MAX + 1] = {
        [DOC_EXMPL_A_MSG] = { .type = NLA_NUL_STRING },
  }

  /* family definition */
  static struct genl_family doc_exmpl_gnl_family = {
        .id = GENL_ID_GENERATE,
        .hdrsize = 0,
        .name = "DOC_EXMPL",
        .version = 1,
        .maxattr = DOC_EXMPL_A_MAX,

  };

  Figure 6: The DOC_EXMPL family, attributes, and policy

You can see above that we defined a new family and the family recognizes a
single attribute, DOC_EXMPL_A_ECHO, which is a NULL terminated string.  The
GENL_ID_GENERATE macro/constant is really just the value 0x0 and it signifies
that we want the Generic Netlink controller to assign the channel number when
we register the family.

The second step is to define the operations for the family, which we do by
creating at least one instance of the genl_ops structure which we explained in
section 3.1.2..  In this example we are only going to define one operation but
you can define up to 255 unique operations for each family.

  /* handler */
  int doc_exmpl_echo(struct sk_buff *skb, struct genl_info *info)
  {
        /* message handling code goes here; return 0 on success, negative
         * values on failure */
  }

  /* commands */
  enum {
        DOC_EXMPL_C_UNSPEC,
        DOC_EXMPL_C_ECHO,
        __DOC_EXMPL_C_ECHO,
  };
  #define DOC_EXMPL_C_MAX (__DOC_EXMPL_C_MAX - 1)

  /* operation definition */
  struct genl_ops doc_exmpl_gnl_ops_echo = {
        .cmd = DOC_EXMPL_C_ECHO,
        .flags = 0,
        .policy = doc_exmpl_genl_policy,
	.doit = doc_exmpl_echo,
	.dumpit = NULL,
  }

  Figure 7: The DOC_EXMPL_C_ECHO operation

Here we have defined a single operation, DOC_EXMPL_C_ECHO, which uses the
Netlink attribute policy we defined above.  Once registered, this particular
operation would call the doc_exmpl_echo() function whenever a
DOC_EXMPL_C_ECHO message is sent to the DOC_EXMPL family over the Generic
Netlink bus.

The third step it to register the DOC_EXMPL family with the Generic Netlink
operation.  We do this with a single function call:

  genl_register_family(&doc_exmpl_gnl_family);

This call registers the new family name with the Generic Netlink mechanism and
requests a new channel number which is stored in the genl_family struct,
replacing the GENL_ID_GENERATE value.  It is important to remember to
unregister Generic Netlink families when done as the kernel does allocate
resources for each registered family.

The fourth and final step is to register the operations for the family.  Once
again this is a simple function call:

  genl_register_ops(&doc_exmpl_gnl_family, &doc_exmpl_gnl_ops_echo);

This call registers the DOC_EXMPL_C_ECHO operation in association with the
DOC_EXMPL family.  The process is now complete, other Generic Netlink users can
now issue DOC_EXMPL_C_ECHO commands and they will be handled as desired.

4.  Generic Netlink Communications
------------------------------------------------------------------------------

This section deals with the Generic Netlink messages themselves and how to
send and receive messages.

4.1. Generic Netlink Message Format
------------------------------------------------------------------------------

Generic Netlink uses the standard Netlink subsystem as a transport layer which
means that the foundation of the Generic Netlink message is the standard
Netlink message format, the only difference is the inclusion of a Generic
Netlink message header.  The format of the message is defined below:

   0                   1                   2                   3
   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  |                Netlink message header (nlmsghdr)              |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  |           Generic Netlink message header (genlmsghdr)         |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  |             Optional user specific message header             |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  |           Optional Generic Netlink message payload            |
  +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

  Figure 8: Generic Netlink message format

Figure #8 is included only to give you a rough idea of how Generic Netlink
messages are formatted and sent on the "wire".  In practice the Netlink and
Generic Netlink API should insulate most users from the details of the message
format and the Netlink message headers.

4.2 Kernel Communication
------------------------------------------------------------------------------

The kernel provides two sets of interfaces for sending, receiving, and
processing Generic Netlink messages.  The majority of the API consists of the
general purpose Netlink interfaces, however, there are a small number of
interfaces specific to Generic Netlink.  The following two include files
define the Netlink and Generic Netlink API for the kernel.

 * include/net/netlink.h
 * include/net/genetlink.h

4.2.1. Sending Messages

Sending Generic Netlink messages is a three step process: allocate memory for
the message buffer, create the message, send the message.  In order to help
demonstrate these steps below is a simple example using the DOC_EXMPL family
shown in section 3.

The first step is to allocate a Netlink message buffer, the easiest way to do
this is with the nlsmsg_new() function.

  struct sk_buff *skb;

  skb = nlmsg_new(NLMSG_GOODSIZE, GFP_KERNEL);
  if (skb == NULL)
      goto failure;

  Figure 9: Allocating a Generic Netlink message buffer

The NLMSG_GOODSIZE macro/constant is a good value to use when you do not know
the size of the message buffer at the time of allocation.  Don't forget that
the message buffer needs to be big enough to hold the message payload and both
the Netlink and Generic Netlink message headers.

The second step is to actually create the message payload.  This is obviously
something which is very specific to each use service, but a simple example is
shown below.

  int rc;
  void *msg_head;

  /* create the message headers */
  msg_head = genlmsg_put(skb, pid, seq, type, 0, flags, DOC_EXMPL_C_ECHO, 1);
  if (msg_head == NULL) {
      rc = -ENOMEM;
      goto failure;
  }
  /* add a DOC_EXMPL_A_MSG attribute */
  rc = nla_put_string(skb, DOC_EXMPL_A_MSG, "Generic Netlink Rocks");
  if (rc != 0)
      goto failure;
  /* finalize the message */
  genlmsg_end(skb, msg_head);

  Figure 10: Creating a Generic Netlink message payload

The genlmsg_put() function creates the required Netlink and Generic Netlink
message headers, populating them with the given values; see the Generic
Netlink header file for a description of the parameters.  The nla_put_string()
function is a standard Netlink attribute function which adds a string
attribute to the end of the Netlink message; see the Netlink header file for a
description of the parameters.  The genlmsg_end() function updates the Netlink
message header once the message payload has been finalized, this function
should be called before sending the message.

The third and final step is to send the Generic Netlink message which can be
done with a single function call.  The example below is for a unicast send,
but interfaces exist for doing a multicast send of Generic Netlink message.

  int rc;

  rc = genlmsg_unicast(skb, pid);
  if (rc != 0)
      goto failure;

  Figure 11: Sending Generic Netlink messages

4.2.2. Receiving Messages

Typically, the kernel acts a Generic Netlink server which means that the act of
receiving messages is handled automatically by the Generic Netlink bus.  Once
the bus receives the message and determines the correct routing, the message
is passed directly to the family specific operation callback for processing.
If the kernel is acting as a Generic Netlink client, server response messages
can be received over the Generic Netlink socket using standard kernel socket
interfaces.

4.3. Userspace Communication
------------------------------------------------------------------------------

While Generic Netlink messages can be sent and received using the standard
socket API it is recommended that user space applications use the libnl
library[1].  The libnl library insulates applications from many of the low
level Netlink tasks and uses an API which is very similar to the kernel API
shown above.

[1] http://people.suug.ch/~tgr/libnl

5. Recommendations
------------------------------------------------------------------------------

The Generic Netlink mechanism is a very flexible communications mechanism and
as a result there are many different ways it can be used.  The following
recommendations are based on conventions within the Linux kernel and should be
followed whenever possible.  While not all existing kernel code follows the
recommendations outlined here all new code should consider these
recommendations as requirements.

5.1. Attributes And Message Payloads
------------------------------------------------------------------------------

When defining new Generic Netlink message formats you must make use of the
Netlink attributes wherever possible.  The Netlink attribute mechanism has
been carefully designed to allow for future message expansion while preserving
backward compatibility.  There are also additional benefits to using Netlink
attributes which include developer familiarity and basic input checking.

Most common data structures can be represented with Netlink attributes:

 * scalar values

   Most scalar values already have well defined attribute types, see section 3
   for details

 * structures

   Structures can be represented using a nested attribute with the structure
   fields represented as attributes in the payload of the container attribute

 * arrays

   Arrays can be represented by using a single nested attribute as a container
   with several of the same attribute type inside each representing a spot in
   the array

It is also important to use unique attributes as much as possible.  This helps
make the most of the Netlink attribute mechanisms and provides for easy changes
to the message format in the future.

5.2. Operation Granularity
------------------------------------------------------------------------------

While it may be tempting to register a single operation for a Generic Netlink
family and multiplex multiple sub-commands on the single operation this
is strongly discouraged for security reasons.  Combining multiple behaviors
into one operation makes it difficult to restrict the operations using the
existing Linux kernel security mechanisms.

5.3. Acknowledgment and Error Reporting
------------------------------------------------------------------------------

It is often necessary for Generic Netlink services to return an ACK or error
code to the client.  It is not necessary to implement an explicit
acknowledgment message as Netlink already provides a flexible acknowledgment
and error reporting message type called NLMSG_ERROR.  When an error occurs a
NLMSG_ERROR message is returned to the client with the error code returned by
the Generic Netlink operation handler.  Clients can also request a NLMSG_ERROR
message when no error has occurred by setting the NLM_F_ACK flag on requests.

你可能感兴趣的:(Generic Netlink HOW-TO based on Jamal's original doc)