何进哥哥

open vswitch研究：vswitchd （三）

vswitchd是用户态的daemon进程，其核心是执行ofproto的逻辑。我们知道ovs是遵从openflow交换机的规范实现的，就拿二层包转发为例，传统交换机(包括Linux bridge的实现)是通过查找cam表，找到dst mac对应的port；而open vswitch的实现则是根据入包skb，查找是否有对应的flow。如果有flow，说明这个skb不是流的第一个包了，那么可以在flow->action里找到转发的port。这里要说明的是，SDN的思想就是所有的包都需要对应一个flow，基于flow给出包的行为action，传统的action无非就是转发，接受，或者丢弃，而在SDN中，会有更多的action定义：修改skb的内容，改变包的路径，clone多份出来发到不同路径等等。

如果skb没有对应的flow，说明这是flow的第一个包，需要为这个包创建一个flow，vswitchd会在一个while循环里反复检查有没有ofproto的请求过来，有可能是ovs-ofctl传过来的，也可能是openvswitch.ko通过netlink发送的upcall请求，当然大部分情况下，都是flow miss导致的创建flow的请求，这时vswitchd会基于openflow规范创建flow, action，我们看下这个流程:

由于open vswitch是一个2层交换机模型，所有包开始都是从某个port接收进来，即调用ovs_dp_process_received_packet，该函数先基于skb通过ovs_flow_extract生成key，然后调用ovs_flow_tbl_lookup基于key查找flow，如果无法找到flow，调用ovs_dp_upcall通过netlink把一个dp_upcall_info结构发到vswitchd里去处理(调用genlmsg_unicast)

vswitchd会在handle_upcalls里来处理上述的netlink request，对于flow table里miss的情况，会调用handle_miss_upcalls，继而又调用handle_flow_miss，下面来看handle_miss_upcalls的实现

static void
handle_miss_upcalls(struct dpif_backer *backer, struct dpif_upcall *upcalls,
size_t n_upcalls)
{

/* Construct the to-do list.
*
* This just amounts to extracting the flow from each packet and sticking
* the packets that have the same flow in the same "flow_miss" structure so
* that we can process them together. */
hmap_init(&todo);
n_misses = 0;

注释里写得很明白，下面的循环会遍历netlink传到用户态的struct dpif_upcall，该结构包含了miss packet，和基于报文生成的的flow key，对于flow key相同的packet，会集中处理

for (upcall = upcalls; upcall < &upcalls[n_upcalls]; upcall++) {

fitness = odp_flow_key_to_flow(upcall->key, upcall->key_len, &flow);
port = odp_port_to_ofport(backer, flow.in_port);

odp_flow_key_to_flow，先调用lib/parse_flow_nlattrs函数解析upcall->key, upcall->key_len，把解析出来的attr属性放到一个bitmap present_attrs中，而对应类型的struct nlattr则放到struct nlattr* attrs[]中。接下来对present_attrs的每一位，从upcall->key中取得相应值并存入flow中。对于vlan的parse，特别调用了parse_8021q_onward

odp_port_to_ofport，用来把flow.in_port，即datapath的port号转换成openflow port，即struct ofport_dpif* port

flow_extract(upcall->packet, flow.skb_priority,
&flow.tunnel, flow.in_port, &miss->flow);

这里把packet解析到flow中，该函数和odp_flow_key_to_flow有些地方重复

/* Add other packets to a to-do list. */
hash = flow_hash(&miss->flow, 0);
existing_miss = flow_miss_find(&todo, &miss->flow, hash);
if (!existing_miss) {
hmap_insert(&todo, &miss->hmap_node, hash);
miss->ofproto = ofproto;
miss->key = upcall->key;
miss->key_len = upcall->key_len;
miss->upcall_type = upcall->type;
list_init(&miss->packets);

n_misses++;
} else {
miss = existing_miss;
}
list_push_back(&miss->packets, &upcall->packet->list_node);
}

flow_hash计算出miss->flow的哈希值，之后在todo这个hmap里基于哈希值查找struct flow_miss*，如果为空，表示这是第一个flow_miss，初始化这个flow_miss并加入到todo中，最后把packet假如到flow_miss->packets的list中。这里验证了之前的结论，对于一次性的多个upcall，会把属于同一个flow_miss的packets链接到同一个flow_miss下再一并处理。

OVS定义了facet，用来表示用户态程序，比如vswitchd，对于一条被匹配的flow的视图。同时kernel space对于一条flow同样有一个视图，facet表示两个视图相同的部分。不同的部分用subfacet来表示，struct subfacet里定义了action行为

如果datapath计算出的flow_key，和vswitchd基于packet计算出的flow_key完全一致的话，facet只会包含唯一的subfacet，如果datapath计算出的flow_key的成员比vswitchd基于packet计算出来的还要多，那么每个多出来的部分都会成为一个subfacet

struct subfacet {
/* Owners. */
struct hmap_node hmap_node; /* In struct ofproto_dpif 'subfacets' list. */
struct list list_node; /* In struct facet's 'facets' list. */
struct facet *facet; /* Owning facet. */

/* Key.
*
* To save memory in the common case, 'key' is NULL if 'key_fitness' is
* ODP_FIT_PERFECT, that is, odp_flow_key_from_flow() can accurately
* regenerate the ODP flow key from ->facet->flow. */
enum odp_key_fitness key_fitness;
struct nlattr *key;
int key_len;

long long int used; /* Time last used; time created if not used. */

uint64_t dp_packet_count; /* Last known packet count in the datapath. */
uint64_t dp_byte_count; /* Last known byte count in the datapath. */

/* Datapath actions.
*
* These should be essentially identical for every subfacet in a facet, but
* may differ in trivial ways due to VLAN splinters. */
size_t actions_len; /* Number of bytes in actions[]. */
struct nlattr *actions; /* Datapath actions. */

enum slow_path_reason slow; /* 0 if fast path may be used. */
enum subfacet_path path; /* Installed in datapath? */

}

我们先来看handle_flow_miss

/* Handles flow miss 'miss' on 'ofproto'. May add any required datapath
* operations to 'ops', incrementing '*n_ops' for each new op. */
static void
handle_flow_miss(struct ofproto_dpif *ofproto, struct flow_miss *miss,
struct flow_miss_op *ops, size_t *n_ops)
{
struct facet *facet;
uint32_t hash;

/* The caller must ensure that miss->hmap_node.hash contains
* flow_hash(miss->flow, 0). */
hash = miss->hmap_node.hash;

facet = facet_lookup_valid(ofproto, &miss->flow, hash);

在表示datapath的数据结构struct ofproto_dpif* ofproto中查找flow。ofproto->facets是一个hashmap，首先计算出miss flow的hash值，之后在hash对应的hmap_node list中查找是否有匹配的flow，比较的方式比较暴力，直接拿memcmp比较。。

if (!facet) {
struct rule_dpif *rule = rule_dpif_lookup(ofproto, &miss->flow);

if (!flow_miss_should_make_facet(ofproto, miss, hash)) {
handle_flow_miss_without_facet(miss, rule, ops, n_ops);

此时认为没有必要创建flow facet，对于一些trivial的流量，创建一个flow facet反而会带来更大的overload

return;
}

facet = facet_create(rule, &miss->flow, hash);

好吧，我们为这个flow创建一个facet
}
handle_flow_miss_with_facet(miss, facet, ops, n_ops);
}

struct flow_miss是对flow的一个封装，用来加快miss flow的batch处理。大多数情况下，都会创建这个facet出来，

2012-10-26T07:15:43Z|22522|ofproto_dpif|INFO|[qinq] miss flow, create facet: vlan_tci 0, proto 0x806, in_port 1, src mac 0:16:3e:83:0:1, dst mac 0:25:9e:5d:62:53

2012-10-26T07:15:43Z|22529|ofproto_dpif|INFO|[qinq] miss flow, create facet: vlan_tci 0, proto 0x806, in_port 2, src mac 0:25:9e:5d:62:53, dst mac 0:16:3e:83:0:1

可以看出一个双工通信创建了两个flow出来，同时也创建了facet

下面来看handle_flow_miss_with_facet，里面调用subfacet_make_actions来生成action，该函数首先调用action_xlate_ctx_init，初始化一个action_xlate_ctx结构，该结构定义如下：

struct action_xlate_ctx {
/* action_xlate_ctx_init() initializes these members. */

/* The ofproto. */
struct ofproto_dpif *ofproto;

/* Flow to which the OpenFlow actions apply. xlate_actions() will modify
* this flow when actions change header fields. */
struct flow flow;

/* The packet corresponding to 'flow', or a null pointer if we are
* revalidating without a packet to refer to. */
const struct ofpbuf *packet;

/* Should OFPP_NORMAL update the MAC learning table? Should "learn"
* actions update the flow table?
*
* We want to update these tables if we are actually processing a packet,
* or if we are accounting for packets that the datapath has processed, but
* not if we are just revalidating. */
bool may_learn;

/* The rule that we are currently translating, or NULL. */

struct rule_dpif *rule;

/* Union of the set of TCP flags seen so far in this flow. (Used only by
* NXAST_FIN_TIMEOUT. Set to zero to avoid updating updating rules'
* timeouts.) */
uint8_t tcp_flags;

/* xlate_actions() initializes and uses these members. The client might want
* to look at them after it returns. */

struct ofpbuf *odp_actions; /* Datapath actions. */
tag_type tags; /* Tags associated with actions. */
enum slow_path_reason slow; /* 0 if fast path may be used. */
bool has_learn; /* Actions include NXAST_LEARN? */
bool has_normal; /* Actions output to OFPP_NORMAL? */
bool has_fin_timeout; /* Actions include NXAST_FIN_TIMEOUT? */
uint16_t nf_output_iface; /* Output interface index for NetFlow. */
mirror_mask_t mirrors; /* Bitmap of associated mirrors. */

/* xlate_actions() initializes and uses these members, but the client has no
* reason to look at them. */

int recurse; /* Recursion level, via xlate_table_action. */
bool max_resubmit_trigger; /* Recursed too deeply during translation. */
struct flow base_flow; /* Flow at the last commit. */
uint32_t orig_skb_priority; /* Priority when packet arrived. */
uint8_t table_id; /* OpenFlow table ID where flow was found. */
uint32_t sflow_n_outputs; /* Number of output ports. */
uint16_t sflow_odp_port; /* Output port for composing sFlow action. */
uint16_t user_cookie_offset;/* Used for user_action_cookie fixup. */
bool exit; /* No further actions should be processed. */
struct flow orig_flow; /* Copy of original flow. */
};

之后调用xlate_actions，openflow1.0定义了如下action，

enum ofp10_action_type {
OFPAT10_OUTPUT, /* Output to switch port. */
OFPAT10_SET_VLAN_VID, /* Set the 802.1q VLAN id. */
OFPAT10_SET_VLAN_PCP, /* Set the 802.1q priority. */
OFPAT10_STRIP_VLAN, /* Strip the 802.1q header. */
OFPAT10_SET_DL_SRC, /* Ethernet source address. */
OFPAT10_SET_DL_DST, /* Ethernet destination address. */
OFPAT10_SET_NW_SRC, /* IP source address. */
OFPAT10_SET_NW_DST, /* IP destination address. */
OFPAT10_SET_NW_TOS, /* IP ToS (DSCP field, 6 bits). */
OFPAT10_SET_TP_SRC, /* TCP/UDP source port. */
OFPAT10_SET_TP_DST, /* TCP/UDP destination port. */
OFPAT10_ENQUEUE, /* Output to queue. */
OFPAT10_VENDOR = 0xffff
};

对应不同的action type，其action传入的数据结构也不同，e.g.

/* Action structure for OFPAT10_SET_VLAN_VID. */
struct ofp_action_vlan_vid {
ovs_be16 type; /* OFPAT10_SET_VLAN_VID. */
ovs_be16 len; /* Length is 8. */
ovs_be16 vlan_vid; /* VLAN id. */
uint8_t pad[2];
};

/* Action structure for OFPAT10_SET_VLAN_PCP. */
struct ofp_action_vlan_pcp {
ovs_be16 type; /* OFPAT10_SET_VLAN_PCP. */
ovs_be16 len; /* Length is 8. */
uint8_t vlan_pcp; /* VLAN priority. */
uint8_t pad[3];
};

union ofp_action {
ovs_be16 type;
struct ofp_action_header header;
struct ofp_action_vendor_header vendor;
struct ofp_action_output output;
struct ofp_action_vlan_vid vlan_vid;
struct ofp_action_vlan_pcp vlan_pcp;
struct ofp_action_nw_addr nw_addr;
struct ofp_action_nw_tos nw_tos;
struct ofp_action_tp_port tp_port;
};

do_xlate_actions传入一个struct ofp_action*数组，对每个struct ofp_action，执行不同的操作，e.g.

case OFPUTIL_OFPAT10_OUTPUT:
xlate_output_action(ctx, &ia->output);
break;

case OFPUTIL_OFPAT10_SET_VLAN_VID:
ctx->flow.vlan_tci &= ~htons(VLAN_VID_MASK);
ctx->flow.vlan_tci |= ia->vlan_vid.vlan_vid | htons(VLAN_CFI);
break;

case OFPUTIL_OFPAT10_SET_VLAN_PCP:
ctx->flow.vlan_tci &= ~htons(VLAN_PCP_MASK);
ctx->flow.vlan_tci |= htons(
(ia->vlan_pcp.vlan_pcp << VLAN_PCP_SHIFT) | VLAN_CFI);
break;

case OFPUTIL_OFPAT10_STRIP_VLAN:
ctx->flow.vlan_tci = htons(0);
break;

对于转发报文，最重要的就是xlate_output_action，该函数调用的xlate_output_action__，其中传入的port为datapath port index，或者其他控制参数，可以在ofp_port的定义中看到如下定义：

enum ofp_port {
/* Maximum number of physical switch ports. */
OFPP_MAX = 0xff00,

/* Fake output "ports". */
OFPP_IN_PORT = 0xfff8, /* Send the packet out the input port. This
virtual port must be explicitly used
in order to send back out of the input
port. */
OFPP_TABLE = 0xfff9, /* Perform actions in flow table.
NB: This can only be the destination
port for packet-out messages. */
OFPP_NORMAL = 0xfffa, /* Process with normal L2/L3 switching. */
OFPP_FLOOD = 0xfffb, /* All physical ports except input port and
those disabled by STP. */
OFPP_ALL = 0xfffc, /* All physical ports except input port. */
OFPP_CONTROLLER = 0xfffd, /* Send to controller. */
OFPP_LOCAL = 0xfffe, /* Local openflow "port". */
OFPP_NONE = 0xffff /* Not associated with a physical port. */
};

在xlate_output_action__中，大部分情况都是走到OFPP_NORMAL里面，调用xlate_normal，里面会调用mac_learning_lookup, 查找mac表找到报文的出口port，然后调用output_normal，output_normal最终调用compose_output_action

compose_output_action__(struct action_xlate_ctx *ctx, uint16_t ofp_port,

bool check_stp)
{
const struct ofport_dpif *ofport = get_ofp_port(ctx->ofproto, ofp_port);
uint16_t odp_port = ofp_port_to_odp_port(ofp_port);
ovs_be16 flow_vlan_tci = ctx->flow.vlan_tci;
uint8_t flow_nw_tos = ctx->flow.nw_tos;
uint16_t out_port;

...

out_port = vsp_realdev_to_vlandev(ctx->ofproto, odp_port,
ctx->flow.vlan_tci);
if (out_port != odp_port) {
ctx->flow.vlan_tci = htons(0);
}
commit_odp_actions(&ctx->flow, &ctx->base_flow, ctx->odp_actions);
nl_msg_put_u32(ctx->odp_actions, OVS_ACTION_ATTR_OUTPUT, out_port);

ctx->sflow_odp_port = odp_port;
ctx->sflow_n_outputs++;
ctx->nf_output_iface = ofp_port;
ctx->flow.vlan_tci = flow_vlan_tci;
ctx->flow.nw_tos = flow_nw_tos;
}

commit_odp_actions，用来把所有action编码车功能nlattr的格式存到ctx->odp_actions中，之后的nl_msg_put_u32(ctx->odp_actions, OVS_ACTION_ATTR_OUTPUT, out_port)把报文的出口port添加进去，这样一条flow action差不多组合完毕了

下面来讨论下vswitchd中的cam表，代码在lib/mac-learning.h lib/mac-learning.c中，

vswitchd内部维护了一个mac/port的cam表，其中mac entry的老化时间为300秒，cam表定义了flooding vlan的概念，即如果vlan是flooding，表示不会去学习任何地址，这个vlan的所有转发都通过flooding完成，

/* A MAC learning table entry. */
struct mac_entry {
struct hmap_node hmap_node; /* Node in a mac_learning hmap. */
struct list lru_node; /* Element in 'lrus' list. */
time_t expires; /* Expiration time. */
time_t grat_arp_lock; /* Gratuitous ARP lock expiration time. */
uint8_t mac[ETH_ADDR_LEN]; /* Known MAC address. */
uint16_t vlan; /* VLAN tag. */
tag_type tag; /* Tag for this learning entry. */

/* Learned port. */
union {
void *p;
int i;
} port;
};

/* MAC learning table. */
struct mac_learning {
struct hmap table; /* Learning table. */ mac_entry组成的hmap哈希表，mac_entry通过hmap_node挂载到mac_learning->table中
struct list lrus; /* In-use entries, least recently used at the
front, most recently used at the back. */ lru的链表，mac_entry通过lru_node挂载到mac_learning->lrus中
uint32_t secret; /* Secret for randomizing hash table. */
unsigned long *flood_vlans; /* Bitmap of learning disabled VLANs. */
unsigned int idle_time; /* Max age before deleting an entry. */ 最大老化时间
};

static uint32_t
mac_table_hash(const struct mac_learning *ml, const uint8_t mac[ETH_ADDR_LEN],
uint16_t vlan)
{
unsigned int mac1 = get_unaligned_u32((uint32_t *) mac);
unsigned int mac2 = get_unaligned_u16((uint16_t *) (mac + 4));
return hash_3words(mac1, mac2 | (vlan << 16), ml->secret);
}

mac_entry计算的hash值，由mac_learning->secret，vlan, mac地址共同通过hash_3words计算出来

mac_entry_lookup，通过mac地址，vlan来查看是否已经对应的mac_entry

get_lru，找到lru链表对应的第一个mac_entry

mac_learning_create/mac_learning_destroy，创建/销毁mac_learning表

mac_learning_may_learn，如果vlan不是flooding vlan且mac地址不是多播地址，返回true

mac_learning_insert，向mac_learning中插入一条mac_entry，首先通过mac_entry_lookup查看mac, vlan对应的mac_entry是否存在，不存在的话如果此时mac_learning已经有了MAC_MAX条mac_entry，老化最老的那条，之后创建mac_entry并插入到cam表中。

mac_learning_lookup，调用mac_entry_lookup在cam表中查找某个vlan对应的mac地址

mac_learning_run，循环老化已经超时的mac_entry

How to Port Open vSwitch to New Software or Hardware
====================================================

Open vSwitch (OVS) is intended to be easily ported to new software and
hardware platforms.  This document describes the types of changes that
are most likely to be necessary in porting OVS to Unix-like platforms.
(Porting OVS to other kinds of platforms is likely to be more
difficult.)


Vocabulary
----------

For historical reasons, different words are used for essentially the
same concept in different areas of the Open vSwitch source tree.  Here
is a concordance, indexed by the area of the source tree:

        datapath/       vport           ---
        vswitchd/       iface           port
        ofproto/        port            bundle
        ofproto/bond.c  slave           bond
        lib/lacp.c      slave           lacp
        lib/netdev.c    netdev          ---
        database        Interface       Port


Open vSwitch Architectural Overview
-----------------------------------

The following diagram shows the very high-level architecture of Open
vSwitch from a porter's perspective.

                   +-------------------+
                   |    ovs-vswitchd   |<-->ovsdb-server
                   +-------------------+
                   |      ofproto      |<-->OpenFlow controllers
                   +--------+-+--------+
                   | netdev | | ofproto|
                   +--------+ |provider|
                   | netdev | +--------+
                   |provider|
                   +--------+

Some of the components are generic.  Modulo bugs or inadequacies,
these components should not need to be modified as part of a port:

  - "ovs-vswitchd" is the main Open vSwitch userspace program, in
    vswitchd/.  It reads the desired Open vSwitch configuration from
    the ovsdb-server program over an IPC channel and passes this
    configuration down to the "ofproto" library.  It also passes
    certain status and statistical information from ofproto back
    into the database.

  - "ofproto" is the Open vSwitch library, in ofproto/, that
    implements an OpenFlow switch.  It talks to OpenFlow controllers
    over the network and to switch hardware or software through an
    "ofproto provider", explained further below.

  - "netdev" is the Open vSwitch library, in lib/netdev.c, that
    abstracts interacting with network devices, that is, Ethernet
    interfaces.  The netdev library is a thin layer over "netdev
    provider" code, explained further below.

The other components may need attention during a port.  You will
almost certainly have to implement a "netdev provider".  Depending on
the type of port you are doing and the desired performance, you may
also have to implement an "ofproto provider" or a lower-level
component called a "dpif" provider.

The following sections talk about these components in more detail.


Writing a netdev Provider
-------------------------

A "netdev provider" implements an operating system and hardware
specific interface to "network devices", e.g. eth0 on Linux.  Open
vSwitch must be able to open each port on a switch as a netdev, so you
will need to implement a "netdev provider" that works with your switch
hardware and software.

struct netdev_class, in lib/netdev-provider.h, defines the interfaces
required to implement a netdev.  That structure contains many function
pointers, each of which has a comment that is meant to describe its
behavior in detail.  If the requirements are unclear, please report
this as a bug.

The netdev interface can be divided into a few rough categories:

  * Functions required to properly implement OpenFlow features.  For
    example, OpenFlow requires the ability to report the Ethernet
    hardware address of a port.  These functions must be implemented
    for minimally correct operation.

  * Functions required to implement optional Open vSwitch features.
    For example, the Open vSwitch support for in-band control
    requires netdev support for inspecting the TCP/IP stack's ARP
    table.  These functions must be implemented if the corresponding
    OVS features are to work, but may be omitted initially.

  * Functions needed in some implementations but not in others.  For
    example, most kinds of ports (see below) do not need
    functionality to receive packets from a network device.

The existing netdev implementations may serve as useful examples
during a port:

  * lib/netdev-linux.c implements netdev functionality for Linux
    network devices, using Linux kernel calls.  It may be a good
    place to start for full-featured netdev implementations.

  * lib/netdev-vport.c provides support for "virtual ports"
    implemented by the Open vSwitch datapath module for the Linux
    kernel.  This may serve as a model for minimal netdev
    implementations.

  * lib/netdev-dummy.c is a fake netdev implementation useful only
    for testing.


Porting Strategies
------------------

After a netdev provider has been implemented for a system's network
devices, you may choose among three basic porting strategies.

The lowest-effort strategy is to use the "userspace switch"
implementation built into Open vSwitch.  This ought to work, without
writing any more code, as long as the netdev provider that you
implemented supports receiving packets.  It yields poor performance,
however, because every packet passes through the ovs-vswitchd process.
See [INSTALL.userspace.md] for instructions on how to configure a
userspace switch.

If the userspace switch is not the right choice for your port, then
you will have to write more code.  You may implement either an
"ofproto provider" or a "dpif provider".  Which you should choose
depends on a few different factors:

  * Only an ofproto provider can take full advantage of hardware
    with built-in support for wildcards (e.g. an ACL table or a
    TCAM).

  * A dpif provider can take advantage of the Open vSwitch built-in
    implementations of bonding, LACP, 802.1ag, 802.1Q VLANs, and
    other features.  An ofproto provider has to provide its own
    implementations, if the hardware can support them at all.

  * A dpif provider is usually easier to implement, but most
    appropriate for software switching.  It "explodes" wildcard
    rules into exact-match entries (with an optional wildcard mask).
    This allows fast hash lookups in software, but makes
    inefficient use of TCAMs in hardware that support wildcarding.

The following sections describe how to implement each kind of port.


ofproto Providers
-----------------

An "ofproto provider" is what ofproto uses to directly monitor and
control an OpenFlow-capable switch.  struct ofproto_class, in
ofproto/ofproto-provider.h, defines the interfaces to implement an
ofproto provider for new hardware or software.  That structure contains
many function pointers, each of which has a comment that is meant to
describe its behavior in detail.  If the requirements are unclear,
please report this as a bug.

The ofproto provider interface is preliminary.  Please let us know if
it seems unsuitable for your purpose.  We will try to improve it.


Writing a dpif Provider
-----------------------

Open vSwitch has a built-in ofproto provider named "ofproto-dpif",
which is built on top of a library for manipulating datapaths, called
"dpif".  A "datapath" is a simple flow table, one that is only required
to support exact-match flows, that is, flows without wildcards.  When a
packet arrives on a network device, the datapath looks for it in this
table.  If there is a match, then it performs the associated actions.
If there is no match, the datapath passes the packet up to ofproto-dpif,
which maintains the full OpenFlow flow table.  If the packet matches in
this flow table, then ofproto-dpif executes its actions and inserts a
new entry into the dpif flow table.  (Otherwise, ofproto-dpif passes the
packet up to ofproto to send the packet to the OpenFlow controller, if
one is configured.)

When calculating the dpif flow, ofproto-dpif generates an exact-match
flow that describes the missed packet.  It makes an effort to figure out
what fields can be wildcarded based on the switch's configuration and
OpenFlow flow table.  The dpif is free to ignore the suggested wildcards
and only support the exact-match entry.  However, if the dpif supports
wildcarding, then it can use the masks to match multiple flows with
fewer entries and potentially significantly reduce the number of flow
misses handled by ofproto-dpif.

The "dpif" library in turn delegates much of its functionality to a
"dpif provider".  The following diagram shows how dpif providers fit
into the Open vSwitch architecture:

                _
               |   +-------------------+
               |   |    ovs-vswitchd   |<-->ovsdb-server
               |   +-------------------+
               |   |      ofproto      |<-->OpenFlow controllers
               |   +--------+-+--------+  _
               |   | netdev | |ofproto-|   |
     userspace |   +--------+ |  dpif  |   |
               |   | netdev | +--------+   |
               |   |provider| |  dpif  |   |
               |   +---||---+ +--------+   |
               |       ||     |  dpif  |   | implementation of
               |       ||     |provider|   | ofproto provider
               |_      ||     +---||---+   |
                       ||         ||       |
                _  +---||-----+---||---+   |
               |   |          |datapath|   |
        kernel |   |          +--------+  _|
               |   |                   |
               |_  +--------||---------+
                            ||
                         physical
                           NIC

struct dpif_class, in lib/dpif-provider.h, defines the interfaces
required to implement a dpif provider for new hardware or software.
That structure contains many function pointers, each of which has a
comment that is meant to describe its behavior in detail.  If the
requirements are unclear, please report this as a bug.

There are two existing dpif implementations that may serve as
useful examples during a port:

  * lib/dpif-netlink.c is a Linux-specific dpif implementation that
    talks to an Open vSwitch-specific kernel module (whose sources
    are in the "datapath" directory).  The kernel module performs
    all of the switching work, passing packets that do not match any
    flow table entry up to userspace.  This dpif implementation is
    essentially a wrapper around calls into the kernel module.

  * lib/dpif-netdev.c is a generic dpif implementation that performs
    all switching internally.  This is how the Open vSwitch
    userspace switch is implemented.

vswitchd是ovs中最核心的组件，openflow的相关逻辑都在vswitchd里实现，一般来说，ovs分为datapath, vswitchd以及ovsdb三个部分，datapath一般是和具体是数据面平台相关的，比如白盒交换机，或者Linux内核等，同时datapath不是必须的组件。ovsdb用于存储vswitch本身的配置信息，比如端口，拓扑，规则等。vswitchd在ovs dist包里是以用户态进程形式呈现的，但这个不是绝对的，上文摘录的部分给出了把ovs移植到其他平台上的方法，也算是目前官方仅有的一篇大致描述了ovs架构的文档

可以看出vswitchd本身是分层的结构，最上面的daemon层主要用于和ovsdb通信，做配置的下发和更新等，中间是ofproto层，用于和openflow控制器通信，以及通过ofproto_class暴露了ofproto provider接口，不同平台上openflow的具体实现就通过ofproto_class统一了接口。

在ovs的定义里，netdev代表了具体平台的设备实现，e.g. linux内核的net_device或者移植到交换机平台下的port等，struct netdev_class定义了netdev-provider的具体实现需要的接口，具体的平台实现需要支持这些统一的接口，从而完成netdev设备的创建，销毁，打开，关闭等一系列操作。

不同的netdev类型通过netdev_register_provider被注册，vswitchd内部会保存一个struct cmap netdev_classes保存所有注册的netdev类型，struct netdev定义如下

[cpp] view plain copy

/* A network device (e.g. an Ethernet device).
*
* Network device implementations may read these members but should not modify
* them. */
struct netdev {
/* The following do not change during the lifetime of a struct netdev. */
char *name; /* Name of network device. */
const struct netdev_class *netdev_class; /* Functions to control
this device. */
/* A sequence number which indicates changes in one of 'netdev''s
* properties. It must be nonzero so that users have a value which
* they may use as a reset when tracking 'netdev'.
*
* Minimally, the sequence number is required to change whenever
* 'netdev''s flags, features, ethernet address, or carrier changes. */
uint64_t change_seq;
/* A netdev provider might be unable to change some of the device's
* parameter (n_rxq, mtu) when the device is in use. In this case
* the provider can notify the upper layer by calling
* netdev_request_reconfigure(). The upper layer will react by stopping
* the operations on the device and calling netdev_reconfigure() to allow
* the configuration changes. 'last_reconfigure_seq' remembers the value
* of 'reconfigure_seq' when the last reconfiguration happened. */
struct seq *reconfigure_seq;
uint64_t last_reconfigure_seq;
/* If this is 'true', the user explicitly specified an MTU for this
* netdev. Otherwise, Open vSwitch is allowed to override it. */
bool mtu_user_config;
/* The core netdev code initializes these at netdev construction and only
* provide read-only access to its client. Netdev implementations may
* modify them. */
int n_txq;
int n_rxq;
int ref_cnt; /* Times this devices was opened. */
struct shash_node *node; /* Pointer to element in global map. */
struct ovs_list saved_flags_list; /* Contains "struct netdev_saved_flags". */
};

struct netdev_class的定义如下，可以看出netdev_class更接近于一个ops结构体，同时加入了设备的队列管理操作

[cpp] view plain copy

/* Network device class structure, to be defined by each implementation of a
* network device.
*
* These functions return 0 if successful or a positive errno value on failure,
* except where otherwise noted.
*
*
* Data Structures
* ===============
*
* These functions work primarily with two different kinds of data structures:
*
* - "struct netdev", which represents a network device.
*
* - "struct netdev_rxq", which represents a handle for capturing packets
* received on a network device
*
* Each of these data structures contains all of the implementation-independent
* generic state for the respective concept, called the "base" state. None of
* them contains any extra space for implementations to use. Instead, each
* implementation is expected to declare its own data structure that contains
* an instance of the generic data structure plus additional
* implementation-specific members, called the "derived" state. The
* implementation can use casts or (preferably) the CONTAINER_OF macro to
* obtain access to derived state given only a pointer to the embedded generic
* data structure.
*
*
* Life Cycle
* ==========
*
* Four stylized functions accompany each of these data structures:
*
* "alloc" "construct" "destruct" "dealloc"
* ------------ ---------------- --------------- --------------
* netdev ->alloc ->construct ->destruct ->dealloc
* netdev_rxq ->rxq_alloc ->rxq_construct ->rxq_destruct ->rxq_dealloc
*
* Any instance of a given data structure goes through the following life
* cycle:
*
* 1. The client calls the "alloc" function to obtain raw memory. If "alloc"
* fails, skip all the other steps.
*
* 2. The client initializes all of the data structure's base state. If this
* fails, skip to step 7.
*
* 3. The client calls the "construct" function. The implementation
* initializes derived state. It may refer to the already-initialized
* base state. If "construct" fails, skip to step 6.
*
* 4. The data structure is now initialized and in use.
*
* 5. When the data structure is no longer needed, the client calls the
* "destruct" function. The implementation uninitializes derived state.
* The base state has not been uninitialized yet, so the implementation
* may still refer to it.
*
* 6. The client uninitializes all of the data structure's base state.
*
* 7. The client calls the "dealloc" to free the raw memory. The
* implementation must not refer to base or derived state in the data
* structure, because it has already been uninitialized.
*
* If netdev support multi-queue IO then netdev->construct should set initialize
* netdev->n_rxq to number of queues.
*
* Each "alloc" function allocates and returns a new instance of the respective
* data structure. The "alloc" function is not given any information about the
* use of the new data structure, so it cannot perform much initialization.
* Its purpose is just to ensure that the new data structure has enough room
* for base and derived state. It may return a null pointer if memory is not
* available, in which case none of the other functions is called.
*
* Each "construct" function initializes derived state in its respective data
* structure. When "construct" is called, all of the base state has already
* been initialized, so the "construct" function may refer to it. The
* "construct" function is allowed to fail, in which case the client calls the
* "dealloc" function (but not the "destruct" function).
*
* Each "destruct" function uninitializes and frees derived state in its
* respective data structure. When "destruct" is called, the base state has
* not yet been uninitialized, so the "destruct" function may refer to it. The
* "destruct" function is not allowed to fail.
*
* Each "dealloc" function frees raw memory that was allocated by the
* "alloc" function. The memory's base and derived members might not have ever
* been initialized (but if "construct" returned successfully, then it has been
* "destruct"ed already). The "dealloc" function is not allowed to fail.
*
*
* Device Change Notification
* ==========================
*
* Minimally, implementations are required to report changes to netdev flags,
* features, ethernet address or carrier through connectivity_seq. Changes to
* other properties are allowed to cause notification through this interface,
* although implementations should try to avoid this. connectivity_seq_get()
* can be used to acquire a reference to the struct seq. The interface is
* described in detail in seq.h. */
struct netdev_class {
/* Type of netdevs in this class, e.g. "system", "tap", "gre", etc.
*
* One of the providers should supply a "system" type, since this is
* the type assumed if no type is specified when opening a netdev.
* The "system" type corresponds to an existing network device on
* the system. */
const char *type;
/* If 'true' then this netdev should be polled by PMD threads. */
bool is_pmd;
/* ## ------------------- ## */
/* ## Top-Level Functions ## */
/* ## ------------------- ## */
/* Called when the netdev provider is registered, typically at program
* startup. Returning an error from this function will prevent any network
* device in this class from being opened.
*
* This function may be set to null if a network device class needs no
* initialization at registration time. */
int (*init)(void);
/* Performs periodic work needed by netdevs of this class. May be null if
* no periodic work is necessary.
*
* 'netdev_class' points to the class. It is useful in case the same
* function is used to implement different classes. */
void (*run)(const struct netdev_class *netdev_class);
/* Arranges for poll_block() to wake up if the "run" member function needs
* to be called. Implementations are additionally required to wake
* whenever something changes in any of its netdevs which would cause their
* ->change_seq() function to change its result. May be null if nothing is
* needed here.
*
* 'netdev_class' points to the class. It is useful in case the same
* function is used to implement different classes. */
void (*wait)(const struct netdev_class *netdev_class);
/* ## ---------------- ## */
/* ## netdev Functions ## */
/* ## ---------------- ## */
/* Life-cycle functions for a netdev. See the large comment above on
* struct netdev_class. */
struct netdev *(*alloc)(void);
int (*construct)(struct netdev *);
void (*destruct)(struct netdev *);
void (*dealloc)(struct netdev *);
/* Fetches the device 'netdev''s configuration, storing it in 'args'.
* The caller owns 'args' and pre-initializes it to an empty smap.
*
* If this netdev class does not have any configuration options, this may
* be a null pointer. */
int (*get_config)(const struct netdev *netdev, struct smap *args);
/* Changes the device 'netdev''s configuration to 'args'.
*
* If this netdev class does not support configuration, this may be a null
* pointer. */
int (*set_config)(struct netdev *netdev, const struct smap *args);
/* Returns the tunnel configuration of 'netdev'. If 'netdev' is
* not a tunnel, returns null.
*
* If this function would always return null, it may be null instead. */
const struct netdev_tunnel_config *
(*get_tunnel_config)(const struct netdev *netdev);
/* Build Tunnel header. Ethernet and ip header parameters are passed to
* tunnel implementation to build entire outer header for given flow. */
int (*build_header)(const struct netdev *, struct ovs_action_push_tnl *data,
const struct netdev_tnl_build_header_params *params);
/* build_header() can not build entire header for all packets for given
* flow. Push header is called for packet to build header specific to
* a packet on actual transmit. It uses partial header build by
* build_header() which is passed as data. */
void (*push_header)(struct dp_packet *packet,
const struct ovs_action_push_tnl *data);
/* Pop tunnel header from packet, build tunnel metadata and resize packet
* for further processing.
* Returns NULL in case of error or tunnel implementation queued packet for further
* processing. */
struct dp_packet * (*pop_header)(struct dp_packet *packet);
/* Returns the id of the numa node the 'netdev' is on. If there is no
* such info, returns NETDEV_NUMA_UNSPEC. */
int (*get_numa_id)(const struct netdev *netdev);
/* Configures the number of tx queues of 'netdev'. Returns 0 if successful,
* otherwise a positive errno value.
*
* 'n_txq' specifies the exact number of transmission queues to create.
*
* The caller will call netdev_reconfigure() (if necessary) before using
* netdev_send() on any of the newly configured queues, giving the provider
* a chance to adjust its settings.
*
* On error, the tx queue configuration is unchanged. */
int (*set_tx_multiq)(struct netdev *netdev, unsigned int n_txq);
/* Sends buffers on 'netdev'.
* Returns 0 if successful (for every buffer), otherwise a positive errno
* value. Returns EAGAIN without blocking if one or more packets cannot be
* queued immediately. Returns EMSGSIZE if a partial packet was transmitted
* or if a packet is too big or too small to transmit on the device.
*
* If the function returns a non-zero value, some of the packets might have
* been sent anyway.
*
* If 'may_steal' is false, the caller retains ownership of all the
* packets. If 'may_steal' is true, the caller transfers ownership of all
* the packets to the network device, regardless of success.
*
* If 'concurrent_txq' is true, the caller may perform concurrent calls
* to netdev_send() with the same 'qid'. The netdev provider is responsible
* for making sure that these concurrent calls do not create a race
* condition by using locking or other synchronization if required.
*
* The network device is expected to maintain one or more packet
* transmission queues, so that the caller does not ordinarily have to
* do additional queuing of packets. 'qid' specifies the queue to use
* and can be ignored if the implementation does not support multiple
* queues.
*
* May return EOPNOTSUPP if a network device does not implement packet
* transmission through this interface. This function may be set to null
* if it would always return EOPNOTSUPP anyhow. (This will prevent the
* network device from being usefully used by the netdev-based "userspace
* datapath". It will also prevent the OVS implementation of bonding from
* working properly over 'netdev'.) */
int (*send)(struct netdev *netdev, int qid, struct dp_packet_batch *batch,
bool may_steal, bool concurrent_txq);
/* Registers with the poll loop to wake up from the next call to
* poll_block() when the packet transmission queue for 'netdev' has
* sufficient room to transmit a packet with netdev_send().
*
* The network device is expected to maintain one or more packet
* transmission queues, so that the caller does not ordinarily have to
* do additional queuing of packets. 'qid' specifies the queue to use
* and can be ignored if the implementation does not support multiple
* queues.
*
* May be null if not needed, such as for a network device that does not
* implement packet transmission through the 'send' member function. */
void (*send_wait)(struct netdev *netdev, int qid);
/* Sets 'netdev''s Ethernet address to 'mac' */
int (*set_etheraddr)(struct netdev *netdev, const struct eth_addr mac);
/* Retrieves 'netdev''s Ethernet address into 'mac'.
*
* This address will be advertised as 'netdev''s MAC address through the
* OpenFlow protocol, among other uses. */
int (*get_etheraddr)(const struct netdev *netdev, struct eth_addr *mac);
/* Retrieves 'netdev''s MTU into '*mtup'.
*
* The MTU is the maximum size of transmitted (and received) packets, in
* bytes, not including the hardware header; thus, this is typically 1500
* bytes for Ethernet devices.
*
* If 'netdev' does not have an MTU (e.g. as some tunnels do not), then
* this function should return EOPNOTSUPP. This function may be set to
* null if it would always return EOPNOTSUPP. */
int (*get_mtu)(const struct netdev *netdev, int *mtup);
/* Sets 'netdev''s MTU to 'mtu'.
*
* If 'netdev' does not have an MTU (e.g. as some tunnels do not), then
* this function should return EOPNOTSUPP. This function may be set to
* null if it would always return EOPNOTSUPP. */
int (*set_mtu)(struct netdev *netdev, int mtu);
/* Returns the ifindex of 'netdev', if successful, as a positive number.
* On failure, returns a negative errno value.
*
* The desired semantics of the ifindex value are a combination of those
* specified by POSIX for if_nametoindex() and by SNMP for ifIndex. An
* ifindex value should be unique within a host and remain stable at least
* until reboot. SNMP says an ifindex "ranges between 1 and the value of
* ifNumber" but many systems do not follow this rule anyhow.
*
* This function may be set to null if it would always return -EOPNOTSUPP.
*/
int (*get_ifindex)(const struct netdev *netdev);
/* Sets 'carrier' to true if carrier is active (link light is on) on
* 'netdev'.
*
* May be null if device does not provide carrier status (will be always
* up as long as device is up).
*/
int (*get_carrier)(const struct netdev *netdev, bool *carrier);
/* Returns the number of times 'netdev''s carrier has changed since being
* initialized.
*
* If null, callers will assume the number of carrier resets is zero. */
long long int (*get_carrier_resets)(const struct netdev *netdev);
/* Forces ->get_carrier() to poll 'netdev''s MII registers for link status
* instead of checking 'netdev''s carrier. 'netdev''s MII registers will
* be polled once every 'interval' milliseconds. If 'netdev' does not
* support MII, another method may be used as a fallback. If 'interval' is
* less than or equal to zero, reverts ->get_carrier() to its normal
* behavior.
*
* Most network devices won't support this feature and will set this
* function pointer to NULL, which is equivalent to returning EOPNOTSUPP.
*/
int (*set_miimon_interval)(struct netdev *netdev, long long int interval);
/* Retrieves current device stats for 'netdev' into 'stats'.
*
* A network device that supports some statistics but not others, it should
* set the values of the unsupported statistics to all-1-bits
* (UINT64_MAX). */
int (*get_stats)(const struct netdev *netdev, struct netdev_stats *);
/* Stores the features supported by 'netdev' into each of '*current',
* '*advertised', '*supported', and '*peer'. Each value is a bitmap of
* NETDEV_F_* bits.
*
* This function may be set to null if it would always return EOPNOTSUPP.
*/
int (*get_features)(const struct netdev *netdev,
enum netdev_features *current,
enum netdev_features *advertised,
enum netdev_features *supported,
enum netdev_features *peer);
/* Set the features advertised by 'netdev' to 'advertise', which is a
* set of NETDEV_F_* bits.
*
* This function may be set to null for a network device that does not
* support configuring advertisements. */
int (*set_advertisements)(struct netdev *netdev,
enum netdev_features advertise);
/* Attempts to set input rate limiting (policing) policy, such that up to
* 'kbits_rate' kbps of traffic is accepted, with a maximum accumulative
* burst size of 'kbits' kb.
*
* This function may be set to null if policing is not supported. */
int (*set_policing)(struct netdev *netdev, unsigned int kbits_rate,
unsigned int kbits_burst);
/* Adds to 'types' all of the forms of QoS supported by 'netdev', or leaves
* it empty if 'netdev' does not support QoS. Any names added to 'types'
* should be documented as valid for the "type" column in the "QoS" table
* in vswitchd/vswitch.xml (which is built as ovs-vswitchd.conf.db(8)).
*
* Every network device must support disabling QoS with a type of "", but
* this function must not add "" to 'types'.
*
* The caller is responsible for initializing 'types' (e.g. with
* sset_init()) before calling this function. The caller retains ownership
* of 'types'.
*
* May be NULL if 'netdev' does not support QoS at all. */
int (*get_qos_types)(const struct netdev *netdev, struct sset *types);
/* Queries 'netdev' for its capabilities regarding the specified 'type' of
* QoS. On success, initializes 'caps' with the QoS capabilities.
*
* Should return EOPNOTSUPP if 'netdev' does not support 'type'. May be
* NULL if 'netdev' does not support QoS at all. */
int (*get_qos_capabilities)(const struct netdev *netdev,
const char *type,
struct netdev_qos_capabilities *caps);
/* Queries 'netdev' about its currently configured form of QoS. If
* successful, stores the name of the current form of QoS into '*typep'
* and any details of configuration as string key-value pairs in
* 'details'.
*
* A '*typep' of "" indicates that QoS is currently disabled on 'netdev'.
*
* The caller initializes 'details' before calling this function. The
* caller takes ownership of the string key-values pairs added to
* 'details'.
*
* The netdev retains ownership of '*typep'.
*
* '*typep' will be one of the types returned by netdev_get_qos_types() for
* 'netdev'. The contents of 'details' should be documented as valid for
* '*typep' in the "other_config" column in the "QoS" table in
* vswitchd/vswitch.xml (which is built as ovs-vswitchd.conf.db(8)).
*
* May be NULL if 'netdev' does not support QoS at all. */
int (*get_qos)(const struct netdev *netdev,
const char **typep, struct smap *details);
/* Attempts to reconfigure QoS on 'netdev', changing the form of QoS to
* 'type' with details of configuration from 'details'.
*
* On error, the previous QoS configuration is retained.
*
* When this function changes the type of QoS (not just 'details'), this
* also resets all queue configuration for 'netdev' to their defaults
* (which depend on the specific type of QoS). Otherwise, the queue
* configuration for 'netdev' is unchanged.
*
* 'type' should be "" (to disable QoS) or one of the types returned by
* netdev_get_qos_types() for 'netdev'. The contents of 'details' should
* be documented as valid for the given 'type' in the "other_config" column
* in the "QoS" table in vswitchd/vswitch.xml (which is built as
* ovs-vswitchd.conf.db(8)).
*
* May be NULL if 'netdev' does not support QoS at all. */
int (*set_qos)(struct netdev *netdev,
const char *type, const struct smap *details);
/* Queries 'netdev' for information about the queue numbered 'queue_id'.
* If successful, adds that information as string key-value pairs to
* 'details'. Returns 0 if successful, otherwise a positive errno value.
*
* Should return EINVAL if 'queue_id' is greater than or equal to the
* number of supported queues (as reported in the 'n_queues' member of
* struct netdev_qos_capabilities by 'get_qos_capabilities').
*
* The caller initializes 'details' before calling this function. The
* caller takes ownership of the string key-values pairs added to
* 'details'.
*
* The returned contents of 'details' should be documented as valid for the
* given 'type' in the "other_config" column in the "Queue" table in
* vswitchd/vswitch.xml (which is built as ovs-vswitchd.conf.db(8)).
*/
int (*get_queue)(const struct netdev *netdev,
unsigned int queue_id, struct smap *details);
/* Configures the queue numbered 'queue_id' on 'netdev' with the key-value
* string pairs in 'details'. The contents of 'details' should be
* documented as valid for the given 'type' in the "other_config" column in
* the "Queue" table in vswitchd/vswitch.xml (which is built as
* ovs-vswitchd.conf.db(8)). Returns 0 if successful, otherwise a positive
* errno value. On failure, the given queue's configuration should be
* unmodified.
*
* Should return EINVAL if 'queue_id' is greater than or equal to the
* number of supported queues (as reported in the 'n_queues' member of
* struct netdev_qos_capabilities by 'get_qos_capabilities'), or if
* 'details' is invalid for the type of queue.
*
* This function does not modify 'details', and the caller retains
* ownership of it.
*
* May be NULL if 'netdev' does not support QoS at all. */
int (*set_queue)(struct netdev *netdev,
unsigned int queue_id, const struct smap *details);
/* Attempts to delete the queue numbered 'queue_id' from 'netdev'.
*
* Should return EINVAL if 'queue_id' is greater than or equal to the
* number of supported queues (as reported in the 'n_queues' member of
* struct netdev_qos_capabilities by 'get_qos_capabilities'). Should
* return EOPNOTSUPP if 'queue_id' is valid but may not be deleted (e.g. if
* 'netdev' has a fixed set of queues with the current QoS mode).
*
* May be NULL if 'netdev' does not support QoS at all, or if all of its
* QoS modes have fixed sets of queues. */
int (*delete_queue)(struct netdev *netdev, unsigned int queue_id);
/* Obtains statistics about 'queue_id' on 'netdev'. Fills 'stats' with the
* queue's statistics. May set individual members of 'stats' to all-1-bits
* if the statistic is unavailable.
*
* May be NULL if 'netdev' does not support QoS at all. */
int (*get_queue_stats)(const struct netdev *netdev, unsigned int queue_id,
struct netdev_queue_stats *stats);
/* Attempts to begin dumping the queues in 'netdev'. On success, returns 0
* and initializes '*statep' with any data needed for iteration. On
* failure, returns a positive errno value.
*
* May be NULL if 'netdev' does not support QoS at all. */
int (*queue_dump_start)(const struct netdev *netdev, void **statep);
/* Attempts to retrieve another queue from 'netdev' for 'state', which was
* initialized by a successful call to the 'queue_dump_start' function for
* 'netdev'. On success, stores a queue ID into '*queue_id' and fills
* 'details' with the configuration of the queue with that ID. Returns EOF
* if the last queue has been dumped, or a positive errno value on error.
* This function will not be called again once it returns nonzero once for
* a given iteration (but the 'queue_dump_done' function will be called
* afterward).
*
* The caller initializes and clears 'details' before calling this
* function. The caller takes ownership of the string key-values pairs
* added to 'details'.
*
* The returned contents of 'details' should be documented as valid for the
* given 'type' in the "other_config" column in the "Queue" table in
* vswitchd/vswitch.xml (which is built as ovs-vswitchd.conf.db(8)).
*
* May be NULL if 'netdev' does not support QoS at all. */
int (*queue_dump_next)(const struct netdev *netdev, void *state,
unsigned int *queue_id, struct smap *details);
/* Releases resources from 'netdev' for 'state', which was initialized by a
* successful call to the 'queue_dump_start' function for 'netdev'.
*
* May be NULL if 'netdev' does not support QoS at all. */
int (*queue_dump_done)(const struct netdev *netdev, void *state);
/* Iterates over all of 'netdev''s queues, calling 'cb' with the queue's
* ID, its statistics, and the 'aux' specified by the caller. The order of
* iteration is unspecified, but (when successful) each queue must be
* visited exactly once.
*
* 'cb' will not modify or free the statistics passed in. */
int (*dump_queue_stats)(const struct netdev *netdev,
void (*cb)(unsigned int queue_id,
struct netdev_queue_stats *,
void *aux),
void *aux);
/* Assigns 'addr' as 'netdev''s IPv4 address and 'mask' as its netmask. If
* 'addr' is INADDR_ANY, 'netdev''s IPv4 address is cleared.
*
* This function may be set to null if it would always return EOPNOTSUPP
* anyhow. */
int (*set_in4)(struct netdev *netdev, struct in_addr addr,
struct in_addr mask);
/* Returns all assigned IP address to 'netdev' and returns 0.
* API allocates array of address and masks and set it to
* '*addr' and '*mask'.
* Otherwise, returns a positive errno value and sets '*addr', '*mask
* and '*n_addr' to NULL.
*
* The following error values have well-defined meanings:
*
* - EADDRNOTAVAIL: 'netdev' has no assigned IPv6 address.
*
* - EOPNOTSUPP: No IPv6 network stack attached to 'netdev'.
*
* 'addr' may be null, in which case the address itself is not reported. */
int (*get_addr_list)(const struct netdev *netdev, struct in6_addr **in,
struct in6_addr **mask, int *n_in6);
/* Adds 'router' as a default IP gateway for the TCP/IP stack that
* corresponds to 'netdev'.
*
* This function may be set to null if it would always return EOPNOTSUPP
* anyhow. */
int (*add_router)(struct netdev *netdev, struct in_addr router);
/* Looks up the next hop for 'host' in the host's routing table. If
* successful, stores the next hop gateway's address (0 if 'host' is on a
* directly connected network) in '*next_hop' and a copy of the name of the
* device to reach 'host' in '*netdev_name', and returns 0. The caller is
* responsible for freeing '*netdev_name' (by calling free()).
*
* This function may be set to null if it would always return EOPNOTSUPP
* anyhow. */
int (*get_next_hop)(const struct in_addr *host, struct in_addr *next_hop,
char **netdev_name);
/* Retrieves driver information of the device.
*
* Populates 'smap' with key-value pairs representing the status of the
* device. 'smap' is a set of key-value string pairs representing netdev
* type specific information. For more information see
* ovs-vswitchd.conf.db(5).
*
* The caller is responsible for destroying 'smap' and its data.
*
* This function may be set to null if it would always return EOPNOTSUPP
* anyhow. */
int (*get_status)(const struct netdev *netdev, struct smap *smap);
/* Looks up the ARP table entry for 'ip' on 'netdev' and stores the
* corresponding MAC address in 'mac'. A return value of ENXIO, in
* particular, indicates that there is no ARP table entry for 'ip' on
* 'netdev'.
*
* This function may be set to null if it would always return EOPNOTSUPP
* anyhow. */
int (*arp_lookup)(const struct netdev *netdev, ovs_be32 ip,
struct eth_addr *mac);
/* Retrieves the current set of flags on 'netdev' into '*old_flags'. Then,
* turns off the flags that are set to 1 in 'off' and turns on the flags
* that are set to 1 in 'on'. (No bit will be set to 1 in both 'off' and
* 'on'; that is, off & on == 0.)
*
* This function may be invoked from a signal handler. Therefore, it
* should not do anything that is not signal-safe (such as logging). */
int (*update_flags)(struct netdev *netdev, enum netdev_flags off,
enum netdev_flags on, enum netdev_flags *old_flags);
/* If the provider called netdev_request_reconfigure(), the upper layer
* will eventually call this. The provider can update the device
* configuration knowing that the upper layer will not call rxq_recv() or
* send() until this function returns.
*
* On error, the configuration is indeterminant and the device cannot be
* used to send and receive packets until a successful configuration is
* applied. */
int (*reconfigure)(struct netdev *netdev);
/* ## -------------------- ## */
/* ## netdev_rxq Functions ## */
/* ## -------------------- ## */
/* If a particular netdev class does not support receiving packets, all these
* function pointers must be NULL. */
/* Life-cycle functions for a netdev_rxq. See the large comment above on
* struct netdev_class. */
struct netdev_rxq *(*rxq_alloc)(void);
int (*rxq_construct)(struct netdev_rxq *);
void (*rxq_destruct)(struct netdev_rxq *);
void (*rxq_dealloc)(struct netdev_rxq *);
/* Attempts to receive a batch of packets from 'rx'. In 'batch', the
* caller supplies 'packets' as the pointer to the beginning of an array
* of NETDEV_MAX_BURST pointers to dp_packet. If successful, the
* implementation stores pointers to up to NETDEV_MAX_BURST dp_packets into
* the array, transferring ownership of the packets to the caller, stores
* the number of received packets into 'count', and returns 0.
*
* The implementation does not necessarily initialize any non-data members
* of 'packets' in 'batch'. That is, the caller must initialize layer
* pointers and metadata itself, if desired, e.g. with pkt_metadata_init()
* and miniflow_extract().
*
* Implementations should allocate buffers with DP_NETDEV_HEADROOM bytes of
* headroom.
*
* Returns EAGAIN immediately if no packet is ready to be received or
* another positive errno value if an error was encountered. */
int (*rxq_recv)(struct netdev_rxq *rx, struct dp_packet_batch *batch);
/* Registers with the poll loop to wake up from the next call to
* poll_block() when a packet is ready to be received with
* netdev_rxq_recv() on 'rx'. */
void (*rxq_wait)(struct netdev_rxq *rx);
/* Discards all packets waiting to be received from 'rx'. */
int (*rxq_drain)(struct netdev_rxq *rx);
};

目前已经实现的netdev_class包括，netdev_linux_class, netdev_internal_class, netdev_tap_class, dpdk_class, dpdk_ring_class, dpdk_vhost_class, dpdk_vhost_client_class, patch_clas等。lib/netdev-linux.c里面实现的netdev通过调用内核api实现了基于内核的网络设备netdev，lib/netdev-vport.c则基于datapath模块实现了基于vport的网络设备netdev。

ofproto层通过ofproto_class定义了openflow的接口，除此之外，还有几个重要的数据结构和ofproto相关，struct ofproto, struct ofport, struct rule, struct oftable, struct ofgroup

1. struct ofproto代表了一个openflow switch结构体，内部包含了struct ofproto_class, struct ofport的hash map，struct oftable, struct ofgroup的hash map etc.

[cpp] view plain copy

/* An OpenFlow switch.
*
* With few exceptions, ofproto implementations may look at these fields but
* should not modify them. */
struct ofproto {
struct hmap_node hmap_node; /* In global 'all_ofprotos' hmap. */
const struct ofproto_class *ofproto_class;
char *type; /* Datapath type. */
char *name; /* Datapath name. */
/* Settings. */
uint64_t fallback_dpid; /* Datapath ID if no better choice found. */
uint64_t datapath_id; /* Datapath ID. */
bool forward_bpdu; /* Option to allow forwarding of BPDU frames
* when NORMAL action is invoked. */
char *mfr_desc; /* Manufacturer (NULL for default). */
char *hw_desc; /* Hardware (NULL for default). */
char *sw_desc; /* Software version (NULL for default). */
char *serial_desc; /* Serial number (NULL for default). */
char *dp_desc; /* Datapath description (NULL for default). */
enum ofputil_frag_handling frag_handling;
/* Datapath. */
struct hmap ports; /* Contains "struct ofport"s. */
struct shash port_by_name;
struct simap ofp_requests; /* OpenFlow port number requests. */
uint16_t alloc_port_no; /* Last allocated OpenFlow port number. */
uint16_t max_ports; /* Max possible OpenFlow port num, plus one. */
struct hmap ofport_usage; /* Map ofport to last used time. */
uint64_t change_seq; /* Change sequence for netdev status. */
/* Flow tables. */
long long int eviction_group_timer; /* For rate limited reheapification. */
struct oftable *tables;
int n_tables;
ovs_version_t tables_version; /* Controls which rules are visible to
* table lookups. */
/* Rules indexed on their cookie values, in all flow tables. */
struct hindex cookies OVS_GUARDED_BY(ofproto_mutex);
struct hmap learned_cookies OVS_GUARDED_BY(ofproto_mutex);
/* List of expirable flows, in all flow tables. */
struct ovs_list expirable OVS_GUARDED_BY(ofproto_mutex);
/* Meter table.
* OpenFlow meters start at 1. To avoid confusion we leave the first
* pointer in the array un-used, and index directly with the OpenFlow
* meter_id. */
struct ofputil_meter_features meter_features;
struct meter **meters; /* 'meter_features.max_meter' + 1 pointers. */
/* OpenFlow connections. */
struct connmgr *connmgr;
/* Delayed rule executions.
*
* We delay calls to ->ofproto_class->rule_execute() past releasing
* ofproto_mutex during a flow_mod, because otherwise a "learn" action
* triggered by the executing the packet would try to recursively modify
* the flow table and reacquire the global lock. */
struct guarded_list rule_executes; /* Contains "struct rule_execute"s. */
int min_mtu; /* Current MTU of non-internal ports. */
/* Groups. */
struct cmap groups; /* Contains "struct ofgroup"s. */
uint32_t n_groups[4] OVS_GUARDED; /* # of existing groups of each type. */
struct ofputil_group_features ogf;
};

2. struct ofport代表了openflow switch的一个端口，同时关联一个struct netdev的设备抽象

[cpp] view plain copy

/* An OpenFlow port within a "struct ofproto".
*
* The port's name is netdev_get_name(port->netdev).
*
* With few exceptions, ofproto implementations may look at these fields but
* should not modify them. */
struct ofport {
struct hmap_node hmap_node; /* In struct ofproto's "ports" hmap. */
struct ofproto *ofproto; /* The ofproto that contains this port. */
struct netdev *netdev;
struct ofputil_phy_port pp;
ofp_port_t ofp_port; /* OpenFlow port number. */
uint64_t change_seq;
long long int created; /* Time created, in msec. */
int mtu;
};

3. struct rule表示一条openflow规则，rule里面包含了一组struct rule_actions

[cpp] view plain copy

struct rule {
/* Where this rule resides in an OpenFlow switch.
*
* These are immutable once the rule is constructed, hence 'const'. */
struct ofproto *const ofproto; /* The ofproto that contains this rule. */
const struct cls_rule cr; /* In owning ofproto's classifier. */
const uint8_t table_id; /* Index in ofproto's 'tables' array. */
bool removed; /* Rule has been removed from the ofproto
* data structures. */
/* Protects members marked OVS_GUARDED.
* Readers only need to hold this mutex.
* Writers must hold both this mutex AND ofproto_mutex.
* By implication writers can read *without* taking this mutex while they
* hold ofproto_mutex. */
struct ovs_mutex mutex OVS_ACQ_AFTER(ofproto_mutex);
/* Number of references.
* The classifier owns one reference.
* Any thread trying to keep a rule from being freed should hold its own
* reference. */
struct ovs_refcount ref_count;
/* A "flow cookie" is the OpenFlow name for a 64-bit value associated with
* a flow.. */
ovs_be64 flow_cookie OVS_GUARDED;
struct hindex_node cookie_node OVS_GUARDED_BY(ofproto_mutex);
enum ofputil_flow_mod_flags flags OVS_GUARDED;
/* Timeouts. */
uint16_t hard_timeout OVS_GUARDED; /* In seconds from ->modified. */
uint16_t idle_timeout OVS_GUARDED; /* In seconds from ->used. */
/* Eviction precedence. */
const uint16_t importance;
/* Removal reason for sending flow removed message.
* Used only if 'flags' has OFPUTIL_FF_SEND_FLOW_REM set and if the
* value is not OVS_OFPRR_NONE. */
uint8_t removed_reason;
/* Eviction groups (see comment on struct eviction_group for explanation) .
*
* 'eviction_group' is this rule's eviction group, or NULL if it is not in
* any eviction group. When 'eviction_group' is nonnull, 'evg_node' is in
* the ->eviction_group->rules hmap. */
struct eviction_group *eviction_group OVS_GUARDED_BY(ofproto_mutex);
struct heap_node evg_node OVS_GUARDED_BY(ofproto_mutex);
/* OpenFlow actions. See struct rule_actions for more thread-safety
* notes. */
const struct rule_actions * const actions;
/* In owning meter's 'rules' list. An empty list if there is no meter. */
struct ovs_list meter_list_node OVS_GUARDED_BY(ofproto_mutex);
/* Flow monitors (e.g. for NXST_FLOW_MONITOR, related to struct ofmonitor).
*
* 'add_seqno' is the sequence number when this rule was created.
* 'modify_seqno' is the sequence number when this rule was last modified.
* See 'monitor_seqno' in connmgr.c for more information. */
enum nx_flow_monitor_flags monitor_flags OVS_GUARDED_BY(ofproto_mutex);
uint64_t add_seqno OVS_GUARDED_BY(ofproto_mutex);
uint64_t modify_seqno OVS_GUARDED_BY(ofproto_mutex);
/* Optimisation for flow expiry. In ofproto's 'expirable' list if this
* rule is expirable, otherwise empty. */
struct ovs_list expirable OVS_GUARDED_BY(ofproto_mutex);
/* Times. Last so that they are more likely close to the stats managed
* by the provider. */
long long int created OVS_GUARDED; /* Creation time. */
/* Must hold 'mutex' for both read/write, 'ofproto_mutex' not needed. */
long long int modified OVS_GUARDED; /* Time of last modification. */
};
struct rule_actions {
/* Flags.
*
* 'has_meter' is true if 'ofpacts' contains an OFPACT_METER action.
*
* 'has_learn_with_delete' is true if 'ofpacts' contains an OFPACT_LEARN
* action whose flags include NX_LEARN_F_DELETE_LEARNED. */
bool has_meter;
bool has_learn_with_delete;
bool has_groups;
/* Actions. */
uint32_t ofpacts_len; /* Size of 'ofpacts', in bytes. */
struct ofpact ofpacts[]; /* Sequence of "struct ofpacts". */
};
struct ofpact {
/* We want the space advantage of an 8-bit type here on every
* implementation, without giving up the advantage of having a useful type
* on implementations that support packed enums. */
#ifdef HAVE_PACKED_ENUM
enum ofpact_type type; /* OFPACT_*. */
#else
uint8_t type; /* OFPACT_* */
#endif
uint8_t raw; /* Original type when added, if any. */
uint16_t len; /* Length of the action, in bytes, including
* struct ofpact, excluding padding. */
};

struct ofproto_class是一个接口工厂类，对应的实现是ofproto-dpif，我们先来看接口定义

[cpp] view plain copy

struct ofproto_class {
/* ## ----------------- ## */
/* ## Factory Functions ## */
/* ## ----------------- ## */
/* Initializes provider. The caller may pass in 'iface_hints',
* which contains an shash of "struct iface_hint" elements indexed
* by the interface's name. The provider may use these hints to
* describe the startup configuration in order to reinitialize its
* state. The caller owns the provided data, so a provider must
* make copies of anything required. An ofproto provider must
* remove any existing state that is not described by the hint, and
* may choose to remove it all. */
void (*init)(const struct shash *iface_hints);
/* Enumerates the types of all supported ofproto types into 'types'. The
* caller has already initialized 'types'. The implementation should add
* its own types to 'types' but not remove any existing ones, because other
* ofproto classes might already have added names to it. */
void (*enumerate_types)(struct sset *types);
/* Enumerates the names of all existing datapath of the specified 'type'
* into 'names' 'all_dps'. The caller has already initialized 'names' as
* an empty sset.
*
* 'type' is one of the types enumerated by ->enumerate_types().
*
* Returns 0 if successful, otherwise a positive errno value.
*/
int (*enumerate_names)(const char *type, struct sset *names);
/* Deletes the datapath with the specified 'type' and 'name'. The caller
* should have closed any open ofproto with this 'type' and 'name'; this
* function is allowed to fail if that is not the case.
*
* 'type' is one of the types enumerated by ->enumerate_types().
* 'name' is one of the names enumerated by ->enumerate_names() for 'type'.
*
* Returns 0 if successful, otherwise a positive errno value.
*/
int (*del)(const char *type, const char *name);
/* Returns the type to pass to netdev_open() when a datapath of type
* 'datapath_type' has a port of type 'port_type', for a few special
* cases when a netdev type differs from a port type. For example,
* when using the userspace datapath, a port of type "internal"
* needs to be opened as "tap".
*
* Returns either 'type' itself or a string literal, which must not
* be freed. */
const char *(*port_open_type)(const char *datapath_type,
const char *port_type);
/* Performs any periodic activity required on ofprotos of type
* 'type'.
*
* An ofproto provider may implement it or not, depending on whether
* it needs type-level maintenance.
*
* Returns 0 if successful, otherwise a positive errno value. */
int (*type_run)(const char *type);
/* Causes the poll loop to wake up when a type 'type''s 'run'
* function needs to be called, e.g. by calling the timer or fd
* waiting functions in poll-loop.h.
*
* An ofproto provider may implement it or not, depending on whether
* it needs type-level maintenance. */
void (*type_wait)(const char *type);
/* Performs any periodic activity required by 'ofproto'. It should:
*
* - Call connmgr_send_packet_in() for each received packet that missed
* in the OpenFlow flow table or that had a OFPP_CONTROLLER output
* action.
*
* - Call ofproto_rule_expire() for each OpenFlow flow that has reached
* its hard_timeout or idle_timeout, to expire the flow.
*
* Returns 0 if successful, otherwise a positive errno value. */
int (*run)(struct ofproto *ofproto);
/* Causes the poll loop to wake up when 'ofproto''s 'run' function needs to
* be called, e.g. by calling the timer or fd waiting functions in
* poll-loop.h. */
void (*wait)(struct ofproto *ofproto);
/* Every "struct rule" in 'ofproto' is about to be deleted, one by one.
* This function may prepare for that, for example by clearing state in
* advance. It should *not* actually delete any "struct rule"s from
* 'ofproto', only prepare for it.
*
* This function is optional; it's really just for optimization in case
* it's cheaper to delete all the flows from your hardware in a single pass
* than to do it one by one. */
void (*flush)(struct ofproto *ofproto);
/* Helper for the OpenFlow OFPT_TABLE_FEATURES request.
*
* The 'features' array contains 'ofproto->n_tables' elements. Each
* element is initialized as:
*
* - 'table_id' to the array index.
*
* - 'name' to "table#" where # is the table ID.
*
* - 'metadata_match' and 'metadata_write' to OVS_BE64_MAX.
*
* - 'config' to the table miss configuration.
*
* - 'max_entries' to 1,000,000.
*
* - Both 'nonmiss' and 'miss' to:
*
* * 'next' to all 1-bits for all later tables.
*
* * 'instructions' to all instructions.
*
* * 'write' and 'apply' both to:
*
* - 'ofpacts': All actions.
*
* - 'set_fields': All fields.
*
* - 'match', 'mask', and 'wildcard' to all fields.
*
* If 'stats' is nonnull, it also contains 'ofproto->n_tables' elements.
* Each element is initialized as:
*
* - 'table_id' to the array index.
*
* - 'active_count' to the 'n_flows' of struct ofproto for the table.
*
* - 'lookup_count' and 'matched_count' to 0.
*
* The implementation should update any members in each element for which
* it has better values:
*
* - Any member of 'features' to better describe the implementation's
* capabilities.
*
* - 'lookup_count' to the number of packets looked up in this flow table
* so far.
*
* - 'matched_count' to the number of packets looked up in this flow
* table so far that matched one of the flow entries.
*/
void (*query_tables)(struct ofproto *ofproto,
struct ofputil_table_features *features,
struct ofputil_table_stats *stats);
/* Sets the current tables version the provider should use for classifier
* lookups. This must be called with a new version number after each set
* of flow table changes has been completed, so that datapath revalidation
* can be triggered. */
void (*set_tables_version)(struct ofproto *ofproto, ovs_version_t version);
struct ofproto *(*alloc)(void);
int (*construct)(struct ofproto *ofproto);
void (*destruct)(struct ofproto *ofproto);
void (*dealloc)(struct ofproto *ofproto);
struct ofport *(*port_alloc)(void);
int (*port_construct)(struct ofport *ofport);
void (*port_destruct)(struct ofport *ofport, bool del);
void (*port_dealloc)(struct ofport *ofport);
/* Called after 'ofport->netdev' is replaced by a new netdev object. If
* the ofproto implementation uses the ofport's netdev internally, then it
* should switch to using the new one. The old one has been closed.
*
* An ofproto implementation that doesn't need to do anything in this
* function may use a null pointer. */
void (*port_modified)(struct ofport *ofport);
/* Called after an OpenFlow request changes a port's configuration.
* 'ofport->pp.config' contains the new configuration. 'old_config'
* contains the previous configuration.
*
* The caller implements OFPUTIL_PC_PORT_DOWN using netdev functions to
* turn NETDEV_UP on and off, so this function doesn't have to do anything
* for that bit (and it won't be called if that is the only bit that
* changes). */
void (*port_reconfigured)(struct ofport *ofport,
enum ofputil_port_config old_config);
/* Looks up a port named 'devname' in 'ofproto'. On success, returns 0 and
* initializes '*port' appropriately. Otherwise, returns a positive errno
* value.
*
* The caller owns the data in 'port' and must free it with
* ofproto_port_destroy() when it is no longer needed. */
int (*port_query_by_name)(const struct ofproto *ofproto,
const char *devname, struct ofproto_port *port);
/* Attempts to add 'netdev' as a port on 'ofproto'. Returns 0 if
* successful, otherwise a positive errno value. The caller should
* inform the implementation of the OpenFlow port through the
* ->port_construct() method.
*
* It doesn't matter whether the new port will be returned by a later call
* to ->port_poll(); the implementation may do whatever is more
* convenient. */
int (*port_add)(struct ofproto *ofproto, struct netdev *netdev);
/* Deletes port number 'ofp_port' from the datapath for 'ofproto'. Returns
* 0 if successful, otherwise a positive errno value.
*
* It doesn't matter whether the new port will be returned by a later call
* to ->port_poll(); the implementation may do whatever is more
* convenient. */
int (*port_del)(struct ofproto *ofproto, ofp_port_t ofp_port);
/* Refreshes datapath configuration of 'port'.
* Returns 0 if successful, otherwise a positive errno value. */
int (*port_set_config)(const struct ofport *port, const struct smap *cfg);
/* Get port stats */
int (*port_get_stats)(const struct ofport *port,
struct netdev_stats *stats);
/* Port iteration functions.
*
* The client might not be entirely in control of the ports within an
* ofproto. Some hardware implementations, for example, might have a fixed
* set of ports in a datapath. For this reason, the client needs a way to
* iterate through all the ports that are actually in a datapath. These
* functions provide that functionality.
*
* The 'state' pointer provides the implementation a place to
* keep track of its position. Its format is opaque to the caller.
*
* The ofproto provider retains ownership of the data that it stores into
* ->port_dump_next()'s 'port' argument. The data must remain valid until
* at least the next call to ->port_dump_next() or ->port_dump_done() for
* 'state'. The caller will not modify or free it.
*
* Details
* =======
*
* ->port_dump_start() attempts to begin dumping the ports in 'ofproto'.
* On success, it should return 0 and initialize '*statep' with any data
* needed for iteration. On failure, returns a positive errno value, and
* the client will not call ->port_dump_next() or ->port_dump_done().
*
* ->port_dump_next() attempts to retrieve another port from 'ofproto' for
* 'state'. If there is another port, it should store the port's
* information into 'port' and return 0. It should return EOF if all ports
* have already been iterated. Otherwise, on error, it should return a
* positive errno value. This function will not be called again once it
* returns nonzero once for a given iteration (but the 'port_dump_done'
* function will be called afterward).
*
* ->port_dump_done() allows the implementation to release resources used
* for iteration. The caller might decide to stop iteration in the middle
* by calling this function before ->port_dump_next() returns nonzero.
*
* Usage Example
* =============
*
* int error;
* void *state;
*
* error = ofproto->ofproto_class->port_dump_start(ofproto, &state);
* if (!error) {
* for (;;) {
* struct ofproto_port port;
*
* error = ofproto->ofproto_class->port_dump_next(
* ofproto, state, &port);
* if (error) {
* break;
* }
* // Do something with 'port' here (without modifying or freeing
* // any of its data).
* }
* ofproto->ofproto_class->port_dump_done(ofproto, state);
* }
* // 'error' is now EOF (success) or a positive errno value (failure).
*/
int (*port_dump_start)(const struct ofproto *ofproto, void **statep);
int (*port_dump_next)(const struct ofproto *ofproto, void *state,
struct ofproto_port *port);
int (*port_dump_done)(const struct ofproto *ofproto, void *state);
struct rule *(*rule_alloc)(void);
enum ofperr (*rule_construct)(struct rule *rule)
/* OVS_REQUIRES(ofproto_mutex) */;
void (*rule_insert)(struct rule *rule, struct rule *old_rule,
bool forward_counts)
/* OVS_REQUIRES(ofproto_mutex) */;
void (*rule_delete)(struct rule *rule) /* OVS_REQUIRES(ofproto_mutex) */;
void (*rule_destruct)(struct rule *rule);
void (*rule_dealloc)(struct rule *rule);
/* Applies the actions in 'rule' to 'packet'. (This implements sending
* buffered packets for OpenFlow OFPT_FLOW_MOD commands.)
*
* Takes ownership of 'packet' (so it should eventually free it, with
* ofpbuf_delete()).
*
* 'flow' reflects the flow information for 'packet'. All of the
* information in 'flow' is extracted from 'packet', except for
* flow->tunnel and flow->in_port, which are assigned the correct values
* for the incoming packet. The register values are zeroed. 'packet''s
* header pointers and offsets (e.g. packet->l3) are appropriately
* initialized. packet->l3 is aligned on a 32-bit boundary.
*
* The implementation should add the statistics for 'packet' into 'rule'.
*
* Returns 0 if successful, otherwise an OpenFlow error code. */
enum ofperr (*rule_execute)(struct rule *rule, const struct flow *flow,
struct dp_packet *packet);
/* Implements the OpenFlow OFPT_PACKET_OUT command. The datapath should
* execute the 'ofpacts_len' bytes of "struct ofpacts" in 'ofpacts'.
*
* The caller retains ownership of 'packet' and of 'ofpacts', so
* ->packet_out() should not modify or free them.
*
* This function must validate that it can correctly implement 'ofpacts'.
* If not, then it should return an OpenFlow error code.
*
* 'flow' reflects the flow information for 'packet'. All of the
* information in 'flow' is extracted from 'packet', except for
* flow->in_port (see below). flow->tunnel and its register values are
* zeroed.
*
* flow->in_port comes from the OpenFlow OFPT_PACKET_OUT message. The
* implementation should reject invalid flow->in_port values by returning
* OFPERR_OFPBRC_BAD_PORT. (If the implementation called
* ofproto_init_max_ports(), then the client will reject these ports
* itself.) For consistency, the implementation should consider valid for
* flow->in_port any value that could possibly be seen in a packet that it
* passes to connmgr_send_packet_in(). Ideally, even an implementation
* that never generates packet-ins (e.g. due to hardware limitations)
* should still allow flow->in_port values for every possible physical port
* and OFPP_LOCAL. The only virtual ports (those above OFPP_MAX) that the
* caller will ever pass in as flow->in_port, other than OFPP_LOCAL, are
* OFPP_NONE and OFPP_CONTROLLER. The implementation should allow both of
* these, treating each of them as packets generated by the controller as
* opposed to packets originating from some switch port.
*
* (Ordinarily the only effect of flow->in_port is on output actions that
* involve the input port, such as actions that output to OFPP_IN_PORT,
* OFPP_FLOOD, or OFPP_ALL. flow->in_port can also affect Nicira extension
* "resubmit" actions.)
*
* 'packet' is not matched against the OpenFlow flow table, so its
* statistics should not be included in OpenFlow flow statistics.
*
* Returns 0 if successful, otherwise an OpenFlow error code. */
enum ofperr (*packet_out)(struct ofproto *ofproto, struct dp_packet *packet,
const struct flow *flow,
const struct ofpact *ofpacts,
size_t ofpacts_len);
enum ofperr (*nxt_resume)(struct ofproto *ofproto,
const struct ofputil_packet_in_private *);
/* Registers meta-data associated with the 'n_qdscp' Qualities of Service
* 'queues' attached to 'ofport'. This data is not intended to be
* sufficient to implement QoS. Instead, providers may use this
* information to implement features which require knowledge of what queues
* exist on a port, and some basic information about them.
*
* EOPNOTSUPP as a return value indicates that this ofproto_class does not
* support QoS, as does a null pointer. */
int (*set_queues)(struct ofport *ofport,
const struct ofproto_port_queue *queues, size_t n_qdscp);
/* If 's' is nonnull, this function registers a "bundle" associated with
* client data pointer 'aux' in 'ofproto'. A bundle is the same concept as
* a Port in OVSDB, that is, it consists of one or more "slave" devices
* (Interfaces, in OVSDB) along with VLAN and LACP configuration and, if
* there is more than one slave, a bonding configuration. If 'aux' is
* already registered then this function updates its configuration to 's'.
* Otherwise, this function registers a new bundle.
*
* If 's' is NULL, this function unregisters the bundle registered on
* 'ofproto' associated with client data pointer 'aux'. If no such bundle
* has been registered, this has no effect.
*
* This function affects only the behavior of the NXAST_AUTOPATH action and
* output to the OFPP_NORMAL port. An implementation that does not support
* it at all may set it to NULL or return EOPNOTSUPP. An implementation
* that supports only a subset of the functionality should implement what
* it can and return 0. */
int (*bundle_set)(struct ofproto *ofproto, void *aux,
const struct ofproto_bundle_settings *s);
/* If 'port' is part of any bundle, removes it from that bundle. If the
* bundle now has no ports, deletes the bundle. If the bundle now has only
* one port, deconfigures the bundle's bonding configuration. */
void (*bundle_remove)(struct ofport *ofport);
/* These functions should be NULL if an implementation does not support
* them. They must be all null or all non-null.. */
/* Initializes 'features' to describe the metering features supported by
* 'ofproto'. */
void (*meter_get_features)(const struct ofproto *ofproto,
struct ofputil_meter_features *features);
/* If '*id' is UINT32_MAX, adds a new meter with the given 'config'. On
* success the function must store a provider meter ID other than
* UINT32_MAX in '*id'. All further references to the meter will be made
* with the returned provider meter id rather than the OpenFlow meter id.
* The caller does not try to interpret the provider meter id, giving the
* implementation the freedom to either use the OpenFlow meter_id value
* provided in the meter configuration, or any other value suitable for the
* implementation.
*
* If '*id' is a value other than UINT32_MAX, modifies the existing meter
* with that meter provider ID to have configuration 'config', while
* leaving '*id' unchanged. On failure, the existing meter configuration
* is left intact. */
enum ofperr (*meter_set)(struct ofproto *ofproto, ofproto_meter_id *id,
const struct ofputil_meter_config *config);
/* Gets the meter and meter band packet and byte counts for maximum of
* 'stats->n_bands' bands for the meter with provider ID 'id' within
* 'ofproto'. The caller fills in the other stats values. The band stats
* are copied to memory at 'stats->bands' provided by the caller. The
* number of returned band stats is returned in 'stats->n_bands'. */
enum ofperr (*meter_get)(const struct ofproto *ofproto,
ofproto_meter_id id,
struct ofputil_meter_stats *stats);
/* Deletes a meter, making the 'ofproto_meter_id' invalid for any
* further calls. */
void (*meter_del)(struct ofproto *, ofproto_meter_id);
/* ## --------------------- ## */
/* ## Datapath information ## */
/* ## --------------------- ## */
/* Retrieve the version string of the datapath. The version
* string can be NULL if it can not be determined.
*
* The version retuned is read only. The caller should not
* free it.
*
* This function should be NULL if an implementation does not support it.
*/
const char *(*get_datapath_version)(const struct ofproto *);
/* ## ------------------- ## */
/* ## Connection tracking ## */
/* ## ------------------- ## */
/* Flushes the connection tracking tables. If 'zone' is not NULL,
* only deletes connections in '*zone'. */
void (*ct_flush)(const struct ofproto *, const uint16_t *zone);
};

下面来看看ofproto_class的具体实现ofproto-dpif，ofproto-dpif用于datapath未命中的报文，通过查找openflow流表计算出具体的action，并下发到datapath中，同时ofproto-dpif还会进一步把报文上送给ofproto，最终交给集中式控制器来处理。

[cpp] view plain copy

/* Ofproto-dpif -- DPIF based ofproto implementation.
*
* Ofproto-dpif provides an ofproto implementation for those platforms which
* implement the netdev and dpif interface defined in netdev.h and dpif.h. The
* most important of which is the Linux Kernel Module (dpif-linux), but
* alternatives are supported such as a userspace only implementation
* (dpif-netdev), and a dummy implementation used for unit testing.
*
* Ofproto-dpif is divided into three major chunks.
*
* - ofproto-dpif.c
* The main ofproto-dpif module is responsible for implementing the
* provider interface, installing and removing datapath flows, maintaining
* packet statistics, running protocols (BFD, LACP, STP, etc), and
* configuring relevant submodules.
*
* - ofproto-dpif-upcall.c
* Ofproto-dpif-upcall is responsible for retrieving upcalls from the kernel,
* processing miss upcalls, and handing more complex ones up to the main
* ofproto-dpif module. Miss upcall processing boils down to figuring out
* what each packet's actions are, executing them (i.e. asking the kernel to
* forward it), and handing it up to ofproto-dpif to decided whether or not
* to install a kernel flow.
*
* - ofproto-dpif-xlate.c
* Ofproto-dpif-xlate is responsible for translating OpenFlow actions into
* datapath actions. */

ofproto-dpif的结构图如下

               |   +-------------------+
               |   |    ovs-vswitchd   |<-->ovsdb-server
               |   +-------------------+
               |   |      ofproto      |<-->OpenFlow controllers
               |   +--------+-+--------+  _
               |   | netdev | |ofproto-|   |
     userspace |   +--------+ |  dpif  |   |
               |   | netdev | +--------+   |
               |   |provider| |  dpif  |   |
               |   +---||---+ +--------+   |
               |       ||     |  dpif  |   | implementation of
               |       ||     |provider|   | ofproto provider
               |_      ||     +---||---+   |
                       ||         ||       |
                _  +---||-----+---||---+   |
               |   |          |datapath|   |
        kernel |   |          +--------+  _|
               |   |                   |
               |_  +--------||---------+
                            ||
                         physical
                           NIC

struct dpif_class是datapath interface实现的工厂接口类，用于和实际的datapath, e.g. openvswitch.ko, 或者userspace datapath交互。目前已有的两个dpif的实现是dpif-netlink和dpif-netdev，前者是基于内核datapath的dpif实现，后者基于用户态datapath。代码可以在lib/dpif-netlink.c以及lib/dpif-netdev.c里找到。

struct dpif_class的接口定义在lib/dpif-provider.h中，仅供参考

你可能感兴趣的:(openVSwitch)

深入理解Open vSwitch（OVS）：原理、架构与操作 CloudJourney 云计算架构
一、引言随着云计算和虚拟化技术的不断发展，网络虚拟化成为了构建灵活、可扩展网络架构的关键技术之一。OpenvSwitch（OVS）作为一种功能强大的开源虚拟交换机，被广泛应用于云计算和虚拟化环境中，为虚拟机提供高效、灵活的网络连接。本文将从技术细节入手，详细阐述OVS的原理、架构以及常见操作，旨在帮助读者更深入地了解和应用OVS。二、OVS的原理OVS的原理主要基于软件定义网络（SDN）的思想，通
OpenVswitch端口流量镜像圣地亚哥_SVIP
ovs中无法直接抓包分析，可以通过端口流量镜像的方式进行抓包端口流量镜像主要分为两步：创建mirror，指定mirror中的源端口，目的端口绑定mirror至bridge创建mirror指定mirror名称，name={name}指定流量：select-all:true，表示此bridge上的所有流量;select-dst-port:镜像从此port离开的流量;select-src-port:镜像
Mininet互通（mininet内主机跨宿主机通信、mininet与真机通信、mininet内主机访问外网）怎么实现？ coderge
图中的ens33、ens37均是网卡，此处的互通指的是ubuntu_1、ubuntu_2、mininet_1内的主机、mininet_2内的主机全部都可以互相通信。借助的原理就是mininet内的出口OVS（OpenvSwitch）占用宿主机ubuntu主机网卡，从而达到mininet内虚拟主机与ubuntu宿主机在网络中同等地位的目的，占用网卡后的网络拓扑可以这么理解。1实验环境Ubuntu18
NUAA-SDN课程考试客观题不买Huracan不改名 NUAA 学习方法
写在前面:请忽略学长的成绩,,,,,,,,,,,,,,,,,,,然后大家全局搜题就行了.资瓷下互联网精神好不好啦!(点个赞)第一次单选和判断题共50题共100分1、（2分）OpenvSwitch中的网桥对应物理交换机，其功能是根据一定流规则，把从端口收到的数据包转发到另一个或多个端口。对错正确答案：对考生答案：对2、（2分）传统的计算机设备包括网络功能、厂商操作系统和定制化硬件。对错正确答案：对考
虚拟化逻辑架构：KVM虚拟机通过OVS端口组实现网络连接 cronaldo91 虚拟化逻辑架构架构运维云计算
目录一、实验1.CentOS7安装OpenVSwitch(构建RPM安装包）2.KVM虚拟机通过OVS端口组实现网络连接二、问题1.安装openvswitch-2.5.10报错2.virt-install未找到命令3.如何删除自定义网络4.开机如何自动启动自定义网络一、实验1.CentOS7安装OpenVSwitch(构建RPM安装包）（1）关闭Selinux或设置为Permissive①查看配置
OVS常用命令与使用总结柿子一位 SDN相关 linux
OVS常用命令与使用总结说明在平时使用ovs中，经常用到的ovs命令，参数，与举例总结，持续更新中…进程启动1.先准备ovs的工作目录，数据库存储路径等mkdir-p/etc/openvswitchmkdir-p/var/run/openvswitch2.先启动ovsdb-serverovsdb-server/etc/openvswitch/conf.db\-vconsole:emer-vsysl
OVS简介写一封情书 OVS 信息与通信
OpenvSwitch简称OVS，OVS是一个支持多层数据转发的高质量虚拟交换机，主要部署在服务器上，相比传统交换机具有很好的编程扩展性，同时具备传统交换机实现的网络隔离和数据转发功能，运行在每个实现虚拟化的物理机器上，并提供远程管理。OVS提供了两种在虚拟化环境中远程管理的协议：一个是OpenFlow，通过流表来管理交换机的行为，一个是OVSDB管理协议，用来暴露交换机的端口状态。其中OpenF
【星海出品】SDN neutron (五) openvswitch 活跃的煤矿打工人 openstack python
1、ovs-vswitchd组件是交换机的主要模块，运行在用户态，其主要负责基本的转发逻辑、地址学习、外部物理端口绑定等。还可以运用OVS自带的ovs-ofctl工具采用openflow协议对交换机进行远程配置和管理。2、ovsdb-server组件是存储OVS的网桥等配置、日志以及状态的轻量级数据库。它与ovs-vswitchd都是以一个单独的进程存在于系统中。ovsdb是一个可提供持久化存储的
网络虚拟化介绍（OVS、DVS）静下心来敲木鱼云计算 php 网络开发语言
目录虚拟化中网络架构虚拟交换机类型虚拟交换机OVS（OpenVswitch）分布式虚拟交换机DVS虚拟机和物理网卡的通信模式虚拟交换机中其它功能特性网络虚拟化概念网络虚拟化就是把网络层的一些功能从硬件中剥离出来，建立新的网络虚拟层；该虚拟层可以接管网络服务和配置，实现网络服务与物理层的解耦，打破物理资源限制；通过网络虚拟层能够在一个物理网络上模拟出多个逻辑网络，并且多用户在逻辑网络上相互隔离，保证
基于ubuntu22.04手动安装openstack——2023.2版本（最新版）的问题汇总让我三行代码 openstack云平台搭建 openstack
前言：基本上按照openstack官方网站动手可以搭建成功（如有需要私信发部署文档）。但是任然有些小问题，所以汇总如下。第一个问题问题：ubuntu搭建2023.2版本neutorn报错，ERRORneutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent[-]Bridgeens34forphysicalnetworkprovid
openstack-neutron-ML2 翰霖学院 openstack openstack系统化学习 openstack 网络云计算
简介openStackNeutron作为一种SDN（SoftwareDefinedNetwork），在其内部使用ML2模块来管理Layer2。ML2全称是ModularLayer2。它是一个可以同时管理多种Layer2技术的框架。在OpenStackNeutron的项目代码中，ML2目前支持OpenvSwitch，linuxbridge，SR-IOV等虚拟化Layer2技术。在Neutron的各个
Neutron 理解 (8):如何实现虚机防火墙的陈晨luminous OpenStack Neutron 防火墙虚拟化
Neutron理解(1):Neutron所实现的虚拟化网络Neutron理解(2):使用OpenvSwitch+VLAN组网Neutron理解(3):OpenvSwitch+GRE/VxLAN组网Neutron理解(4):NeutronOVSOpenFlow流表和L2PopulationNeutron理解(5)：Neutron是如何向Nova虚机分配固定IP地址的Neutron理解(6):如何实现
网络流量监测技术概述 yeasy Tech Network 网络监控
监控指标延迟（Latency）丢包率（PacketLoss）吞吐量（Throughput）链路使用率（LinkUtilization）可用性（Availability）测量手段主动vs被动单点vs多点网络层vs应用层镜像vs采样主机端vs交换节点流量抓取协议镜像/SPAN把被监控端口的流量复制一份，发送到特定目的端口。某些硬件交换机支持，OpenvSwitch支持类似的Mirror功能。分为两类：
理解OpenStack中的OpenvSwitch的几个要点 yeasy Tech OpenStack Openvswitch networking 网络 openstack openvswitch
OpenvSwitch是实现虚拟化网络的重要基础组件，在OpenStack中利用OpenvSwitch作为底层部件来完成虚拟网络提供和租户网络管理。在部署和应用OpenStack的过程中，可能会碰到网络相关的一些问题，能够准确的理解OpenStack中OpenvSwitch的角色和网络的理念，会有助于解决问题和快速部署。OpenvSwitch可以认为是一种LinuxBridge的实现，只不过功能更
Conmi的正确答案——NetworkManager配置网络后报错“Cannot call openvswitch: ovsdb-server.service is not running.” Conmi·白小丑服务器运维 debian
系统：debian12NetworkManager版本：netplan.io/stable,now0.106-2amd64错误信息：**(generate:8806):WARNING**:10:13:09.808:Permissionsfor/etc/netplan/00-installer-config.yamlaretooopen.NetplanconfigurationshouldNOTbe
system管理服务平解技术控
1、system管理系统服务常用命令1.systemctllist-units查看当前有多少系统服务2.rc.d开机自启动服务顺序/etc/rc.d下面的目录结构init.drc0.drc1.drc2.drc3.drc4.drc5.drc6.drc.local其中rc6.d下面存放的东西类似与K91openvswitch->../init.d/openvswitch代表第91号服务，链接到某个具体
Open vSwitch介绍 ཌ斌赋ད Open vSwitch介绍和测试网络智能路由器
OpenvSwitch介绍1vSwitch功能2OvS架构3OvS报文处理3.1传统OvS方式3.2OvS+DPDK处理方式4OvS补充说明4.1基本概念4.2匹配项与规则4.2.1匹配项4.2.2动作本节主要介绍OpenvSwitch(OvS)的基本概念、架构、报文处理流程。1vSwitch功能vSwitch(VirtualSwitch)指虚拟交换机或虚拟网络交换机，工作在二层数据网络，通过软件
Begining-To-End DPDK Guide 认真的柯南 DPDK QEMU linux debian 运维
IntroductionThisguideismoreofatutorialthatshouldguideyouthroughallofthestepsofinstallingDPDKandOpenvSwitchfromthepackagesbuiltbytheDebianLinux.ThisguideassumestheuseofIntelNianticNICcards.MellanoxConn
ansible安装ocata问题不会再迷了路
安装neutron时会遇到openvswitch起不来，原因不详，解决办法是，有个openvswitch-nonetwork的服务没起来，起来就可以了装完horizon后访问出错，原因见：https://cloud.tencent.com/developer/article/1504245意思大概是:因为它周期性连接到非本地缓存有问题。官网写的有bug。把/etc/openstack-dashbo
linux6.缺少命令行,奇怪的链接错误：命令行中缺少DSO 果酱味 linux6.缺少命令行
奇怪的链接错误：命令行中缺少DSO当我编译openvswitch-1.5.0时，遇到以下编译错误：gcc-Wstrict-prototypes-Wall-Wno-sign-compare-Wpointer-arith-Wdeclaration-after-statement-Wformat-security-Wswitch-enum-Wunused-parameter-Wstrict-aliasi
Linux中openvswitch配置网桥详解好奇的菜鸟服务器 linux 服务器
以下是对给出的命令进行逐行解释和注释：#安装openvswitch软件包，并自动确认所有提示信息使用默认值（-y参数）dnfinstallopenvswitch-y#启动openvswitch服务systemctlstartopenvswitch#设置openvswitch服务开机启动systemctlenableopenvswitch#编辑网卡配置文件ens33（使用vim编辑器打开/etc/s
K8S网络原理运维@小兵 K8S网络原理 flannel网络 Calico网络
文章目录一、Kubernetes网络模型`设计原则``IP-per-Pod模型`二、Kubernetes的网络实现容器到容器的通信Pod之间的通信同一个Node内Pod之间的通信不同Node上Pod之间的通信CNI网络模型CNM模型CNI模型在Kubernetes中使用网络插件开源的网络组件FlannelFlannel实现图Flannel特点OpenvSwitch网络架构网络通信过程OpenvSw
软件定义网络-OpenvSwitch key_3_feng 网络协议网络协议
软件定义网络（SDN）。它主要有以下三个特点：控制与转发分离：转发平面就是一个个虚拟或者物理的网络设备，就像小区里面的一条条路。控制平面就是统一的控制中心，就像小区物业的监控室。它们原来是一起的，物业管理员要从监控室出来，到路上去管理设备，现在是分离的，路就是走人的，控制都在监控室。控制平面与转发平面之间的开放接口：控制器向上提供接口，被应用层调用，就像总控室提供按钮，让物业管理员使用。控制器向下
Mellanox ConnectX-6-dx智能网卡 openvswitch 流表卸载源码分析秋千无闻 openvswitch dpdk mellanox 智能网卡 dpu 流表卸载
MellanoxConnectX-6-dx智能网卡具备流表卸载能力。智能网卡的部署方式兼容当前服务器ovs部署方式。而DPUbluefield2，其要求ovs从服务器上转移到DPU上，这影响现有上层neutron架构，改造量大。前置信息OFED代码版本：LinuxInfiniBandDrivers。其中,openvswitch版本为2.17.2，dpdk版本为20.11。卸载主流程概述目前，智能网
CentOS7.6下创建KVM（使用Openvswitch）养猫的老鼠
背景：需要在CentOS7下面创建一个KVM，这里我们使用Openvswitch步骤：安装qemu-kvm,libvirtd,openvswitch。通过Openvswitch创建bridgeovs-vsctladd-brzz_bridge;ovs-vsctladd-portzz_bridgeeth1创建KVMvirt-install--virt-type=kvm--name=PXE-KVM--v
一文看懂Antrea架构 MichaelChoo
Antrea总览Antrea开源项目是一个基于OpenvSwitch（OVS）的KubernetesCNI网络插件解决方案，旨在为Kubernetes集群提供更高效、更强大的跨平台网络和安全策略。Antrea利用OVS作为网络数据平面，目前主要为Kubernetes集群提供二到四层的网络服务和安全特性。七层安全策略将在随后的版本中发布。OVS是一种高性能可编程虚拟交换机，使用统一的技术栈支持Lin
DPU应用场景系列（一）网络功能卸载 yusur 硬件工程
DPU应用场景系列（一）网络功能卸载网络功能卸载是伴随云计算网络而产生的，主要是对云计算主机上的虚拟交换机的能力做硬件卸载，从而减少主机上消耗在网络上的CPU算力，提高可售卖计算资源。图云计算网络架构目前除了公有云大厂采用自研云平台，绝大部分私有云厂商都使用开源的OpenStack云平台生态。在OpenStack云平台中，虚拟交换机通常是OpenvSwitch，承担着云计算中网络虚拟化的主要工作，
Kubernetes 的网络原理（七）---开源的网络组件flannel及calico 牛牛Blog Kubernetes kubernetes 开源的网络组件 flannel及calico
一Flannel组件1.1flannel介绍Kubernetes的网络模型假定了所有Pod都在一个可以直接连通的扁平网络空间中。若需要实现这个网络假设，需要实现不同节点上的Docker容器之间的互相访问，然后运行Kubernetes。目前已经有多个开源组件支持容器网络模型。如Flannel、OpenvSwitch、直接路由和Calico。Flannel之所以可以搭建Kubernetes依赖的底层网
HCIP-OpenStack组件之neutron [email protected] openstack
neutron（ovs、ovn）OVSOVS(OpenvSwitch)是虚拟交换机，遵循SDN(SoftwareDefinedNetwork，软件定义网络)架构来管理的。OVS介绍参考：https://mp.weixin.qq.com/s?__biz=MzAwMDQyOTcwOA==&mid=2247485088&idx=1&sn=f7eb3126eaa7d8c2a5056332694aea8b&
Kvm配置ovs网桥吃面包的刺猬运维网络 linux
环境：部署在kvm虚拟环境上（让虚拟机和宿主机都可以直接从路由器获取到独立ip）1、安装ovs软件安装包并启动服务（一般采用源码安装，此处用yum安装）yuminstallopenvswitch-2.9.0-3.el7.x86_64.rpmsystemctlenable--nowopenvswitch2、创建br0网桥，查看当前状态ovs-vsctladd-brbr0ovs-vsctlshow3、
LeetCode[Math] - #66 Plus One Cwind java LeetCode 题解 Algorithm Math
原题链接：#66 Plus One 要求：给定一个用数字数组表示的非负整数，如num1 = {1, 2, 3, 9}, num2 = {9, 9}等，给这个数加上1。注意： 1. 数字的较高位存在数组的头上，即num1表示数字1239 2. 每一位（数组中的每个元素）的取值范围为0~9 难度：简单分析：题目比较简单，只须从数组
JQuery中$.ajax()方法参数详解 AILIKES JavaScript jsonp jquery Ajax json
url: 要求为String类型的参数，（默认为当前页地址）发送请求的地址。 type: 要求为String类型的参数，请求方式（post或get）默认为get。注意其他http请求方法，例如put和 delete也可以使用，但仅部分浏览器支持。 timeout: 要求为Number类型的参数，设置请求超时时间（毫秒）。此设置将覆盖$.ajaxSetup()方法的全局
JConsole & JVisualVM远程监视Webphere服务器JVM Kai_Ge JVisualVM JConsole Webphere
JConsole是JDK里自带的一个工具，可以监测Java程序运行时所有对象的申请、释放等动作，将内存管理的所有信息进行统计、分析、可视化。我们可以根据这些信息判断程序是否有内存泄漏问题。　　使用JConsole工具来分析WAS的JVM问题，需要进行相关的配置。　　首先我们看WAS服务器端的配置. 　　1、登录was控制台https://10.4.119.18
自定义annotation 120153216 annotation
Java annotation 自定义注释@interface的用法一、什么是注释说起注释，得先提一提什么是元数据(metadata)。所谓元数据就是数据的数据。也就是说，元数据是描述数据的。就象数据表中的字段一样，每个字段描述了这个字段下的数据的含义。而J2SE5.0中提供的注释就是java源代码的元数据，也就是说注释是描述java源
CentOS 5/6.X 使用 EPEL YUM源 2002wmj centos
CentOS 6.X 安装使用EPEL YUM源1. 查看操作系统版本[root@node1 ~]# uname -a Linux node1.test.com 2.6.32-358.el6.x86_64 #1 SMP Fri Feb 22 00:31:26 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux [root@node1 ~]#
在SQLSERVER中查找缺失和无用的索引SQL 357029540 SQL Server
--缺失的索引 SELECT avg_total_user_cost * avg_user_impact * ( user_scans + user_seeks ) AS PossibleImprovement , last_user_seek ,
Spring3 MVC 笔记（二） —json+rest优化 7454103 Spring3 MVC
接上次的 spring mvc 注解的一些详细信息！其实也是一些个人的学习笔记呵呵！
替换“\”的时候报错Unexpected internal error near index 1 \ ^ adminjun java “\替换”
发现还是有些东西没有刻子脑子里,,过段时间就没什么概念了,所以贴出来...以免再忘... 在拆分字符串时遇到通过 \ 来拆分，可是用所以想通过转义 \\ 来拆分的时候会报异常 public class Main { /*
POJ 1035 Spell checker(哈希表) aijuans 暴力求解--哈希表
/* 题意：输入字典，然后输入单词，判断字典中是否出现过该单词，或者是否进行删除、添加、替换操作，如果是，则输出对应的字典中的单词要求按照输入时候的排名输出题解：建立两个哈希表。一个存储字典和输入字典中单词的排名，一个进行最后输出的判重 */ #include <iostream> //#define using namespace std; const int HASH =
通过原型实现javascript Array的去重、最大值和最小值 ayaoxinchao JavaScript array prototype
用原型函数（prototype）可以定义一些很方便的自定义函数，实现各种自定义功能。本次主要是实现了Array的去重、获取最大值和最小值。实现代码如下： <script type="text/javascript"> Array.prototype.unique = function() { var a = {}; var le
UIWebView实现https双向认证请求 bewithme UIWebView https Objective-C
什么是HTTPS双向认证我已在先前的博文 ASIHTTPRequest实现https双向认证请求中有讲述，不理解的读者可以先复习一下。本文是用UIWebView来实现对需要客户端证书验证的服务请求，网上有些文章中有涉及到此内容，但都只言片语，没有讲完全，更没有完整的代码，让人困扰不已。但是此知
NoSQL数据库之Redis数据库管理(Redis高级应用之事务处理、持久化操作、pub_sub、虚拟内存) bijian1013 redis 数据库 NoSQL
3.事务处理 Redis对事务的支持目前不比较简单。Redis只能保证一个client发起的事务中的命令可以连续的执行，而中间不会插入其他client的命令。当一个client在一个连接中发出multi命令时，这个连接会进入一个事务上下文，该连接后续的命令不会立即执行，而是先放到一个队列中，当执行exec命令时，redis会顺序的执行队列中
各数据库分页sql备忘 bingyingao oracle sql 分页
ORACLE 下面这个效率很低 SELECT * FROM ( SELECT A.*, ROWNUM RN FROM (SELECT * FROM IPAY_RCD_FS_RETURN order by id desc) A ) WHERE RN <20; 下面这个效率很高 SELECT A.*, ROWNUM RN FROM (SELECT * FROM IPAY_RCD_
【Scala七】Scala核心一：函数 bit1129 scala
1. 如果函数体只有一行代码，则可以不用写{},比如 def print(x: Int) = println(x) 一行上的多条语句用分号隔开，则只有第一句属于方法体，例如 def printWithValue(x: Int) : String= println(x); "ABC" 上面的代码报错，因为，printWithValue的方法
了解GHC的factorial编译过程 bookjovi haskell
GHC相对其他主流语言的编译器或解释器还是比较复杂的，一部分原因是haskell本身的设计就不易于实现compiler，如lazy特性，static typed，类型推导等。关于GHC的内部实现有篇文章说的挺好，这里，文中在RTS一节中详细说了haskell的concurrent实现，里面提到了green thread，如果熟悉Go语言的话就会发现，ghc的concurrent实现和Go有点类
Java-Collections Framework学习与总结-LinkedHashMap BrokenDreams LinkedHashMap
前面总结了java.util.HashMap，了解了其内部由散列表实现，每个桶内是一个单向链表。那有没有双向链表的实现呢？双向链表的实现会具备什么特性呢？来看一下HashMap的一个子类——java.util.LinkedHashMap。
读《研磨设计模式》-代码笔记-抽象工厂模式-Abstract Factory bylijinnan abstract
声明：本文只为方便我个人查阅和理解，详细的分析以及源代码请移步原作者的博客http://chjavach.iteye.com/ package design.pattern; /* * Abstract Factory Pattern * 抽象工厂模式的目的是： * 通过在抽象工厂里面定义一组产品接口，方便地切换“产品簇” * 这些接口是相关或者相依赖的
压暗面部高光 cherishLC PS
方法一、压暗高光&重新着色当皮肤很油又使用闪光灯时，很容易在面部形成高光区域。下面讲一下我今天处理高光区域的心得：皮肤可以分为纹理和色彩两个属性。其中纹理主要由亮度通道（Lab模式的L通道）决定，色彩则由a、b通道确定。处理思路为在保持高光区域纹理的情况下，对高光区域着色。具体步骤为：降低高光区域的整体的亮度，再进行着色。如果想简化步骤，可以只进行着色（参看下面的步骤1
Java VisualVM监控远程JVM crabdave visualvm
Java VisualVM监控远程JVM JDK1.6开始自带的VisualVM就是不错的监控工具. 这个工具就在JAVA_HOME\bin\目录下的jvisualvm.exe, 双击这个文件就能看到界面通过JMX连接远程机器, 需要经过下面的配置: 1. 修改远程机器JDK配置文件 (我这里远程机器是linux).
Saiku去掉登录模块 daizj saiku 登录 olap BI
1、修改applicationContext-saiku-webapp.xml <security:intercept-url pattern="/rest/**" access="IS_AUTHENTICATED_ANONYMOUSLY" /> <security:intercept-url pattern=&qu
浅析 Flex中的Focus dsjt html Flex Flash
关键字：focus、 setFocus、 IFocusManager、KeyboardEvent 焦点、设置焦点、获得焦点、键盘事件一、无焦点的困扰——组件监听不到键盘事件原因：只有获得焦点的组件（确切说是InteractiveObject）才能监听到键盘事件的目标阶段；键盘事件（flash.events.KeyboardEvent）参与冒泡阶段，所以焦点组件的父项（以及它爸
Yii全局函数使用 dcj3sjt126com yii
由于YII致力于完美的整合第三方库，它并没有定义任何全局函数。yii中的每一个应用都需要全类别和对象范围。例如，Yii::app()->user;Yii::app()->params['name'];等等。我们可以自行设定全局函数，使得代码看起来更加简洁易用。(原文地址) 我们可以保存在globals.php在protected目录下。然后，在入口脚本index.php的，我们包括在
设计模式之单例模式二（解决无序写入的问题） come_for_dream 单例模式 volatile 乱序执行双重检验锁
在上篇文章中我们使用了双重检验锁的方式避免懒汉式单例模式下由于多线程造成的实例被多次创建的问题，但是因为由于JVM为了使得处理器内部的运算单元能充分利用，处理器可能会对输入代码进行乱序执行（Out Of Order Execute）优化，处理器会在计算之后将乱序执行的结果进行重组，保证该
程序员从初级到高级的蜕变 gcq511120594 框架工作 PHP android html5
软件开发是一个奇怪的行业，市场远远供不应求。这是一个已经存在多年的问题，而且随着时间的流逝，愈演愈烈。我们严重缺乏能够满足需求的人才。这个行业相当年轻。大多数软件项目是失败的。几乎所有的项目都会超出预算。我们解决问题的最佳指导方针可以归结为——“用一些通用方法去解决问题，当然这些方法常常不管用，于是，唯一能做的就是不断地尝试，逐个看看是否奏效”。现在我们把淫浸代码时间超过3年的开发人员称为
Reverse Linked List hcx2013 list
Reverse a singly linked list. /** * Definition for singly-linked list. * public class ListNode { * int val; * ListNode next; * ListNode(int x) { val = x; } * } */ p
Spring4.1新特性——数据库集成测试 jinnianshilongnian spring 4.1
目录 Spring4.1新特性——综述 Spring4.1新特性——Spring核心部分及其他 Spring4.1新特性——Spring缓存框架增强 Spring4.1新特性——异步调用和事件机制的异常处理 Spring4.1新特性——数据库集成测试脚本初始化 Spring4.1新特性——Spring MVC增强 Spring4.1新特性——页面自动化测试框架Spring MVC T
C# Ajax上传图片同时生成微缩图(附Demo) liyonghui160com
1.Ajax无刷新上传图片,详情请阅我的这篇文章。（jquery + c# ashx） 2.C#位图处理 System.Drawing。 3.最新demo支持IE7,IE8,Fir
Java list三种遍历方法性能比较 pda158 java
从c/c++语言转向java开发，学习java语言list遍历的三种方法，顺便测试各种遍历方法的性能，测试方法为在ArrayList中插入1千万条记录，然后遍历ArrayList，发现了一个奇怪的现象，测试代码例如以下： package com.hisense.tiger.list; import java.util.ArrayList; import java.util.Iterator;
300个涵盖IT各方面的免费资源（上）——商业与市场篇 shoothao seo 商业与市场 IT资源免费资源
A.网站模板+logo+服务器主机+发票生成 HTML5 UP:响应式的HTML5和CSS3网站模板。 Bootswatch:免费的Bootstrap主题。 Templated:收集了845个免费的CSS和HTML5网站模板。 Wordpress.org|Wordpress.com:可免费创建你的新网站。 Strikingly:关注领域中免费无限的移动优
localStorage、sessionStorage uule localStorage
W3School 例子 HTML5 提供了两种在客户端存储数据的新方法： localStorage - 没有时间限制的数据存储 sessionStorage - 针对一个 session 的数据存储之前，这些都是由 cookie 完成的。但是 cookie 不适合大量数据的存储，因为它们由每个对服务器的请求来传递，这使得 cookie 速度很慢而且效率也不