在内核协议栈初始化时—-inet_init()函数,注册了内置的几种协议(TCP,UDP,RAW,PING),同时为每种协议申请slab高速缓存。
static int __init inet_init(void)
{
...
rc = proto_register(&tcp_prot, 1);
if (rc)
goto out_free_reserved_ports;
rc = proto_register(&udp_prot, 1);
if (rc)
goto out_unregister_tcp_proto;
rc = proto_register(&raw_prot, 1);
if (rc)
goto out_unregister_udp_proto;
rc = proto_register(&ping_prot, 1);
if (rc)
goto out_unregister_raw_proto;
...
}
在看注册函数前,我们先看下要注册的TCP协议—-tcp_prot,都有些什么信息。
struct proto tcp_prot = {
.name = "TCP",
.owner = THIS_MODULE,
.close = tcp_close,
.connect = tcp_v4_connect,
.disconnect = tcp_disconnect,
.accept = inet_csk_accept,
.ioctl = tcp_ioctl,
.init = tcp_v4_init_sock,
.destroy = tcp_v4_destroy_sock,
.shutdown = tcp_shutdown,
.setsockopt = tcp_setsockopt,
.getsockopt = tcp_getsockopt,
.recvmsg = tcp_recvmsg,
.sendmsg = tcp_sendmsg,
.sendpage = tcp_sendpage,
.backlog_rcv = tcp_v4_do_rcv,
.release_cb = tcp_release_cb,
.hash = inet_hash,
.unhash = inet_unhash,
.get_port = inet_csk_get_port,
.enter_memory_pressure = tcp_enter_memory_pressure,
.sockets_allocated = &tcp_sockets_allocated,
.orphan_count = &tcp_orphan_count,
.memory_allocated = &tcp_memory_allocated,
.memory_pressure = &tcp_memory_pressure,
.sysctl_wmem = sysctl_tcp_wmem,
.sysctl_rmem = sysctl_tcp_rmem,
.max_header = MAX_TCP_HEADER,
.obj_size = sizeof(struct tcp_sock),
.slab_flags = SLAB_DESTROY_BY_RCU,
.twsk_prot = &tcp_timewait_sock_ops,
.rsk_prot = &tcp_request_sock_ops,
.h.hashinfo = &tcp_hashinfo,
.no_autobind = true,
#ifdef CONFIG_COMPAT
.compat_setsockopt = compat_tcp_setsockopt,
.compat_getsockopt = compat_tcp_getsockopt,
#endif
#ifdef CONFIG_MEMCG_KMEM
.init_cgroup = tcp_init_cgroup,
.destroy_cgroup = tcp_destroy_cgroup,
.proto_cgroup = tcp_proto_cgroup,
#endif
};
这个协议结构体里包含了几乎所有我们TCP要用到的东西。其中name和init,以及obj_size成员我们稍后就会用到,而close,connect,accept等成员,后续也会陆续接触,因此要对该协议结构好好熟悉一下。
现在我们再来看下注册函数,
int proto_register(struct proto *prot, int alloc_slab)
{
if (alloc_slab) {//表示注册协议时要分配slab缓存
prot->slab = kmem_cache_create(prot->name, prot->obj_size, 0,
SLAB_HWCACHE_ALIGN | prot->slab_flags,
NULL);
if (prot->slab == NULL) {
pr_crit("%s: Can't create sock SLAB cache!\n",
prot->name);
goto out;
}
...
}
所以最终是由kmem_cache_create()完成TCP协议的slab缓存建立。prot->name, prot->obj_size这两个参数我们上面介绍tcp_prot时有提到过,其中prot->name(值为TCP),就是我们在/proc/slabinfo中看到的名称。prot->obj_size(值为sizeof(struct tcp_sock)),就是每个缓存结构的大小。所以结果就是创建了一个名为TCP,大小为sizeof(struct tcp_sock)的slab高速缓存。
这里我们又看到了struct tcp_sock,加上之前的struct sock,以及struct socket,好像很绕的样子。那我们就来看看这三者有什么关系。
我们先把几个相关的结构体定义列出,
struct tcp_sock {
/* inet_connection_sock has to be the first member of tcp_sock */
struct inet_connection_sock inet_conn;
u16 tcp_header_len; /* Bytes of tcp header to send */
u16 xmit_size_goal_segs; /* Goal for segmenting output packets */
...
};
struct inet_connection_sock {
/* inet_sock has to be the first member! */
struct inet_sock icsk_inet;
struct request_sock_queue icsk_accept_queue;
struct inet_bind_bucket *icsk_bind_hash;
...
};
struct inet_sock {
/* sk and pinet6 has to be the first two members of inet_sock */
struct sock sk;
#if IS_ENABLED(CONFIG_IPV6)
struct ipv6_pinfo *pinet6;
#endif
...
};
struct socket {
socket_state state;
...
struct sock *sk;
const struct proto_ops *ops;
};
可以看到tcp_sock,inet_connection_sock,inet_sock这三个结构体都有标注xxx变量必须是结构体第一个成员。这是为何?对比着三个结构体,我们就能找到答案。这是一个包含的关系,tcp_sock结构里包含了inet_connection_sock结构,而inet_connection_sock结构又包含了inet_sock结构。这就类似于C++里的继承,sock是父类,tcp_sock,inet_connection_sock,inet_sock都是子类。
说到这里我们也就能理解以下这几个函数为什么能直接使用强制类型转换,
static inline struct tcp_sock *tcp_sk(const struct sock *sk)
{
return (struct tcp_sock *)sk;
}
static inline struct inet_connection_sock *inet_csk(const struct sock *sk)
{
return (struct inet_connection_sock *)sk;
}
static inline struct inet_sock *inet_sk(const struct sock *sk)
{
return (struct inet_sock *)sk;
}
这样从小结构体强制转换到大结构体不会越界访问吗?答案是不会,别忘记上面创建TCP协议的slab时,就是以struct tcp_sock作为大小进行分配的。也就是内核中的每个sock都是tcp_sock类型,而struct tcp_sock正好是最大的那个结构体。
对于struct socket和struct sock,它们的区别在于,socket结构体是对应于用户态,是为应用层提供的统一结构,也就是所谓的general BSD socket。而sock结构体是对应于内核态,是socket在网络层的表示(network layer representation of sockets)。它们两者是一一对应的,在struct socket中有一个指针指向对应的struct sock。