AF_UNIX 用于本地,通过socket文件通信 , 不用经过cpu对包解析,放到网卡, 内核直接放到对应的socket缓冲文件。如果客户端与服务端通过socket文件通信,那通过netstat命令, 能找到客户端与服务端的连接关系吗?
请看测试实例:
#! /usr/bin/perl -w
use strict;
use IO::Socket::UNIX qw( SOCK_STREAM SOMAXCONN );
my $SOCK_PATH = '/tmp/test.sock';
unlink($SOCK_PATH) if -e $SOCK_PATH;
my $server = IO::Socket::UNIX->new(
Type => SOCK_STREAM(),
Local => $SOCK_PATH,
Listen => SOMAXCONN,
)or die("Can't create server socket: $!\n");
while (1) {
my $connection = $server->accept;
if (fork() == 0) {
print "** New connection received **\n";
$connection->autoflush(1);
my $count = 1;
while (my $line = <$connection>) {
if ($line){
chomp($line);
$connection->print($count . ' -> ' . $line . "\n");
print "Received and replied to $count '$line'\n";
$count++;
}
}
close $connection;
exit;
}
}
#!/usr/bin/perl -w
use strict;
use IO::Socket::UNIX qw( SOCK_STREAM );
my $SOCK_PATH = '/tmp/test.sock';
my $client = IO::Socket::UNIX->new(
Type => SOCK_STREAM(),
Peer => $SOCK_PATH
)or die("Can't connect to server: $!\n");
$client->autoflush(1);
## Listen for replies
if (fork() == 0) {
while (my $line = <$client>) {
if ($line){
chomp($line);
print("Recv: '" . $line . "'\n");
}
}
}
## Send something
while(1){
for my $itm ('Alpha','Beta','Gamma','Delta'){
print("Send: " . $itm . "\n");
print($client $itm . "\n") or warn("Can't send: $!\n"); # send to server, \n terminates
}
sleep 5
}
print "** Client Finished **\n";
查看进程id
[root@localhost ~]# ps auxf | grep server.pl
root 19083 0.0 0.1 135592 4324 pts/1 S+ 19:18 0:00 | \_ perl server.pl
root 19092 0.0 0.1 135592 2700 pts/1 S+ 19:18 0:00 | \_ perl server.pl
root 19157 0.0 0.0 112708 984 pts/3 S+ 19:18 0:00 \_ grep --color=auto server.pl
[root@localhost ~]# ps auxf | grep client.pl
root 19090 0.0 0.1 135592 4308 pts/2 S+ 19:18 0:00 | \_ perl client.pl
root 19091 0.0 0.0 135592 2344 pts/2 S+ 19:18 0:00 | \_ perl client.pl
root 19211 0.0 0.0 112708 984 pts/3 S+ 19:19 0:00 | \_ grep --color=auto client.pl
查看服务端文件占用
[root@localhost ~]# lsof -p 19092 | grep sock
perl 19092 root 3u unix 0xffff93dbd27ba400 0t0 70144 /tmp/test.sock
perl 19092 root 4u unix 0xffff93dbd1ee5800 0t0 70145 /tmp/test.sock
[root@localhost ~]# netstat -anlp | grep 19092
unix 3 [ ] STREAM CONNECTED 70145 19092/perl /tmp/test.sock
查看客户端文件占用
[root@localhost ~]# lsof -p 19090 |grep sock
perl 19090 root 3u unix 0xffff93dbd1ee3c00 0t0 70162 socket
[root@localhost ~]# lsof -p 19091 |grep sock
perl 19091 root 3u unix 0xffff93dbd1ee3c00 0t0 70162 socket
[root@localhost ~]# netstat -anlp | grep 19090
unix 3 [ ] STREAM CONNECTED 70162 19090/perl
[root@localhost ~]# netstat -anlp | grep 19091
[root@localhost ~]#
[root@localhost ~]# ll /proc/19090/fd/
total 0
lrwx------. 1 root root 64 Apr 27 19:21 0 -> /dev/pts/2
lrwx------. 1 root root 64 Apr 27 19:21 1 -> /dev/pts/2
lrwx------. 1 root root 64 Apr 27 19:18 2 -> /dev/pts/2
lrwx------. 1 root root 64 Apr 27 19:21 3 -> socket:[70162]
通过以上几种方法查看客户端的情况, 我们只知道它有建立一个socket连接,但是具体连接到那个socket文件,并不知道。那根据lsof 输出的 device=0xffff93dbd1ee3c00, 以及node=70162, 能找到对应的socket文件吗?
[root@localhost ~]# stat /tmp/test.sock
File: ‘/tmp/test.sock’
Size: 0 Blocks: 0 IO Block: 4096 socket
Device: fd00h/64768d Inode: 67785956 Links: 1
Access: (0755/srwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)
Context: unconfined_u:object_r:user_tmp_t:s0
Access: 2020-04-27 19:18:33.922022962 +0800
Modify: 2020-04-27 19:18:28.245022727 +0800
Change: 2020-04-27 19:18:28.245022727 +0800
Birth: -
[root@localhost ~]#
发现实际 /tmp/test.sock 的inode是67785956 显然这俩不是一个概念。
继续看,0xffff93dbd1ee3c00应该是一个内存地址,那我们在proc下找找。
[root@localhost ~]# grep -Irn "ffff93dbd1ee3c0" /proc/net
/proc/net/unix:80:ffff93dbd1ee3c00: 00000003 00000000 00000000 0001 03 70162
也看不出具体是那个文件(但对于如tcp连接,其实在/proc/net/tcp中是可以找到连接的 ip:port 信息的)
那要怎么才能找对对应连接的socket文件呢,
最后还是在外文上找到了资料,https://unix.stackexchange.com/questions/16300/whos-got-the-other-end-of-this-unix-socketpair/190606#190606
对于内核大于3.3, 可以使用ss命令
[root@localhost ~]# ss | grep 70162
u_str ESTAB 0 0 * 70162 * 70145
u_str ESTAB 0 0 /tmp/test.sock 70145 * 70162
[root@localhost ~]#
70162在ss命令种,表示Port, 在netstat种又表示 I-Node, 有什么不一样吗?
对于没有ss命令的环境,则可以从上述的内存地址找到答案。即 0xffff93dbd1ee3c00
利用之前一篇博客搭建的内核调试环境,参见: https://www.jianshu.com/p/caff00d28b5e
这次我们使用kcore来查看内核的当前运行情况
gdb /usr/lib/debug/lib/modules/3.10.0-957.el7.x86_64/vmlinux /proc/kcore
Core was generated by `BOOT_IMAGE=/vmlinuz-3.10.0-957.el7.x86_64 root=/dev/mapper/centos-root ro crashk'.
#0 0x0000000000000000 in irq_stack_union ()
(gdb) print ((struct unix_sock*) 0xffff93dbd1ee3c00)->peer
$1 = (struct sock *) 0xffff93dbd1ee5800
(gdb) quit
[root@localhost ~]# lsof | grep 0xffff93dbd1ee5800
perl 19092 root 4u unix 0xffff93dbd1ee5800 0t0 70145 /tmp/test.sock
[root@localhost ~]#
Linux内核关于unix_sock的定义
/* The AF_UNIX socket */
struct unix_sock {
/* WARNING: sk has to be the first member */
struct sock sk;
struct unix_address *addr;
struct path path;
struct mutex readlock;
struct sock *peer;
struct list_head link;
atomic_long_t inflight;
spinlock_t lock;
unsigned char recursion_level;
unsigned long gc_flags;
#define UNIX_GC_CANDIDATE 0
#define UNIX_GC_MAYBE_CYCLE 1
struct socket_wq peer_wq;
wait_queue_t peer_wake;
};
与我们的客户端代码是一致的
my $client = IO::Socket::UNIX->new(
Type => SOCK_STREAM(),
Peer => $SOCK_PATH
)or die("Can't connect to server: $!\n");
该方法即将peer的内存地址打印出来,对应的就是path信息了。 有了这些排查方法,我们就能找到client连接的是哪个socket文件以及服务端进程, 方便定位问题。