Roce ofed环境搭建与测试
一、安装包下载:
mellanox驱动下载地址:
1、进入到网址:https://cn.mellanox.com/products/infiniband-drivers/linux/mlnx_ofed
2、在打开的页面上找到自己平台,如:Linux SW/Drivers,这里以centos 8为例;
3、在页面的下方找到对应的版本进行下载;
这里以tar.gz的格式进行相应的说明使用;
二、安装
1、将下载好的驱动包上传到服务器,上传的步骤这里不叙述;
2、解压上传好的驱动包,等待解压完成;
[root@localhost ~]# tar xzvf MLNX_OFED_LINUX-5.0-2.1.8.0-rhel8.1-x86_64.tgz
./MLNX_OFED_LINUX-5.0-2.1.8.0-rhel8.1-x86_64/
./MLNX_OFED_LINUX-5.0-2.1.8.0-rhel8.1-x86_64/RPM-GPG-KEY-Mellanox
./MLNX_OFED_LINUX-5.0-2.1.8.0-rhel8.1-x86_64/uninstall.sh
./MLNX_OFED_LINUX-5.0-2.1.8.0-rhel8.1-x86_64/.mlnx
./MLNX_OFED_LINUX-5.0-2.1.8.0-rhel8.1-x86_64/.arch
./MLNX_OFED_LINUX-5.0-2.1.8.0-rhel8.1-x86_64/distro
………………………
3、安装
[root@localhost ~]# cd MLNX_OFED_LINUX-5.0-2.1.8.0-rhel8.1-x86_64/
[root@localhost MLNX_OFED_LINUX-5.0-2.1.8.0-rhel8.1-x86_64]# ./mlnxofedinstall
Logs dir: /tmp/MLNX_OFED_LINUX.10300.logs
General log file: /tmp/MLNX_OFED_LINUX.10300.logs/general.log
Verifying KMP rpms compatibility with target kernel...
……………………….
Complete!
等待系统安装完成即可;
三、常用检查配置;
1、InfiniBand 状态:
[root@centos222 ~]# ibstat
CA 'mlx4_0'
CA type: MT4099
Number of ports: 1
Firmware version: 2.42.5000
Hardware version: 1
Node GUID: 0xf452140300880760
System image GUID: 0xf452140300880760
Port 1:
State: Active
Physical state: LinkUp
Rate: 10
Base lid: 0
LMC: 0
SM lid: 0
Capability mask: 0x00010000
Port GUID: 0xf65214fffe880760
Link layer: Ethernet
2、InfiniBand 状态:
[root@centos222 ~]# ibstatus
Infiniband device 'mlx4_0' port 1 status:
default gid: fe80:0000:0000:0000:f652:14ff:fe88:0760
base lid: 0x0
sm lid: 0x0
state: 4: ACTIVE
phys state: 5: LinkUp
rate: 10 Gb/sec (1X QDR)
link_layer: Ethernet
3、网卡的对应关系:
[root@centos222 ~]# ibdev2netdev
mlx4_0 port 1 ==> enp2s0 (Up)
4、网卡协商相关信息:
[root@centos222 ~]# ethtool enp2s0
Settings for enp2s0:
Supported ports: [ FIBRE ]
Supported link modes: 1000baseKX/Full
10000baseKR/Full
Supported pause frame use: Symmetric Receive-only
Supports auto-negotiation: No
Supported FEC modes: Not reported
Advertised link modes: 1000baseKX/Full
10000baseKR/Full
Advertised pause frame use: Symmetric
Advertised auto-negotiation: No
Advertised FEC modes: Not reported
Speed: 10000Mb/s
Duplex: Full
Port: Direct Attach Copper
PHYAD: 0
Transceiver: internal
Auto-negotiation: off
Supports Wake-on: d
Wake-on: d
Current message level: 0x00000014 (20)
link ifdown
Link detected: yes
5、网卡支持的gid等相关信息:
[root@centos222 ~]# show_gids
DEV PORT INDEX GID IPv4 VER DEV
--- ---- ----- --- ------------ --- ---
mlx4_0 1 0 fe80:0000:0000:0000:f652:14ff:fe88:0760 v1 enp2s0
n_gids_found=1
6、网卡工作模式:
[root@centos222 ~]# ibstatus
Infiniband device 'mlx4_0' port 1 status:
default gid: fe80:0000:0000:0000:f652:14ff:fe88:0760
base lid: 0x0
sm lid: 0x0
state: 4: ACTIVE
phys state: 5: LinkUp
rate: 10 Gb/sec (1X QDR)
link_layer: Ethernet
查看网卡当前的link 工作模式:
[root@centos7221 ~]# connectx_port_config -s
--------------------------------
Port configuration for PCI device: 0000:86:00.0 is:
eth
--------------------------------
[root@centos7221 ~]# connectx_port_config
ConnectX PCI devices :
|----------------------------|
| 1 0000:86:00.0 |
|----------------------------|
Before port change:
eth
|----------------------------|
| Possible port modes: |
| 1: Infiniband |
| 2: Ethernet |
| 3: AutoSense |
|----------------------------|
Select mode for port 1 (1,2,3):
按照需要进行选择;
Note:
Connectx-3只支持Ethernet模式
7、解释:
$ ib_send_bw -d mlx5_4 -x 3 //在一边服务器上启动收包测试, 用index3, RoCEv2:这里要注意的是Index 值;
$ sudo ib_send_bw -d mlx5_4 192.168.1.1 --report_gbits -F -x 3 //另外一边发包
Note:
# 记得给你的网卡绑定个IP, 两边能ping通
[root@centos222 ~]# nmcli c modify enp2s0 ipv4.addresses 10.10.10.222/24 autoconnect yes ipv4.method manual
[root@centos222 ~]# nmcli c down enp2s0
Connection 'enp2s0' successfully deactivated (D-Bus active path: /org/freedesktop/NetworkManager/ActiveConnection/439)
[root@centos222 ~]# nmcli c reload enp2s0
[root@centos222 ~]# nmcli c up enp2s0
Connection successfully activated (D-Bus active path: /org/freedesktop/NetworkManager/ActiveConnection/440)
[root@centos222 ~]# ip a
1: lo:
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eno1:
link/ether 6c:92:bf:70:97:cc brd ff:ff:ff:ff:ff:ff
inet 192.168.101.222/24 brd 192.168.101.255 scope global noprefixroute eno1
valid_lft forever preferred_lft forever
inet6 fe80::aad6:ae47:a954:b791/64 scope link noprefixroute
valid_lft forever preferred_lft forever
3: enp4s0f1:
link/ether 6c:92:bf:70:97:cd brd ff:ff:ff:ff:ff:ff
4: enp132s0f0:
link/ether 68:91:d0:61:57:2e brd ff:ff:ff:ff:ff:ff
5: enp132s0f1:
link/ether 68:91:d0:61:57:2f brd ff:ff:ff:ff:ff:ff
6: enp2s0:
link/ether f4:52:14:88:07:60 brd ff:ff:ff:ff:ff:ff
inet 10.10.10.222/24 brd 10.10.10.255 scope global noprefixroute enp2s0
valid_lft forever preferred_lft forever
inet6 fe80::da78:33ac:bf32:8856/64 scope link noprefixroute
valid_lft forever preferred_lft forever
7: virbr0:
link/ether 52:54:00:53:81:7e brd ff:ff:ff:ff:ff:ff
inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0
valid_lft forever preferred_lft forever
8: virbr0-nic:
link/ether 52:54:00:53:81:7e brd ff:ff:ff:ff:ff:ff
四、QA;
在安装的过程中可能出现很多问题,最常见的就是缺少安装包,可先安装缺少的包,再次安装驱动;
下面是安装驱动必须的包:
tcl tcsh gcc-gfortran tk python36 perl
在centos 上直接联网进行更新安装即可;
yum install tcl tcsh gcc-gfortran tk python36 perl