有一台高配的R730xd服务器,带有满配大容量硬盘以及SSD和两张Qlogic的HBA卡。有天朋友讨论能不能把这台服务器配置成一台FC存储服务器推给vmware或者其他客户端使用。这个我之前没有做过,很感兴趣,所以我们花了一个多星期的进行研究最后成功实现,故我们把过程和遇到的问题及解决方法写出来分享,希望帮到其他有这个需求的运维。
刚开始Google了良久,最后决定思路是:在Linux上部署zfs系统,把SSD定义为zfs的缓存磁盘,然后把HBA卡设置为target模式,使用targetcli在zfs下的一个文件块推送至vmware或者其他客户端。
寻找良好兼容性的Linux发行版我们搞了很久,发现CentOS6使用的2.6内核不能把HBA卡设置为target模式;使用CentOS7.1能支持设置为target模式,但在targetcli下无法看到qla2xxx的路径;自己试过在CentOS6下编译3.x或者4.x的内核也是在targetcli下无法看到qla2xxx的路径,一直无法解决。后面安装Ubuntu15.10,就没有这个问题,但是在装zfs的时候发现有个包没有兼容内核,无法安装。最后是使用了Ubuntu14.04 LTS版本才顺利完成。详见本文最后的报错分享。
最后的环境:DellR730xd+QLE2650(8G)+Intel SSD 750+Ubuntu14.04LTS
先安装ZFS文件系统支持
sudo su - apt-add-repository --yes ppa:zfs-native/stable apt-get update apt-get install debootstrap spl-dkms zfs-dkms ubuntu-zfs #系统启动挂载zfs模块 vi /etc/rc.local /sbin/modprobe zfs #安装SSD驱动 apt-get install git-core build-essential libncurses5-dev git clone cd nvme-cli make && make install #创建pool zpool create zfspool /dev/sdb #添加缓存盘 zpool add zfspool cache /dev/nvme0n1
安装targetcli
apt-get install targetcli
设置HBA卡为target mode
vi /etc/modprobe.d/qla2xxx.conf options qla2xxx qlini_mode="disabled"
重启,检查targetcli是否正确加载qla2xxx模块,下面有qla2xxx输出即为正常。
targetcli /> ls o- / ..................................................................... [...] o- backstores .......................................................... [...] | o- fileio ............................................... [0 Storage Object] | o- iblock ............................................... [0 Storage Object] | o- pscsi ................................................ [0 Storage Object] | o- rd_dr ................................................ [0 Storage Object] | o- rd_mcp ............................................... [0 Storage Object] o- ib_srpt ........................................................ [0 Target] o- iscsi .......................................................... [0 Target] o- loopback ....................................................... [0 Target] o- qla2xxx ........................................................ [0 Target] />
创建backstore
#官方文档说明设备、文件、闪存盘可以作为backstore,此处我创建文件型backstore /> cd backstores/ /backstores> fileio/ create name=test file_or_dev=/zfspool/test size=1T Using buffered mode. Created fileio test. /backstores> ls o- backstores ............................................................ [...] o- fileio ................................................. [1 Storage Object] | o- test ............................... [1.0T, /zfspool/test, not in use] o- iblock ................................................. [0 Storage Object] o- pscsi .................................................. [0 Storage Object] o- rd_mcp ................................................. [0 Storage Object]
创建target,插了两块HBA,可以看到有两个WWN
/> qla2xxx/ info Fabric module name: qla2xxx ConfigFS path: /sys/kernel/config/target/qla2xxx Allowed WWNs list (free type): 21:00:00:24:ff:0e:1e:30, 21:00:00:24:ff:0e:7c:f5 Fabric module specfile: /var/target/fabric/qla2xxx.spec Fabric module features: acls Corresponding kernel module: tcm_qla2xxx /> qla2xxx/ create 21:00:00:24:ff:0e:1e:30 Created target 21:00:00:24:ff:0e:1e:30. /> qla2xxx/ create 21:00:00:24:ff:0e:7c:f5 Created target 21:00:00:24:ff:0e:7c:f5.
推LUN
/> cd qla2xxx/21:00:00:24:ff:0e:1e:30/ /qla2xxx/21:0...4:ff:0e:1e:30> luns/ create /backstores/fileio/test Selected LUN 0. Created LUN 0. /qla2xxx/21:0...4:ff:0e:1e:30> cd ../ /qla2xxx> cd 21:00:00:24:ff:0e: 21:00:00:24:ff:0e:1e:30/ 21:00:00:24:ff:0e:7c:f5/ .............path /qla2xxx> cd 21:00:00:24:ff:0e:7c:f5/ /qla2xxx/21:0...4:ff:0e:7c:f5> luns/ create /backstores/fileio/test Selected LUN 0. Created LUN 0.
设置接入权限,接入端也是两个HBA卡,所以每个target设置两个ACL
/qla2xxx/21:0...4:ff:0e:7c:f5> acls/ create 20:00:74:e6:e2:65:c8:0b Created Node ACL for 20:00:74:e6:e2:65:c8:0b Created mapped LUN 0. /qla2xxx/21:0...4:ff:0e:7c:f5> acls/ create 20:01:74:e6:e2:65:c8:0b Created Node ACL for 20:01:74:e6:e2:65:c8:0b Created mapped LUN 0. /qla2xxx/21:0...4:ff:0e:7c:f5> cd .. /qla2xxx> cd 21:00:00:24:ff:0e:1e:30/ /qla2xxx/21:0...4:ff:0e:1e:30> acls/ create 20:00:74:e6:e2:65:c8:0b Created Node ACL for 20:00:74:e6:e2:65:c8:0b Created mapped LUN 0. /qla2xxx/21:0...4:ff:0e:1e:30> acls/ create 20:01:74:e6:e2:65:c8:0b Created Node ACL for 20:01:74:e6:e2:65:c8:0b Created mapped LUN 0.
最后整个targetcli的树目录是这样的
/> ls o- / ..................................................................... [...] o- backstores .......................................................... [...] | o- fileio ............................................... [1 Storage Object] | | o- test .................................... [1.0T, /zfspool/test, in use] | o- iblock ............................................... [0 Storage Object] | o- pscsi ................................................ [0 Storage Object] | o- rd_mcp ............................................... [0 Storage Object] o- ib_srpt ....................................................... [0 Targets] o- iscsi ......................................................... [0 Targets] o- loopback ...................................................... [0 Targets] o- qla2xxx ....................................................... [2 Targets] | o- 21:00:00:24:ff:0e:1e:30 ....................................... [enabled] | | o- acls ......................................................... [2 ACLs] | | | o- 20:00:74:e6:e2:65:c8:0b .............................. [1 Mapped LUN] | | | | o- mapped_lun0 ........................................... [lun0 (rw)] | | | o- 20:01:74:e6:e2:65:c8:0b .............................. [1 Mapped LUN] | | | o- mapped_lun0 ........................................... [lun0 (rw)] | | o- luns .......................................................... [1 LUN] | | o- lun0 .................................. [fileio/test (/zfspool/test)] | o- 21:00:00:24:ff:0e:7c:f5 ....................................... [enabled] | o- acls ......................................................... [2 ACLs] | | o- 20:00:74:e6:e2:65:c8:0b .............................. [1 Mapped LUN] | | | o- mapped_lun0 ........................................... [lun0 (rw)] | | o- 20:01:74:e6:e2:65:c8:0b .............................. [1 Mapped LUN] | | o- mapped_lun0 ........................................... [lun0 (rw)] | o- luns .......................................................... [1 LUN] | o- lun0 .................................. [fileio/test (/zfspool/test)] o- tcm_fc ........................................................ [0 Targets] o- usb_gadget .................................................... [0 Targets] o- vhost ......................................................... [0 Targets]
最后别忘了保存,在保存的过程中遇到以下错误,查询资料发现是targetcli的一个bug,也给出了修复方法。
/> saveconfig Save configuration? [Y/n]: y Saving new startup configuration Traceback (most recent call last): File "/usr/lib/python2.7/dist-packages/configshell/shell.py", line 990, in run_interactive self._cli_loop() File "/usr/lib/python2.7/dist-packages/configshell/shell.py", line 820, in _cli_loop self.run_cmdline(cmdline) File "/usr/lib/python2.7/dist-packages/configshell/shell.py", line 934, in run_cmdline self._execute_command(path, command, pparams, kparams) File "/usr/lib/python2.7/dist-packages/configshell/shell.py", line 909, in _execute_command result = target.execute_command(command, pparams, kparams) File "/usr/lib/python2.7/dist-packages/targetcli/ui_node.py", line 104, in execute_command pparams, kparams) File "/usr/lib/python2.7/dist-packages/configshell/node.py", line 1416, in execute_command result = method(*pparams, **kparams) File "/usr/lib/python2.7/dist-packages/targetcli/ui_node.py", line 123, in ui_command_saveconfig CliConfig.save_running_config() File "/usr/lib/python2.7/dist-packages/targetcli/cli_config.py", line 65, in save_running_config config.load_live() File "/usr/lib/python2.7/dist-packages/rtslib/config.py", line 565, in load_live source=source, allow_new_attrs=True) File "/usr/lib/python2.7/dist-packages/rtslib/config.py", line 190, in _load_parse_tree token = self.validate_obj(token, cur) File "/usr/lib/python2.7/dist-packages/rtslib/config.py", line 377, in validate_obj valid_value = self.validate_val(valid_token['key'][1], id_type) File "/usr/lib/python2.7/dist-packages/rtslib/config.py", line 355, in validate_val % (val_type, value)) ConfigError: Unknown value type 'qla2xxx_wwn' when validating 21:00:00:24:ff:0e:7c:f5
修复方法,见https://github.com/bootc/rtslib/commit/727c345bd18137c424e4fba62bfab7bcfabfc024
vi /usr/share/pyshared/rtslib/config.py #第349行增加 elif val_type == 'naa': if is_valid_wwn('naa', value): valid_value = value + elif val_type == 'qla2xxx_wwn': + if is_valid_wwn('qla2xxx_wwn', value): + valid_value = value elif val_type == 'backend': if is_valid_backend(value, parent): valid_value = value vi /usr/share/pyshared/rtslib/utils.py #第562行增加 and re.match( "[0-9A-Fa-f]{8}(-[0-9A-Fa-f]{4}){3}-[0-9A-Fa-f]{12}$", wwn): return True + elif wwn_type == 'qla2xxx_wwn' \ + and re.match( + "[0-9A-Fa-f]{2}(:[0-9A-Fa-f]{2}){7}$", wwn): + return True else: return False
再次保存,就不出错了。
附上这个环境下的虚拟机与运行在存储上和vmware vsan下的虚拟机性能对比。
R730
NetAPP FC存储
VMware VSAN
得益于Intel SSD 750的缓存,可以看到性能比FC存储有了好几倍的提升。
感谢以下参考资料:
http://linux-iscsi.org/wiki/Fibre_Channel
http://linux-iscsi.org/wiki/Targetcli
http://iori.tw/透過targetcli設定linux-io的fiber-channel-w-qlogic-cards/
https://github.com/bootc/rtslib/commit/727c345bd18137c424e4fba62bfab7bcfabfc024
常见错误解决:
qla2xxx: Unknown parameter `qlini_mode' #我在CentOS下遇到这个错误,最后发现是内核不支持,后面通过装CentOS7.1或者编译内核至3.x以上可以解决这个错误,但是在targetcli下看不到qla2xxx,最后换成了Ubuntu解决。
E: Package 'ubuntu-zfs' has no installation candidate #这个错误在Ubuntu安装zfs文件系统的时候出现,原因是我装的时候这个包还没有支持ubuntu15.10,需要换旧版的Ubuntu