


xfs_admin: 调整 xfs 文件系统的各种参数

xfs_copy: 拷贝 xfs 文件系统的内容到一个或多个目标系统(并行方式)

xfs_db: 调试或检测 xfs 文件系统(查看文件系统碎片等)

xfs_check: 检测 xfs 文件系统的完整性

xfs_bmap: 查看一个文件的块映射

xfs_repair: 尝试修复受损的 xfs 文件系统

xfs_fsr: 碎片整理

xfs_quota: 管理 xfs 文件系统的磁盘配额

xfs_metadump: 将 xfs 文件系统的元数据 (metadata) 拷贝到一个文件中

xfs_mdrestore: 从一个文件中将元数据 (metadata) 恢复到 xfs 文件系统

xfs_growfs: 调整一个 xfs 文件系统大小(只能扩展)

xfs_freeze:暂停(-f)和恢复(-u)xfs 文件系统

xfs_logprint: 打印xfs文件系统的日志

xfs_mkfile: 创建xfs文件系统

xfs_info: 查询文件系统详细信息

xfs_ncheck: generate pathnames from i-numbers for XFS

xfs_rtcp: XFS实时拷贝命令

xfs_io: 调试xfs I/O路径

Debian在最小安装时没有XFS的支持,需要另外安装:apt-get install xfsprogs

Usage: xfs_admin [-efjlpuV] [-c 0|1] [-L label] [-U uuid] device
xfs_admin -L media /dev/sdb1

xfs_admin -l dev_name
# xfs_admin -l /dev/mapper/vg0-lv_home 
label = "home"

在需要调用'xfs_growfs'来实现,用于在lvm格式区上创建的分区,为原有区增加了空间,就需要用它的告知分区新的大小。与ext4分区扩展一样简单,不用卸载分区即可实现。比如已经有挂载:/dev/mapper/vg0-lv_home -> /home,那么直接操作'/home'分区就好了:
 xfs_growfs /home



[root@localhost ~]# xfs_info /dev/sda3
meta-data=/dev/sda3      isize=256    agcount=32, agsize=228641200 blks
 =       sectsz=4096  attr=2, projid32bit=0
data     =       bsize=4096   blocks=7316518400, imaxpct=5
 =       sunit=0      swidth=0 blks
naming   =version 2      bsize=4096   ascii-ci=0
log      =internal       bsize=4096   blocks=521728, version=2
 =       sectsz=4096  sunit=1 blks, lazy-count=1
realtime =none   extsz=4096   blocks=0, rtextents=0


mkfs.xfs -b size=2k -n size=4k /dev/sda3

mkfs.xfs -b size=2k /dev/sdxn

通过以上实例总结执行命令:mkfs.xfs -d su=64k,sw=4 /dev/sda3
指定 'su' and 'sw' 参数
# mkfs.xfs -d su=64k,sw=4 /dev/sda3

mkfs.xfs 常用参数说明:

-b选项 size= 逻辑块尺寸大小。The default block size is 4096 bytes (4 KB).

-n选项 size= 文件系统目录块大小。目录块应大于文件系统逻辑块。

-l选项 每个xfs文件系统有一个文件系统日志记录。这个日志需要专用的磁盘空间。这个空间不能被df显示,也不能以文件名来访问。日志记录分为外部和内部日志。外部指得是使用一个外部设备。内部指得是占用一个专用的磁盘空间。关于内部日志,这个大小是以-l size=选项来指定的。这个默认日志大小会越来越大,直到最大的日志大小,128M,在一个1TB的文件系统中。

For filesystems with a very high transaction activity, a large log size is recommended. You should avoid making your log too large because a large log can increase filesystem mount time after a crash.在一个比较大量活跃的业务文件系统中,推荐一个大的日志size,你应避免让你的日志大大,因为一个大日志能加重在crash后文件系统挂接时间。
-d选项 主要用于数据部份的参数指定,例如在一个raid设备中如何进行分配。

For a RAID device, the default stripe unit is 0, indicating that the feature is disabled. You should configure the stripe unit and width sizes of RAID devices in order to avoid unexpected performance anomalies caused by the filesystem doing non-optimal I/O operations to the RAID unit. For example, if a block write is not aligned on a RAID stripe unit boundary and is not a full stripe unit, the RAID will be forced to do a read/modify/write cycle to write the data. This can have a significant performance impact. By setting the stripe unit size properly, XFS will avoid unaligned accesses。



Start with your RAID stripe size.  Let’s use 64k which is a common default.  In this case 64K = 2^16 = 65536 bytes. 默认尺寸是64K

Get your sector size from fdisk.  In this case 512 bytes. 扇区大小512b

Calculate how many sectors fit in a RAID stripe. 65536 / 512 = 128 sectors per stripe. 每个条带大小128个扇区

Get start boundary of our mysql partition from fdisk: 27344896. 查看该分区的起始数为27344896

See if the Start boundary for our mysql partition falls on a stripe boundary by dividing the start sector of the partition by the sectors per stripe: 27344896 / 128 = 213632. This is a whole number, so we are good. If it had a remainder, then our partition would not start on a RAID stripe boundary. 查看如果由起始扇区划分的起始边界落到条带的边界,再计算扇区数,得到一个整数。如果有余数,那么我们的分区不会从raid条带边界开始。

XFS requires a little massaging (or a lot).  For a standard server, it’s fairly simple.  We need to know two things:

RAID stripe size
Number of unique, utilized disks in the RAID.  This turns out to be the same as the size formulas I gave above:
 RAID 1+0: is a set of mirrored drives, so the number here is num drives / 2.
 RAID 5: is striped drives plus one full drive of parity, so the number here is num drives – 1.

In our case, it is RAID 1+0 64k stripe with 8 drives. Since those drives each have a mirror, there are really 4 sets of unique drives that are striped over the top. Using these numbers, we set the ‘su’ and ‘sw’ options in mkfs.xfs with those two values respectively.

# mkfs.xfs -b size=1k -l size=10m /dev/sdc1

# mkfs.xfs -l logdev=/dev/sdh,size=65536b /dev/sdc1

# mkfs.xfs -b size=2k -n size=4k /dev/sdc1



/home/pgsql xfs nobarrier,noatime,nodiratime

[root@freeoa ~]# df -hT
Filesystem    Type    Size  Used Avail Use% Mounted on
/dev/sdb1      xfs     9T   6T  2.4T  88% /oabackup

[root@freeoa ~]# df -hi
Filesystem  Inodes   IUsed   IFree IUse% Mounted on
/dev/sdb1   9.3G    3.4M    9.3G    1% /backup


这种情况在XFS FAQ有如下的描述:

Q: What is the inode64 mount option for?

By default, with 32bit inodes, XFS places inodes only in the first 1TB of a disk. If you have a disk with 100TB, all inodes will be stuck in the first TB. This can lead to strange things like "disk full" when you still have plenty space free, but there's no more place in the first TB to create a new inode. Also, performance sucks.
To come around this, use the inode64 mount options for filesystems >1TB. Inodes will then be placed in the location where their data is, minimizing disk seeks.
Beware that some old programs might have problems reading 64bit inodes, especially over NFS. Your editor used inode64 for over a year with recent (openSUSE 11.1 and higher) distributions using NFS and Samba without any corruptions, so that might be a recent enough distro.


mount -o remount -o noatime,nodiratime,inode64,nobarrier /dev/sdb1 /oabackup


查看文件块状况: xfs_bmap -v file.tar.bz2
查看磁盘碎片状况: xfs_db -c frag -r /dev/sda1
整理碎片: xfs_fsr /dev/sda1

xfs_repair -n /dev/cciss/cpd0p
xfs_repair -n (非更改模式)



Phase 1 - find and verify superblock...
Phase 2 - zero log...
- scan filesystem freespace and inode maps...
- found root inode chunk
Phase 3 - for each AG...
- scan and clear agi unlinked lists...
- process known inodes and perform inode discovery...
- agno = 0
- agno = 1
- process newly discovered inodes...
Phase 4 - check for duplicate blocks...
- setting up duplicate extent list...
- clear lost+found (if it exists) ...
- clearing existing “lost+found” inode
- deleting existing “lost+found” entry
- check for inodes claiming duplicate blocks...
- agno = 0
imap claims in-use inode 242000 is free, correcting imap
- agno = 1
- agno = 2
Phase 5 - rebuild AG headers and trees...
- reset superblock counters...
Phase 6 - check inode connectivity...
- ensuring existence of lost+found directory
- traversing filesystem starting at / ...
- traversal finished ...
- traversing all unattached subtrees ...
- traversals finished ...
- moving disconnected inodes to lost+found ...
disconnected inode 242000, moving to lost+found    
Phase 7 - verify and correct link counts...

In this example, inode 242000 was an inode that was moved to lost+found during a previous xfs_repair run. This run of xfs_repair found that the filesystem is consistent. If the lost+found directory had been empty, in phase 4 only the messages about clearing and deleting the lost+found directory would have appeared. The imap claims and disconnected inode messages appear (one pair of messages per inode) if there are inodes in the lost+found directory

在这个例子里 inode 242000是一个被移到lost_found在上次xfs_repair运行时侯。这个xfs_repair运行发现文件系统是连读的。如果lost_found被置空,在第四阶段只有关于清理和删除的信息出现。如果在lost_found目录里,imap claims和disconnected inode信息会出现。



# xfs_admin -u /dev/sdb
UUID = 88b6b5ee-1125-4a52-ae8a-48bf3ced1c18

# xfs_admin -U 893e121c-582e-4851-bad7-cf46f01167b3 /dev/sde
Clearing log and setting UUID
writing all SBs
new UUID = 893e121c-582e-4851-bad7-cf46f01167b3

# xfs_admin -U nil /dev/sdb
Clearing log and setting UUID
writing all SBs
new UUID = 00000000-0000-0000-0000-000000000000

# xfs_admin -U generate /dev/sdb
writing all SBs
new UUID = c1b9d5a2-6789-11ab-9101-0020afc76f16


tune2fs /dev/sdc1 -U f0acce91-a416-1234-abcd-43f3ed3768f9


mount挂载磁盘是提示"mount: Structure needs cleaning"。
使用"xfs_repair -L /dev/sdb1"修复所挂在的分区。
