DRBD状态监控

操作环境

CentOS Linux release 7.4.1708 (Core)

DRBDADM_BUILDTAG=GIT-hash:\ fed9a1df82015e52c14c912fa4b93336e2ab4fcc\ build\ by\ root@drbd-node3\,\ 2018-07-05\ 17:40:03
DRBDADM_API_VERSION=2
DRBD_KERNEL_VERSION_CODE=0x09000e
DRBD_KERNEL_VERSION=9.0.14
DRBDADM_VERSION_CODE=0x090301
DRBDADM_VERSION=9.3.1
 

操作步骤

drbd-overview

drbd-overview可以比较方便的查看到drbd cluster的状态

[root@drbd-node3 ~]# drbd-overview 
NOTE: drbd-overview will be deprecated soon.
Please consider using drbdtop.

 0:scsivol/0  Connected(2*) Primar/Second UpToDa/UpToDa /mnt/nfs xfs 500G 11G 490G 3% 

信息显示:

scsivol为drbd resource的名称;

Primar/Second表示主机drbd-node3为主节点,如果在副节点上通过drbd-overview来查看这里会显示为Second/Primar;

UpToDa/UptoDa表示两个节点的数据都是最新的,则说明两节点的数据同步是一致的;

后面一串为文件系统信息,挂载目录为/mnt/nfs,文件系统为xfs,设备大小为500GB,已使用容量11GB,剩余容量490GB,利用率为3%;

通过/proc/drbd查看信息

[root@drbd-node3 ~]# cat /proc/drbd 
version: 9.0.14-1 (api:2/proto:86-113)
GIT-hash: 62f906cf44ef02a30ce0c148fec223b40c51c533 build by root@drbd-node3, 2018-07-05 17:33:27
Transports (api:16): tcp (9.0.14-1)

/proc/drbd查看的主要是drbd的版本信息version,以及网络传输协议TCP,注意DRBD除了可以通过TCP进行传输外还可以通过RDMA协议进行传输。

drbdadm

[root@drbd-node3 ~]# drbdadm status scsivol
scsivol role:Primary
  disk:UpToDate
  drbd-node1 role:Secondary
    peer-disk:UpToDate

scsivol role:显示该节点为主节点,如果在副节点,这里会显示为Secondary;

disk:显示主节点数据的状态;

drbd-node1 role:drbd-node1为副节点名称,显示drbd-node1为副节点状态;

peer-disk:显示副节点数据的状态;

除了使用drbdadm,还可以使用drbd-setup添加参数--verbose以及--statistics显示更多的信息

root@drbd-node3 ~]# drbdsetup status scsivol --verbose --statistics 
scsivol node-id:1 role:Primary suspended:no
    write-ordering:flush
  volume:0 minor:0 disk:UpToDate quorum:yes
      size:524271964 read:23027 written:10488103 al-writes:94 bm-writes:0 upper-pending:0 lower-pending:0 al-suspended:no blocked:no
  drbd-node1 node-id:0 connection:Connected role:Secondary congested:no
    volume:0 replication:Established peer-disk:UpToDate resync-suspended:no
        received:0 sent:10506515 out-of-sync:0 pending:0 unacked:0

通过drbdsetup events2查看drbd即时状态信息

[root@drbd-node3 ~]# drbdsetup events2 --now scsivol
exists resource name:scsivol role:Primary suspended:no
exists connection name:scsivol peer-node-id:0 conn-name:drbd-node1 connection:Connected role:Secondary
exists device name:scsivol volume:0 minor:0 disk:UpToDate client:no quorum:yes
exists peer-device name:scsivol peer-node-id:0 conn-name:drbd-node1 volume:0 replication:Established peer-disk:UpToDate peer-client:no resync-suspended:no
exists -

如果去除--now参数,则会不间断的显示信息

[root@drbd-node3 ~]# drbdsetup events2 scsivol
exists resource name:scsivol role:Primary suspended:no
exists connection name:scsivol peer-node-id:0 conn-name:drbd-node1 connection:Connected role:Secondary
exists device name:scsivol volume:0 minor:0 disk:UpToDate client:no quorum:yes
exists peer-device name:scsivol peer-node-id:0 conn-name:drbd-node1 volume:0 replication:Established peer-disk:UpToDate peer-client:no resync-suspended:no
exists -

查看连接状态

[root@drbd-node3 ~]# drbdadm cstate scsivol
Connected

连接状态分为以下几种:

  • StandAlone
  • Disconnecting
  • Unconnect
  • Timeout
  • BrokenPipe
  • NetWorkFailure
  • ProtocolError
  • TearDown
  • Connecting
  • Connected
StandAlone
No network configuration available. The resource has not yet been connected, or has been administratively disconnected (using drbdadm disconnect), or has dropped its connection due to failed authentication or split brain.

Disconnecting
Temporary state during disconnection. The next state is StandAlone.

Unconnected
Temporary state, prior to a connection attempt. Possible next states: Connecting.

Timeout
Temporary state following a timeout in the communication with the peer. Next state: Unconnected.

BrokenPipe
Temporary state after the connection to the peer was lost. Next state: Unconnected.

NetworkFailure
Temporary state after the connection to the partner was lost. Next state: Unconnected.

ProtocolError
Temporary state after the connection to the partner was lost. Next state: Unconnected.

TearDown
Temporary state. The peer is closing the connection. Next state: Unconnected.

Connecting
This node is waiting until the peer node becomes visible on the network.

Connected
A DRBD connection has been established, data mirroring is now active. This is the normal state.

查看复制状态

复制状态分为以下几种:

  • Off
  • Established
  • StartingSyncS
  • StartingSyncT
  • WFBitMapS
  • WFBitMapT
  • WFSyncUUID
  • SyncSource
  • SyncTarget
  • PausedSyncS
  • PausedSyncT
  • VerifyS
  • VerifyT
  • Ahead
  • Behind
Off
The volume is not replicated over this connection, since the connection is not Connected. Established
 

All writes to that volume are replicated online. This is the normal state.

StartingSyncS
Full synchronization, initiated by the administrator, is just starting. The next possible states are: SyncSource or
PausedSyncS.

StartingSyncT
Full synchronization, initiated by the administrator, is just starting. Next state: WFSyncUUID.

WFBitMapS
Partial synchronization is just starting. Next possible states: SyncSource or PausedSyncS.

WFBitMapT
Partial synchronization is just starting. Next possible state: WFSyncUUID.

WFSyncUUID
Synchronization is about to begin. Next possible states: SyncTarget or PausedSyncT.

SyncSource
Synchronization is currently running, with the local node being the source of synchronization.

SyncTarget
Synchronization is currently running, with the local node being the target of synchronization.

PausedSyncS
The local node is the source of an ongoing synchronization, but synchronization is currently paused. This may be due to a dependency on the completion of another synchronization process, or due to synchronization having been manually interrupted by drbdadm pause-sync.

PausedSyncT
The local node is the target of an ongoing synchronization, but synchronization is currently paused. This may be due to a dependency on the completion of another synchronization process, or due to synchronization having been manually interrupted by drbdadm pause-sync.

VerifyS
On-line device verification is currently running, with the local node being the source of verification.

VerifyT
On-line device verification is currently running, with the local node being the target of verification.

Ahead
Data replication was suspended, since the link can not cope with the load. This state is enabled by the configuration
on-congestion optione 

Behind
Data replication was suspended by the peer, since the link can not cope with the load. This state is enabled by the configuration on-congestion option on the peer node 

查看资源状态

[root@drbd-node3 ~]# drbdadm role scsivol
Primary

资源状态可分为:

  • Primary
  • Secondary
  • Unknown
Primary
The resource is currently in the primary role, and may be read from and written to. This role only occurs on one of the two nodes, unless dual-primary mode is enabled.

Secondary
The resource is currently in the secondary role. It normally receives updates from its peer (unless running in disconnected mode), but may neither be read from nor written to. This role may occur on one or both nodes.

Unknown
The resource’s role is currently unknown. The local resource role never has this status. It is only displayed for the peer’s resource role, and only in disconnected mode.

查看硬盘状态

[root@drbd-node3 ~]# drbdadm dstate scsivol
UpToDate/UpToDate

硬盘状态可分为:

  • Diskless
  • Attaching
  • Detaching
  • Failed
  • Negotiating
  • Inconsistent
  • Outdated
  • DUnknown
  • Consistent
  • UpToDate
Diskless
No local block device has been assigned to the DRBD driver. This may mean that the resource has never attached to its backing device, that it has been manually detached using drbdadm detach, or that it automatically detached after a lower-level I/O error.

Attaching
Transient state while reading meta data.

Detaching
Transient state while detaching and waiting for ongoing IOs to complete.

Failed
Transient state following an I/O failure report by the local block device. Next state: Diskless.

Negotiating
Transient state when an Attach is carried out on an already-Connected DRBD device.

Inconsistent
The data is inconsistent. This status occurs immediately upon creation of a new resource, on both nodes (before the initial full sync). Also, this status is found in one node (the synchronization target) during synchronization.

Outdated
Resource data is consistent, but outdated.

DUnknown
This state is used for the peer disk if no network connection is available.

Consistent
Consistent data of a node without connection. When the connection is established, it is decided whether the data is
UpToDate or Outdated.

UpToDate
Consistent, up-to-date state of the data. This is the normal state.
 

性能参数

  • read
  • written
  • al-writes
  • send
  • receive
  • bm-writes
  • lower-pending
  • pending
  • unacked
  • upper-pending
  • write-ordering
  • out-of-sync
  • resync-suspended
  • blocked
send (network send)
Volume of net data sent to the partner via the network connection; in Kibyte.

receive (network receive)
Volume of net data received by the partner via the network connection; in Kibyte.

read (disk write)
Net data written on local hard disk; in Kibyte.

written (disk read)
Net data read from local hard disk; in Kibyte.

al-writes (activity log)
Number of updates of the activity log area of the meta data.

bm-writes (bit map)
Number of updates of the bitmap area of the meta data.

lower-pending (local count)
Number of open requests to the local I/O sub-system issued by DRBD.

pending
Number of requests sent to the partner, but that have not yet been answered by the latter.

unacked (unacknowledged)
Number of requests received by the partner via the network connection, but that have not yet been answered.

upper-pending (application pending)
Number of block I/O requests forwarded to DRBD, but not yet answered by DRBD.

write-ordering (write order)
Currently used write ordering method: b(barrier), f(flush), d(drain) or n(none).

out-of-sync
Amount of storage currently out of sync; in Kibibytes.

resync-suspended
Whether the resynchronization is currently suspended or not. Possible values are no, user, peer, dependency.

blocked
Shows local I/O congestion.
 

•	no: No congestion.
•	upper: I/O above the DRBD device is blocked, ie. to the filesystem. Typical causes are
◦	I/O suspension by the administrator, see the suspend-io command in drbdadm.
◦	transient blocks, eg. during attach/detach
◦	buffers depleted,
◦	Waiting for bitmap IO
•	lower: Backing device is congested.

It’s possible to see a value of upper,lower, too.

 

你可能感兴趣的:(linux)