Linux下使用inotify实现文件监控

1、需求

工程中需要对某个文件夹下的文件进行监控,文件、目录发生变化后需要进行处理;

普通的方法是通过循环不停遍历文件夹,但文件数量较多时,将导致判定时间较长,并且无法区分文件使用正在使用;

万幸是Linux2.6后提供了一种inotify 对文件系统进行监控,通过触发的方式告诉你文件的变化,从而代替以往循环遍历的方式;

2、接口

2.1 接口说明

1) int inotify_init(void);

创建 inotify 实例,该接口返回一个文件描述符,失败返回-1,跟socket一样,可以通过errno获取错误类型;

DESCRIPTION
       inotify_init()  initializes a new inotify instance and returns a file descriptor associated with a new inotify
       event queue.

       If flags is 0, then inotify_init1() is the same as inotify_init().  The following values can be  bitwise  ORed
       in flags to obtain different behavior:

       IN_NONBLOCK Set the O_NONBLOCK file status flag on the new open file description.  Using this flag saves extra
                   calls to fcntl(2) to achieve the same result.

       IN_CLOEXEC  Set the close-on-exec (FD_CLOEXEC) flag on the new file descriptor.  See the  description  of  the
                   O_CLOEXEC flag in open(2) for reasons why this may be useful.

RETURN VALUE
       On  success,  these  system calls return a new file descriptor.  On error, -1 is returned, and errno is set to
       indicate the error.

2)int inotify_add_watch(int fd, const char *pathname, uint32_t mask);

加入需要监控的目录到 inotify 实例,并设置监控类型 mask;成功后返回该目录的Watch file descriptor,后续用于在事件中进行辨识;

DESCRIPTION
       inotify_add_watch()  adds a new watch, or modifies an existing watch, for the file whose location is specified
       in pathname; the caller must have read permission for this file.  The fd argument is a file descriptor  refer-
       ring  to the inotify instance whose watch list is to be modified.  The events to be monitored for pathname are
       specified in the mask bit-mask argument.  See inotify(7) for a description of the bits  that  can  be  set  in
       mask.

       A successful call to inotify_add_watch() returns the unique watch descriptor associated with pathname for this
       inotify instance.  If pathname was not previously being watched by  this  inotify  instance,  then  the  watch
       descriptor  is  newly  allocated.  If pathname was already being watched, then the descriptor for the existing
       watch is returned.

       The watch descriptor is returned by later read(2)s from the inotify file descriptor.  These reads  fetch  ino-
       tify_event  structures (see inotify(7)) indicating file system events; the watch descriptor inside this struc-
       ture identifies the object for which the event occurred.

RETURN VALUE
       On success, inotify_add_watch() returns a non-negative watch descriptor.  On error -1 is returned and errno is
       set appropriately.

3)int inotify_rm_watch(int fd, int wd);  移除事件监控;

DESCRIPTION
       inotify_rm_watch() removes the watch associated with the watch descriptor wd from the inotify instance associ-
       ated with the file descriptor fd.

       Removing a watch causes an IN_IGNORED event to be generated for this watch descriptor.  (See inotify(7).)

RETURN VALUE
       On success, inotify_rm_watch() returns zero, or -1 if an error occurred (in which case, errno is set appropri-
       ately).

2.2 事件类型

通过 sys/inotify.h 文件可以看出 mask标志如何设置,里面包含了常用的增、删、改动作:

/* Supported events suitable for MASK parameter of INOTIFY_ADD_WATCH.  */
#define IN_ACCESS    0x00000001 /* File was accessed. 文件被访问 */
#define IN_MODIFY    0x00000002 /* File was modified. 文件被修改 */
#define IN_ATTRIB    0x00000004 /* Metadata changed. 文件属性发生变化 */
#define IN_CLOSE_WRITE   0x00000008 /* Writtable file was closed. 以可写的方式打开后关闭了文件 */
#define IN_CLOSE_NOWRITE 0x00000010 /* Unwrittable file closed. 以非可写的方式打开后关闭了文件 */
#define IN_CLOSE     (IN_CLOSE_WRITE | IN_CLOSE_NOWRITE) /* Close. 上述两者的集合 */
#define IN_OPEN      0x00000020 /* File was opened. 文件被打开 */
#define IN_MOVED_FROM    0x00000040 /* File was moved from X. 文件移出监控目录 */
#define IN_MOVED_TO      0x00000080 /* File was moved to Y. 文件移入监控目录 */
#define IN_MOVE      (IN_MOVED_FROM | IN_MOVED_TO) /* Moves. 上述两者的集合 */
#define IN_CREATE    0x00000100 /* Subfile was created. 监控目录下新建了子文件、子目录 */
#define IN_DELETE    0x00000200 /* Subfile was deleted. 监控目录下删除了子文件、子目录 */
#define IN_DELETE_SELF   0x00000400 /* Self was deleted. 监控目录被删除 */
#define IN_MOVE_SELF     0x00000800 /* Self was moved. 监控目录被移动 */
/* All events which a program can wait on.  */
#define IN_ALL_EVENTS    (IN_ACCESS | IN_MODIFY | IN_ATTRIB | IN_CLOSE_WRITE  \
              | IN_CLOSE_NOWRITE | IN_OPEN | IN_MOVED_FROM        \
              | IN_MOVED_TO | IN_CREATE | IN_DELETE           \
              | IN_DELETE_SELF | IN_MOVE_SELF)

2.3 事件结构

/* Structure describing an inotify event.  */
struct inotify_event
{
  int wd;       /* Watch descriptor.  */
  uint32_t mask;    /* Watch mask.  */
  uint32_t cookie;  /* Cookie to synchronize two events.  */
  uint32_t len;     /* Length (including NULs) of name.  */
  char name __flexarr;  /* Name.  */
};

事件的获取是为文件描述符可读后,读取下来的内容,结构大概如下所示:

Linux下使用inotify实现文件监控_第1张图片

wd是在inotify实例底下的子监控事件标识,mask为事件掩码,cookie为两个事件的关联,len则是详细名称;

注意name是一个柔性数组,表示后续追加了不定长的事件名称;

.eg: 

在inotify实例(fd=1)监控了目录a(wd=2)、b(wd=3);

现在fd=1收到事件wd=2,mask=IN_CREATE,cookie=0,name=“1”,表示在a目录底下创建了文件1

现在fd=1收到事件wd=3,mask=IN_MOVED_FROM,cookie=0x1234,name=“2”,事件wd=3,mask=IN_MOVED_TO,cookie=0x1234,name=“3

表示在b目录底下发生了文件2被重命名为文件3;


3、实例

inotify的实例是文件描述符,操作起来跟socket区别不大,所以对于Linux来讲(inotify也只在Linux支持),可以使用epoll、select进行IO复用处理;

也就是说可以使用libevent网络库完美结合进行编程,使用内部bufferevent机制提供缓冲管理(可以参考《Linux下使用bufferevent实现tcp代理功能》),极大简化代码工作;

先看一下内部的一个结构体 struct string,用于开辟一个足够大的buffer对inotify事件进行维护使用:

#define SIZE_IEVENT sizeof(struct inotify_event)

struct string
{
    char str[SIZE_IEVENT + 1024];
    size_t len;
};

其次是main函数,内部创建了base实例、bev实例,同时调用了inotify的API申请出fd,托管到bev中:

通过 bufferevent_setcb(bev, on_recv, NULL, NULL, &string);  设置一个fd可读时的回调函数,即当有事件来时,调用on_recv进行处理;

int main(int argc, char *argv[])
{
    int fd = 0;
    struct bufferevent *bev = NULL;
    struct event_base *base = NULL;

    struct string string = {{0}, 0}; 

    if ( argc < 2 ) { 
        printf("%s \n", argv[0], argv[1]);
        exit(EXIT_FAILURE);
    }   

    base = event_base_new();
    assert(base);

    fd = inotify_init();

    inotify_add_watch(fd, argv[1], 
            IN_CREATE | 
            IN_DELETE | 
            IN_MOVED_FROM | 
            IN_MOVED_TO | 
            IN_CLOSE_WRITE);

    bev = bufferevent_socket_new(base, fd, 0); 
    assert(bev);

    bufferevent_setwatermark(bev, EV_READ, SIZE_IEVENT, 0); 
    bufferevent_setcb(bev, on_recv, NULL, NULL, &string);
    bufferevent_enable(bev, EV_READ);

    event_base_dispatch(base);

    return EXIT_SUCCESS;
}

接着看一下核心函数on_recv,当fd有事件来时(可能一次多个事件),bufferevent先帮我们把数据收到缓冲区了,

然后触发我们的on_recv回调函数,我们只需在里面使用 bufferevent_read从缓冲区取出事件到pstr就行了。

由于事件中name是不定长的,所以就有了以下的循环处理:收sizeof(struct inotify_event)前32字节内容,再根据pevent->len 获取后续的name内容。

void on_recv(struct bufferevent *bev, void *args)
{
    size_t length = 0;
    struct string *pstr = (struct string *)args;
    struct inotify_event *pevent = (struct inotify_event *)pstr->str;

    while ( 1 ) {
        length = evbuffer_get_length(bufferevent_get_input(bev));
        if ( pstr->len == 0 ) {
            if ( length < SIZE_IEVENT ) {
                printf("Retry head\n");
                return;
            }
            pstr->len += bufferevent_read(bev, pevent, SIZE_IEVENT);
            assert(pstr->len == SIZE_IEVENT);
        }
        else {
            if ( length < pevent->len ) {
                printf("Retry body\n");
                return;
            }
            pstr->len += bufferevent_read(bev, pevent->name, pevent->len);
            assert(pstr->len == pevent->len + SIZE_IEVENT);
            pstr->len = 0;

            /* Done */
            display(pevent);
        }
    }
    return;
}
void display(struct inotify_event *pevent)
{
#define __display(mask, type) if ( mask & (type) ) { \
    printf("%-15s, ", #type); \
}
    __display(pevent->mask, IN_CREATE);
    __display(pevent->mask, IN_DELETE);
    __display(pevent->mask, IN_CLOSE_WRITE);
    __display(pevent->mask, IN_MOVED_FROM);
    __display(pevent->mask, IN_MOVED_TO);

    printf("%s\n", pevent->name);
}
其他依赖的头文件如  #include 、#include 、 #include 就不细说了

程序运行起来后,简单使用命令进行批量创建、重命名、删除:

mkdir a1 a2 a3 a4 a5; rename a b a*; rm -rf *

执行结果如下:

Linux下使用inotify实现文件监控_第2张图片

4、结论

使用inotify机制进行文件监控在实时性方面确实比扫描方式优秀,但在使用inotify接口还需要注意以下几点:

1)inotify_event仅告知你wd和子事件的名称,需要自己维护一个数据结构来方便全路径补齐;

2)inotify_add_watch 仅对一级目录进行监控(当前API的一个槽点),若需要自己创建多级目录则需要继续再对子目录进行 add_watch;

3)同上述问题,多级目录请做足 mkdir -p a/b/c/d/e/f 的测试,在对a目录 add_watch的时候,需要再配合扫描来保证多层级目录不会丢失;

4)重命名时候,大多情况需要你判定cookie不为零,自己对前后两个事件进行关联才能判定;

5)对于大压力的事件(瞬间几十万事件的),需要关注一下 /proc/sys/fs/inotify/max_* 底下的配置,免得内核队列满导致事件丢失;

6)其他(好像没想起别的坑来了,fd的一些处理也得注意一下);


对于运维不想涉及到c语言编程的话,可以搜索一下inotifywait工具在shell中的使用;


参考文章:

[1] https://www.ibm.com/developerworks/cn/linux/l-ubuntu-inotify/

[2] http://www.infoq.com/cn/articles/inotify-linux-file-system-event-monitoring

你可能感兴趣的:(linux)