作者:刘昊昱
博客:http://blog.csdn.net/liuhaoyutz
内核版本:3.10.1
一、kobject结构定义
kobject是Linux设备模型的最底层数据结构,它代表一个内核对象。
kobject结构体定义在include/linux/kobject.h文件中:
60struct kobject { 61 const char *name; 62 struct list_head entry; 63 struct kobject *parent; 64 struct kset *kset; 65 struct kobj_type *ktype; 66 struct sysfs_dirent *sd; 67 struct kref kref; 68 unsigned int state_initialized:1; 69 unsigned int state_in_sysfs:1; 70 unsigned int state_add_uevent_sent:1; 71 unsigned int state_remove_uevent_sent:1; 72 unsigned int uevent_suppress:1; 73};
name是这个内核对象的名字,在sysfs文件系统中,name将以一个目录的形式出现。
entry用于将该内核对象链接进其所属的kset的内核对象链表。
parent代表该内核对象的父对象,用于构建内核对象的层次结构。
kset是该内核对象所属的“内核对象集合”。
ktype是该内核对象的sysfs文件系统相关的操作函数和属性。
sd表示该内核对象对应的sysfs目录项。
kref的核心数据是一个原子型变量,用于表示该内核对象的引用计数。
state_initialized表示该内核对象是否已经进行过了初始化,1表示已经初始化过了。
state_in_sysfs表示该内核对象是否已经在sysfs文件系统中建立一个入口点。
state_add_uevent_sent表示加入内核对象时是否发送uevent事件。
state_remove_uevent_sent表示删除内核对象时是否发送uevent事件。
uevent_suppress表示内核对象状态发生变化时,是否发送uevent事件。
二、kobject初始化分析
分析kobject,我们从kobject_init_and_add函数开始看,该函数完成对kobject的初始化,建立kobject的层次结构,并将kobject添加到sysfs文件系统中,该函数定义在lib/kobject.c文件中,其内容如下:
360/** 361 * kobject_init_and_add - initialize a kobject structure and add it to the kobject hierarchy 362 * @kobj: pointer to the kobject to initialize 363 * @ktype: pointer to the ktype for this kobject. 364 * @parent: pointer to the parent of this kobject. 365 * @fmt: the name of the kobject. 366 * 367 * This function combines the call to kobject_init() and 368 * kobject_add(). The same type of error handling after a call to 369 * kobject_add() and kobject lifetime rules are the same here. 370 */ 371int kobject_init_and_add(struct kobject *kobj, struct kobj_type *ktype, 372 struct kobject *parent, const char *fmt, ...) 373{ 374 va_list args; 375 int retval; 376 377 kobject_init(kobj, ktype); 378 379 va_start(args, fmt); 380 retval = kobject_add_varg(kobj, parent, fmt, args); 381 va_end(args); 382 383 return retval; 384}
从注释中可以看出,kobject_init_and_add函数可以分为两部分,一个是kobject_init函数,对kobject进行初始化,另一个是kobject_add_varg函数,将kobject添加到kobject层次结构中。
先来看kobject_init函数,其定义如下:
256/** 257 * kobject_init - initialize a kobject structure 258 * @kobj: pointer to the kobject to initialize 259 * @ktype: pointer to the ktype for this kobject. 260 * 261 * This function will properly initialize a kobject such that it can then 262 * be passed to the kobject_add() call. 263 * 264 * After this function is called, the kobject MUST be cleaned up by a call 265 * to kobject_put(), not by a call to kfree directly to ensure that all of 266 * the memory is cleaned up properly. 267 */ 268void kobject_init(struct kobject *kobj, struct kobj_type *ktype) 269{ 270 char *err_str; 271 272 if (!kobj) { 273 err_str = "invalid kobject pointer!"; 274 goto error; 275 } 276 if (!ktype) { 277 err_str = "must have a ktype to be initialized properly!\n"; 278 goto error; 279 } 280 if (kobj->state_initialized) { 281 /* do not error out as sometimes we can recover */ 282 printk(KERN_ERR "kobject (%p): tried to init an initialized " 283 "object, something is seriously wrong.\n", kobj); 284 dump_stack(); 285 } 286 287 kobject_init_internal(kobj); 288 kobj->ktype = ktype; 289 return; 290 291error: 292 printk(KERN_ERR "kobject (%p): %s\n", kobj, err_str); 293 dump_stack(); 294}
276-279行,如果ktype为空,则返回一个错误,所以从kobject_init_and_add函数中必须传递一个非空的ktype过来。
280-285行,如果kobject已经被初始化过,则打印错误信息。
288行,将传递过来的ktype赋值给kobj->ktype。
287行,调用kobject_init_internal函数,该函数定义在lib/kobject.c文件中,其定义如下:
143static void kobject_init_internal(struct kobject *kobj) 144{ 145 if (!kobj) 146 return; 147 kref_init(&kobj->kref); 148 INIT_LIST_HEAD(&kobj->entry); 149 kobj->state_in_sysfs = 0; 150 kobj->state_add_uevent_sent = 0; 151 kobj->state_remove_uevent_sent = 0; 152 kobj->state_initialized = 1; 153}
该函数完成对kobject成员变量的初始化,唯一值得一看的是对kobj->kref即引用计数的初始化,这是通过kref_init函数完成的,该函数定义在include/linux/kref.h文件中,其定义如下:
24struct kref { 25 atomic_t refcount; 26}; 27 28/** 29 * kref_init - initialize object. 30 * @kref: object in question. 31 */ 32static inline void kref_init(struct kref *kref) 33{ 34 atomic_set(&kref->refcount, 1); 35}
引用计数的核心变量是一个原子变量,初始化为1。
至此,kobject_init函数我们就分析完了。
回到kobject_init_and_add函数,
379行,va_start用于处理可变参数,这里不再详细解释。
380行,调用kobject_add_varg函数,该函数定义在lib/kobject.c文件中,其内容如下:
297static int kobject_add_varg(struct kobject *kobj, struct kobject *parent, 298 const char *fmt, va_list vargs) 299{ 300 int retval; 301 302 retval = kobject_set_name_vargs(kobj, fmt, vargs); 303 if (retval) { 304 printk(KERN_ERR "kobject: can not set name properly!\n"); 305 return retval; 306 } 307 kobj->parent = parent; 308 return kobject_add_internal(kobj); 309}
302行,调用kobject_set_name_vargs解析可变参数,设置kobject.name。
307行,设置kobject.parent为parent。
308行,调用kobject_add_internal函数,该函数定义在lib/kobject.c文件中,其内容如下:
156static int kobject_add_internal(struct kobject *kobj) 157{ 158 int error = 0; 159 struct kobject *parent; 160 161 if (!kobj) 162 return -ENOENT; 163 164 if (!kobj->name || !kobj->name[0]) { 165 WARN(1, "kobject: (%p): attempted to be registered with empty " 166 "name!\n", kobj); 167 return -EINVAL; 168 } 169 170 parent = kobject_get(kobj->parent); 171 172 /* join kset if set, use it as parent if we do not already have one */ 173 if (kobj->kset) { 174 if (!parent) 175 parent = kobject_get(&kobj->kset->kobj); 176 kobj_kset_join(kobj); 177 kobj->parent = parent; 178 } 179 180 pr_debug("kobject: '%s' (%p): %s: parent: '%s', set: '%s'\n", 181 kobject_name(kobj), kobj, __func__, 182 parent ? kobject_name(parent) : "<NULL>", 183 kobj->kset ? kobject_name(&kobj->kset->kobj) : "<NULL>"); 184 185 error = create_dir(kobj); 186 if (error) { 187 kobj_kset_leave(kobj); 188 kobject_put(parent); 189 kobj->parent = NULL; 190 191 /* be noisy on error issues */ 192 if (error == -EEXIST) 193 WARN(1, "%s failed for %s with " 194 "-EEXIST, don't try to register things with " 195 "the same name in the same directory.\n", 196 __func__, kobject_name(kobj)); 197 else 198 WARN(1, "%s failed for %s (error: %d parent: %s)\n", 199 __func__, kobject_name(kobj), error, 200 parent ? kobject_name(parent) : "'none'"); 201 } else 202 kobj->state_in_sysfs = 1; 203 204 return error; 205}
164-168行,检查是否已经设置kobject.name,如果没有设置,则返回错误。所以前面必须已经设置好kobject.name。
170行,对过kobject_get函数取得kobject.parent,该函数定义在lib/kobject.c文件中,其内容如下:
521/** 522 * kobject_get - increment refcount for object. 523 * @kobj: object. 524 */ 525struct kobject *kobject_get(struct kobject *kobj) 526{ 527 if (kobj) 528 kref_get(&kobj->kref); 529 return kobj; 530}
注意,该函数除了返回对应的kobject,还通过调用kref_get函数,增加该kobject的引用计数。
173-178行,如果kobject.kset不为空,则调用kobj_kset_join函数,将kobject.entry链接进kobject.kset.list中。另外,如果parent为空,则将kset.kobject设置为kobject.parent。
185行,调用create_dir函数创建kobject在sysfs文件系统中对应的目录结构,该函数定义在lib/kobject.c文件中,其内容如下:
47static int create_dir(struct kobject *kobj) 48{ 49 int error = 0; 50 error = sysfs_create_dir(kobj); 51 if (!error) { 52 error = populate_dir(kobj); 53 if (error) 54 sysfs_remove_dir(kobj); 55 } 56 return error; 57}
通过sysfs_create_dir函数创建sysfs文件系统目录结构,这里就不再继续追踪了,后面在分析sysfs文件系统时我们会分析这个函数。
202行,如果在sysfs文件系统中创建目录结构成功,则将 kobj->state_in_sysfs设置为1。
至此,kobject_add_internal函数我们就分析完了;相应的kobject_add_varg函数也就分析完了;相应的kobject_init_and_add函数也就分析完了。可以看到,kobject_init_and_add函数完成了对kobject初始化,建立了kobject的层次结构,并将kobject添加到sysfs文件系统中。
三、kobject的属性
kobject.ktype代表这个kobject的类型属性,它是struct kobj_type类型,该结构体定义在include/linux/kobject.h文件中,如下:
108struct kobj_type { 109 void (*release)(struct kobject *kobj); 110 const struct sysfs_ops *sysfs_ops; 111 struct attribute **default_attrs; 112 const struct kobj_ns_type_operations *(*child_ns_type)(struct kobject *kobj); 113 const void *(*namespace)(struct kobject *kobj); 114};
109行,release函数用于在释放kobject对象时进行一些清理工作。
110行,sysfs_ops是struct sysfs_ops类型的指针,该结构体定义在include/linux/sysfs.h文件中,其定义如下:
124struct sysfs_ops { 125 ssize_t (*show)(struct kobject *, struct attribute *,char *); 126 ssize_t (*store)(struct kobject *,struct attribute *,const char *, size_t); 127 const void *(*namespace)(struct kobject *, const struct attribute *); 128};
sysfs_ops定义了对struct attribute进行操作的函数,其中,show用于将要显示的内容填充到第三个参数指定的内存空间中,并最终显示给用户空间。store用于设置attribute内容。
111行,default_attrs是一个struct attribute数组,其中保存了kobject的默认attribute。struct attribute定义在include/linux/sysfs.h文件中,其内容如下:
26struct attribute { 27 const char *name; 28 umode_t mode; 29#ifdef CONFIG_DEBUG_LOCK_ALLOC 30 bool ignore_lockdep:1; 31 struct lock_class_key *key; 32 struct lock_class_key skey; 33#endif 34};
在将kobject加入到sysfs文件系统后,在该kobject对应的目录下,会创建kobject的属性文件,default_attrs数组中指定了几个属性,就会创建几个对应的属性文件,那么sysfs_ops->show在读属性文件时被调用,sysfs_ops->store在写属性文件时被调用,下面我们就来分析一下这两个函数是怎样被调用的。
当用户空间的程序要对属性文件进行读写操作时,首先要通过open函数打开该属性文件,通过一系列的函数调用,会调用到sysfs_open_file函数,该函数定义在fs/sysfs/file.c文件中,其内容如下:
326static int sysfs_open_file(struct inode *inode, struct file *file) 327{ 328 struct sysfs_dirent *attr_sd = file->f_path.dentry->d_fsdata; 329 struct kobject *kobj = attr_sd->s_parent->s_dir.kobj; 330 struct sysfs_buffer *buffer; 331 const struct sysfs_ops *ops; 332 int error = -EACCES; 333 334 /* need attr_sd for attr and ops, its parent for kobj */ 335 if (!sysfs_get_active(attr_sd)) 336 return -ENODEV; 337 338 /* every kobject with an attribute needs a ktype assigned */ 339 if (kobj->ktype && kobj->ktype->sysfs_ops) 340 ops = kobj->ktype->sysfs_ops; 341 else { 342 WARN(1, KERN_ERR "missing sysfs attribute operations for " 343 "kobject: %s\n", kobject_name(kobj)); 344 goto err_out; 345 } 346 347 /* File needs write support. 348 * The inode's perms must say it's ok, 349 * and we must have a store method. 350 */ 351 if (file->f_mode & FMODE_WRITE) { 352 if (!(inode->i_mode & S_IWUGO) || !ops->store) 353 goto err_out; 354 } 355 356 /* File needs read support. 357 * The inode's perms must say it's ok, and we there 358 * must be a show method for it. 359 */ 360 if (file->f_mode & FMODE_READ) { 361 if (!(inode->i_mode & S_IRUGO) || !ops->show) 362 goto err_out; 363 } 364 365 /* No error? Great, allocate a buffer for the file, and store it 366 * it in file->private_data for easy access. 367 */ 368 error = -ENOMEM; 369 buffer = kzalloc(sizeof(struct sysfs_buffer), GFP_KERNEL); 370 if (!buffer) 371 goto err_out; 372 373 mutex_init(&buffer->mutex); 374 buffer->needs_read_fill = 1; 375 buffer->ops = ops; 376 file->private_data = buffer; 377 378 /* make sure we have open dirent struct */ 379 error = sysfs_get_open_dirent(attr_sd, buffer); 380 if (error) 381 goto err_free; 382 383 /* open succeeded, put active references */ 384 sysfs_put_active(attr_sd); 385 return 0; 386 387 err_free: 388 kfree(buffer); 389 err_out: 390 sysfs_put_active(attr_sd); 391 return error; 392}
340行,将kobj->ktype->sysfs_ops赋值给ops,这样就取得了sysfs_ops结构体,也就取得了sysfs_ops.show和sysfs_ops.store函数。
374行,将buffer->needs_read_fill设置为1,后面在执行读操作时会用到这个值。
375行,将buffer->ops设置为ops。
376行,将buffer保存在file->private_data中,方便其它函数使用。
如果用户空间程序要对属性文件进行读取操作,最终会调用到sysfs_read_file函数,该函数定义在fs/sysfs/file.c文件中,其内容如下:
108/** 109 * sysfs_read_file - read an attribute. 110 * @file: file pointer. 111 * @buf: buffer to fill. 112 * @count: number of bytes to read. 113 * @ppos: starting offset in file. 114 * 115 * Userspace wants to read an attribute file. The attribute descriptor 116 * is in the file's ->d_fsdata. The target object is in the directory's 117 * ->d_fsdata. 118 * 119 * We call fill_read_buffer() to allocate and fill the buffer from the 120 * object's show() method exactly once (if the read is happening from 121 * the beginning of the file). That should fill the entire buffer with 122 * all the data the object has to offer for that attribute. 123 * We then call flush_read_buffer() to copy the buffer to userspace 124 * in the increments specified. 125 */ 126 127static ssize_t 128sysfs_read_file(struct file *file, char __user *buf, size_t count, loff_t *ppos) 129{ 130 struct sysfs_buffer * buffer = file->private_data; 131 ssize_t retval = 0; 132 133 mutex_lock(&buffer->mutex); 134 if (buffer->needs_read_fill || *ppos == 0) { 135 retval = fill_read_buffer(file->f_path.dentry,buffer); 136 if (retval) 137 goto out; 138 } 139 pr_debug("%s: count = %zd, ppos = %lld, buf = %s\n", 140 __func__, count, *ppos, buffer->page); 141 retval = simple_read_from_buffer(buf, count, ppos, buffer->page, 142 buffer->count); 143out: 144 mutex_unlock(&buffer->mutex); 145 return retval; 146}
134行,因为我们在sysfs_open_file函数中将buffer->needs_read_fill赋值为1,所以135行fill_read_buffer函数会被调用,它会调用sysfs_ops.show函数填充要显示给用户的数据。fill_read_buffer函数同样定义在fs/sysfs/file.c文件中,其内容如下:
56/** 57 * fill_read_buffer - allocate and fill buffer from object. 58 * @dentry: dentry pointer. 59 * @buffer: data buffer for file. 60 * 61 * Allocate @buffer->page, if it hasn't been already, then call the 62 * kobject's show() method to fill the buffer with this attribute's 63 * data. 64 * This is called only once, on the file's first read unless an error 65 * is returned. 66 */ 67static int fill_read_buffer(struct dentry * dentry, struct sysfs_buffer * buffer) 68{ 69 struct sysfs_dirent *attr_sd = dentry->d_fsdata; 70 struct kobject *kobj = attr_sd->s_parent->s_dir.kobj; 71 const struct sysfs_ops * ops = buffer->ops; 72 int ret = 0; 73 ssize_t count; 74 75 if (!buffer->page) 76 buffer->page = (char *) get_zeroed_page(GFP_KERNEL); 77 if (!buffer->page) 78 return -ENOMEM; 79 80 /* need attr_sd for attr and ops, its parent for kobj */ 81 if (!sysfs_get_active(attr_sd)) 82 return -ENODEV; 83 84 buffer->event = atomic_read(&attr_sd->s_attr.open->event); 85 count = ops->show(kobj, attr_sd->s_attr.attr, buffer->page); 86 87 sysfs_put_active(attr_sd); 88 89 /* 90 * The code works fine with PAGE_SIZE return but it's likely to 91 * indicate truncated result or overflow in normal use cases. 92 */ 93 if (count >= (ssize_t)PAGE_SIZE) { 94 print_symbol("fill_read_buffer: %s returned bad count\n", 95 (unsigned long)ops->show); 96 /* Try to struggle along */ 97 count = PAGE_SIZE - 1; 98 } 99 if (count >= 0) { 100 buffer->needs_read_fill = 0; 101 buffer->count = count; 102 } else { 103 ret = count; 104 } 105 return ret; 106}
75-78行,如果buffer->page为空,即以前没有为buffer->page分配过内存,就调用get_zeroed_page为它分配一页内存。
85行,调用ops->show(kobj, attr_sd->s_attr.attr, buffer->page)函数,注意ops->show即sysfs_ops.show函数的参数,第三个参数是刚分配的buffer->page,而buffer->page最终是要拷贝并显示给用户空间的内容,到这里,我们就可以明白sysfs_ops.show函数的作用是将要显示给用户空间的内容填充到第三个参数指定的内存空间中(默认是一页的内存空间)。
回到sysfs_read_file函数,执行完135行fill_read_buffer函数后,下面来到141行,调用simple_read_from_buffer(buf, count, ppos, buffer->page, buffer->count),该函数用于将要显示的数据拷贝到用户空间,注意第一个参数buf是用户空间传递进来的缓冲区地址,第三个参数buffer->page是由sysfs_ops.show填充的那一页内存空间。该函数定义在fs/libfs.c文件中,其内容如下:
568/** 569 * simple_read_from_buffer - copy data from the buffer to user space 570 * @to: the user space buffer to read to 571 * @count: the maximum number of bytes to read 572 * @ppos: the current position in the buffer 573 * @from: the buffer to read from 574 * @available: the size of the buffer 575 * 576 * The simple_read_from_buffer() function reads up to @count bytes from the 577 * buffer @from at offset @ppos into the user space address starting at @to. 578 * 579 * On success, the number of bytes read is returned and the offset @ppos is 580 * advanced by this number, or negative value is returned on error. 581 **/ 582ssize_t simple_read_from_buffer(void __user *to, size_t count, loff_t *ppos, 583 const void *from, size_t available) 584{ 585 loff_t pos = *ppos; 586 size_t ret; 587 588 if (pos < 0) 589 return -EINVAL; 590 if (pos >= available || !count) 591 return 0; 592 if (count > available - pos) 593 count = available - pos; 594 ret = copy_to_user(to, from + pos, count); 595 if (ret == count) 596 return -EFAULT; 597 count -= ret; 598 *ppos = pos + count; 599 return count; 600}
可以看到,该函数就是将buffer->page拷贝到用户指定的缓冲区中,这样用户空间就得到了想要的数据。
至此,我们就可以明白sysfs_ops.show的作用及其被调用的流程了。
同样的道理,我们再来看sysfs_ops.store的调用流程:
对属性文件的写操作,最终会调用到sysfs_write_file函数,该函数定义在fs/sysfs/file.c文件中,其内容如下:
210/** 211 * sysfs_write_file - write an attribute. 212 * @file: file pointer 213 * @buf: data to write 214 * @count: number of bytes 215 * @ppos: starting offset 216 * 217 * Similar to sysfs_read_file(), though working in the opposite direction. 218 * We allocate and fill the data from the user in fill_write_buffer(), 219 * then push it to the kobject in flush_write_buffer(). 220 * There is no easy way for us to know if userspace is only doing a partial 221 * write, so we don't support them. We expect the entire buffer to come 222 * on the first write. 223 * Hint: if you're writing a value, first read the file, modify only the 224 * the value you're changing, then write entire buffer back. 225 */ 226 227static ssize_t 228sysfs_write_file(struct file *file, const char __user *buf, size_t count, loff_t *ppos) 229{ 230 struct sysfs_buffer * buffer = file->private_data; 231 ssize_t len; 232 233 mutex_lock(&buffer->mutex); 234 len = fill_write_buffer(buffer, buf, count); 235 if (len > 0) 236 len = flush_write_buffer(file->f_path.dentry, buffer, len); 237 if (len > 0) 238 *ppos += len; 239 mutex_unlock(&buffer->mutex); 240 return len; 241}
234行,调用fill_write_buffer(buffer, buf, count)函数将用户空间传递进来的数据保存到buffer->page中,该函数定义在fs/sysfs/file.c文件中,其内容如下:
148/** 149 * fill_write_buffer - copy buffer from userspace. 150 * @buffer: data buffer for file. 151 * @buf: data from user. 152 * @count: number of bytes in @userbuf. 153 * 154 * Allocate @buffer->page if it hasn't been already, then 155 * copy the user-supplied buffer into it. 156 */ 157 158static int 159fill_write_buffer(struct sysfs_buffer * buffer, const char __user * buf, size_t count) 160{ 161 int error; 162 163 if (!buffer->page) 164 buffer->page = (char *)get_zeroed_page(GFP_KERNEL); 165 if (!buffer->page) 166 return -ENOMEM; 167 168 if (count >= PAGE_SIZE) 169 count = PAGE_SIZE - 1; 170 error = copy_from_user(buffer->page,buf,count); 171 buffer->needs_read_fill = 1; 172 /* if buf is assumed to contain a string, terminate it by \0, 173 so e.g. sscanf() can scan the string easily */ 174 buffer->page[count] = 0; 175 return error ? -EFAULT : count; 176}
236行,调用flush_write_buffer函数,该函数就会调用sysfs_ops.store函数将用户空间传递进来的数据写到kobject中。flush_write_buffer函数定义在fs/sysfs/file.c文件中,其内容如下:
179/** 180 * flush_write_buffer - push buffer to kobject. 181 * @dentry: dentry to the attribute 182 * @buffer: data buffer for file. 183 * @count: number of bytes 184 * 185 * Get the correct pointers for the kobject and the attribute we're 186 * dealing with, then call the store() method for the attribute, 187 * passing the buffer that we acquired in fill_write_buffer(). 188 */ 189 190static int 191flush_write_buffer(struct dentry * dentry, struct sysfs_buffer * buffer, size_t count) 192{ 193 struct sysfs_dirent *attr_sd = dentry->d_fsdata; 194 struct kobject *kobj = attr_sd->s_parent->s_dir.kobj; 195 const struct sysfs_ops * ops = buffer->ops; 196 int rc; 197 198 /* need attr_sd for attr and ops, its parent for kobj */ 199 if (!sysfs_get_active(attr_sd)) 200 return -ENODEV; 201 202 rc = ops->store(kobj, attr_sd->s_attr.attr, buffer->page, count); 203 204 sysfs_put_active(attr_sd); 205 206 return rc; 207}