Linux Thermal

文字图片是转载:http://kernel.meizu.com/linux-thermal-framework-intro.html
代码分析是自己的分析

Linux Thermal 是Linux 系统下温度控制相关的模块,主要用来控制系统运行过程中芯片产生的热量,使得芯片温度和设备外壳维持在一个安全的范围。

Thermal 的主要框架
要实现一个温度控制的需求,就需要:获取温度的设备和控制温度的设备,以及一些使用温度控制设备的策略。
获取温度的设备:在Thermal框架中被抽象为Thermal Zone Device;
控制温度的设备:在Thermal框架中被抽象为Thermal Cooling Device;

Linux Thermal_第1张图片

Thermal Zone Device
上面说到Thermal Zone Device是获取温度设备的抽象,怎么抽象的?RTFSC
通过代码我们可以看到,一个能提供温度的设备操作函数主要有:绑定函数、获取温度函数、获取触发点温度函数。
绑定函数:Thermal core用来绑定用的,后面讲;
获取温度函数:获取设备温度用的。一般soc内部会有温度传感器提供温度,有些热敏电阻通过ADC也能算出温度,这个函数就是取这些温度值;
获取触发点温度函数:这个是用来做什么的呢?这个其实是thermal框架里面的一个关键点,因为要控制温度,那么什么时候控制就需要有东西来描述,而描述什么时候控制的东西就是触发点,每个thermal zone device会定义很多触发点,那么每个触发点就是通过该函数获得;
该结构体定义的地方是:./include/linux/thermal.h

struct thermal_zone_device {
	int id;
	char type[THERMAL_NAME_LENGTH];
	struct device device;
	struct thermal_attr *trip_temp_attrs;
	struct thermal_attr *trip_type_attrs;
	struct thermal_attr *trip_hyst_attrs;
	void *devdata;
	int trips;
	/*轮询时间*/
	unsigned long trips_disabled;	/* bitmap for disabled trips */
	int passive_delay;
	int polling_delay;
	int temperature;
	int last_temperature;
	int emul_temperature;
	int passive;
	unsigned int forced_passive;
	atomic_t need_update;
	/*设备操作函数*/
	struct thermal_zone_device_ops *ops;
	struct thermal_zone_params *tzp;
	/*降温策略*/
	struct thermal_governor *governor;
	void *governor_data;
	//重要,每个zone的instance列表头@thermal_instances:list of &struct thermal_instance of this thermal zone
	struct list_head thermal_instances;
	struct idr idr;
	struct mutex lock;
	struct list_head node;
	/*用来循环处理的delayed_work*/
	struct delayed_work poll_queue;
	struct sensor_threshold tz_threshold[2];
	struct sensor_info sensor;
};
struct thermal_zone_device_ops {
	/*绑定函数*/
	int (*bind) (struct thermal_zone_device *,struct thermal_cooling_device *);
	int (*unbind) (struct thermal_zone_device *,struct thermal_cooling_device *);
	/*获取温度函数*/
	int (*get_temp) (struct thermal_zone_device *, unsigned long *);
	int (*get_mode) (struct thermal_zone_device *,enum thermal_device_mode *);
	int (*set_mode) (struct thermal_zone_device *,enum thermal_device_mode);
	int (*get_trip_type) (struct thermal_zone_device *, int,enum thermal_trip_type *);
	int (*activate_trip_type) (struct thermal_zone_device *, int,enum thermal_trip_activation_mode);
	/*获取触发点温度*/
	int (*get_trip_temp) (struct thermal_zone_device *, int,unsigned long *);
	int (*set_trip_temp) (struct thermal_zone_device *, int,unsigned long);
	int (*get_trip_hyst) (struct thermal_zone_device *, int,unsigned long *);
	int (*set_trip_hyst) (struct thermal_zone_device *, int,unsigned long);
	int (*get_crit_temp) (struct thermal_zone_device *, unsigned long *);
	int (*set_emul_temp) (struct thermal_zone_device *, unsigned long);
	int (*get_trend) (struct thermal_zone_device *, int,enum thermal_trend *);
	int (*notify) (struct thermal_zone_device *, int,enum thermal_trip_type);
};

Thermal Cooling Devices
Thermal Cooling Devices是可以降温设备的抽象,能降温的设备比如风扇,这些好理解,但是像CPU,GPU,这些Cooling Devices怎么理解呢?
其实CPU,GPU这些Cooling device是通过降低产热量来降温的。而风扇,散热片这些是用来加快散热的。
Thermal Cooling Devices抽象的方式是,认为所有的能降温的设备有很多可以单独控制的状态,例如风扇有不同的风速状态。
CPU/GPU Cooling device 有不同最大运行频率状态,这样当温度高了之后通过调整这些状态来降低温度;

struct thermal_cooling_device {
	int id;
	char type[THERMAL_NAME_LENGTH];
	struct device device;
	struct device_node *np;
	void *devdata;
	/*操作函数*/
	const struct thermal_cooling_device_ops *ops;
	bool updated; /* true if the cooling device does not need update */
	struct mutex lock; /* protect thermal_instances list */
	//同上 ,instances列表的头结点
	struct list_head thermal_instances;
	struct list_head node;
};
struct thermal_cooling_device_ops {
	int (*get_max_state) (struct thermal_cooling_device *, unsigned long *);
	int (*get_cur_state) (struct thermal_cooling_device *, unsigned long *);
	/*设定等级*/
	int (*set_cur_state) (struct thermal_cooling_device *, unsigned long);
	int (*get_requested_power)(struct thermal_cooling_device *,struct thermal_zone_device *, u32 *);
	int (*state2power)(struct thermal_cooling_device *,struct thermal_zone_device *, unsigned long, u32 *);
	int (*power2state)(struct thermal_cooling_device *,struct thermal_zone_device *, u32, unsigned long *);
};

Thermal Governor
Thermal Governor是降温策略的一个抽象,主要是根据温度来选择thermal cooling devices等级的方法,举个简单的例子,当前的温度升高速度很快,选择风扇3挡风,温度升高不快,选择1挡风,这就是一个Governor
很简单,所有的策略都通过throttle这个函数实现,内核已经实现了一些策略,step_wise,user_space,power_allocator,bang_bang,等具体实现算法细节就不展开了。

/**
 * struct thermal_governor - structure that holds thermal governor information
 * @name:	name of the governor
 * @bind_to_tz: callback called when binding to a thermal zone.  If it
 *		returns 0, the governor is bound to the thermal zone,
 *		otherwise it fails.
 * @unbind_from_tz:	callback called when a governor is unbound from a
 *			thermal zone.
 * @throttle:	callback called for every trip point even if temperature is
 *		below the trip point temperature
 * @governor_list:	node in thermal_governor_list (in thermal_core.c)
 */
struct thermal_governor {
	char name[THERMAL_NAME_LENGTH];
	int (*bind_to_tz)(struct thermal_zone_device *tz);
	void (*unbind_from_tz)(struct thermal_zone_device *tz);
	/*策略函数*/
	int (*throttle)(struct thermal_zone_device *tz, int trip);
	struct list_head	governor_list;
};

Thermal Core
有了获取温度的设备,有了温控控制的设备,有了控制方法,Thermal Core就负责把这些整合在一起。RTFSC
1.注册函数,Thermal Core通过对外提供注册的接口,让thermal zone device\thermal cooling device\thermal governor注册进来
这个接口函数是增加一个thermal zone device 的sensor 在目录/sys/class/thermal目录下,并且取名为thermal_zone[0-*],同时打算绑定thermal cooling devices 的注册,返回值是指向创建thermal_zone_device的指针

struct thermal_zone_device *thermal_zone_device_register(const char *type,int trips, int mask, void *devdata,struct thermal_zone_device_ops *ops,struct thermal_zone_params *tzp,int passive_delay, int polling_delay)
thermal_zone_device_register() - register a new thermal zone device
@type:	the thermal zone device type
@trips:	the number of trip points the thermal zone support
@mask:	a bit string indicating the writeablility of trip points
@devdata:	private device data
@ops:	standard thermal zone device callbacks
@tzp:	thermal zone platform parameters
@passive_delay: number of milliseconds to wait between polls when performing passive cooling
@polling_delay: number of milliseconds to wait between polls when checking whether trip points have been crossed (0 for interrupt driven systems)

这个接口函数是增加一个新的接口函数thermal cooling device (fan/processor/…) 在/sys/class/thermal/文件夹中作为cooling_device[0-*],它对自己是绑定的,返回值是指向thermal_cooling_device 结构体的指针。

struct thermal_cooling_device * thermal_cooling_device_register(char *type, void *devdata,const struct thermal_cooling_device_ops *ops)
thermal_cooling_device_register() - register a new thermal cooling device
@type:	the thermal cooling device type.
@devdata:	device private data.
@ops:		standard thermal cooling devices callbacks.

这个接口是注册thermal governor

int thermal_register_governor(struct thermal_governor *governor)

2.Thermal zone/cooling device 注册过程中thermal core会调用绑定函数,绑定的过程最主要是一个cooling device 绑定到一个thermal_zone的触发点上
这个接口连接thermal cooling device到thermal zone device的某个触发点上。成功返回0

//先贴一个结构体
/*
 * This structure is used to describe the behavior of
 * a certain cooling device on a certain trip point
 * in a certain thermal zone
 */
struct thermal_instance {
	int id;
	char name[THERMAL_NAME_LENGTH];
	struct thermal_zone_device *tz;
	struct thermal_cooling_device *cdev;
	int trip;
	bool initialized;
	unsigned long upper;	/* Highest cooling state for this trip point */
	unsigned long lower;	/* Lowest cooling state for this trip point */
	unsigned long target;	/* expected cooling state */
	char attr_name[THERMAL_NAME_LENGTH];
	struct device_attribute attr;
	char weight_attr_name[THERMAL_NAME_LENGTH];
	struct device_attribute weight_attr;
	struct list_head tz_node; /* 重要node in tz->thermal_instances */
	struct list_head cdev_node; /* 重要node in cdev->thermal_instances */
	unsigned int weight; /* The weight of the cooling device */
};
thermal_zone_bind_cooling_device() - bind a cooling device to a thermal zone
@tz:	pointer to struct thermal_zone_device
@trip:	indicates which trip point the cooling devices is associated with in this thermal zone.
@cdev:	pointer to struct thermal_cooling_device
@upper:	the Maximum cooling state for this trip point. THERMAL_NO_LIMIT means no upper limit, and the cooling device can be in max_state.
@lower:	the Minimum cooling state can be used for this trip point.THERMAL_NO_LIMIT means no lower limit,and the cooling device can be in cooling state 0.
@weight:The weight of the cooling device to be bound to thethermal zone. Use THERMAL_WEIGHT_DEFAULT for thedefault value
int thermal_zone_bind_cooling_device(struct thermal_zone_device *tz,
				     int trip,
				     struct thermal_cooling_device *cdev,
				     unsigned long upper, unsigned long lower,
				     unsigned int weight)
{
	struct thermal_instance *dev; //用来描述zone和cooling设备在某个trip 上的关系
	struct thermal_instance *pos;
	struct thermal_zone_device *pos1;
	struct thermal_cooling_device *pos2;
	unsigned long max_state;
	int result;

	//使得pos1指向tz设备,pos2指向cooling设备
	list_for_each_entry(pos1, &thermal_tz_list, node) { if (pos1 == tz) break; }
	list_for_each_entry(pos2, &thermal_cdev_list, node) { if (pos2 == cdev) break; }

	//使用cooling设备的get_max_state函数,得到最大等级状态
	cdev->ops->get_max_state(cdev, &max_state);

	/* lower default 0, upper default max_state */
	lower = lower == THERMAL_NO_LIMIT ? 0 : lower;
	upper = upper == THERMAL_NO_LIMIT ? max_state : upper;
	
	dev = kzalloc(sizeof(struct thermal_instance), GFP_KERNEL); //给dev开辟空间
	dev->tz = tz; //dev得到zone设备
	dev->cdev = cdev; //dev得到cooling设备
	dev->trip = trip; //dev得到温度触发的那个点 
	dev->upper = upper; //dev得到上限
	dev->lower = lower; //dev得到下限
	dev->target = THERMAL_NO_TARGET; // 不知道做啥的
	dev->weight = weight; //dev得到weight
	
	//调用idr_alloc,动态分配一个id号,并将该id号做为dev的id号
	result = get_idr(&tz->idr, &tz->lock, &dev->id);
	
	sprintf(dev->name, "cdev%d", dev->id); //用id号做成dev的name
	//一个kobject对象就对应sys目录中的一个设备,代表这些驱动的结构
	//在tz->device.kobj目录下创建指向cdev->device.kobj目录的软链接,name为软链接文件名称。
	result =sysfs_create_link(&tz->device.kobj, &cdev->device.kobj, dev->name);

	sprintf(dev->attr_name, "cdev%d_trip_point", dev->id);// 用id号做成dev的attr_name
	sysfs_attr_init(&dev->attr.attr);// 文件属性的初始化?
	//对属性进行赋值
	dev->attr.attr.name = dev->attr_name;
	dev->attr.attr.mode = 0444;
	dev->attr.show = thermal_cooling_device_trip_point_show; //属性中show函数,具象为一个文件节点cat的调用
	//调用sysfs_create_file()在kobj对应的目录下创建attr对应的属性文件
	result = device_create_file(&tz->device, &dev->attr);

	//大致同上,只是不太清楚weight是用来做啥的
	sprintf(dev->weight_attr_name, "cdev%d_weight", dev->id);
	sysfs_attr_init(&dev->weight_attr.attr);
	dev->weight_attr.attr.name = dev->weight_attr_name;
	dev->weight_attr.attr.mode = S_IWUSR | S_IRUGO;
	dev->weight_attr.show = thermal_cooling_device_weight_show;
	dev->weight_attr.store = thermal_cooling_device_weight_store;
	result = device_create_file(&tz->device, &dev->weight_attr);

	mutex_lock(&tz->lock);  //对zone列表上锁
	mutex_lock(&cdev->lock);  //对cooling列表上锁
	//遍历zone下的thermal_instances列表,看看有没有跟这个准备加入的instances一样的
	list_for_each_entry(pos, &tz->thermal_instances, tz_node)
	    if (pos->tz == tz && pos->trip == trip && pos->cdev == cdev) {
		result = -EEXIST; //有
		break;
	}
	if (!result) {  //没有的话,就分别在zone和cooling的设备的instances列表中加入
		list_add_tail(&dev->tz_node, &tz->thermal_instances); //把这个instances加入到zone的instances列表中
		list_add_tail(&dev->cdev_node, &cdev->thermal_instances);//把这个instances加入到cooling的instances列表中
		atomic_set(&tz->need_update, 1);//原子操作,设置值
	}
	mutex_unlock(&cdev->lock);  //对cooling列表解锁
	mutex_unlock(&tz->lock);    //对zone列表解锁

	if (!result)
		return 0;
		
	device_remove_file(&tz->device, &dev->weight_attr);
	remove_trip_file:device_remove_file(&tz->device, &dev->attr);
	remove_symbol_link:sysfs_remove_link(&tz->device.kobj, dev->name);
	release_idr:release_idr(&tz->idr, &tz->lock, dev->id);
	free_mem:kfree(dev);

	return result;
}
EXPORT_SYMBOL_GPL(thermal_zone_bind_cooling_device);//导出符号,在另一个函数中调用

3.Thermal core使能delayed_work循环处理,使得整个thermal控制流程运转起来,当温度升高超过温度触发点的话,就会使能对应的cooling device进行降温处理。
首先在在struct thermal_zone_device *thermal_zone_device_register()中调用中:
a.bind_tz(tz); --__bind–thermal_zone_bind_cooling_device()绑定zone和cooling设备
b.INIT_DELAYED_WORK(&(tz->poll_queue), thermal_zone_device_check);来初始化工作poll_queue以及工作函数thermal_zone_check;
c.if (!tz->ops->get_temp) thermal_zone_device_set_polling(tz, 0);如果tz不存在get_temp这个函数,则调用delay为0的thermal_zone_device_set_polling函数,里面调用cancel_delayed_work(&tz->poll_queue);取消延迟工作
d.thermal_zone_device_reset(tz); 重置这个zone设备,里面包括tz->temperature = THERMAL_TEMP_INVALID;tz->passive = 0;以及对每一个instances的pos->initialized = false;
c.之后是重点:
atomic_cmpxhg()是比较+交换的原子操作,比较need_update的值是否等于1,如果是,则把0赋值给need_update,否则不修改它的值,返回值是need_update赋值前的值。
如果,之前的bind成功,就会通过原子操作使得need_update的值为1
然后调用thermal_zone_device_update(tz)

if (atomic_cmpxchg(&tz->need_update, 1, 0))
		thermal_zone_device_update(tz);
//在thermal_zone_device_update(tz);中
先执行update_temperature	
	--thermal_zone_get_temp(tz, &temp)  --  tz->ops->get_temp(tz, temp)获得temp值
	之后再赋值	
	tz->last_temperature = tz->temperature;
然后进行每个trip温度的处理,就是处理触发点,这里就会调用到具体的governor
for (count = 0; count < tz->trips; count++) handle_thermal_trip(tz, count);
	在handle_thermal_trip函数中,首先通过tz->ops->get_trip_type(tz, trip, &type); 获取每个触发点的类别
	然后根据类别进行不同governor运算handle_critical_trips(tz, trip, type);或者handle_non_critical_trips(tz, trip, type);
	在处理完某个trip点后,我们需要调用monitor_thermal_zone(tz)来重新start 监视器monitor
	在看monitor_thermal_zone函数之前,先看一下zone device结构体的一些用到的成员:
	passive:1 if you've crossed a passive trip point, 0 otherwise. 当这个trip温度被触发后,passive为1,在前面的reset的时候已经置为0
	passive_delay:	number of milliseconds to wait between polls when performing passive cooling.  执行cooling时候的delay时间
	polling_delay:	number of milliseconds to wait between polls when checking whether trip points have been crossed (0 for interrupt driven systems)  平常检查的delay时间
	根据以上三个参数执行函数thermal_zone_device_set_polling,执行如下函数
static void thermal_zone_device_set_polling(struct thermal_zone_device *tz,
					    int delay)
{
	if (delay > 1000)
		//执行延迟工作,delay时间后执行工作tz->poll_queue,用system_freezable_wq线程,因为delay>1000,且用cooling的时候,所以用粗粗的定时器round_jiffies
		mod_delayed_work(system_freezable_wq, &tz->poll_queue, round_jiffies(msecs_to_jiffies(delay)));
	else if (delay)//执行延迟工作,正常的检查温度状态
		mod_delayed_work(system_freezable_wq, &tz->poll_queue,msecs_to_jiffies(delay));
	else //如果delay为0,取消这个工作
		cancel_delayed_work(&tz->poll_queue);
}

下面介绍延迟工作做了什么

static void thermal_zone_device_check(struct work_struct *work)
{
	//通过工作,获得zone的结构体
	struct thermal_zone_device *tz = container_of(work, struct thermal_zone_device, poll_queue.work);
	thermal_zone_device_update(tz);//发现没有,又调用了上面的函数了,获得并且更新温度,进行governor的调度,重新start monitor,然后set polling,一段时间后又进行工作(delay时间,降温就久一点,check就短一点),不断循环
}

Linux Thermal_第2张图片

现在不妨换换思路,瞧点文学东西

如果你喜欢,聊历史,思哲学,品诗集,赏国学。

那就关注公众号:二校五叔

这个是博主的文学公众号啦_

你可能感兴趣的:(Thermal)