cpufreq framework提供机制(cpufreq driver)与策略(cpufreq governor),此外提供了cpufreq core来对机制和策略进行管理。
主要代码路径:
driver/cpufreq/cpufreq.c
include/linux/cpufreq.h
drivers/cpufreq/cpufreq_userspace.c
看起来与cpuidle framework的图很像,但是有些差别如下:
用户层与cpufreq framework的交互,主要是通过sysfs,这个可以在/sys下看到很多文件,而Kernel Module也可以使用某些接口来回调它;
系统只允许有一个Platform Drivers,为全局变量cpufreq_driver,cpufreq core通过它去回调驱动;
驱动与硬件的交互,通过如set_clk_rate/regulator_set_voltage等接口去设置CPU的时钟和电压,而不再是cpu_ops;
有一个全局的governor链表cpufreq_governor_list,可以通过查找链表来选择合适的governor;
核心的数据结构有三个:
struct cpufreq_policy:用于描述不同的policy,涉及到频率表、cpuinfo等各种信息,并且每个policy都会指向某个governor;
struct cpufreq_governor:用于对policy的管理;
struct cpufreq_driver:用于描述具体的驱动程序;
如下图:
cpufreq core是cpufreq framework的核心模块,和kernel其它framework类似,它主要实现三类功能:
对上,以sysfs的形式向用户空间提供统一的接口,以notifier的形式向其它driver提供频率变化的通知;
对下,提供CPU core频率和电压控制的驱动框架,方便底层driver的开发;同时,提供governor框架,用于实现不同的频率调整机制;
内部,封装各种逻辑,实现所需功能。这些逻辑主要围绕struct cpufreq_driver、struct cpufreq_policy和struct cpufreq_governor三个数据结构进行。
1)struct cpufreq_driver
struct cpufreq_driver用于抽象cpufreq驱动,是平台驱动工程师关注最多的结构,其定义如下:
/* include/linux/cpufreq.h */
struct cpufreq_driver {
char name[CPUFREQ_NAME_LEN];
u8 flags;
void *driver_data;
/* needed by all drivers */
int (*init) (struct cpufreq_policy *policy);
int (*verify) (struct cpufreq_policy *policy);
/* define one out of two */
int (*setpolicy) (struct cpufreq_policy *policy);
/*
* On failure, should always restore frequency to policy->restore_freq
* (i.e. old freq).
*/
int (*target) (struct cpufreq_policy *policy, /* Deprecated */
unsigned int target_freq,
unsigned int relation);
int (*target_index) (struct cpufreq_policy *policy,
unsigned int index);
/*
* Only for drivers with target_index() and CPUFREQ_ASYNC_NOTIFICATION
* unset.
*
* get_intermediate should return a stable intermediate frequency
* platform wants to switch to and target_intermediate() should set CPU
* to to that frequency, before jumping to the frequency corresponding
* to 'index'. Core will take care of sending notifications and driver
* doesn't have to handle them in target_intermediate() or
* target_index().
*
* Drivers can return '0' from get_intermediate() in case they don't
* wish to switch to intermediate frequency for some target frequency.
* In that case core will directly call ->target_index().
*/
unsigned int (*get_intermediate)(struct cpufreq_policy *policy,
unsigned int index);
int (*target_intermediate)(struct cpufreq_policy *policy,
unsigned int index);
/* should be defined, if possible */
unsigned int (*get) (unsigned int cpu);
/* optional */
int (*bios_limit) (int cpu, unsigned int *limit);
int (*exit) (struct cpufreq_policy *policy);
void (*stop_cpu) (struct cpufreq_policy *policy);
int (*suspend) (struct cpufreq_policy *policy);
int (*resume) (struct cpufreq_policy *policy);
struct freq_attr **attr;
/* platform specific boost support code */
bool boost_supported;
bool boost_enabled;
int (*set_boost) (int state);
};
介绍该结构之前,我们先思考一个问题:由设备模型可知,driver是用来驱动设备的,那么struct cpufreq_driver所对应的设备是什么?也许从该结构中回调函数的参数可以猜到,是struct cpufreq_policy。
name,该driver的名字,需要唯一,因为cpufreq framework允许同时注册多个driver,用户可以根据实际情况选择使用哪个driver。driver的标识,就是name。
init,driver的入口,由cpufreq core在设备枚举的时候调用,driver需要根据硬件情况,填充policy的内容。
verify,验证policy中的内容是否符合硬件要求。它和init接口都是必须实现的接口。
setpolicy,driver需要提供这个接口,用于设置CPU core动态频率调整的范围(即policy)。
target、target_index,driver需要实现这两个接口中的一个(target为旧接口,不推荐使用),用于设置CPU core为指定频率(同时修改为对应的电压)。
后面的接口都是可选的。
有关struct cpufreq_driver的API包括:
int cpufreq_register_driver(struct cpufreq_driver *driver_data);
int cpufreq_unregister_driver(struct cpufreq_driver *driver_data);
const char *cpufreq_get_current_driver(void);
void *cpufreq_get_driver_data(void);
分别为driver的注册、注销。获取当前所使用的driver名称,以及该driver的私有数据结构(driver_data字段)。
2)struct cpufreq_policy
struct cpufreq_policy是比较抽象的一个数据结构,我们需要借助cpufreq core中的一些实现逻辑,去分析、理解它。
前面我们提到过一个问题,cpufreq driver对应的设备是什么?kernel是这样抽象cpufreq的:
cpufreq model抽象出一个CPU bus(对应的sysfs目录为/sys/devices/system/cpu/,所有的CPU device都挂在这个bus上。cpufreq是CPU device的一类特定功能,被抽象为一个subsys interface。
当CPU device和CPU driver匹配时,bus core会调用subsys interface的add_dev回调函数,相当于为该特定功能添加一个“device”,进而和该特定功能的“driver”(这里为cpufreq driver)匹配,执行driver的初始化(probe,或者其它)接口。
那么该“特定功能”应该用什么样的“device”表示呢?应具体功能具体对待。kernel使用cpufreq policy(即“调频策略”)来抽象cpufreq。所谓的调频策略,即频率调整的范围,它从一定程度上,代表了cpufreq的属性。这就是struct cpufreq_policy结构的现实意义:
struct cpufreq_policy {
/* CPUs sharing clock, require sw coordination */
cpumask_var_t cpus; /* Online CPUs only */
cpumask_var_t related_cpus; /* Online + Offline CPUs */
unsigned int shared_type; /* ACPI: ANY or ALL affected CPUs
should set cpufreq */
unsigned int cpu; /* cpu nr of CPU managing this policy */
unsigned int last_cpu; /* cpu nr of previous CPU that managed
* this policy */
struct clk *clk;
struct cpufreq_cpuinfo cpuinfo;/* see above */
unsigned int min; /* in kHz */
unsigned int max; /* in kHz */
unsigned int cur; /* in kHz, only needed if cpufreq
* governors are used */
unsigned int restore_freq; /* = policy->cur before transition */
unsigned int suspend_freq; /* freq to set during suspend */
unsigned int policy; /* see above */
struct cpufreq_governor *governor; /* see below */
void *governor_data;
bool governor_enabled; /* governor start/stop flag */
struct work_struct update; /* if update_policy() needs to be
* called, but you're in IRQ context */
struct cpufreq_real_policy user_policy;
struct cpufreq_frequency_table *freq_table;
struct list_head policy_list;
struct kobject kobj;
struct completion kobj_unregister;
/*
* The rules for this semaphore:
* - Any routine that wants to read from the policy structure will
* do a down_read on this semaphore.
* - Any routine that will write to the policy structure and/or may take away
* the policy altogether (eg. CPU hotplug), will hold this lock in write
* mode before doing so.
*
* Additional rules:
* - Lock should not be held across
* __cpufreq_governor(data, CPUFREQ_GOV_POLICY_EXIT);
*/
struct rw_semaphore rwsem;
/* Synchronization for frequency transitions */
bool transition_ongoing; /* Tracks transition status */
spinlock_t transition_lock;
wait_queue_head_t transition_wait;
struct task_struct *transition_task; /* Task which is doing the transition */
/* For cpufreq driver's internal use */
void *driver_data;
};
该结构看着很复杂,现在只需要关心几个事情:
min/max frequency,调频范围,对于可以自动调频的CPU而言,只需要这两个参数就够了。
current frequency和governor,对于不能自动调频的CPU,需要governor设置具体的频率值。下面介绍一下governor。
struct cpufreq_policy不会直接对外提供API。
3) cpufreq governors
对于不能自动调频的CPU core,必须由软件设定具体的频率值。根据使用场景的不同,会有不同的调整方案,这是由governor模块负责的,如下:
struct cpufreq_governor {
char name[CPUFREQ_NAME_LEN];
int initialized;
int (*governor) (struct cpufreq_policy *policy,
unsigned int event);
ssize_t (*show_setspeed) (struct cpufreq_policy *policy,
char *buf);
int (*store_setspeed) (struct cpufreq_policy *policy,
unsigned int freq);
unsigned int max_transition_latency; /* HW must be able to switch to
next freq faster than this value in nano secs or we
will fallback to performance governor */
struct list_head governor_list;
struct module *owner;
};
name,该governor的名称。
governor,用于governor状态切换的回调函数。
show_setspeed、store_setspeed,用于提供sysfs “setspeed” attribute文件的回调函数。
max_transition_latency,该governor所能容忍的最大频率切换延迟。
cpufreq governors主要向具体的governor模块提供governor的注册和注销接口。
4)通过sysfs向用户空间提供的接口
学习wiki:
Linux cpufreq framework
linux cpufreq framework(3)_cpufreq core
深入浅出CPUFreq