如何在SAS中利用数据的分位数等统计量实现自动分组?

背景:有时我们要观察各个分数区间的用户,在各个特征上的表现有无差异。在进行分组时,除了使用PROC FORMAT手工定义区间之外,也可以使用PROC RANK和PROC FORMAT,利用分数(或者其他数据)的分位数等统计量,实现自动化分组排序。

PROC RANK


proc rank data=test out=r_test【输出的数据集】;

var spend【对spend进行排序】;

ranks r_spend【序号变量命名为r_spend】; �

run;

PROC UNIVARIATE


proc univariate data=events noprint;

var neg_score;

output out=p pctlpre=P_【分位数变量名称的前缀为P_】

pctlpts=10 to 100 by 10;

weight SamplingWeight;

run;

proc transpose data=p out=pt;

run;

proc sort data=pt

nodupkey force noequals;

by COL1;

run;


Generating deciles, quartiles, percentiles or other groups from numeric variables. The GROUPS optionis used here to specify the binning. Deciles are created by specifying GROUPS=10, quartiles can be generated by GROUPS=4, and percentiles are created with setting GROUPS=100.

你可能感兴趣的:(如何在SAS中利用数据的分位数等统计量实现自动分组?)