工作中,双侧编程会涉及结果的Compare,目的是为了确认双侧编程结果一致。对于不一致的结果,查明原因后,进行程序更新。
这一过程中,最先处理的是,数据集比较结果不一致如何展示的问题。
Compare过程步比较结果,可以通过以下4种方式进行展示:
- SAS日志
- 自动宏变量SYSINFO的返回值
- 过程步的Results输出
- 结果输出到数据集
1. SAS日志
当使用WARNING
、PRINTALL
、ERROR
选项时,Compare过程步会在SAS日志中输出不一致的描述。以ERROR
选项为例,演示程序如下:
data base;
set sashelp.class;
run;
data comp;
set sashelp.class;
if _n_ = 1 then height =100;
run;
proc compare base = base comp=comp error;
run;
SAS日志输出结果如下:
2. 自动宏变量SYSINFO的返回值
Compare过程步运行结束之后,会在自动宏变量SYSINFO中返回一个值,这个返回值记录了比较结果的信息。在Compare过程步运行之后以及其他过程步运行之前,通过检查SYSINFO的返回值,可以获取不一致的具体描述信息。很多公司Compare的宏程序也是利用SYSINFO宏变量输出比较的结果。
SAS官方文档中,对返回值有具体的描述,Code具体值为2的(n-1)次方,n对应具体描述的位序。(来源:SAS Help Center: Results: PROC COMPARE)
如果在一次比较中,以上16种情形出现不止一个,那么SAS会输出各出现情形对应Code值的求和。
举例1,变量值不同:
data base;
set sashelp.class;
run;
data comp;
set sashelp.class;
if _n_ = 1 then height =100;
run;
proc compare base = base comp=comp;
run;
%let comres=&sysinfo.;
%put Compare resulst code: &comres.;
日志输出如下:
输出结果为4096,对应第13种情况(2的12次方),A value comparision was unequal
。
举例2,变量值以及Label不同:
data base;
set sashelp.class;
run;
data comp;
set sashelp.class;
if _n_ = 1 then height =100;
label weight = "W";
run;
proc compare base = base comp=comp;
run;
%let comres=&sysinfo.;
%put Compare resulst code: &comres.;
日志输出如下:
输出结果为4128,4096+32,对应第6和第13种情形,Variable has different label
,A value comparison was unequal
。
关于多种情形如何定位区分的问题,SAS文档中提供了一种二进制匹配确认的方法:
%let rc=&sysinfo;
data _null_;
/* 1. Test for data set label */
if &rc = '1'b then
put '<<<< Data sets have different labels';
/* 2. Test for data set types */
if &rc = '1.'b then
put '<<<< Data set types differ';
/* 3. Test for variable informats */
if &rc = '1..'b then
put '<<<< Variable has different informat';
/* 4. Test for variable formats */
if &rc = '1...'b then
put '<<<< Variable has different format';
/* 5. Test for length */
if &rc = '1....'b then
put '<<<< Variable has different lengths between the base data set
and the comparison data set';
/* 6. Test for label */
if &rc = '1.....'b then
put '<<<< Variable has different label';
/* 7. Test for base observation */
if &rc = '1......'b then
put '<<<< Base data set has observation not in comparison data set';
/* 8. Test for comparison observation */
if &rc = '1.......'b then
put '<<<< Comparison data set has observation not in base';
/* 9. Test for base BY group */
if &rc = '1........'b then
put '<<<< Base data set has BY group not in comparison';
/* 10. Test for comparison BY group */
if &rc = '1.........'b then
put '<<<< Comparison data set has BY group not in base';
/* 11. Variable in base data set not in compare data set */
if &rc ='1..........'b then
put '<<<< Variable in base data set not found in comparison data set';
/* 12. Comparison data set has variable not in base data set */
if &rc = '1...........'b then
put '<<<< Comparison data set has variable not contained in the
base data set';
/* 13. Test for values */
if &rc = '1............'b then
put '<<<< A value comparison was unequal';
/* 14. Conflicting variable types */
if &rc ='1.............'b then
put '<<<< Conflicting variable types between the two data sets
being compared';
/* 15. Test for BY variables */
if &rc = '1..............'b then
put '<<<< BY variables do not match';
/* 16. Fatal error*/
if &rc ='1...............'b then
put '<<<< Fatal error: comparison not done';
run;
上一个Compare过程步返回结果为4128,运行以上代码后,SAS日志显示如下,所有不一致情形都会输出。SAS这样设计很巧妙,每一个返回值对应的情形都能够清晰输出。
3. 过程步的Results输出
在SAS中运行Compare过程步后,Results页面也会有输出结果的汇总,汇总的内容有以下几种:
- Data Set Summary
- Variables Summary
- Observation Summary
- Values Comparison Summary
- Value Comparison Results
- Table of Summary Statistics
- Comparison Results for Observations (Using the TRANSPOSE Option)
示例代码:
data base;
set sashelp.class;
run;
data comp;
set sashelp.class;
if _n_ = 1 then height =100;
label weight = "W";
run;
proc compare base = base comp=comp;
run;
Results页面结果如下:
Results页面中的内容也可以输出到外部文档中:
ods rtf file = "E:\99_Test\Test\Compare_results.rtf";
proc compare base = base comp=comp;
run;
ods rtf close;
结果输出到指定目标文件中:
外部文件内容截取部分:
关于如何保留或删除特定汇总结果,具体参考SAS官方文档: SAS Help Center: Results: PROC COMPARE。
4. 结果输出到数据集(out=
选项)
这个方式在之前的文章中介绍过,具体参考SAS编程:分享数据集Compare的小经验,输出的结果数据集为,相比较两个数据集变量值不同的记录。
我常用的选项设置如下:
proc compare base = base comp = comp out=df
outbase outcomp outdif outnoequal;
run;
输出数据集结果如下:
总结
文章介绍了,SAS中Compare过程步结果输出的4种方式,读者可以结合自己的工作需求,进行“私人定制”。
同时,各家公司Compare宏程序大都也是基于以上几种方式进行输出,希望能够帮助读者理解本公司宏程序运行机制。
感谢阅读, 欢迎关注!
若有疑问,欢迎评论交流!