PROC IMPORT
可以读取外部数据并写入到SAS数据集中
语法如下:
PROC IMPORT
DATAFILE="filename" | DATATABLE="tablename" (Not used for Microsoft Excel files)
<file-format-specific-statements>;
根据语法可以看到,必选参数只有一个,即DATAFILE|DATATABLE,其中DATAFILE可以用别名file代替,DATATABLE可以用别名table代替。
DBMS=data-source-identifier:导入的数据的类型
SAS data-set-option(s): SAS data set的选项,比如可以使用where data set option.
OUT=
REPLACE:如果数据集已经存在,是否替换。
<file-format-specific-statements>:文件格式说明,比如,对于Excel文档,GETNAMES=YES | NO可以规定是否使用文档中的第一行来产生SAS 变量,SHEET=sheet-name来指定文档中sheet的名子,每个语句是以逗号作为分割符。
从C:\Users\qingsong\Desktop\test.xlsx的sheet2中读取内容到数据集emp中,该sheet中内容如下:
使用的SAS程序如下:
PROC IMPORT
DATAFILE="C:\Users\qingsong\Desktop\test.xlsx"
DBMS=XLSX
OUT=TEST2
REPLACE;
GETNAMES=YES;
SHEET=sheet2;
RUN;
PROC PRINT DATA=TEST;
RUN;
PROC CONTENTS DATA=TEST;
RUN;
运行结果部分如下图所示,可以看到Birth Date生成的变量名为Birth_Date,多了一个下划线。
假设文件 J:\sas_guide_data\base-guide-practice-data\cert\delimiter.txt 内容如下:
Region&State&Month&Expenses&Revenue
Southern&GA&JAN2001&2000&8000
Southern&GA&FEB2001&1200&6000
Southern&FL&FEB2001&8500&11000
Northern&NY&FEB2001&3000&4000
Northern&NY&MAR2001&6000&5000
Southern&FL&MAR2001&9800&13500
Northern&MA&MAR2001&1500&1000
程序如下:
proc import datafile='J:\sas_guide_data\base-guide-practice-data\cert\delimiter.txt'
DBMS=DLM
OUT=TEST2
REPLACE;
DELIMITER='&';
GETNAMES=YES;
run;
proc print data=TEST2;
run;
其中 DBMS=DLM表示文件格式为DLM(Delimited file , default delimiter is a blank))。 DELIMITER语句指定了分割符为'&',如果不指定,默认的分割符为空格。
输出结果如下:
示例3:CSV文件
CSV(Comma Separated Value)即逗号分割值的文件,示例如下:
"Africa","Boot","Addis Ababa","12","$29,761","$191,821","$769"
"Asia","Boot","Bangkok","1","$1,996","$9,576","$80"
"Canada","Boot","Calgary","8","$17,720","$63,280","$472"
"Central America/Caribbean","Boot","Kingston","33","$102,372","$393,376","$4,454"
"Eastern Europe","Boot","Budapest","22","$74,102","$317,515","$3,341"
"Middle East","Boot","Al-Khobar","10","$15,062","$44,658","$765"
"Pacific","Boot","Auckland","12","$20,141","$97,919","$962"
"South America","Boot","Bogota","19","$15,312","$35,805","$1,229"
"United States","Boot","Chicago","16","$82,483","$305,061","$3,735"
"Western Europe","Boot","Copenhagen","2","$1,663","$4,657","$129"
即可以指定DBMS=DLM, DELIMITER=',' ,也可以直接指定DBMS=CSV,程序如下,程序中有两个proc import,效果一样:
title "Hello";
proc import datafile='J:\sas_guide_data\base-guide-practice-data\cert\boot.csv'
DBMS=CSV
OUT=TEST2
REPLACE;
GETNAMES=NO;
run;
proc print data=TEST2;
run;
tile "world";
proc import datafile='J:\sas_guide_data\base-guide-practice-data\cert\boot.csv'
DBMS=DLM
OUT=TEST3
REPLACE;
DELIMITER=',';
GETNAMES=NO;
run;
proc print data=TEST3;
run;
效果如下:
参考资料:SAS Certified Sepcialist Prep Guide