最近一直在用Weka做数据分析,用weka就必须要知道ARFF格式文件了,ARFF(Attribute-Relation File Format)文件,是一种ASCII文本文件,这种不常用的文件格式就需要一个能把其他格式文件直接转换成ARFF格式
首先介绍一下ARFF文件 :(详细内容请参考)https://blog.csdn.net/buaalei/article/details/7103055
@relation vote2
@attribute handicapped-infants {n,y}
@attribute water-project-cost-sharing {y}
@attribute adoption-of-the-budget-resolution {n,y}
@attribute physician-fee-freeze {y,n}
@attribute el-salvador-aid {y}
@attribute religious-groups-in-schools {y}
@attribute anti-satellite-test-ban {n}
@attribute aid-to-nicaraguan-contras {n}
@attribute mx-missile {n}
@attribute immigration {y,n}
@attribute synfuels-corporation-cutback {n,y}
@attribute education-spending {y,n}
@attribute superfund-right-to-sue {y}
@attribute crime {y,n}
@attribute duty-free-exports {n,y}
@attribute export-administration-act-south-africa {y,n}
@attribute Class string
@data
n,y,n,y,y,y,n,n,n,y,?,y,y,y,n,y,?
n,y,n,y,y,y,n,n,n,n,n,y,y,y,n,?,?
?,y,y,?,y,y,n,n,n,n,y,n,y,y,n,n,?
n,y,y,n,?,y,n,n,n,n,y,n,y,n,n,y,?
y,y,y,n,y,y,n,n,n,n,y,?,y,y,y,y,?
n,y,y,n,y,y,n,n,n,n,n,n,y,y,y,y,?
n,y,n,y,y,y,n,n,n,n,n,n,?,y,y,y,?
n,y,n,y,y,y,n,n,n,n,n,n,y,y,?,y,?
n,y,n,y,y,y,n,n,n,n,n,y,y,y,n,y,?
ARFF格式文件的文本就是这样分为三块:1.关系声明@relation 2.属性声明@attribute 3.数据声明@data
属性声明按照这个格式:@attribute
数据声明就是CSV格式的数据内容了,数据之间用逗号隔开
知道了ARFF格式要求,就可以很容易的知道怎么去创建ARFF格式的文件了,用BufferedWriter写成 .arff后缀的文件格式就好
下面代码呈上 :
public static void arff(String sourceFile, String targetFile, int column)
{
try
{
BufferedReader in;
BufferedWriter out;
String temp;
out = new BufferedWriter(new FileWriter(targetFile, false));
//关系声明
out.write("@relation" + " spam");
out.newLine();
//属性声明
int i1;
for (i1 = 0; i1 < column; i1++)
{
out.write("@attribute attr" + i1 + " String");
out.newLine();
}
//数据声明
out.write("@data");
out.newLine();
//读CSV文件
in = new BufferedReader(new FileReader(sourceFile));
temp = in.readLine();
while (temp != null)
{
out.write(temp);
out.newLine();
temp = in.readLine();
}
in.close();
out.flush();
out.close();
} catch (Exception e) {
e.printStackTrace();
}
}