Weka中的有监督的离散化方法

参考:机器学习-变量离散之MDLP

对应分析Weka中weka.filters.supervised.attribute.Discretize
涉及的其他类包括weka.filters.Filter
分析的宏观代码

Discretize disc=new Discretize()
disc.setInputFormat(data)
Instances afterDiscretize=Filter.useFilter(data,disc)

Filter.useFilter()

Filter.useFilter(Instances data, Filter filter){
for (int i = 0; i < data.numInstances(); i++) {
      filter.input(data.instance(i));
      //进一步调用bufferInput():把instance复制一份后交给Filter的m_InputFormat变量,这相当于输入数据的一个完整copy
      //
    }
    filter.batchFinished();//调用离散化计算的主体部分calculateCutPoints()详细分析见下文
    Instances newData = filter.getOutputFormat();
    Instance processed;
    while ((processed = filter.output()) != null) {
      newData.add(processed);
    }
}
return newData

Discretize.calculateCutPoints()

calculateCutPoints(){
    m_CutPoints = new double[getInputFormat().numAttributes()][];
    calculateCutPointsByMDL(i, copy);//对第i个属性进行离散化
}

你可能感兴趣的:(Weka)