下层库所有向Renascence架构提供的函数,其输入输出都必须给一个对应的继承于IStatusType的类,用于读取、保存、映射、释放该类型。
/*Basic API*/
class IStatusType
{
public:
IStatusType(const std::string name):mName(name){}
virtual ~IStatusType(){}
inline std::string name() const {return mName;}
/*GPStream是输入流的抽象,输入可能来源于文件、内存、网络等等*/
virtual void* vLoad(GPStream* input) const = 0;
/*GPWStream是输出流的抽象,可能输出到文件、内存、网络等等*/
virtual void vSave(void* contents, GPWStream* output) const = 0;
virtual void vFree(void* contents) const = 0;
/* map * Modify contents by values. * return the number of parameters it needed. * If value is NULL, just return the number of parameters. * If *content is NULL and value is not NULL, alloc a new one. */
virtual int vMap(void** content, double* value) const = 0;
/* Check(Optional) * For Continue Data (Stream), Check if the data is completed, content must be not null */
virtual bool vCheckCompleted(void* content) const {return NULL!=content;}
/* Merge(Optional) * For Continue Data (Stream), Merge the src data to dst, dst and src must be not null * Normally, dst and src will be freed after calling this api * return NULL means can't merge */
virtual void* vMerge(void* dst, void* src) const {return NULL;}
private:
std::string mName;
};
下层库需要提供如下形式的标准函数:
typedef GPContents*(*computeFunction)(GPContents* inputs);
GPContents 中包含一组带类型定义的数据:
struct GPContents
{
struct GP_Unit
{
void* content;
const IStatusType* type;
};
std::vector<GP_Unit> contents;
};
鉴于直接写这种类型的函数比较复杂,可以按如下方式,在需要导出的函数(函数本身需要保证单输出,不改输入,不改变全局、静态变量)前加 /GP FUNCTION/,然后执行 tools/makeGPFunction.py,自动生成对应的函数。
下面是机器学习库中的导出例子:
/*GP FUNCTION*/ALFloatMatrix* ALPackageValidateMatrix(ALIMatrixPredictor* l, ALFloatMatrix* m);
/*GP FUNCTION*/ALFloatMatrix* ALPackageValidateChain(ALFloatPredictor* l, ALLabeldData* c);
/*GP FUNCTION*/ALIMatrixPredictor* ALPackageSuperLearning(ALISuperviseLearner* l, ALFloatMatrix* m);
/*GP FUNCTION*/ALIMatrixPredictor* ALPackageUnSuperLearning(ALIUnSuperLearner* l, ALFloatMatrix* m);
/*GP FUNCTION*/ALFloatPredictor* ALPackageLearn(ALIChainLearner* l, ALLabeldData* d);
/*GP FUNCTION*/ALIChainLearner* ALPackageCreateDivider(ALIChainLearner* l, ALIChainLearner* r, ALDividerParameter* p/*S*/);
/*GP FUNCTION*/ALIChainLearner* ALPackageCreateCGP(ALCGPParameter* p/*S*/);
/*GP FUNCTION*/ALISuperviseLearner* ALPackageCreateRegress();
/*GP FUNCTION*/ALLabeldData* ALPackageLabled(ALFloatDataChain* c, double delay);
/*GP FUNCTION*/double ALPackageCrossValidate(ALIChainLearner* l, ALLabeldData* c);
/*GP FUNCTION*/ALIChainLearner* ALPackageCombine(ALARStructure* ar/*S*/, ALISuperviseLearner* l);
/*GP FUNCTION*/ALClassifierCreator* ALPackageCreateSVM(ALSVMParameter* p/*S*/);
/*GP FUNCTION*/ALClassifierCreator* ALPackageCreateGMM();
/*GP FUNCTION*/ALClassifierCreator* ALPackageCreateLogicalRegress();
/*GP FUNCTION*/ALClassifierCreator* ALPackageCreateDecisionTree(ALDecisionTreeParameter* p/*S*/);
/*GP FUNCTION*/ALFloatMatrix* ALPackageMatrixMerge(ALFloatMatrix* A, ALFloatMatrix* B, double aleft, double aright, double bleft, double bright);
/*GP FUNCTION*/ALFloatMatrix* ALPackageMatrixCrop(ALFloatMatrix* A, double aleft, double aright);
最终需要提供一个函数表,这个函数表可以在运行makeGPFunction.py时自动生成。
如上面的例子:
<libAbstract_learning>
<ALPackageCreateClassify_GPpackage>
<shortName>CreateClassify</shortName>
<output>ALClassifier</output>
<status></status>
<input>ALClassifierCreator ALFloatMatrix </input>
<inputNeedComplete>False False </inputNeedComplete>
</ALPackageCreateClassify_GPpackage>
<ALPackageRamdomForest_GPpackage>
<shortName>RamdomForest</shortName>
<output>ALClassifier</output>
<status></status>
<input>ALFloatMatrix </input>
<inputNeedComplete>False </inputNeedComplete>
</ALPackageRamdomForest_GPpackage>
<ALPackageClassify_GPpackage>
<shortName>Classify</shortName>
<output>ALFloatMatrix</output>
<status></status>
<input>ALClassifier ALFloatMatrix </input>
<inputNeedComplete>False False </inputNeedComplete>
</ALPackageClassify_GPpackage>
<ALPackageClassifyProb_GPpackage>
<shortName>ClassifyProb</shortName>
<output>ALFloatMatrix</output>
<status></status>
<input>ALClassifier ALFloatMatrix </input>
<inputNeedComplete>False False </inputNeedComplete>
</ALPackageClassifyProb_GPpackage>
<ALPackageClassifyProbValues_GPpackage>
<shortName>ClassifyProbValues</shortName>
<output>ALFloatMatrix</output>
<status></status>
<input>ALClassifier </input>
<inputNeedComplete>False </inputNeedComplete>
</ALPackageClassifyProbValues_GPpackage>
<ALPackageCrossValidateClassify_GPpackage>
<shortName>CrossValidateClassify</shortName>
<output>double</output>
<status></status>
<input>ALClassifierCreator ALFloatMatrix </input>
<inputNeedComplete>False False </inputNeedComplete>
</ALPackageCrossValidateClassify_GPpackage>
<ALPackageValidateMatrix_GPpackage>
<shortName>ValidateMatrix</shortName>
<output>ALFloatMatrix</output>
<status></status>
<input>ALIMatrixPredictor ALFloatMatrix </input>
<inputNeedComplete>False False </inputNeedComplete>
</ALPackageValidateMatrix_GPpackage>
<ALPackageValidateChain_GPpackage>
<shortName>ValidateChain</shortName>
<output>ALFloatMatrix</output>
<status></status>
<input>ALFloatPredictor ALLabeldData </input>
<inputNeedComplete>False False </inputNeedComplete>
</ALPackageValidateChain_GPpackage>
<ALPackageSuperLearning_GPpackage>
<shortName>SuperLearning</shortName>
<output>ALIMatrixPredictor</output>
<status></status>
<input>ALISuperviseLearner ALFloatMatrix </input>
<inputNeedComplete>False False </inputNeedComplete>
</ALPackageSuperLearning_GPpackage>
<ALPackageUnSuperLearning_GPpackage>
<shortName>UnSuperLearning</shortName>
<output>ALIMatrixPredictor</output>
<status></status>
<input>ALIUnSuperLearner ALFloatMatrix </input>
<inputNeedComplete>False False </inputNeedComplete>
</ALPackageUnSuperLearning_GPpackage>
<ALPackageLearn_GPpackage>
<shortName>Learn</shortName>
<output>ALFloatPredictor</output>
<status></status>
<input>ALIChainLearner ALLabeldData </input>
<inputNeedComplete>False False </inputNeedComplete>
</ALPackageLearn_GPpackage>
<ALPackageCreateDivider_GPpackage>
<shortName>CreateDivider</shortName>
<output>ALIChainLearner</output>
<status>ALDividerParameter </status>
<input>ALIChainLearner ALIChainLearner </input>
<inputNeedComplete>False False </inputNeedComplete>
</ALPackageCreateDivider_GPpackage>
<ALPackageCreateCGP_GPpackage>
<shortName>CreateCGP</shortName>
<output>ALIChainLearner</output>
<status>ALCGPParameter </status>
<input></input>
<inputNeedComplete></inputNeedComplete>
</ALPackageCreateCGP_GPpackage>
<ALPackageCreateRegress_GPpackage>
<shortName>CreateRegress</shortName>
<output>ALISuperviseLearner</output>
<status></status>
<input></input>
<inputNeedComplete></inputNeedComplete>
</ALPackageCreateRegress_GPpackage>
<ALPackageLabled_GPpackage>
<shortName>Labled</shortName>
<output>ALLabeldData</output>
<status></status>
<input>ALFloatDataChain double </input>
<inputNeedComplete>False False </inputNeedComplete>
</ALPackageLabled_GPpackage>
<ALPackageCrossValidate_GPpackage>
<shortName>CrossValidate</shortName>
<output>double</output>
<status></status>
<input>ALIChainLearner ALLabeldData </input>
<inputNeedComplete>False False </inputNeedComplete>
</ALPackageCrossValidate_GPpackage>
<ALPackageCombine_GPpackage>
<shortName>Combine</shortName>
<output>ALIChainLearner</output>
<status>ALARStructure </status>
<input>ALISuperviseLearner </input>
<inputNeedComplete>False </inputNeedComplete>
</ALPackageCombine_GPpackage>
<ALPackageCreateSVM_GPpackage>
<shortName>CreateSVM</shortName>
<output>ALClassifierCreator</output>
<status>ALSVMParameter </status>
<input></input>
<inputNeedComplete></inputNeedComplete>
</ALPackageCreateSVM_GPpackage>
<ALPackageCreateGMM_GPpackage>
<shortName>CreateGMM</shortName>
<output>ALClassifierCreator</output>
<status></status>
<input></input>
<inputNeedComplete></inputNeedComplete>
</ALPackageCreateGMM_GPpackage>
<ALPackageCreateLogicalRegress_GPpackage>
<shortName>CreateLogicalRegress</shortName>
<output>ALClassifierCreator</output>
<status></status>
<input></input>
<inputNeedComplete></inputNeedComplete>
</ALPackageCreateLogicalRegress_GPpackage>
<ALPackageCreateDecisionTree_GPpackage>
<shortName>CreateDecisionTree</shortName>
<output>ALClassifierCreator</output>
<status>ALDecisionTreeParameter </status>
<input></input>
<inputNeedComplete></inputNeedComplete>
</ALPackageCreateDecisionTree_GPpackage>
<ALPackageMatrixMerge_GPpackage>
<shortName>MatrixMerge</shortName>
<output>ALFloatMatrix</output>
<status></status>
<input>ALFloatMatrix ALFloatMatrix double double double double </input>
<inputNeedComplete>False False False False False False </inputNeedComplete>
</ALPackageMatrixMerge_GPpackage>
<ALPackageMatrixCrop_GPpackage>
<shortName>MatrixCrop</shortName>
<output>ALFloatMatrix</output>
<status></status>
<input>ALFloatMatrix double double </input>
<inputNeedComplete>False False False </inputNeedComplete>
</ALPackageMatrixCrop_GPpackage>
</libAbstract_learning>
见 doc/formula.txt
1、基本形式:
f(x0, g(x1, h(x2), f(x2, x3)));
f(x0, ADF[NAME, x0,x1,f(x0,x3)])
2、符号说明:
f、g、h:函数的简写或全写,在xml格式的metaData中提供
x0-xn:输入变量名,必须从x0开始,中间不允许间断,此函数构建完成后的ADF,必须按由小到大的顺序组织输入变量
ADF:自动生成标志
NAME:ADF别名,必须取一个独特的名字,以便后续使用
f(x0,x3):此表示ADF的一个输入为固定的函数
3、自动生成
自动生成是将某一个输入变量改由指定了输入的子函数替代,子函数由GP库去搜索得出
格式为 ADF[NAME, x1, x2, x3, f(x0,x1), …]
自动生成函数的输入由[]中的内容描述,按类型:输入变量编号来排,输出类型就是该位置上的输入所应有的类型
自动生成的函数确定用完所有输入,但有可能会重复利用
就当做一般的C++库使用,对外接口均封装在
include/user/GPAPI.h
先安装 swig,然后进入python-renascence/module,运行 ./build_source.sh && sudo python setup.py install,这样安装好。
下面这个例子使用 Renascence架构,调用机器学习库作一次时间序列的预测:
#!/usr/bin/python
import Renascence
producer = Renascence.init(["./libAbstract_learning.xml"])
print producer.listAllFunctions()
print producer.listAllTypes()
x0 = producer.load('ALFloatDataChain', './bao.txt')
formula = 'CrossValidate(ADF(GodTrain), Labled(x0, x1))'
#formula = 'CrossValidate(CreateCGP(), Labled(x0, x1))'
#[trainedformula, bestValue] = producer.train(formula, producer.merge(x0, 1.0))
[trainedformula, bestValue] = producer.train(formula, producer.merge(x0, 1.0), times=100000, cacheFile='temp.txt')
print bestValue
print trainedformula.ADF('GodTrain')
#print trainedformula.parameters()
p = trainedformula.parameters()
with open('temp_parameter.txt', 'w') as f:
f.write(p)