librec的使用 how to use librec

http://wiki.librec.net/doku.php?id=introduction

直接使用

1.下载解压

2.进入bin下

3.输入命令行

Librec经过良好的封装, 可以直接通过命令行加载配置项来运行相应的代码,也可以在其他工程中分别实例Java相应的类来进行计算.

在命令行中, rec参数指定程序进行推荐, 其他参数在参数-exec 之后通过-D 或-jobconf来指定. 其中dfs.data.dir 与dfs.result.dir 分别指定了读取数据的路径与存放结果的路径. rec.recommender.class指定运行的算法. 命令行的其他用法请参考CLI walkthrough. 其他算法以及配置项请参考Algorithm list. 不同配置项的含义请参考配置文件librec.properties

LibRec通过调用命令行来输入相应的参数并提交计算作业, 计算过程以日志形式打印在终端, 最终推荐结果保存在当前目录的result文件夹下.

librec rec -exec -D rec.recommender.class=itemcluster -D rec.pgm.number=10 -D rec.iterator.maximum=20
或是使用命令行

librec rec -exec -conf itemcluster-test.properties

itemcluster-test.properties需要自己编写放在bin


命令行介绍

在这里我们展示如何使用命令行来完成数据与模型的读取及保存, 以及使用命令行来输入参数进行推荐计算.

Usage

Usage: librec  [options]...
commands:
  rec                       run recommender
  data                      load data

global options:
  --help                    display this help text
  --exec                    run Recommender
  --version                 show Librec version info

job options:
  -conf               path to config file
  -D, -jobconf        set configuration items (key=value)
  -libjars                  add entend jar files to classpath

rec/data 为指定程序进行推荐算法/数据读取功能.

-exec 为执行推荐算法.在2.0版本中为保留选项

-D | -jobconf [options] 为加载相关配置, 具体配置项请参考ConfigurationList及AlgorithmList

-conf [path/to/properties] 加载配置文件

-libjars 为加载其他路径下的jar包到classpath中, 其中lib下的jar包自动加载, 当前脚本为示例


作为自己项目的一部分使用

pom.xml中加入


    net.librec
    librec-core
    2.0.0
使用如下代码来运行推荐程序

public void main(String[] args) throws Exception {
    
    // recommender configuration
    Configuration conf = new Configuration();
    Resource resource = new Resource("rec/cf/userknn-test.properties");
    conf.addResource(resource);

    // build data model
    DataModel dataModel = new TextDataModel(conf);
    dataModel.buildDataModel();
    
    // set recommendation context
    RecommenderContext context = new RecommenderContext(conf, dataModel);
    RecommenderSimilarity similarity = new PCCSimilarity();
    similarity.buildSimilarityMatrix(dataModel, true);
    context.setSimilarity(similarity);

    // training
    Recommender recommender = new UserKNNRecommender();
    recommender.recommend(context);

    // evaluation
    RecommenderEvaluator evaluator = new MAEEvaluator();
    recommender.evaluate(evaluator);

    // recommendation results
    List recommendedItemList = recommender.getRecommendedList();
    RecommendedFilter filter = new GenericRecommendedFilter();
    recommendedItemList = filter.filter(recommendedItemList);
}

public static void main(String[] args) throws Exception {

        // build data model
        Configuration conf = new Configuration();
        conf.set("dfs.data.dir", "G:/LibRec/librec/data");
        TextDataModel dataModel = new TextDataModel(conf);
        dataModel.buildDataModel();

        // build recommender context
        RecommenderContext context = new RecommenderContext(conf, dataModel);

        // build similarity
        conf.set("rec.recommender.similarity.key" ,"item");
        RecommenderSimilarity similarity = new PCCSimilarity();
        similarity.buildSimilarityMatrix(dataModel);
        context.setSimilarity(similarity);

        // build recommender
        conf.set("rec.neighbors.knn.number", "5");
        Recommender recommender = new ItemKNNRecommender();
        recommender.setContext(context);

        // run recommender algorithm
        recommender.recommend(context);

        // evaluate the recommended result
        RecommenderEvaluator evaluator = new RMSEEvaluator();
        System.out.println("RMSE:" + recommender.evaluate(evaluator));

        // set id list of filter
        List userIdList = new ArrayList<>();
        List itemIdList = new ArrayList<>();
        userIdList.add("1");
        itemIdList.add("70");

        // filter the recommended result
        List recommendedItemList = recommender.getRecommendedList();
        GenericRecommendedFilter filter = new GenericRecommendedFilter();
        filter.setUserIdList(userIdList);
        filter.setItemIdList(itemIdList);
        recommendedItemList = filter.filter(recommendedItemList);

        // print filter result
        for (RecommendedItem recommendedItem : recommendedItemList) {
            System.out.println(
                    "user:" + recommendedItem.getUserId() + " " +
                    "item:" + recommendedItem.getItemId() + " " +
                    "value:" + recommendedItem.getValue()
            );
        }
}

读取配置文件代码示例: 可以如示例中使用相对目录访问jar包中的配置文件,也可以指定自己编辑的配置文件。

public void testRecommender() throws ClassNotFoundException, LibrecException, IOException {
    Resource resource = new Resource("rec/cf/itemknn-test.properties");
    conf.addResource(resource);
    RecommenderJob job = new RecommenderJob(conf);
    job.runJob();
}

配置在conf下的librec.properties里,设置数据来源,默认数据集为filmtrust

# set data directory
dfs.data.dir=../data
# set result directory
# recommender result will output in this folder
dfs.result.dir=../result
# set log directory
dfs.log.dir=../log

# convertor
# load data and splitting data 
# into two (or three) set
# setting dataset name
data.input.path=filmtrust
# setting dataset format(UIR, UIRT)
data.column.format=UIR
# setting method of split data
# value can be ratio, loocv, given, KCV
data.model.splitter=ratio
#data.splitter.cv.number=5
# using rating to split dataset
data.splitter.ratio=rating
# filmtrust dataset is saved by text
# text, arff is accepted
data.model.format=text
# the ratio of trainset
# this value should in (0,1)
data.splitter.trainset.ratio=0.8

# Detailed configuration of loocv, given, KCV 
# is written in User Guide 

# set the random seed for reproducing the results (split data, init parameters and other methods using random)
# default is set 1l
# if do not set ,just use System.currentTimeMillis() as the seed and could not reproduce the results.
rec.random.seed=1

# binarize threshold mainly used in ranking
# -1.0 - maxRate, binarize rate into -1.0 and 1.0
# binThold = -1.0, do nothing
# binThold = value, rating > value is changed to 1.0 other is 0.0, mainly used in ranking
# for PGM 0.0 maybe a better choose
data.convert.binarize.threshold=-1.0

# evaluation the result or not
rec.eval.enable=true

# specifies evaluators
# rec.eval.classes=auc,precision,recall...
# if rec.eval.class is blank 
# every evaluator will be calculated
# rec.eval.classes=auc,precision,recall

# evaluator value set is written in User Guide
# if this algorithm is ranking only true or false
rec.recommender.isranking=false

#can use user,item,social similarity, default value is user, maximum values:user,item,social
#rec.recommender.similarities=user

其中User-Item-Rating简写为UIR, User-Item-Rating-Date简写为UIRT. 当使用Text格式的数据来作为输入时, 对以下配置项进行配置

data.model.format=text
data.column.format=UIR #or UIRT

你可能感兴趣的:(librec的使用 how to use librec)