作者:Jack Zhang 来自开拓者部落 ,qq群:248087140,欢迎加入我们!
本文欢迎转载,转载请注明出处 http://my.oschina.net/u/1866370/blog/287907
1、Mahout官网:http://mahout.apache.org/
2、Mahout官网上关于Mahout依赖的页面 http://mahout.apache.org/general/downloads.html
在2中可以看到Mahout的Maven坐标
<dependency> <groupId>org.apache.mahout</groupId> <artifactId>mahout-core</artifactId> <version>${mahout.version}</version> </dependency>
使用Maven使Mahout的环境搭建变得简单方便,只需在pom中添加如下内容即可。
<properties> <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding> <mahout.version>0.6</mahout.version> </properties> <dependencies> <dependency> <groupId>org.apache.mahout</groupId> <artifactId>mahout-core</artifactId> <version>${mahout.version}</version> </dependency> <dependency> <groupId>org.apache.mahout</groupId> <artifactId>mahout-integration</artifactId> <version>${mahout.version}</version> <exclusions> <exclusion> <groupId>org.mortbay.jetty</groupId> <artifactId>jetty</artifactId> </exclusion> <exclusion> <groupId>org.apache.cassandra</groupId> <artifactId>cassandra-all</artifactId> </exclusion> <exclusion> <groupId>me.prettyprint</groupId> <artifactId>hector-core</artifactId> </exclusion> </exclusions> </dependency> </dependencies>
说明:
1、Maven的配置说明 <mahout.version>0.6</mahout.version> 以上为将Mahout的版本设置为Maven的全局变量,可以在pom的其他位置,以<version>${mahout.version}</version>的方式引用 <exclusion> 标签 <exclusion>标签的作用是,在导入依赖的时候排除,<exclusion>标签中的jar包 此处排除了org.mortbay.jetty.jetty,org.apache.cassandra.cassandra-all,me.prettyprint.hector-core 其中第二个jar包,在以后的文章中会提到,此处不做说明 2、jar包的说明 mahout-core 为mahout核心包 mahout-integration 将Mahout整合如其他项目的jar包 这坐标的引入将会导致以下全部jar包被引入项目 (包括,mahout的相关包,httpclient,solr,lucene,mongodb和一些在java项目中常常使用的工具包。) org\apache\mahout\mahout-core\0.6\mahout-core-0.6.jar org\apache\mahout\mahout-math\0.6\mahout-math-0.6.jar org\uncommons\maths\uncommons-maths\1.2.2\uncommons-maths-1.2.2.jar jfree\jcommon\1.0.12\jcommon-1.0.12.jar com\google\guava\guava\r09\guava-r09.jar org\apache\mahout\mahout-collections\1.0\mahout-collections-1.0.jar org\apache\hadoop\hadoop-core\0.20.204.0\hadoop-core-0.20.204.0.jar commons-cli\commons-cli\1.2\commons-cli-1.2.jar commons-httpclient\commons-httpclient\3.0.1\commons-httpclient-3.0.1.jar commons-logging\commons-logging\1.0.3\commons-logging-1.0.3.jar commons-codec\commons-codec\1.4\commons-codec-1.4.jar commons-configuration\commons-configuration\1.6\commons-configuration-1.6.jar commons-collections\commons-collections\3.2.1\commons-collections-3.2.1.jar commons-digester\commons-digester\1.8\commons-digester-1.8.jar commons-beanutils\commons-beanutils\1.7.0\commons-beanutils-1.7.0.jar commons-beanutils\commons-beanutils-core\1.8.0\commons-beanutils-core-1.8.0.jar org\codehaus\jackson\jackson-core-asl\1.8.2\jackson-core-asl-1.8.2.jar org\codehaus\jackson\jackson-mapper-asl\1.8.2\jackson-mapper-asl-1.8.2.jar org\slf4j\slf4j-api\1.6.1\slf4j-api-1.6.1.jar commons-lang\commons-lang\2.6\commons-lang-2.6.jar org\uncommons\watchmaker\watchmaker-framework\0.6.2\watchmaker-framework-0.6.2.jar com\thoughtworks\xstream\xstream\1.3.1\xstream-1.3.1.jar xpp3\xpp3_min\1.1.4c\xpp3_min-1.1.4c.jar org\apache\lucene\lucene-core\3.4.0\lucene-core-3.4.0.jar org\apache\lucene\lucene-analyzers\3.4.0\lucene-analyzers-3.4.0.jar org\apache\mahout\commons\commons-cli\2.0-mahout\commons-cli-2.0-mahout.jar org\apache\commons\commons-math\2.2\commons-math-2.2.jar org\apache\mahout\mahout-integration\0.6\mahout-integration-0.6.jar commons-dbcp\commons-dbcp\1.4\commons-dbcp-1.4.jar commons-pool\commons-pool\1.5.6\commons-pool-1.5.6.jar org\apache\solr\solr-commons-csv\3.4.0\solr-commons-csv-3.4.0.jar org\mongodb\mongo-java-driver\2.5\mongo-java-driver-2.5.jar org\mongodb\bson\2.5\bson-2.5.jar
hadoop的安装可查看本博客Hadoop类目下的相关文章
Mahout程序编写
第一步:新建一个文本文件,重命名为intro,复制以下内容到intro.txt,将后缀修改为csv。在实际的开发中,我们也将取得类似的日志数据,作为输入文件。
1,101,5.0 1,102,3.0 1,103,2.5 2,101,2.0 2,102,3.0 2,103,5.0 2,104,2.0 3,101,2.5 3,104,4.0 3,105,4.5 3,107,5.0 4,101,5.0 4,103,3.0 4,104,4.5 4,106,4.0 5,101,4.0 5,102,3.0 5,103,2.0 5,104,4.0 5,105,3.5 5,106,4.0
第二步:编写基于用户的协同过滤 程序
需求:向ID为1的用户,推荐物品,推荐的物品数为1。
class RecommenderIntro { final static int NEIGHBORHOOD_NUM = 2; final static int USER_ID = 1; final static int RECOMMEND_NUM = 1; public static void main(String[] args) throws IOException, TasteException { /**构建文件对象,注意文件路径要正确*/ DataModel model = new FileDataModel(new File("intro.csv")); /**用户相识度*/ UserSimilarity user = new PearsonCorrelationSimilarity(model); /**近邻*/ UserNeighborhood neighborhood = new NearestNUserNeighborhood(NEIGHBORHOOD_NUM, user, model); /**生成推荐器*/ Recommender recommender = new GenericUserBasedRecommender(model, neighborhood, user); /**进行推荐 ,向ID为1的用户推荐 1个物品*/ List<RecommendedItem> recommendations = recommender.recommend(USER_ID, RECOMMEND_NUM); for(RecommendedItem recommendation:recommendations){ System.out.println(recommendation); } } }
推荐结果
RecommendedItem[item:104, value:4.257081]