预装分布式环境之前已经安装好
下载 http://archive.apache.org/dist/mahout/
mahout-distribution-0.8.tar.gz 包
解压
mv mahout-distribution-0.8.tar.gz mahout
声明环境变量
vim .bash_profile
#mahout-distribution-0.8
export MAHOUT_HOME=/home/hadoop/mahout
export MAHOUT_BIN=/home/hadoop/mahout/bin
export PATH=$PATH:$MAHOUT_HOME/bin
source .bash_profile
mahout --version
Running on hadoop, using /home/hadoop/hadoop/bin/hadoop and HADOOP_CONF_DIR=
MAHOUT-JOB: /home/hadoop/mahout/mahout-examples-0.8-job.jar
16/02/03 15:55:44 WARN driver.MahoutDriver: Unable to add class: --version
16/02/03 15:55:44 WARN driver.MahoutDriver: No --version.props found on classpath, will use command-line arguments only
Unknown program '--version' chosen.
Valid program names are:.............
cp /home/hadoop/hadoop-core-1.2.1.jar mahout/lib/hadoop
删除原来的包
下载一个数据文件synthetic_control.data 用作源
下载地址http://archive.ics.uci.edu/ml/databases/synthetic_control/synthetic_control.data
上传到hdfs上面
hadoop fs -ls /user/hadoop/testdata
Found 1 items
-rw-r--r-- 1 hadoop supergroup 288374 2016-02-03 15:38 /user/hadoop/testdata/synthetic_control.data
运行
hadoop jar /home/hadoop/mahout/mahout-examples-0.8-job.jar org.apache.mahout.clustering.syntheticcontrol.kmeans.Job
生成目录
hadoop fs -ls /user/hadoop/output
Found 14 items
-rw-r--r-- 1 hadoop supergroup 194 2016-02-03 15:45 /user/hadoop/output/_policy
drwxr-xr-x - hadoop supergroup 0 2016-02-03 15:45 /user/hadoop/output/clusteredPoints
drwxr-xr-x - hadoop supergroup 0 2016-02-03 15:41 /user/hadoop/output/clusters-0
drwxr-xr-x - hadoop supergroup 0 2016-02-03 15:41 /user/hadoop/output/clusters-1
drwxr-xr-x - hadoop supergroup 0 2016-02-03 15:42 /user/hadoop/output/clusters-2
drwxr-xr-x - hadoop supergroup 0 2016-02-03 15:42 /user/hadoop/output/clusters-3
drwxr-xr-x - hadoop supergroup 0 2016-02-03 15:42 /user/hadoop/output/clusters-4
drwxr-xr-x - hadoop supergroup 0 2016-02-03 15:43 /user/hadoop/output/clusters-5
drwxr-xr-x - hadoop supergroup 0 2016-02-03 15:43 /user/hadoop/output/clusters-6
drwxr-xr-x - hadoop supergroup 0 2016-02-03 15:44 /user/hadoop/output/clusters-7
drwxr-xr-x - hadoop supergroup 0 2016-02-03 15:44 /user/hadoop/output/clusters-8
drwxr-xr-x - hadoop supergroup 0 2016-02-03 15:45 /user/hadoop/output/clusters-9-final
drwxr-xr-x - hadoop supergroup 0 2016-02-03 15:41 /user/hadoop/output/data
drwxr-xr-x - hadoop supergroup 0 2016-02-03 15:41 /user/hadoop/output/random-seeds