1、解压与安装
解压pig-0.10.0.tar.gz到/opt/目录下:
tar -zxvf pig-0.10.0.tar.gz -C /opt/ mv /opt/pig-0.10.0 /opt/pig chown -R hadoop:hadoop /opt/pig su hadoop
2、配置/opt/pig/bin/pig文件
打开pig文件
#
# The Pig command script
#
# Environment Variables
#
# JAVA_HOME The java implementation to use. Overrides JAVA_HOME.
#
# PIG_CLASSPATH Extra Java CLASSPATH entries.
#
# HADOOP_HOME/HADOOP_PREFIX Environment HADOOP_HOME/HADOOP_PREFIX(0.20.205)
#
# HADOOP_CONF_DIR Hadoop conf dir
#
# PIG_HEAPSIZE The maximum amount of heap to use, in MB.
# Default is 1000.
#
# PIG_OPTS Extra Java runtime options.
#
# PIG_CONF_DIR Alternate conf dir. Default is ${PIG_HOME}/conf.
#
# HBASE_CONF_DIR - Optionally, the HBase configuration to run against
#
export JAVA_HOME=/usr/java/jdk/ export PIG_INSTALL=/opt/pig export HADOOP_INSTALL=/opt/hadoop export PATH=$PIG_INSTALL/bin:%HADOOP_INSTALL/bin:$PATH export PIG_CLASSPATH=$HADOOP_INSTALL/conf
使用下面命令执行
cd /opt/hadoop/bin ./haoop fs -copyFromLocal /opt/data/test.txt /opt/data/test.txt cd /opt/pig/bin ./pig例子:取出用户名,存在dist.txt里面
A = LOAD '/opt/data/test.txt' USING PigStorage('\t') AS (id,name); dump A; B = FOREACH A GENERATE name; STORE B INTO '/opt/data/dist.txt' USING PigStorage(); cd /opt/hadoop/bin ./hadoop fs -ls /opt/data ./hadoop fs -ls /opt/data/dist.txt ./hadoop fs -cat /opt/data/dist/txt/part-m-00000
Pig Latin常用命令:
LOAD ...... USING PigStorage('') ...... AS ......; FOREACH ...... GENERATE ......; FILTER ...... BY ......; DUMP; STORE ...... INTO; GROUP ...... BY; AND OR