
Hadoop 安装 in Ubuntu 64bit on VMware station

  • 1. Install VMware pro 15 on Windows 10.
  • 2. Download Ubuntu 64 bit from official website
  • 3. 在虚拟机上安装Ubuntu
  • 4. 安装Hadoop
    • 4.1 Install hadoop on Ubuntu system (PDF)
    • 4.2 Installation JDK. Choose Oracle JDK, open JDK not working for me.
    • 4.3 Install Hadoop (PDF)
    • 4.4 Install eclipse on Ubuntu
    • 4.5 Install Hadoop-Eclipse-Plugin (PDF)
    • 4.6 Install Apache pig
    • 4.7 Run Apache Pig Script

1. Install VMware pro 15 on Windows 10.


2. Download Ubuntu 64 bit from official website


3. 在虚拟机上安装Ubuntu

按照提示一步步操作就可以,要想在宿主机和虚拟机复制粘贴共享文件,可以设置共享文件夹,安装vm tools,这一步问题很多,解决方法是打vmtools 补丁,如果mnt不了,参考这里:Ubuntu论坛

4. 安装Hadoop

4.1 Install hadoop on Ubuntu system (PDF)

Download hadoop stable version:

Note: If VM cannot connect to Wi-Fi network, re-connect your Wi-Fi to make authorization. Change to Bridge connection not working for me.

4.2 Installation JDK. Choose Oracle JDK, open JDK not working for me.


Use second way, go to official site download the latest version of JDK: http://www.oracle.com/technetwork/java/javase/downloads/index.html

sudo mkdir /usr/lib/jvm
sudo tar -zxvf jdk-9.0.1_linux-x64_bin.tar.gz -C /usr/lib/jvm
sudo vim ~/.bashrc (vim: press i to enter insert mode and press esc to return , :wp for save and exit the vim)

#set oracle jdk environment
export JAVA_HOME=/usr/lib/jvm/ jdk-9.0.1
source ~/.bashrc
java –version

4.3 Install Hadoop (PDF)

4.4 Install eclipse on Ubuntu

In the Ubuntu software, search Eclipse and install. Default directory is /usr/lib/eclipse

4.5 Install Hadoop-Eclipse-Plugin (PDF)

Note: Plugin version 2.6 also works for Hadoop 2.9.0 in this case.

4.6 Install Apache pig

Download a release: http://apache.mirrors.spacedump.net/pig/
Note: Restore environment variables:
Export PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin

4.7 Run Apache Pig Script

hadoop version
cd /$Hadoop_Home/bin/
hdfs dfs -mkdir hdfs://localhost:9000/Pig_Data
hdfs dfs -put /home/Hadoop/Pig/Pig_Data/student_data.txt dfs://localhost:9000/pig_data/
pig -x local Sample_script.pig
pig -x mapreduce Sample_script.pig

Processing time: 10G ~ 15min (4G ram 2 Cores)
