Only have English input method here. so keep this notes in English.
Recently I borrowed a book about Spark from company's library.
Have learnt some basic concepts about hadoop and spark. Here, I take some notes for Learning Spark.
1. Install
Go to the official site, and download the latest tar file of spark
http://spark.apache.org/downloads.html
you can un-tar the file and put it to the path you want to install for you spark.
here I put it under /usr/local/spark
commands:
cd /usr/local/
sudo cp /home/kevin/Downloads/sparkXXX ./
sudo tar -xf sparkXXX
sudo mv sparkXXX spark
now you can do a quick test to verify you installation.
cd spark
we have a README.md file, we will count the total words in python shell.
./bin/pyspark
>>>test = sc.textFile("README.md")
>>>test.count()
[here show the log message and the result]
also you can go to 127.0.0.1:4040 to open the spark web UI.