DevOps(5)Spark Deployment on VM
1. Old Environment
1.1 Jdk
java version "1.6.0_45"
Switch version on ubuntu system.
>
sudo update-alternatives --config java
Set up ubuntu JAVA_HOME
>vi ~/.profile
export JAVA_HOME="/usr/lib/jvm/java-6-oracle"
Java Compile Version Problem
[
warn
] Error reading API from class file : java.lang.UnsupportedClassVersionError: com/digby/localpoint/auth/util/Base64$OutputStream : Unsupported major.minor version 51.0
>
sudo update-alternatives --config java
>
sudo update-alternatives --config javac
1.2 Cassandra
cassandra 1.2.13 version
>
sudo mkdir -p /var/log/cassandra
>
sudo chown -R carl /var/log/cassandra
carl is my username
>
sudo mkdir -p /var/lib/cassandra
>
sudo chown -R carl /var/lib/cassandra
Change the config if needed, start the cassandra single mode
>
cassandra -f conf/cassandra.yaml
Test that from client
>
cassandra-cli -host ubuntu-dev1 -port 9160
Setup the multiple nodes, Config changes
listen_address
:
ubuntu-dev1
-
class_name
:
org.apache.cassandra.locator.SimpleSeedProvider
parameters:
- seeds: "ubuntu-dev1,ubuntu-dev2"
Change that on both nodes on ubuntu-dev1, ubuntu-dev2.
Start the 2 nodes in backend
>
nohup cassandra -f conf/cassandra.yaml &
Verify that the cluster is working
>
nodetool -h ubuntu-dev1 ring
Datacenter: datacenter1
==========
Address Rack Status State Load Owns Token
7068820527558753619
10.190.191.195 rack1 Up Normal 132.34 KB 36.12% -4714763636920163240
==========
Address Rack Status State Load Owns Token
7068820527558753619
10.190.191.195 rack1 Up Normal 132.34 KB 36.12% -4714763636920163240
10.190.190.190 rack1 Up Normal 65.18 KB 63.88% 7068820527558753619
1.3 Spark
I am choosing this old version.
spark-0.9.0-incubating-bin-hadoop1.tgz
Place that in the right place.
Set up the access across among the masters and slaves.
On Master
>
ssh-keygen -t rsa
>
cat ~/.ssh/id_rsa.pub
On slave
>
mkdir ~/.ssh
>
vi ~/.ssh/authorized_keys
Put the public key from rsa.pub
Config the Spark file here /opt/spark/conf/spark-env.sh
SCALA_HOME=/opt/scala/scala-2.10.3
SPARK_WORKER_MEMORY=512m
#SPARK_CLASSPATH='/opt/localpoint-profiles-spark/*jar'
#SPARK_JAVA_OPTS="-Dbuild.env=lmm.sdprod"
SPARK_WORKER_MEMORY=512m
#SPARK_CLASSPATH='/opt/localpoint-profiles-spark/*jar'
#SPARK_JAVA_OPTS="-Dbuild.env=lmm.sdprod"
USER=carl
/opt/spark/conf/slaves
ubuntu-dev1
ubuntu-dev2
Command to start the Spark Server
>
sbin/start-all.sh
Spark single mode Command
>java -Dbuild.env=sillycat.dev cp /opt/YOU_PROJECT/lib/*.jar com.sillycat.YOUR_CLASS
>java -Dbuild.env=sillycat.dev -Dsparkcontext.Master=“spark://YOURSERVER:7070” cp /opt/YOU_PROJECT/lib/*.jar com.sillycat.YOUR_CLASS
Visit the homepage for Spark Master
3. Prepare Mysql
>sudo apt-get install software-properties-common
>sudo add-apt-repository ppa:ondrej/mysql-5.6
>sudo apt-get update
>sudo apt-get update
>sudo apt-get install mysql-server
Command to create the database and set up the password
>
use mysql;
>
grant all privileges on test.* to root@"%" identified by 'kaishi';
>
flush privileges;
on the client, maybe only install mysql client
>
sudo apt-get install mysql-client-core-5.6
Change the bind address in
sudo vi /etc/mysql/my.cnf
bind-address = 127.0.0.1
>
sudo service mysql stop
>
sudo service mysql start
4. Install Grails
Download from here, I am using an old version.
>wget
5. Install tomcat on Master
>wget
Config the database in this file, TOMCAT_HOME/conf/context.xml
<Resource name="jdbc/lmm" auth="Container" type="javax.sql.DataSource"
maxIdle="30" maxWait="-1" maxActive="100"
factory="org.apache.tomcat.jdbc.pool.DataSourceFactory"
testOnBorrow="true"
validationQuery="select 1"
logAbandoned="true"
username="root"
password="kaishi"
driverClassName="com.mysql.jdbc.Driver"
maxIdle="30" maxWait="-1" maxActive="100"
factory="org.apache.tomcat.jdbc.pool.DataSourceFactory"
testOnBorrow="true"
validationQuery="select 1"
logAbandoned="true"
username="root"
password="kaishi"
driverClassName="com.mysql.jdbc.Driver"
url="jdbc:mysql://localhost:3306/lmm?autoReconnect=true&useServerPrepStmts=false&rewriteBatchedStatements=true"/>
Download and place the right mysql driver
>
ls -l lib | grep mysql
-rw-r--r-- 1 carl carl 786484 Dec 10 09:30 mysql-connector-java-5.1.16.jar
Change the config to avoid OutOfMemoryError
>
vi bin/catalina.sh
JAVA_OPTS
=
"
$JAVA_OPTS
-Xms2048m -Xmx2048m -XX:PermSize=256m -XX:MaxPermSize=512m
"
6. Running Assembly Jar File
build the assembly jar and place in the lib directory, create a shell file in the bin directory
>
cat bin/startup.sh
#!/bin/bash
#!/bin/bash
java -Xms512m -Xmx1024m -Dbuild.env=lmm.sparkvm -Dspray.can.server.request-timeout=300s -Dspray.can.server.idle-timeout=360s -cp /opt/YOUR_MODULE/lib/*.jar com.sillycat,YOUPACKAGE.YOUMAINLCASS
Setup the Bouncy Castle Jar
>cd
/usr/lib/jvm/java-6-oracle/jre/lib/ext
>cd
/usr/lib/jvm/java-6-oracle/jre/lib/security
>
sudo vi java.security
security.provider.9=org.bouncycastle.jce.provider.BouncyCastleProvider
7. JCE Problem
download file
jce_policy-6.zip from
http://www.oracle.com/technetwork/java/javase/downloads/jce-6-download-429243.html
Unzip the file and place the jar into this directory.
8. Command to Check data in cqlsh
Connect to cassandra
>
cqlsh localhost 9160
Check the key space
cqlsh> select * from system.schema_keyspaces;
Check the version
cqlsh> show version
[cqlsh 3.1.8 | Cassandra 1.2.13 | CQL spec 3.0.0 | Thrift protocol 19.36.2]
Use the key space, something like database;
cqlsh> use device_lookup;
check the table
cqlsh:device_lookup> select count(*) from profile_devices limit 300000;
During testing, if need to clear the data
delete from profile_devices where deviceid = 'ios1009528' and brandcode = 'spark' and profileid = 5;
delete from profile_devices where brandcode = 'spark' and profileid = 5;
Deployment Option One
1 Put a serialize class there.
package com.sillycat.easyspark.profile
import com.sillycat.easyspark.model.Attributes import org.apache.spark.serializer.KryoRegistrator
import com.esotericsoftware.kryo.Kryo
import com.sillycat.easyspark.model.Profile
class ProfileKryoRegistrator extends KryoRegistrator {
override def registerClasses(kryo: Kryo) { kryo.register(classOf[Attributes])
kryo.register(classOf[Profile])
}
}
Change the configuration and start SparkContent part as follow:
val config = ConfigFactory.load() val conf = new SparkConf() conf.setMaster(config.getString("sparkcontext.Master")) conf.setAppName("Profile Device Update") conf.setSparkHome(config.getString("sparkcontext.Home")) if (config.hasPath("jobJar")) { conf.setJars(List(config.getString("jobJar"))) } else { conf.setJars(SparkContext.jarOfClass(this.getClass).toSeq)
}
conf.set("spark.serializer", "org.apache.spark.serializer.KryoSerializer")
conf
.set(
"spark.kryo.registrator"
,
“com.sillycat.easyspark.profile.ProfileKryoRegistrator"
)
val sc = new SparkContext(conf)
It works.
Tips
1. Command to Unzip the jar file
>jar xf jar-file
References:
cassandra
spark
ubuntu server
grails
bouncy castle
tomcat out of memory
Tips
Spark Trouble Shooting