一,hadoop搭建(简单)
到官网下载hadoop地址:https://hadoop.apache.org/releases.html 我使用的是hadoop-2.7.6 将下载的文件拷贝到虚拟机指定文件夹里解压 解压命令 tar-zxvf XXXXXXX 配置环境变量 在这就不累述了
主要配置的有以下几个文件:
core-site.xml:
hdfs-site.xml
hadoop-env.sh: export JAVA_HOME=/java/jdk1.8.0_191 // 修改javahome
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# Set Hadoop-specific environment variables here.
# The only required environment variable is JAVA_HOME. All others are
# optional. When running a distributed configuration it is best to
# set JAVA_HOME in this file, so that it is correctly defined on
# remote nodes.
# The java implementation to use.
export JAVA_HOME=/java/jdk1.8.0_191 // 修改javahome
# The jsvc implementation to use. Jsvc is required to run secure datanodes
# that bind to privileged ports to provide authentication of data transfer
# protocol. Jsvc is not required if SASL is configured for authentication of
# data transfer protocol using non-privileged ports.
#export JSVC_HOME=${JSVC_HOME}
export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-"/etc/hadoop"}
# Extra Java CLASSPATH elements. Automatically insert capacity-scheduler.
for f in $HADOOP_HOME/contrib/capacity-scheduler/*.jar; do
if [ "$HADOOP_CLASSPATH" ]; then
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$f
else
export HADOOP_CLASSPATH=$f
fi
done
# The maximum amount of heap to use, in MB. Default is 1000.
#export HADOOP_HEAPSIZE=
#export HADOOP_NAMENODE_INIT_HEAPSIZE=""
# Extra Java runtime options. Empty by default.
export HADOOP_OPTS="$HADOOP_OPTS -Djava.net.preferIPv4Stack=true"
# Command specific options appended to HADOOP_OPTS when specified
export HADOOP_NAMENODE_OPTS="-Dhadoop.security.logger=${HADOOP_SECURITY_LOGGER:-INFO,RFAS} -Dhdfs.audit.logger=${HDFS_AUDIT_LOGGER:-INFO,NullAppender} $HADOOP_NAMENODE_OPTS"
export HADOOP_DATANODE_OPTS="-Dhadoop.security.logger=ERROR,RFAS $HADOOP_DATANODE_OPTS"
export HADOOP_SECONDARYNAMENODE_OPTS="-Dhadoop.security.logger=${HADOOP_SECURITY_LOGGER:-INFO,RFAS} -Dhdfs.audit.logger=${HDFS_AUDIT_LOGGER:-INFO,NullAppender} $HADOOP_SECONDARYNAMENODE_OPTS"
mapred-site.xml:
yarn-site.xml:
至此伪分布式hadoop搭建成功
初始化namenode
hadoop namenode -format
分别启动namenode和yarn
start-dfs.sh;和start-yarn.sh.也可以直接使用start-all.sh(已过时)
通过jps 查看
7443 SecondaryNameNode
7955 Jps
7270 DataNode
7720 NodeManager
7133 NameNode
7598 ResourceManager
说明启动成功(页面看到小象)
二 hbase搭建:
hbase官网下载地址http://hbase.apache.org/ 我使用的是 hbase-1.2.9-bin
需要修改的文件:
hbase-site.xml :
regionservers: (单机版不需要修改)
localhost
最后将hadoop 的core-site.xml hdfs-site.xml 拷贝过来就行
zk的搭建在这就不累述了很简单(跳过)
建议所有都配置环境变量
启动zk(zkServer.sh start),启动hadoop(start-dfs.sh,start-yarn.sh),启动hbase
命令:start-hbase.sh
进程:
7443 SecondaryNameNode
8243 HMaster
8101 QuorumPeerMain
7270 DataNode
8759 Jps
7720 NodeManager
8378 HRegionServer
7133 NameNode
7598 ResourceManage
至此hbase搭建成功
三,elasticsearch搭建:官网地址https://www.elastic.co/downloads 使用版本为: elasticsearch-6.5.4
配置环境变量直接启动:bin/elasticsearch
下载chrome 扩展程序 elasticSearch_head
es搭建成功
四 janusgraph 搭建: 下载地址:https://github.com/JanusGraph/janusgraph/releases 使用版本:janusgraph-0.2.2-hadoop2
配置文件修改 拷贝两份 janusgraph-hbase-es.properties 到gremlin-server 重命名为 janusgraph-hbase-es-socketserver.properties和anusgraph-hbase-es-httpserver.properties
分别以两种方式启动 socket和http
janusgraph-hbase-es-socketserver.properties:
# JanusGraph configuration sample: HBase and Elasticsearch
#
# This file connects to HBase using a Zookeeper quorum
# (storage.hostname) consisting solely of localhost. It also connects
# to Elasticsearch running on localhost over Elasticsearch's native "Transport"
# protocol. Zookeeper, the HBase services, and Elasticsearch must already
# be running and available before starting JanusGraph with this file.
# The primary persistence provider used by JanusGraph. This is required.
# It should be set one of JanusGraph's built-in shorthand names for its
# standard storage backends (shorthands: berkeleyje, cassandrathrift,
# cassandra, astyanax, embeddedcassandra, cql, hbase, inmemory) or to the
# full package and classname of a custom/third-party StoreManager
# implementation.
#
# Default: (no default value)
# Data Type: String
# Mutability: LOCAL
gremlin.graph=org.janusgraph.core.JanusGraphFactory
//table名
storage.hbase.table=janusgraph_odms31
后端储存hbase
storage.backend=hbase
# The hostname or comma-separated list of hostnames of storage backend
# servers. This is only applicable to some storage backends, such as
# cassandra and hbase.
#
# Default: 127.0.0.1
# Data Type: class java.lang.String[]
# Mutability: LOCAL
//主机名
storage.hostname=hadoop
# Whether to enable JanusGraph's database-level cache, which is shared
# across all transactions. Enabling this option speeds up traversals by
# holding hot graph elements in memory, but also increases the likelihood
# of reading stale data. Disabling it forces each transaction to
# independently fetch graph elements from storage before reading/writing
# them.
#
# Default: false
# Data Type: Boolean
# Mutability: MASKABLE
cache.db-cache = true
# How long, in milliseconds, database-level cache will keep entries after
# flushing them. This option is only useful on distributed storage
# backends that are capable of acknowledging writes without necessarily
# making them immediately visible.
#
# Default: 50
# Data Type: Integer
# Mutability: GLOBAL_OFFLINE
#
# Settings with mutability GLOBAL_OFFLINE are centrally managed in
# JanusGraph's storage backend. After starting the database for the first
# time, this file's copy of this setting is ignored. Use JanusGraph's
# Management System to read or modify this value after bootstrapping.
# Settings with mutability GLOBAL_OFFLINE are centrally managed in
# JanusGraph's storage backend. After starting the database for the first
# time, this file's copy of this setting is ignored. Use JanusGraph's
# Management System to read or modify this value after bootstrapping.
es
index.search.backend=elasticsearch
# The hostname or comma-separated list of hostnames of index backend
# servers. This is only applicable to some index backends, such as
# elasticsearch and solr.
#
# Default: 127.0.0.1
# Data Type: class java.lang.String[]
# Mutability: MASKABLEr
es
index.search.hostname=hadoop
拷贝 gremlin-server.ymal 重命名为gremlin-server-socket.ymal:
host: 0.0.0.0
port: 8182
scriptEvaluationTimeout: 30000
//socket模式
channelizer: org.apache.tinkerpop.gremlin.server.channel.WebSocketChannelizer
graphs: {
graph: conf/gremlin-server/janusgraph-hbase-es-server.properties
}
plugins:
- janusgraph.imports
scriptEngines: {
gremlin-groovy: {
imports: [java.lang.Math],
staticImports: [java.lang.Math.PI],
scripts: [scripts/empty-sample.groovy]}}
serializers:
- { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
- { className: org.apache.tinkerpop.gremlin.driver.ser.GryoLiteMessageSerializerV1d0, config: {ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
- { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { serializeResultToString: true }}
- { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistryV1d0] }}
- { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV2d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
- { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistryV1d0] }}
processors:
- { className: org.apache.tinkerpop.gremlin.server.op.session.SessionOpProcessor, config: { sessionTimeout: 28800000 }}
- { className: org.apache.tinkerpop.gremlin.server.op.traversal.TraversalOpProcessor, config: { cacheExpirationTime: 600000, cacheMaxSize: 1000 }}
metrics: {
consoleReporter: {enabled: true, interval: 180000},
csvReporter: {enabled: true, interval: 180000, fileName: /tmp/gremlin-server-metrics.csv},
jmxReporter: {enabled: true},
slf4jReporter: {enabled: true, interval: 180000},
gangliaReporter: {enabled: false, interval: 180000, addressingMode: MULTICAST},
graphiteReporter: {enabled: false, interval: 180000}}
maxInitialLineLength: 4096
maxHeaderSize: 8192
maxChunkSize: 8192
maxContentLength: 65536
maxAccumulationBufferComponents: 1024
resultIterationBatchSize: 64
writeBufferLowWaterMark: 32768
writeBufferHighWaterMark: 65536
拷贝 gremlin-server.ymal 重命名为gremlin-server-http.ymal:
host: 0.0.0.0
port: 8182
scriptEvaluationTimeout: 30000
channelizer: org.apache.tinkerpop.gremlin.server.channel.WsAndHttpChannelizer
graphs: {
graph: conf/gremlin-server/janusgraph-hbase-es-httpserver.properties
}
authentication: {
authenticator: org.janusgraph.graphdb.tinkerpop.gremlin.server.auth.SaslAndHMACAuthenticator,
authenticationHandler: org.janusgraph.graphdb.tinkerpop.gremlin.server.handler.SaslAndHMACAuthenticationHandler,
config: {
defaultUsername: user,
defaultPassword: password,
hmacSecret: secret,
credentialsDb: conf/janusgraph-credentials-server.properties
}
}
plugins:
- janusgraph.imports
scriptEngines: {
gremlin-groovy: {
imports: [java.lang.Math],
staticImports: [java.lang.Math.PI],
scripts: [scripts/empty-sample.groovy]}}
serializers:
- { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
- { className: org.apache.tinkerpop.gremlin.driver.ser.GryoLiteMessageSerializerV1d0, config: {ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
- { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { serializeResultToString: true }}
- { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistryV1d0] }}
- { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV2d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
- { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistryV1d0] }}
processors:
- { className: org.apache.tinkerpop.gremlin.server.op.session.SessionOpProcessor, config: { sessionTimeout: 28800000 }}
- { className: org.apache.tinkerpop.gremlin.server.op.traversal.TraversalOpProcessor, config: { cacheExpirationTime: 600000, cacheMaxSize: 1000 }}
metrics: {
consoleReporter: {enabled: true, interval: 180000},
csvReporter: {enabled: true, interval: 180000, fileName: /tmp/gremlin-server-metrics.csv},
jmxReporter: {enabled: true},
slf4jReporter: {enabled: true, interval: 180000},
gangliaReporter: {enabled: false, interval: 180000, addressingMode: MULTICAST},
graphiteReporter: {enabled: false, interval: 180000}}
maxInitialLineLength: 4096
maxHeaderSize: 8192
maxChunkSize: 8192
maxContentLength: 65536
maxAccumulationBufferComponents: 1024
resultIterationBatchSize: 64
writeBufferLowWaterMark: 32768
writeBufferHighWaterMark: 65536
拷贝 gremlin-server.ymal 重命名为gremlin-server-socketAndHttp.ymal:
host: 0.0.0.0
port: 8182
scriptEvaluationTimeout: 30000
channelizer: org.apache.tinkerpop.gremlin.server.channel.WsAndHttpChannelizer
graphs: {
graph: conf/gremlin-server/janusgraph-hbase-es-httpserver.properties
}
plugins:
- janusgraph.imports
scriptEngines: {
gremlin-groovy: {
imports: [java.lang.Math],
staticImports: [java.lang.Math.PI],
scripts: [scripts/empty-sample.groovy]}}
serializers:
- { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
- { className: org.apache.tinkerpop.gremlin.driver.ser.GryoLiteMessageSerializerV1d0, config: {ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
- { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { serializeResultToString: true }}
- { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistryV1d0] }}
- { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV2d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
- { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistryV1d0] }}
processors:
- { className: org.apache.tinkerpop.gremlin.server.op.session.SessionOpProcessor, config: { sessionTimeout: 28800000 }}
- { className: org.apache.tinkerpop.gremlin.server.op.traversal.TraversalOpProcessor, config: { cacheExpirationTime: 600000, cacheMaxSize: 1000 }}
metrics: {
consoleReporter: {enabled: true, interval: 180000},
csvReporter: {enabled: true, interval: 180000, fileName: /tmp/gremlin-server-metrics.csv},
jmxReporter: {enabled: true},
slf4jReporter: {enabled: true, interval: 180000},
gangliaReporter: {enabled: false, interval: 180000, addressingMode: MULTICAST},
graphiteReporter: {enabled: false, interval: 180000}}
maxInitialLineLength: 4096
maxHeaderSize: 8192
maxChunkSize: 8192
maxContentLength: 65536
maxAccumulationBufferComponents: 1024
resultIterationBatchSize: 64
writeBufferLowWaterMark: 32768
writeBufferHighWaterMark: 65536
三种模式启动大同小异:
,命令:gremlin-server.sh ./conf/gremlin-server/gremlin-server-httpsocket.yaml(其他只需修改配置名即可)
查看 hbase web是否创建表
如期望一样创建成功,至此服务端启动成功
启动客户端进入janusgrap链接服务端:命令如下
./gremlin.sh
:remote connect tinkerpop.server conf/remote.yaml
Configured hadoop/xxxxxxxxx:8182
:remote console
==>All scripts will now be sent to Gremlin Server - [hadoop/192.168.8.88:8182] - type ':remote console' to return to local mode
g=graph.traversal()
graphtraversalsource[standardjanusgraph[hbase:[hadoop]], standard]
g.V()查询所有顶点(参考gremlin语言https://www.gremlin.com/)
链接成功
java 代码链接:
需要的依赖:
org.janusgraph
janusgraph-core
0.3.1
org.janusgraph
janusgraph-es
0.3.1
org.apache.httpcomponents
httpclient
4.5.2
org.apache.tinkerpop
tinkergraph-gremlin
3.3.3
org.apache.commons
commons-math3
3.1
将gremlin-server-httpsocket.yaml 拷贝到工程文件下:
import org.apache.commons.configuration.Configuration;
import org.apache.commons.configuration.ConfigurationException;
import org.apache.commons.configuration.PropertiesConfiguration;
import org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.GraphTraversalSource;
import org.apache.tinkerpop.gremlin.structure.Graph;
import org.apache.tinkerpop.gremlin.structure.util.GraphFactory;
import org.janusgraph.core.schema.JanusGraphManagement;
public class testjanusgraph{
private String FileName;
private Configuration conf;
private Graph graph;
private GraphTraversalSource g;
/**
* Constructs a graph app using the given properties.
* @param fileName location of the properties file
*/
public GraphApp(final String file) {
FileName = file;
}
/**
* Opens the graph instance. If the graph instance does not exist, a new
* graph instance is initialized.
*/
public GraphTraversalSource openGraph() throws ConfigurationException {
conf = new PropertiesConfiguration(FileName);
graph = GraphFactory.open(conf);
g = graph.traversal();
return g;
}
/**
* Closes the graph instance.
*/
public void closeGraph() throws Exception {
LOGGER.info("closing graph");
try {
if (g != null) {
g.close();
}
if (graph != null) {
graph.close();
}
} finally {
g = null;
graph = null;
}
}
}
import org.apache.tinkerpop.gremlin.process.traversal.P;
import org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.GraphTraversal;
import org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.GraphTraversalSource;
import org.apache.tinkerpop.gremlin.structure.Direction;
import org.apache.tinkerpop.gremlin.structure.Edge;
import org.apache.tinkerpop.gremlin.structure.Property;
import org.apache.tinkerpop.gremlin.structure.Vertex;
import org.janusgraph.core.Cardinality;
import org.janusgraph.core.EdgeLabel;
import org.janusgraph.core.JanusGraph;
import org.janusgraph.core.Multiplicity;
import org.janusgraph.core.PropertyKey;
import org.janusgraph.core.VertexLabel;
import org.janusgraph.core.attribute.Geoshape;
import org.janusgraph.core.schema.JanusGraphIndex;
import org.janusgraph.core.schema.JanusGraphManagement;
import org.janusgraph.core.schema.JanusGraphManagement.IndexBuilder;
import org.janusgraph.core.schema.Mapping;
import org.janusgraph.core.schema.SchemaAction;
import org.janusgraph.core.schema.SchemaStatus;
public class App extends testjanusgraph{
protected static final String MIXED_INDEX_CONFIG_NAME = "search";
protected boolean useMixedIndex;
protected String mixedIndexConfigName;
/**
* Constructs a graph app using the given properties.
*
* @param fileName
* location of the properties file
*/
public App(final String fileName) {
super(fileName);
this.mixedIndexConfigName = MIXED_INDEX_CONFIG_NAME;
}
@Override
public GraphTraversalSource openGraph() throws ConfigurationException {
super.openGraph();
useMixedIndex = useMixedIndex && conf.containsKey("index." + mixedIndexConfigName + ".backend");
return g;
}
@Override
public void dropGraph() throws Exception {
if (graph != null) {
}
}
/**
* 新增顶点.
*
* @param mgmt
* @param vname
* 顶点名称
*/
@Override
public boolean createSchemaVertex(final JanusGraphManagement mgmt, String vname) {
boolean ok = false;
try {
LOGGER.info("creating schema vertex");
mgmt.makeVertexLabel(vname).make();
mgmt.commit();
ok = true;
} catch (Exception e) {
ok = false;
mgmt.rollback();
}
return ok;
}
未完待续