伪分布hadoop+hbase+janusgraph+es搭建(仅供交流学习)

一,hadoop搭建(简单)

到官网下载hadoop地址:https://hadoop.apache.org/releases.html 我使用的是hadoop-2.7.6 将下载的文件拷贝到虚拟机指定文件夹里解压 解压命令 tar-zxvf XXXXXXX 配置环境变量 在这就不累述了

主要配置的有以下几个文件:

core-site.xml:


       
                fs.defaultFS
                hdfs://hadoop:9000
       

       
                hadoop.tmp.dir
                /hadoop/hadoop-2.7.6/temp
       

hdfs-site.xml


       
                dfs.replication
                1
       

hadoop-env.sh:  export JAVA_HOME=/java/jdk1.8.0_191 // 修改javahome

# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements.  See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership.  The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License.  You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Set Hadoop-specific environment variables here.

# The only required environment variable is JAVA_HOME.  All others are
# optional.  When running a distributed configuration it is best to
# set JAVA_HOME in this file, so that it is correctly defined on
# remote nodes.

# The java implementation to use.
export JAVA_HOME=/java/jdk1.8.0_191 // 修改javahome

# The jsvc implementation to use. Jsvc is required to run secure datanodes
# that bind to privileged ports to provide authentication of data transfer
# protocol.  Jsvc is not required if SASL is configured for authentication of
# data transfer protocol using non-privileged ports.
#export JSVC_HOME=${JSVC_HOME}

export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-"/etc/hadoop"}

# Extra Java CLASSPATH elements.  Automatically insert capacity-scheduler.
for f in $HADOOP_HOME/contrib/capacity-scheduler/*.jar; do
  if [ "$HADOOP_CLASSPATH" ]; then
    export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$f
  else
    export HADOOP_CLASSPATH=$f
  fi
done

# The maximum amount of heap to use, in MB. Default is 1000.
#export HADOOP_HEAPSIZE=
#export HADOOP_NAMENODE_INIT_HEAPSIZE=""

# Extra Java runtime options.  Empty by default.
export HADOOP_OPTS="$HADOOP_OPTS -Djava.net.preferIPv4Stack=true"

# Command specific options appended to HADOOP_OPTS when specified
export HADOOP_NAMENODE_OPTS="-Dhadoop.security.logger=${HADOOP_SECURITY_LOGGER:-INFO,RFAS} -Dhdfs.audit.logger=${HDFS_AUDIT_LOGGER:-INFO,NullAppender} $HADOOP_NAMENODE_OPTS"
export HADOOP_DATANODE_OPTS="-Dhadoop.security.logger=ERROR,RFAS $HADOOP_DATANODE_OPTS"

export HADOOP_SECONDARYNAMENODE_OPTS="-Dhadoop.security.logger=${HADOOP_SECURITY_LOGGER:-INFO,RFAS} -Dhdfs.audit.logger=${HDFS_AUDIT_LOGGER:-INFO,NullAppender} $HADOOP_SECONDARYNAMENODE_OPTS"

mapred-site.xml:


       
                mapreduce.framework.name
                yarn
       

yarn-site.xml:


       
       
                yarn.nodemanager.auxservices
                mapreduce_shuffle
       

至此伪分布式hadoop搭建成功

初始化namenode

hadoop namenode -format

分别启动namenode和yarn

start-dfs.sh;和start-yarn.sh.也可以直接使用start-all.sh(已过时)

通过jps 查看

7443 SecondaryNameNode
7955 Jps
7270 DataNode
7720 NodeManager
7133 NameNode
7598 ResourceManager

说明启动成功(页面看到小象)

伪分布hadoop+hbase+janusgraph+es搭建(仅供交流学习)_第1张图片

伪分布hadoop+hbase+janusgraph+es搭建(仅供交流学习)_第2张图片

二 hbase搭建:

hbase官网下载地址http://hbase.apache.org/ 我使用的是 hbase-1.2.9-bin

需要修改的文件:

hbase-site.xml :


       
                hbase.rootdir
                hdfs://hadoop:9000/hbase
       

       
                hbase.cluster.distributed
                true
       

       
                hbase.zookeeper.quorum
                hadoop
       

       
                zookeeper.znode.parent
                /hbase
       

       
                hbase.cluster.distributed
                true
       

       
                hbase.zookeeper.property.dataDir
                /tmp/zookeeper
       

regionservers: (单机版不需要修改)

localhost

最后将hadoop 的core-site.xml   hdfs-site.xml  拷贝过来就行

zk的搭建在这就不累述了很简单(跳过)

建议所有都配置环境变量

启动zk(zkServer.sh start),启动hadoop(start-dfs.sh,start-yarn.sh),启动hbase

命令:start-hbase.sh

进程:

7443 SecondaryNameNode
8243 HMaster
8101 QuorumPeerMain
7270 DataNode
8759 Jps
7720 NodeManager
8378 HRegionServer
7133 NameNode
7598 ResourceManage

伪分布hadoop+hbase+janusgraph+es搭建(仅供交流学习)_第3张图片

至此hbase搭建成功

三,elasticsearch搭建:官网地址https://www.elastic.co/downloads 使用版本为: elasticsearch-6.5.4

配置环境变量直接启动:bin/elasticsearch

下载chrome 扩展程序 elasticSearch_head

伪分布hadoop+hbase+janusgraph+es搭建(仅供交流学习)_第4张图片

es搭建成功

四 janusgraph 搭建: 下载地址:https://github.com/JanusGraph/janusgraph/releases  使用版本:janusgraph-0.2.2-hadoop2

配置文件修改 拷贝两份 janusgraph-hbase-es.properties 到gremlin-server 重命名为 janusgraph-hbase-es-socketserver.properties和anusgraph-hbase-es-httpserver.properties

分别以两种方式启动 socket和http

janusgraph-hbase-es-socketserver.properties:

# JanusGraph configuration sample: HBase and Elasticsearch
#
# This file connects to HBase using a Zookeeper quorum
# (storage.hostname) consisting solely of localhost.  It also connects
# to Elasticsearch running on localhost over Elasticsearch's native "Transport"
# protocol.  Zookeeper, the HBase services, and Elasticsearch must already
# be running and available before starting JanusGraph with this file.

# The primary persistence provider used by JanusGraph.  This is required.
# It should be set one of JanusGraph's built-in shorthand names for its
# standard storage backends (shorthands: berkeleyje, cassandrathrift,
# cassandra, astyanax, embeddedcassandra, cql, hbase, inmemory) or to the
# full package and classname of a custom/third-party StoreManager
# implementation.
#
# Default:    (no default value)
# Data Type:  String
# Mutability: LOCAL
gremlin.graph=org.janusgraph.core.JanusGraphFactory

//table名
storage.hbase.table=janusgraph_odms31

后端储存hbase
storage.backend=hbase
# The hostname or comma-separated list of hostnames of storage backend
# servers.  This is only applicable to some storage backends, such as
# cassandra and hbase.
#
# Default:    127.0.0.1
# Data Type:  class java.lang.String[]
# Mutability: LOCAL

//主机名
storage.hostname=hadoop

# Whether to enable JanusGraph's database-level cache, which is shared
# across all transactions. Enabling this option speeds up traversals by
# holding hot graph elements in memory, but also increases the likelihood
# of reading stale data.  Disabling it forces each transaction to
# independently fetch graph elements from storage before reading/writing
# them.
#         
# Default:    false
# Data Type:  Boolean
# Mutability: MASKABLE
cache.db-cache = true

# How long, in milliseconds, database-level cache will keep entries after
# flushing them.  This option is only useful on distributed storage
# backends that are capable of acknowledging writes without necessarily
# making them immediately visible.
#   
# Default:    50
# Data Type:  Integer
# Mutability: GLOBAL_OFFLINE
#   
# Settings with mutability GLOBAL_OFFLINE are centrally managed in
# JanusGraph's storage backend.  After starting the database for the first
# time, this file's copy of this setting is ignored.  Use JanusGraph's
# Management System to read or modify this value after bootstrapping.

# Settings with mutability GLOBAL_OFFLINE are centrally managed in
# JanusGraph's storage backend.  After starting the database for the first
# time, this file's copy of this setting is ignored.  Use JanusGraph's
# Management System to read or modify this value after bootstrapping.

es
index.search.backend=elasticsearch

# The hostname or comma-separated list of hostnames of index backend
# servers.  This is only applicable to some index backends, such as
# elasticsearch and solr.
#
# Default:    127.0.0.1
# Data Type:  class java.lang.String[]
# Mutability: MASKABLEr

es
index.search.hostname=hadoop

拷贝 gremlin-server.ymal 重命名为gremlin-server-socket.ymal:

host: 0.0.0.0
port: 8182
scriptEvaluationTimeout: 30000

//socket模式
channelizer: org.apache.tinkerpop.gremlin.server.channel.WebSocketChannelizer
graphs: {
  graph: conf/gremlin-server/janusgraph-hbase-es-server.properties
}
plugins:
  - janusgraph.imports
scriptEngines: {
  gremlin-groovy: {
    imports: [java.lang.Math],
    staticImports: [java.lang.Math.PI],
    scripts: [scripts/empty-sample.groovy]}}
serializers:
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoLiteMessageSerializerV1d0, config: {ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { serializeResultToString: true }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistryV1d0] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV2d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistryV1d0] }}
processors:
  - { className: org.apache.tinkerpop.gremlin.server.op.session.SessionOpProcessor, config: { sessionTimeout: 28800000 }}
  - { className: org.apache.tinkerpop.gremlin.server.op.traversal.TraversalOpProcessor, config: { cacheExpirationTime: 600000, cacheMaxSize: 1000 }}
metrics: {
  consoleReporter: {enabled: true, interval: 180000},
  csvReporter: {enabled: true, interval: 180000, fileName: /tmp/gremlin-server-metrics.csv},
  jmxReporter: {enabled: true},
  slf4jReporter: {enabled: true, interval: 180000},
  gangliaReporter: {enabled: false, interval: 180000, addressingMode: MULTICAST},
  graphiteReporter: {enabled: false, interval: 180000}}
maxInitialLineLength: 4096
maxHeaderSize: 8192
maxChunkSize: 8192
maxContentLength: 65536
maxAccumulationBufferComponents: 1024
resultIterationBatchSize: 64
writeBufferLowWaterMark: 32768
writeBufferHighWaterMark: 65536

拷贝 gremlin-server.ymal 重命名为gremlin-server-http.ymal:
host: 0.0.0.0
port: 8182
scriptEvaluationTimeout: 30000
channelizer: org.apache.tinkerpop.gremlin.server.channel.WsAndHttpChannelizer
graphs: {
  graph: conf/gremlin-server/janusgraph-hbase-es-httpserver.properties
}
authentication: {
  authenticator: org.janusgraph.graphdb.tinkerpop.gremlin.server.auth.SaslAndHMACAuthenticator,
  authenticationHandler: org.janusgraph.graphdb.tinkerpop.gremlin.server.handler.SaslAndHMACAuthenticationHandler,
  config: {
    defaultUsername: user,
    defaultPassword: password,
    hmacSecret: secret,
    credentialsDb: conf/janusgraph-credentials-server.properties
  }
}
plugins:
  - janusgraph.imports
scriptEngines: {
  gremlin-groovy: {
    imports: [java.lang.Math],
    staticImports: [java.lang.Math.PI],
    scripts: [scripts/empty-sample.groovy]}}
serializers:
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoLiteMessageSerializerV1d0, config: {ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { serializeResultToString: true }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistryV1d0] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV2d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistryV1d0] }}
processors:
  - { className: org.apache.tinkerpop.gremlin.server.op.session.SessionOpProcessor, config: { sessionTimeout: 28800000 }}
  - { className: org.apache.tinkerpop.gremlin.server.op.traversal.TraversalOpProcessor, config: { cacheExpirationTime: 600000, cacheMaxSize: 1000 }}
metrics: {
  consoleReporter: {enabled: true, interval: 180000},
  csvReporter: {enabled: true, interval: 180000, fileName: /tmp/gremlin-server-metrics.csv},
  jmxReporter: {enabled: true},
  slf4jReporter: {enabled: true, interval: 180000},
  gangliaReporter: {enabled: false, interval: 180000, addressingMode: MULTICAST},
  graphiteReporter: {enabled: false, interval: 180000}}
maxInitialLineLength: 4096
maxHeaderSize: 8192
maxChunkSize: 8192
maxContentLength: 65536
maxAccumulationBufferComponents: 1024
resultIterationBatchSize: 64
writeBufferLowWaterMark: 32768
writeBufferHighWaterMark: 65536

拷贝 gremlin-server.ymal 重命名为gremlin-server-socketAndHttp.ymal:
host: 0.0.0.0
port: 8182
scriptEvaluationTimeout: 30000
channelizer: org.apache.tinkerpop.gremlin.server.channel.WsAndHttpChannelizer
graphs: {
  graph: conf/gremlin-server/janusgraph-hbase-es-httpserver.properties
}
plugins:
  - janusgraph.imports
scriptEngines: {
  gremlin-groovy: {
    imports: [java.lang.Math],
    staticImports: [java.lang.Math.PI],
    scripts: [scripts/empty-sample.groovy]}}
serializers:
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoLiteMessageSerializerV1d0, config: {ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { serializeResultToString: true }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistryV1d0] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV2d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistryV1d0] }}
processors:
  - { className: org.apache.tinkerpop.gremlin.server.op.session.SessionOpProcessor, config: { sessionTimeout: 28800000 }}
  - { className: org.apache.tinkerpop.gremlin.server.op.traversal.TraversalOpProcessor, config: { cacheExpirationTime: 600000, cacheMaxSize: 1000 }}
metrics: {
  consoleReporter: {enabled: true, interval: 180000},
  csvReporter: {enabled: true, interval: 180000, fileName: /tmp/gremlin-server-metrics.csv},
  jmxReporter: {enabled: true},
  slf4jReporter: {enabled: true, interval: 180000},
  gangliaReporter: {enabled: false, interval: 180000, addressingMode: MULTICAST},
  graphiteReporter: {enabled: false, interval: 180000}}
maxInitialLineLength: 4096
maxHeaderSize: 8192
maxChunkSize: 8192
maxContentLength: 65536
maxAccumulationBufferComponents: 1024
resultIterationBatchSize: 64
writeBufferLowWaterMark: 32768
writeBufferHighWaterMark: 65536

三种模式启动大同小异:

,命令:gremlin-server.sh ./conf/gremlin-server/gremlin-server-httpsocket.yaml(其他只需修改配置名即可)

伪分布hadoop+hbase+janusgraph+es搭建(仅供交流学习)_第5张图片

 

查看  hbase web是否创建表

伪分布hadoop+hbase+janusgraph+es搭建(仅供交流学习)_第6张图片

如期望一样创建成功,至此服务端启动成功

启动客户端进入janusgrap链接服务端:命令如下

./gremlin.sh

:remote connect tinkerpop.server conf/remote.yaml

Configured hadoop/xxxxxxxxx:8182

:remote console

==>All scripts will now be sent to Gremlin Server - [hadoop/192.168.8.88:8182] - type ':remote console' to return to local mode

g=graph.traversal()

graphtraversalsource[standardjanusgraph[hbase:[hadoop]], standard]
g.V()查询所有顶点(参考gremlin语言https://www.gremlin.com/)

链接成功

java 代码链接:

需要的依赖:


   org.janusgraph
   janusgraph-core
   0.3.1


   org.janusgraph
   janusgraph-es
   0.3.1




   org.apache.httpcomponents
   httpclient
   4.5.2

   org.apache.tinkerpop
   tinkergraph-gremlin
   3.3.3

   org.apache.commons
   commons-math3
   3.1

将gremlin-server-httpsocket.yaml 拷贝到工程文件下:

import org.apache.commons.configuration.Configuration;
import org.apache.commons.configuration.ConfigurationException;
import org.apache.commons.configuration.PropertiesConfiguration;
import org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.GraphTraversalSource;
import org.apache.tinkerpop.gremlin.structure.Graph;
import org.apache.tinkerpop.gremlin.structure.util.GraphFactory;
import org.janusgraph.core.schema.JanusGraphManagement;

public class testjanusgraph{
    
    private String FileName;
    private  Configuration conf;
    private  Graph graph;
    private  GraphTraversalSource g;
  
    /**
     * Constructs a graph app using the given properties.
     * @param fileName location of the properties file
     */
    public GraphApp(final String file) {
      FileName = file;
    }

    /**
     * Opens the graph instance. If the graph instance does not exist, a new
     * graph instance is initialized.
     */
    public GraphTraversalSource openGraph() throws ConfigurationException {
  
        conf = new PropertiesConfiguration(FileName);
        graph = GraphFactory.open(conf);
        g = graph.traversal();
        return g;
    }

    /**
     * Closes the graph instance.
     */
    public void closeGraph() throws Exception {
        LOGGER.info("closing graph");
        try {
            if (g != null) {
                g.close();
            }
            if (graph != null) {
                graph.close();
            }
        } finally {
            g = null;
            graph = null;
        }
    }

}

import org.apache.tinkerpop.gremlin.process.traversal.P;
import org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.GraphTraversal;
import org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.GraphTraversalSource;
import org.apache.tinkerpop.gremlin.structure.Direction;
import org.apache.tinkerpop.gremlin.structure.Edge;
import org.apache.tinkerpop.gremlin.structure.Property;
import org.apache.tinkerpop.gremlin.structure.Vertex;
import org.janusgraph.core.Cardinality;
import org.janusgraph.core.EdgeLabel;
import org.janusgraph.core.JanusGraph;
import org.janusgraph.core.Multiplicity;
import org.janusgraph.core.PropertyKey;
import org.janusgraph.core.VertexLabel;
import org.janusgraph.core.attribute.Geoshape;
import org.janusgraph.core.schema.JanusGraphIndex;
import org.janusgraph.core.schema.JanusGraphManagement;
import org.janusgraph.core.schema.JanusGraphManagement.IndexBuilder;
import org.janusgraph.core.schema.Mapping;
import org.janusgraph.core.schema.SchemaAction;
import org.janusgraph.core.schema.SchemaStatus;


public class App extends testjanusgraph{


   protected static final String MIXED_INDEX_CONFIG_NAME = "search";
   protected boolean useMixedIndex;
   protected String mixedIndexConfigName;

   /**
    * Constructs a graph app using the given properties.
    * 
    * @param fileName
    *            location of the properties file
    */
   public App(final String fileName) {
      super(fileName);
      this.mixedIndexConfigName = MIXED_INDEX_CONFIG_NAME;
   }

   @Override
   public GraphTraversalSource openGraph() throws ConfigurationException {
      super.openGraph();
      useMixedIndex = useMixedIndex && conf.containsKey("index." + mixedIndexConfigName + ".backend");
      return g;
   }

   @Override
   public void dropGraph() throws Exception {
      if (graph != null) {
         
      }
   }

   /**
    * 新增顶点.
    * 
    * @param mgmt
    * @param vname
    *            顶点名称
    */
   @Override
   public boolean createSchemaVertex(final JanusGraphManagement mgmt, String vname) {
      boolean ok = false;
      try {
         LOGGER.info("creating schema vertex");
         mgmt.makeVertexLabel(vname).make();
         mgmt.commit();
         ok = true;
      } catch (Exception e) {
         ok = false;
         mgmt.rollback();
      }
      return ok;
   }

未完待续

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

你可能感兴趣的:(伪分布hadoop+hbase+janusgraph+es搭建(仅供交流学习))