laiahu

HBase Installation

来源：https://ccp.cloudera.com/display/CDHDOC/HBase+Installation

Apache HBase provides large-scale tabular storage for Hadoop using the Hadoop Distributed File System (HDFS). Cloudera recommends installing HBase in a standalone mode before you try to run it on a whole cluster.

Important
If you have not already done so, install Cloudera'syum,zypper/YaSToraptrepository before using the following commands to install or upgrade HBase. For instructions, seeCDH3 Installation.

Upgrading HBase to the Latest CDH3 Release

Note
To see which version of HBase is shipping in the latest CDH3 release, check theVersion and Packaging Information. For important information on new and changed components, see theRelease Notes.

The instructions that follow assume that you are upgrading HBase as part of an upgrade to the latest CDH3 release, and have already performed the steps underUpgrading CDH3.

To upgrade HBase to the latest CDH3 release, proceed as follows.

Warning
Youmustshut down the HBase, Thrift, and ZooKeeper processes as shown below. If these processes are running during the upgrade, the new version will not work correctly.

Step 1: Perform a Graceful Cluster Shutdown

To shut HBase down gracefully, stop the Thrift server and clients, then stop the cluster.

Stop the Thrift server and clients

sudo service hadoop-hbase-thrift stop
Stop the cluster.
1. Use the following command on the master node:
  
  sudo service hadoop-hbase-master stop
2. Use the following command on each node hosting a region server:
  
  sudo service hadoop-hbase-regionserver stop

This shuts down the master and the region servers gracefully.

Step 2. Stop the ZooKeeper Server

$ sudo service hadoop-zookeeper-server stop

Note
Depending on your platform and release, you may need to use

$ sudo /sbin/service hadoop-zookeeper-server stop

$ sudo /sbin/service hadoop-zookeeper stop

Step 3: Install the new version of HBase

Note
You may want to take this opportunity to upgrade ZooKeeper, but you do nothaveto upgrade Zookeeper before upgrading HBase; the new version of HBase will run with the older version of Zookeeper. For instructions on upgrading ZooKeeper, seeUpgrading ZooKeeper to the Latest CDH3 Release.

Follow directions in the next section,Installing HBase.

Installing HBase

To install HBase on Ubuntu and other Debian systems:

$ sudo apt-get install hadoop-hbase

To install HBase On Red Hat-compatible systems:

$ sudo yum install hadoop-hbase

To install HBase on SUSE systems:

$ sudo zypper install hadoop-hbase

To list the installed files on Ubuntu and other Debian systems:

$ dpkg -L hadoop-hbase

To list the installed files on Red Hat and SUSE systems:

$ rpm -ql hadoop-hbase

You can see that the HBase package has been configured to conform to the Linux Filesystem Hierarchy Standard. (To learn more, runman hier).

 
           HBase wrapper script /usr/bin/hbase 
          
           HBase Configuration Files /etc/hbase/conf 
          
           HBase Jar and Library Files /usr/lib/hbase 
          
           HBase Log Files /var/log/hbase 
          
           HBase service scripts /etc/init.d/hadoop-hbase-*

You are now ready to enable the server daemons you want to use with Hadoop. Java-based client access is also available by adding the jars in/usr/lib/hbase/and/usr/lib/hbase/lib/to your Java class path.

Host Configuration Settings for HBase

Configuring the REST Port

You can use aninit.dscript,/etc/init.d/hadoop-hbase-rest, to start the REST server; for example:

/etc/init.d/hadoop-hbase-rest start

The script starts the server by default on port 8080. This is a commonly used port and so may conflict with other applications running on the same host.

If you need change the port for the REST server, configure it inhbase-site.xml, for example:

 
           <property> 
          
           <name>hbase.rest.port</name> 
          
           <value>60050</value> 
          
           </property>

Note
You can useHBASE_REST_OPTSinhbase-env.shto pass other settings (such as heap size and GC parameters) to the REST server JVM.

Using DNS with HBase

HBase uses the local hostname to report its IP address. Both forward and reverse DNS resolving should work. If your machine has multiple interfaces, HBase uses the interface that the primary hostname resolves to. If this is insufficient, you can sethbase.regionserver.dns.interfacein thehbase-site.xmlfile to indicate the primary interface. To work properly, this setting requires that your cluster configuration is consistent and every host has the same network interface configuration. As an alternative, you can sethbase.regionserver.dns.nameserverin thehbase-site.xmlfile to choose a different name server than the system-wide default.

Using the Network Time Protocol (NTP) with HBase

The clocks on cluster members should be in basic alignments. Some skew is tolerable, but excessive skew could generate odd behaviors. RunNTPon your cluster, or an equivalent. If you are having problems querying data or unusual cluster operations, verify the system time.

Setting User Limits for HBase

Because HBase is a database, it uses a lot of files at the same time. The default ulimit setting of 1024 for the maximum number of open files on Unix systems is insufficient. Any significant amount of loading will result in failures in strange ways and cause the error messagejava.io.IOException...(Too many open files)to be logged in the HBase or HDFS log files. For more information about this issue, see theApache HBase Book. You may also notice errors such as:

 
           2010-04-06 03:04:37,542 INFO org.apache.hadoop.hdfs.DFSClient: Exception increateBlockOutputStream java.io.EOFException 
          
           2010-04-06 03:04:37,542 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block blk_-6935524980745310745_1391901

Configuring ulimit for HBase

Cloudera recommends increasing the maximum number of file handles to more than 10,000. Note that increasing the file handles for the user who is running the HBase process is an operating system configuration, not an HBase configuration. Also, a common mistake is to increase the number of file handles for a particular user but, for whatever reason, HBase will be running as a different user. HBase prints the ulimit it is using on the first line in the logs. Make sure that it is correct.

If you are using ulimit, you must make the following configuration changes:

In the/etc/security/limits.conffile, add the following lines:

Note
Only the root user can edit this file.

 
             hdfs - nofile 32768 
            
             hbase - nofile 32768

To apply the changes in/etc/security/limits.confon Ubuntu and other Debian systems, add the following line in the/etc/pam.d/common-sessionfile:

session required pam_limits.so

Using dfs.datanode.max.xcievers with HBase

A Hadoop HDFS DataNode has an upper bound on the number of files that it can serve at any one time. The upper bound property is calleddfs.datanode.max.xcievers(the property is spelled in the code exactly as shown here). Before loading, make sure you have configured the value fordfs.datanode.max.xcieversin theconf/hdfs-site.xmlfile to at least 4096 as shown below:

 
           <property> 
          
           <name>dfs.datanode.max.xcievers</name> 
          
           <value>4096</value> 
          
           </property>

Be sure to restart HDFS after changing the value fordfs.datanode.max.xcievers. If you don't change that value as described, strange failures can occur and an error message about exceeding the number ofxcieverswill be added to the DataNode logs. Other error messages about missing blocks are also logged, such as:

 
           10/12/08 20:10:31 INFO hdfs.DFSClient: Could not obtain block blk_XXXXXXXXXXXXXXXXXXXXXX_YYYYYYYY from any node: 
          
           java.io.IOException: No live nodes contain current block. Will get new block locations from namenode and retry...

Starting HBase in Standalone Mode

By default, HBase ships configured forstandalone mode. In this mode of operation, a single JVM hosts the HBase Master, an HBase Region Server, and a ZooKeeper quorum peer. In order to run HBase in standalone mode, you must install the HBase Master package:

Installing the HBase Master for Standalone Operation

To install the HBase Master on Ubuntu and other Debian systems:

$ sudo apt-get install hadoop-hbase-master

To install the HBase Master On Red Hat-compatible systems:

$ sudo yum install hadoop-hbase-master

To install the HBase Master on SUSE systems:

$ sudo zypper install hadoop-hbase-master

Starting the HBase Master

On Red Hat and SUSE systems (using.rpmpackages) you can start now start the HBase Master by using the included service script:

$ sudo /etc/init.d/hadoop-hbase-master start

On Ubuntu systems (using Debian packages) the HBase Master starts when the HBase package is installed.

To verify that the standalone installation is operational, visithttp://localhost:60010. The list of Region Servers at the bottom of the page should include one entry for your local machine.

Note
Although you just started the master process, instandalonemode this same process is also internally running a region server and a ZooKeeper peer. In the next section, you will break out these components into separate JVMs.

If you see this message when you start the HBase standalone master:

 
           Starting Hadoop HBase master daemon: starting master, logging to /usr/lib/hbase/logs/hbase-hbase-master/cloudera-vm.out 
          
           Couldnt start ZK at requested address of 2181, instead got: 2182. Aborting. Why? Because clients (eg shell) wont be able to find this ZK quorum 
          
           hbase-master.

you will need to stop the hadoop-zookeeper-server or uninstall the hadoop-zookeeper-server package.

Accessing HBase by using the HBase Shell

After you have started the standalone installation, you can access the database by using the HBase Shell:

 
           $ hbase shell 
          
           HBase Shell; enter 'help<RETURN>' for list of supported commands. 
          
           Type "exit<RETURN>" to leave the HBase Shell 
          
           Version: 0.89.20100621+17, r, Mon Jun 28 10:13:32 PDT 2010 
          
           hbase(main):001:0> status 'detailed' 
          
           version 0.89.20100621+17 
          
           0 regionsInTransition 
          
           1 live servers 
          
           my-machine:59719 1277750189913 
          
           requests=0, regions=2, usedHeap=24, maxHeap=995 
          
           .META.,,1 
          
           stores=2, storefiles=0, storefileSizeMB=0, memstoreSizeMB=0, storefileIndexSizeMB=0 
          
           -ROOT-,,0 
          
           stores=1, storefiles=1, storefileSizeMB=0, memstoreSizeMB=0, storefileIndexSizeMB=0 
          
           0 dead servers

Using MapReduce with HBase

To run MapReduce jobs that use HBase, you need to add the HBase and Zookeeper JAR files to the Hadoop Java classpath. You can do this by adding the following statement to each job:

TableMapReduceUtil.addDependencyJars(job);

This distributes the JAR files to the cluster along with your job and adds them to the job's classpath, so that you do not need to edit the MapReduce configuration.

You can find more information aboutaddDependencyJarshere.

When getting anConfigurationobject for a HBase MapReduce job, instantiate it using theHBaseConfiguration.create()method.

Configuring HBase in Pseudo-distributed Mode

Pseudo-distributedmode differs fromstandalonemode in that each of the component processes run in a separate JVM.

Note
If the HBase master is already running in standalone mode, stop it by running/etc/init.d/hadoop-hbase-master stopbefore continuing with pseudo-distributed configuration.

Modifying the HBase Configuration

To enable pseudo-distributed mode, you must first make some configuration changes. Open/etc/hbase/conf/hbase-site.xmlin your editor of choice, and insert the following XML properties between the<configuration>and</configuration>tags. Be sure to replacelocalhostwith the host name of your HDFS Name Node if it is not running locally.

 
           <property> 
          
           <name>hbase.cluster.distributed</name> 
          
           <value>true</value> 
          
           </property> 
          
           <property> 
          
           <name>hbase.rootdir</name> 
          
           <value>hdfs://localhost/hbase</value> 
          
           </property>

Creating the`/hbase`Directory in`HDFS`

Before starting the HBase Master, you need to create the/hbasedirectory inHDFS. The HBase master runs ashbase:hbaseso it does not have the required permissions to create a top level directory.

To create the/hbasedirectory inHDFS:

 
           $ sudo -u hdfs hadoop fs -mkdir /hbase 
          
           $ sudo -u hdfs hadoop fs -chown hbase /hbase

Enabling Servers for Pseudo-distributed Operation

After you have configured HBase, you must enable the various servers that make up a distributed HBase cluster. HBase uses three required types of servers:

Installing and Starting ZooKeeper Server

HBase uses ZooKeeper Server as a highly available, central location for cluster management. For example, it allows clients to locate the servers, and ensures that only one master is active at a time. For a small cluster, running a ZooKeeper node colocated with the NameNode is recommended. For larger clusters, contact Cloudera Support for configuration help.

Install and start the ZooKeeper Server in standalone mode by running the commands shown in the"Installing the ZooKeeper Server Package on a Single Server" section ofZooKeeper Installation.

Starting the HBase Master

After ZooKeeper is running, you can start the HBase master in standalone mode.

$ sudo /etc/init.d/hadoop-hbase-master start

Starting an HBase Region Server

The Region Server is the part of HBase that actually hosts data and processes requests. The region server typically runs on all of the slave nodes in a cluster, but not the master node.

To enable the HBase Region Server on Ubuntu and other Debian systems:

$ sudo apt-get install hadoop-hbase-regionserver

To enable the HBase Region Server On Red Hat-compatible systems:

$ sudo yum install hadoop-hbase-regionserver

To enable the HBase Region Server on SUSE systems:

$ sudo zypper install hadoop-hbase-regionserver

To start the Region Server:

$ sudo /etc/init.d/hadoop-hbase-regionserver start

Verifying the Pseudo-Distributed Operation

After you have started ZooKeeper, the Master, and a Region Server, the pseudo-distributed cluster should be up and running. You can verify that each of the daemons is running using thejpstool from the Oracle JDK, which you can obtain fromhere. If you are running a pseudo-distributed HDFS installation and a pseudo-distributed HBase installation on one machine,jpswill show the following output:

 
           $ sudo jps 
          
           32694 Jps 
          
           30674 HRegionServer 
          
           29496 HMaster 
          
           28781 DataNode 
          
           28422 NameNode 
          
           30348 QuorumPeerMain

You should also be able to navigate tohttp://localhost:60010and verify that the local region server has registered with the master.

Installing the HBase Thrift Server

The HBase Thrift Server is an alternative gateway for accessing the HBase server. Thrift mirrors most of the HBase client APIs while enabling popular programming languages to interact with HBase. The Thrift Server is multi-platform and performs better than REST in many situations. Thrift can be run collocated along with the region servers, but should not be collocated with the NameNode or the JobTracker. For more information about Thrift, visithttp://incubator.apache.org/thrift/.

To enable the HBase Thrift Server on Ubuntu and other Debian systems:

$ sudo apt-get install hadoop-hbase-thrift

To enable the HBase Thrift Server On Red Hat-compatible systems:

$ sudo yum install hadoop-hbase-thrift

To enable the HBase Thrift Server on SUSE systems:

$ sudo zypper install hadoop-hbase-thrift

Deploying HBase in a Distributed Cluster

After you have HBase running in pseudo-distributed mode, the same configuration can be extended to running on a distributed cluster.

Choosing where to Deploy the Processes

For small clusters, Cloudera recommends designating one node in your cluster as the master node. On this node, you will typically run the HBase Master and a ZooKeeper quorum peer. These master processes may be collocated with the Hadoop NameNode and JobTracker for small clusters.

Designate the remaining nodes as slave nodes. On each node, Cloudera recommends running a Region Server, which may be collocated with a Hadoop TaskTracker and a DataNode. When collocating with TaskTrackers, be sure that the resources of the machine are not oversubscribed – it's safest to start with a small number of MapReduce slots and work up slowly.

Configuring for Distributed Operation

After you have decided which machines will run each process, you can edit the configuration so that the nodes may locate each other. In order to do so, you should make sure that the configuration files are synchronized across the cluster. Cloudera strongly recommends the use of a configuration management system to synchronize the configuration files, though you can use a simpler solution such asrsyncto get started quickly.

The only configuration change necessary to move from pseudo-distributed operation to fully-distributed operation is the addition of the ZooKeeper Quorum address inhbase-site.xml. Insert the following XML property to configure the nodes with the address of the node where the ZooKeeper quorum peer is running:

 
           <property> 
          
           <name>hbase.zookeeper.quorum</name> 
          
           <value>mymasternode</value> 
          
           </property>

To start the cluster, start the services in the following order:

The ZooKeeper Quorum Peer
The HBase Master
Each of the HBase Region Servers

After the cluster is fully started, you can view the HBase Master web interface on port 60010 and verify that each of the slave nodes has registered properly with the master.

Troubleshooting

The Cloudera packages of HBase have been configured to place logs in/var/log/hbase. While getting started, Cloudera recommends tailing these logs to note any error messages or failures.

Viewing the HBase Documentation

For additional HBase documentation, seehttp://archive.cloudera.com/cdh/3/hbase/.

你可能感兴趣的:(hadoop,hbase)

nosql数据库技术与应用知识点皆过客，揽星河 NoSQL nosql 数据库大数据数据分析数据结构非关系型数据库
Nosql知识回顾大数据处理流程数据采集(flume、爬虫、传感器)数据存储(本门课程NoSQL所处的阶段)Hdfs、MongoDB、HBase等数据清洗(入仓)Hive等数据处理、分析(Spark、Flink等)数据可视化数据挖掘、机器学习应用(Python、SparkMLlib等)大数据时代存储的挑战(三高)高并发(同一时间很多人访问)高扩展(要求随时根据需求扩展存储)高效率(要求读写速度快)
浅谈MapReduce Android路上的人 Hadoop 分布式计算 mapreduce 分布式框架 hadoop
从今天开始，本人将会开始对另一项技术的学习，就是当下炙手可热的Hadoop分布式就算技术。目前国内外的诸多公司因为业务发展的需要，都纷纷用了此平台。国内的比如BAT啦，国外的在这方面走的更加的前面，就不一一列举了。但是Hadoop作为Apache的一个开源项目，在下面有非常多的子项目，比如HDFS，HBase,Hive，Pig,等等，要先彻底学习整个Hadoop，仅仅凭借一个的力量，是远远不够的。
Hadoop 傲雪凌霜，松柏长青后端大数据 hadoop 大数据分布式
ApacheHadoop是一个开源的分布式计算框架，主要用于处理海量数据集。它具有高度的可扩展性、容错性和高效的分布式存储与计算能力。Hadoop核心由四个主要模块组成，分别是HDFS（分布式文件系统）、MapReduce（分布式计算框架）、YARN（资源管理）和HadoopCommon（公共工具和库）。1.HDFS（HadoopDistributedFileSystem）HDFS是Hadoop生
Hadoop架构 henan程序媛 hadoop 大数据分布式
一、案列分析1.1案例概述现在已经进入了大数据(BigData)时代，数以万计用户的互联网服务时时刻刻都在产生大量的交互，要处理的数据量实在是太大了，以传统的数据库技术等其他手段根本无法应对数据处理的实时性、有效性的需求。HDFS顺应时代出现，在解决大数据存储和计算方面有很多的优势。1.2案列前置知识点1.什么是大数据大数据是指无法在一定时间范围内用常规软件工具进行捕捉、管理和处理的大量数据集合，
分享一个基于python的电子书数据采集与可视化分析 hadoop电子书数据分析与推荐系统 spark大数据毕设项目（源码、调试、LW、开题、PPT) 计算机源码社 Python项目大数据大数据 python hadoop 计算机毕业设计选题计算机毕业设计源码数据分析 spark毕设
作者：计算机源码社个人简介：本人八年开发经验，擅长Java、Python、PHP、.NET、Node.js、Android、微信小程序、爬虫、大数据、机器学习等，大家有这一块的问题可以一起交流！学习资料、程序开发、技术解答、文档报告如需要源码，可以扫取文章下方二维码联系咨询Java项目微信小程序项目Android项目Python项目PHP项目ASP.NET项目Node.js项目选题推荐项目实战|p
hbase介绍 CrazyL- 云计算+大数据 hbase
hbase是一个分布式的、多版本的、面向列的开源数据库hbase利用hadoophdfs作为其文件存储系统，提供高可靠性、高性能、列存储、可伸缩、实时读写、适用于非结构化数据存储的数据库系统hbase利用hadoopmapreduce来处理hbase、中的海量数据hbase利用zookeeper作为分布式系统服务特点：数据量大：一个表可以有上亿行，上百万列（列多时，插入变慢）面向列：面向列（族）的
Apache HBase基础（基本概述，物理架构，逻辑架构，数据管理，架构特点，HBase Shell） May--J--Oldhu HBase HBase shell hbase物理架构 hbase逻辑架构 hbase
NoSQL综述及ApacheHBase基础一.HBase1.HBase概述2.HBase发展历史3.HBase应用场景3.1增量数据-时间序列数据3.2信息交换-消息传递3.3内容服务-Web后端应用程序3.4HBase应用场景示例4.ApacheHBase生态圈5.HBase物理架构5.1HMaster5.2RegionServer5.3Region和Table6.HBase逻辑架构-Row7.
大数据毕业设计hadoop+spark+hive知识图谱租房数据分析可视化大屏租房推荐系统 58同城租房爬虫房源推荐系统房价预测系统计算机毕业设计机器学习深度学习人工智能 2401_84572577 程序员大数据 hadoop 人工智能
做了那么多年开发，自学了很多门编程语言，我很明白学习资源对于学一门新语言的重要性，这些年也收藏了不少的Python干货，对我来说这些东西确实已经用不到了，但对于准备自学Python的人来说，或许它就是一个宝藏，可以给你省去很多的时间和精力。别在网上瞎学了，我最近也做了一些资源的更新，只要你是我的粉丝，这期福利你都可拿走。我先来介绍一下这些东西怎么用，文末抱走。（1）Python所有方向的学习路线（
Spark集群的三种模式 MelodyYN #Spark spark hadoop big data
文章目录1、Spark的由来1.1Hadoop的发展1.2MapReduce与Spark对比2、Spark内置模块3、Spark运行模式3.1Standalone模式部署配置历史服务器配置高可用运行模式3.2Yarn模式安装部署配置历史服务器运行模式4、WordCount案例1、Spark的由来定义：Hadoop主要解决，海量数据的存储和海量数据的分析计算。Spark是一种基于内存的快速、通用、可
月度总结 | 2022年03月 | 考研与就业的抉择 | 确定未来走大数据开发路线「已注销」个人总结 hadoop
一、时间线梳理3月3日，寻找到同专业的就业伙伴3月5日，着手准备Java八股文，决定先走Java后端路线3月8月，申请到了校图书馆的考研专座，决定暂时放弃就业，先准备考研，买了数学和408的资料书3月9日-3月13日，因疫情原因，宿舍区暂封，这段时间在准备考研，发现内容特别多3月13日-3月19日，大部分时间在刷Hadoop、Zookeeper、Kafka的视频，同时在准备实习的项目3月20日，退
HBase（一）——HBase介绍 weixin_30595035 大数据数据库数据结构与算法
HBase介绍1、关系型数据库与非关系型数据库（1）关系型数据库关系型数据库最典型的数据机构是表，由二维表及其之间的联系所组成的一个数据组织优点：1、易于维护：都是使用表结构，格式一致2、使用方便：SQL语言通用，可用于复杂查询3、复杂操作：支持SQL，可用于一个表以及多个表之间非常复杂的查询缺点：1、读写性能比较差，尤其是海量数据的高效率读写2、固定的表结构，灵活度稍欠3、高并发读写需求，传统关
HBase介绍 mingyu1016 数据库
概述HBase是一个分布式的、面向列的开源数据库,源于google的一篇论文《bigtable：一个结构化数据的分布式存储系统》。HBase是GoogleBigtable的开源实现，它利用HadoopHDFS作为其文件存储系统，利用HadoopMapReduce来处理HBase中的海量数据，利用Zookeeper作为协同服务。HBase的表结构HBase以表的形式存储数据。表有行和列组成。列划分为
Java中的大数据处理框架对比分析省赚客app开发者 java 开发语言
Java中的大数据处理框架对比分析大家好，我是微赚淘客系统3.0的小编，是个冬天不穿秋裤，天冷也要风度的程序猿！今天，我们将深入探讨Java中常用的大数据处理框架，并对它们进行对比分析。大数据处理框架是现代数据驱动应用的核心，它们帮助企业处理和分析海量数据，以提取有价值的信息。本文将重点介绍ApacheHadoop、ApacheSpark、ApacheFlink和ApacheStorm这四种流行的
Hadoop windows intelij 跑 MR WordCount piziyang12138
一、软件环境我使用的软件版本如下:IntellijIdea2017.1Maven3.3.9Hadoop分布式环境二、创建maven工程打开Idea,file->new->Project,左侧面板选择maven工程。(如果只跑MapReduce创建java工程即可，不用勾选Creatfromarchetype，如果想创建web工程或者使用骨架可以勾选)image.png设置GroupId和Artif
Hbase - 迁移数据[导出,导入] kikiki5
>有没有这样一样情况，把一个集群中的某个表导到另一个群集中，或者hbase的表结构发生了更改，但是数据还要，比如预分区没做，导致某台RegionServer很吃紧，Hbase的导出导出都可以很快的完成这些操作。![](https://upload-images.jianshu.io/upload_images/9028759-4fb9aa8ca3777969.png?imageMogr2/auto
通过DBeaver连接Phoenix操作hbase 不想做咸鱼的王富贵
通过DBeaver连接Phoenix操作hbase前言本文介绍常用一种通用数据库工具Dbeaver，DBeaver可通过JDBC连接到数据库，可以支持几乎所有的数据库产品，包括：MySQL、PostgreSQL、MariaDB、SQLite、Oracle、Db2、SQLServer、Sybase、MSAccess、Teradata、Firebird、Derby等等。商业版本更是可以支持各种NoSQ
Hbase - kerberos认证异常 kikiki2
之前怎么认证都认证不上，问题找了好了，发现它的异常跟实际操作根本就对不上，死马当活马医，当时也是瞎改才好的，给大家伙记录记录。KrbException:ServernotfoundinKerberosdatabase(7)-LOOKING_UP_SERVER>>>KdcAccessibility:removestorm1.starsriver.cnatsun.security.krb5.KrbTg
Hadoop学习第三课（HDFS架构--读、写流程）小小程序员呀~ 数据库 hadoop 架构 big data
1.块概念举例1：一桶水1000ml，瓶子的规格100ml=>需要10个瓶子装完一桶水1010ml，瓶子的规格100ml=>需要11个瓶子装完一桶水1010ml，瓶子的规格200ml=>需要6个瓶子装完块的大小规格，只要是需要存储，哪怕一点点，也是要占用一个块的块大小的参数：dfs.blocksize官方默认的大小为128M官网：https://hadoop.apache.org/docs/r3.
hadoop启动HDFS命令 m0_67401228 java 搜索引擎 linux 后端
启动命令：/hadoop/sbin/start-dfs.sh停止命令：/hadoop/sbin/stop-dfs.sh
【计算机毕设-大数据方向】基于Hadoop的电商交易数据分析可视化系统的设计与实现程序员-石头山大数据实战案例大数据 hadoop 毕业设计毕设
博主介绍：✌全平台粉丝5W+,高级大厂开发程序员，博客之星、掘金/知乎/华为云/阿里云等平台优质作者。【源码获取】关注并且私信我【联系方式】最下边感兴趣的可以先收藏起来，同学门有不懂的毕设选题，项目以及论文编写等相关问题都可以和学长沟通，希望帮助更多同学解决问题前言随着电子商务行业的迅猛发展，电商平台积累了海量的数据资源，这些数据不仅包括用户的基本信息、购物记录，还包括用户的浏览行为、评价反馈等多
分布式离线计算—Spark—基础介绍测试开发abbey 人工智能—大数据
原文作者：饥渴的小苹果原文地址：【Spark】Spark基础教程目录Spark特点Spark相对于Hadoop的优势Spark生态系统Spark基本概念Spark结构设计Spark各种概念之间的关系Executor的优点Spark运行基本流程Spark运行架构的特点Spark的部署模式Spark三种部署方式Hadoop和Spark的统一部署摘要：Spark是基于内存计算的大数据并行计算框架Spar
spark常用命令我是浣熊的微笑 spark
查看报错日志：yarnlogsapplicationIDspark2-submit--masteryarn--classcom.hik.ReadHdfstest-1.0-SNAPSHOT.jar进入$SPARK_HOME目录，输入bin/spark-submit--help可以得到该命令的使用帮助。hadoop@wyy:/app/hadoop/spark100$bin/spark-submit--
spark启动命令学不会又听不懂 spark 大数据分布式
hadoop启动：cd/root/toolssstart-dfs.sh，只需在hadoop01上启动stop-dfs.sh日志查看：cat/root/toolss/hadoop/logs/hadoop-root-datanode-hadoop03.outzookeeper启动：cd/root/toolss/zookeeperbin/zkServer.shstart，三台都要启动bin/zkServ
编程常用命令总结 Yellow0523 Linux BigData 大数据
编程命令大全1.软件环境变量的配置JavaScalaSparkHadoopHive2.大数据软件常用命令Spark基本命令Spark-SQL命令Hive命令HDFS命令YARN命令Zookeeper命令kafka命令Hibench命令MySQL命令3.Linux常用命令Git命令conda命令pip命令查看Linux系统的详细信息查看Linux系统架构(X86还是ARM，两种方法都可)端口号命令L
kvm 虚拟机命令行虚拟机操作、制作快照和恢复快照以及工作常用总结西京刀客云原生(Cloud Native)云计算虚拟化 Linux C/C++服务器 linux kvm
文章目录kvm虚拟机命令行虚拟机操作、制作快照和恢复快照一、kvm虚拟机命令行虚拟机操作(创建和删除)查看虚拟机virt-install创建一个虚拟机关闭虚拟机重启虚拟机销毁虚拟机二、kvm制作快照和恢复快照**创建快照**工作常见问题创建快照报错：：internalsnapshotsofaVMwithpflashbasedfirmwarearenotsupported检查虚拟机是否包含pflas
Hadoop常见面试题整理及解答叶青舟 Linux hdfs 大数据 hadoop linux
Hadoop常见面试题整理及解答一、基础知识篇：1.把数据仓库从传统关系型数据库转到hadoop有什么优势？答：（1）关系型数据库成本高，且存储空间有限。而Hadoop使用较为廉价的机器存储数据，且Hadoop可以将大量机器构建成一个集群，并在集群中使用HDFS文件系统统一管理数据，极大的提高了数据的存储及处理能力。（2）关系型数据库仅支持标准结构化数据格式，Hadoop不仅支持标准结构化数据格式
2025毕业设计指南：如何用Hadoop构建超市进货推荐系统？大数据分析助力精准采购计算机编程指导师 Java实战集 Python实战集大数据实战集课程设计 hadoop 数据分析 spring boot java 进货 python
✍✍计算机编程指导师⭐⭐个人介绍：自己非常喜欢研究技术问题！专业做Java、Python、小程序、安卓、大数据、爬虫、Golang、大屏等实战项目。⛽⛽实战项目：有源码或者技术上的问题欢迎在评论区一起讨论交流！⚡⚡Java实战|SpringBoot/SSMPython实战项目|Django微信小程序/安卓实战项目大数据实战项目⚡⚡文末获取源码文章目录⚡⚡文末获取源码基于hadoop的超市进货推荐系
Hadoop Common 之序列化机制小解猫君之上 #Apache Hadoop
1.JavaSerializable序列化该序列化通过ObjectInputStream的readObject实现序列化，ObjectOutputStream的writeObject实现反序列化。这不过此种序列化虽然跨病态兼容性强，但是因为存储过多的信息，但是传输效率比较低，所以hadoop弃用它。（序列化信息包括这个对象的类，类签名，类的所有静态，费静态成员的值，以及他们父类都要被写入）publ
深入理解hadoop(一)----Common的实现----Configuration maoxiao_jsd 深入理解----hadoop
属本人个人原创，转载请注明,希望对大家有帮助！！一,hadoop的配置管理a,hadoop通过独有的Configuration处理配置信息Configurationconf=newConfiguration();conf.addResource("core-default.xml");conf.addResource("core-site.xml");后者会覆盖前者中未final标记的相同配置项b
hadoop 0.22.0 部署笔记 weixin_33701564 大数据 java 运维
为什么80%的码农都做不了架构师？>>>因为需要使用hbase，所以开始对hbase进行学习。hbase是部署在hadoop平台上的NOSql数据库，因此在部署hbase之前需要先部署hadoop。环境：redhat5、hadoop-0.22.0.tar.gz、jdk-6u13-linux-i586.zipip192.168.1.128hostname：localhost.localdomain（
iOS http封装 374016526 ios 服务器交互 http 网络请求
程序开发避免不了与服务器的交互，这里打包了一个自己写的http交互库。希望可以帮到大家。内置一个basehttp，当我们创建自己的service可以继承实现。 KuroAppBaseHttp *baseHttp = [[KuroAppBaseHttp alloc] init]; [baseHttp setDelegate:self]; [baseHttp
lolcat ：一个在 Linux 终端中输出彩虹特效的命令行工具 brotherlamp linux linux教程 linux视频 linux自学 linux资料
那些相信 Linux 命令行是单调无聊且没有任何乐趣的人们，你们错了，这里有一些有关 Linux 的文章，它们展示着 Linux 是如何的有趣和“淘气” 。在本文中，我将讨论一个名为“lolcat”的小工具 – 它可以在终端中生成彩虹般的颜色。何为 lolcat ? Lolcat 是一个针对 Linux，BSD 和 OSX 平台的工具，它类似于 cat 命令，并为 cat
MongoDB索引管理（1）——[九] eksliang mongodb MongoDB管理索引
转载请出自出处：http://eksliang.iteye.com/blog/2178427 一、概述数据库的索引与书籍的索引类似，有了索引就不需要翻转整本书。数据库的索引跟这个原理一样，首先在索引中找，在索引中找到条目以后，就可以直接跳转到目标文档的位置，从而使查询速度提高几个数据量级。不使用索引的查询称
Informatica参数及变量 18289753290 Informatica 参数变量
下面是本人通俗的理解，如有不对之处，希望指正 info参数的设置：在info中用到的参数都在server的专门的配置文件中（最好以parma）结尾下面的GLOBAl就是全局的，$开头的是系统级变量，$$开头的变量是自定义变量。如果是在session中或者mapping中用到的变量就是局部变量，那就把global换成对应的session或者mapping名字。 [GLOBAL] $Par
python 解析unicode字符串为utf8编码字符串酷的飞上天空 unicode
php返回的json字符串如果包含中文，则会被转换成\uxx格式的unicode编码字符串返回。在浏览器中能正常识别这种编码，但是后台程序却不能识别，直接输出显示的是\uxx的字符，并未进行转码。转换方式如下 >>> import json >>> q = '{"text":"\u4
Hibernate的总结永夜-极光 Hibernate
1.hibernate的作用,简化对数据库的编码,使开发人员不必再与复杂的sql语句打交道做项目大部分都需要用JAVA来链接数据库，比如你要做一个会员注册的页面，那么获取到用户填写的基本信后，你要把这些基本信息存入数据库对应的表中，不用hibernate还有mybatis之类的框架，都不用的话就得用JDBC，也就是JAVA自己的，用这个东西你要写很多的代码，比如保存注册信
SyntaxError: Non-UTF-8 code starting with '\xc4' 随便小屋 python
刚开始看一下Python语言，传说听强大的，但我感觉还是没Java强吧！写Hello World的时候就遇到一个问题，在Eclipse中写的，代码如下 ''' Created on 2014年10月27日 @author: Logic ''' print("Hello World!"); 运行结果 SyntaxError: Non-UTF-8
学会敬酒礼仪不做酒席菜鸟 aijuans 菜鸟
俗话说，酒是越喝越厚，但在酒桌上也有很多学问讲究，以下总结了一些酒桌上的你不得不注意的小细节。细节一：领导相互喝完才轮到自己敬酒。敬酒一定要站起来，双手举杯。细节二：可以多人敬一人，决不可一人敬多人，除非你是领导。细节三：自己敬别人，如果不碰杯，自己喝多少可视乎情况而定，比如对方酒量，对方喝酒态度，切不可比对方喝得少，要知道是自己敬人。细节四：自己敬别人，如果碰杯，一
《创新者的基因》读书笔记 aoyouzi 读书笔记《创新者的基因》
创新者的基因创新者的“基因”，即最具创意的企业家具备的五种“发现技能”：联想，观察，实验，发问，建立人脉。第一部分破坏性创新，从你开始第一章破坏性创新者的基因如何获得启示：发现以下的因素起到了催化剂的作用：(1) -个挑战现状的问题；(2)对某项技术、某个公司或顾客的观察；(3) -次尝试新鲜事物的经验或实验；(4)与某人进行了一次交谈，为他点醒
表单验证技术百合不是茶 JavaScript DOM对象 String对象事件
js最主要的功能就是验证表单,下面是我对表单验证的一些理解,贴出来与大家交流交流 ,数显我们要知道表单验证需要的技术点, String对象,事件,函数一:String对象;通常是对字符串的操作; 1,String的属性; 字符串.length;表示该字符串的长度; var str= "java"
web.xml配置详解之context-param bijian1013 java servlet web.xml context-param
一.格式定义： <context-param> <param-name>contextConfigLocation</param-name> <param-value>contextConfigLocationValue></param-value> </context-param> 作用：该元
Web系统常见编码漏洞（开发工程师知晓） Bill_chen sql PHP Web fckeditor 脚本
1.头号大敌：SQL Injection 原因：程序中对用户输入检查不严格，用户可以提交一段数据库查询代码，根据程序返回的结果，获得某些他想得知的数据，这就是所谓的SQL Injection，即SQL注入。本质: 对于输入检查不充分，导致SQL语句将用户提交的非法数据当作语句的一部分来执行。示例： String query = "SELECT id FROM users
【MongoDB学习笔记六】MongoDB修改器 bit1129 mongodb
本文首先介绍下MongoDB的基本的增删改查操作，然后，详细介绍MongoDB提供的修改器，以完成各种各样的文档更新操作 MongoDB的主要操作 show dbs 显示当前用户能看到哪些数据库 use foobar 将数据库切换到foobar show collections 显示当前数据库有哪些集合 db.people.update，update不带参数，可
提高职业素养，做好人生规划白糖_ 人生
培训讲师是成都著名的企业培训讲师，他在讲课中提出的一些观点很新颖，在此我收录了一些分享一下。注：讲师的观点不代表本人的观点，这些东西大家自己揣摩。 1、什么是职业规划：职业规划并不完全代表你到什么阶段要当什么官要拿多少钱，这些都只是梦想。职业规划是清楚的认识自己现在缺什么，这个阶段该学习什么，下个阶段缺什么，又应该怎么去规划学习，这样才算是规划。
国外的网站你都到哪边看？ bozch 技术网站国外
学习软件开发技术，如果没有什么英文基础，最好还是看国内的一些技术网站，例如：开源OSchina，csdn，iteye,51cto等等。个人感觉如果英语基础能力不错的话，可以浏览国外的网站来进行软件技术基础的学习，例如java开发中常用的到的网站有apache.org 里面有apache的很多Projects,springframework.org是spring相关的项目网站,还有几个感觉不错的
编程之美-光影切割问题 bylijinnan 编程之美
package a; public class DisorderCount { /**《编程之美》“光影切割问题” * 主要是两个问题： * 1.数学公式（设定没有三条以上的直线交于同一点）： * 两条直线最多一个交点，将平面分成了4个区域； * 三条直线最多三个交点，将平面分成了7个区域； * 可以推出：N条直线 M个交点，区域数为N+M+1。
关于Web跨站执行脚本概念 chenbowen00 Web 安全跨站执行脚本
跨站脚本攻击(XSS)是web应用程序中最危险和最常见的安全漏洞之一。安全研究人员发现这个漏洞在最受欢迎的网站,包括谷歌、Facebook、亚马逊、PayPal,和许多其他网站。如果你看看bug赏金计划,大多数报告的问题属于 XSS。为了防止跨站脚本攻击,浏览器也有自己的过滤器,但安全研究人员总是想方设法绕过这些过滤器。这个漏洞是通常用于执行cookie窃取、恶意软件传播,会话劫持,恶意重定向。在
[开源项目与投资]投资开源项目之前需要统计该项目已有的用户数 comsci 开源项目
现在国内和国外,特别是美国那边,突然出现很多开源项目,但是这些项目的用户有多少,有多少忠诚的粉丝,对于投资者来讲,完全是一个未知数,那么要投资开源项目,我们投资者必须准确无误的知道该项目的全部情况,包括项目发起人的情况,项目的维持时间..项目的技术水平,项目的参与者的势力,项目投入产出的效益.....
oracle alert log file（告警日志文件） daizj oracle 告警日志文件 alert log file
The alert log is a chronological log of messages and errors, and includes the following items: All internal errors (ORA-00600), block corruption errors (ORA-01578), and deadlock errors (ORA-00060)
关于 CAS SSO 文章声明 denger SSO
由于几年前写了几篇 CAS 系列的文章，之后陆续有人参照文章去实现，可都遇到了各种问题，同时经常或多或少的收到不少人的求助。现在这时特此说明几点： 1. 那些文章发表于好几年前了，CAS 已经更新几个很多版本了，由于近年已经没有做该领域方面的事情，所有文章也没有持续更新。 2. 文章只是提供思路，尽管 CAS 版本已经发生变化，但原理和流程仍然一致。最重要的是明白原理，然后
初二上学期难记单词 dcj3sjt126com english word
lesson 课 traffic 交通 matter 要紧；事物 happy 快乐的，幸福的 second 第二的 idea 主意；想法；意见 mean 意味着 important 重要的，重大的 never 从来，决不 afraid 害怕的 fifth 第五的 hometown 故乡，家乡 discuss 讨论；议论 east 东方的 agree 同意；赞成 bo
uicollectionview 纯代码布局, 添加头部视图 dcj3sjt126com Collection
#import <UIKit/UIKit.h> @interface myHeadView : UICollectionReusableView { UILabel *TitleLable; } -(void)setTextTitle; @end #import "myHeadView.h" @implementation m
N 位随机数字串的 JAVA 生成实现 FX夜归人 java Math 随机数 Random
/** * 功能描述随机数工具类<br /> * @author FengXueYeGuiRen * 创建时间 2014-7-25<br /> */ public class RandomUtil { // 随机数生成器 private static java.util.Random random = new java.util.R
Ehcache（09）——缓存Web页面 234390216 ehcache 页面缓存
页面缓存目录 1 SimplePageCachingFilter 1.1 calculateKey 1.2 可配置的初始化参数 1.2.1 cach
spring中少用的注解@primary解析 jackyrong primary
这次看下spring中少见的注解@primary注解，例子 @Component public class MetalSinger implements Singer{ @Override public String sing(String lyrics) { return "I am singing with DIO voice
Java几款性能分析工具的对比 lbwahoo java
Java几款性能分析工具的对比摘自：http://my.oschina.net/liux/blog/51800 在给客户的应用程序维护的过程中，我注意到在高负载下的一些性能问题。理论上，增加对应用程序的负载会使性能等比率的下降。然而，我认为性能下降的比率远远高于负载的增加。我也发现，性能可以通过改变应用程序的逻辑来提升，甚至达到极限。为了更详细的了解这一点，我们需要做一些性能
JVM参数配置大全 nickys jvm 应用服务器
JVM参数配置大全 /usr/local/jdk/bin/java -Dresin.home=/usr/local/resin -server -Xms1800M -Xmx1800M -Xmn300M -Xss512K -XX:PermSize=300M -XX:MaxPermSize=300M -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=5 -
搭建 CentOS 6 服务器(14) - squid、Varnish rensanning varnish
（一）squid 安装 # yum install httpd-tools -y # htpasswd -c -b /etc/squid/passwords squiduser 123456 # yum install squid -y 设置 # cp /etc/squid/squid.conf /etc/squid/squid.conf.bak # vi /etc/
Spring缓存注解@Cache使用 tom_seed spring
参考资料 http://www.ibm.com/developerworks/cn/opensource/os-cn-spring-cache/ http://swiftlet.net/archives/774 缓存注解有以下三个： @Cacheable @CacheEvict @CachePut
dom4j解析XML时出现"java.lang.noclassdeffounderror: org/jaxen/jaxenexception"错误 xp9802
java.lang.NoClassDefFoundError: org/jaxen/JaxenExc 关键字: java.lang.noclassdeffounderror: org/jaxen/jaxenexception 使用dom4j解析XML时，要快速获取某个节点的数据，使用XPath是个不错的方法，dom4j的快速手册里也建议使用这种方式执行时却抛出以下异常： Exceptio

HBase Installation

Contents

Upgrading HBase to the Latest CDH3 Release

Step 1: Perform a Graceful Cluster Shutdown

Step 2. Stop the ZooKeeper Server

Step 3: Install the new version of HBase

Installing HBase

Host Configuration Settings for HBase

Configuring the REST Port

Using DNS with HBase

Using the Network Time Protocol (NTP) with HBase

Setting User Limits for HBase

Configuring ulimit for HBase

Using dfs.datanode.max.xcievers with HBase

Starting HBase in Standalone Mode

Installing the HBase Master for Standalone Operation

Starting the HBase Master

Accessing HBase by using the HBase Shell

Using MapReduce with HBase

Configuring HBase in Pseudo-distributed Mode

Modifying the HBase Configuration

Creating the/hbaseDirectory inHDFS

Enabling Servers for Pseudo-distributed Operation

Installing and Starting ZooKeeper Server

Starting the HBase Master

Starting an HBase Region Server

Verifying the Pseudo-Distributed Operation

Installing the HBase Thrift Server

Deploying HBase in a Distributed Cluster

Choosing where to Deploy the Processes

Configuring for Distributed Operation

Troubleshooting

Viewing the HBase Documentation

你可能感兴趣的:(hadoop,hbase)

Creating the`/hbase`Directory in`HDFS`