[大数据技术与应用省赛学习记录七]——模块一(其余软件安装配置)

因为其他软件在比赛中不设有专项配置,所以在这里将客户端需要的软件进行一次性概述。

软件包在第一个博客中都有,需要的点这里。

一、Hive

学习过程中,略学了一点HQL语句,链接给各位奉献上。
1.下载解压hive
2.配置文件

  1. 全局配置(/etc/profile)
export HIVE_HOME=/software/hive
export PATH="$HIVE_HOME/bin:$PATH
  1. hive-site.xml
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
      <!-- 设置下面这些属性 -->
      <property>
        <name>hive.server2.thrift.port</name>
    	<value>10000</value>
			</property>
			<property>
        <name>hive.server2.thrift.bind.host</name>
    	<value>192.168.9.105</value>
		</property>
    	 <property>
			<name>hive.exec.scratchdir</name>
			<value>/tmp/hive</value>
      </property>
      <property>
			<name>hive.exec.local.scratchdir</name>
			<value>/software/hive/tmp/hive</value>
      </property>
      <property>
			<name>hive.downloaded.resources.dir</name>
			<value>/usr/local/hive/tmp/</value>
      </property>
      <property>
			<name>hive.querylog.location</name>
			<value>/software/hive/tmp/hive</value>
      </property>  
      <property>
			<name>hive.aux.jars.path</name>
			<value>/software/hive/lib,/software/hive/jdbc</value>
      </property>
      <property>
			<name>hive.metastore.warehouse.dir</name>
			<value>hdfs://master:9000/user/hive/warehouse</value>
      </property>
      <!--配置Hive Metastore-->
      <property>
  	  		<name>javax.jdo.option.ConnectionURL</name>
   			  <value>jdbc:mysql://master:3306/hivedb?createDatabaseIfNotExist=true&amp;useSSL=false&amp;</value>
	 </property>
      <property>
			<name>javax.jdo.option.ConnectionDriverName</name>
			<value>com.mysql.cj.jdbc.Driver</value> 
      </property>
      <property>
			<name>javax.jdo.option.ConnectionUserName</name>
			<value>hivedb</value>
      </property>
      
      <property>
			<name>javax.jdo.option.ConnectionPassword</name>
			<value>hivedb</value>   //这里是你mysql的密码
      </property>	  

      <!--配置开发端(这里最好是配置ip地址,以便于从Windows连接)-->
      <property>
            <name>hive.server2.thrift.bind.host</name>
            <value>master</value>
      </property>      
      <!--配置beeline远程客户端连接时的用户名和密码。这个用户名要在对应的hadoop的配置文件core-site.xml中也配置-->  
      <property>
            <name>hive.server2.thrift.client.user</name>
            <value>master/value>
      </property>
      <property>
            <name>hive.server2.thrift.client.password</name>
            <value>123456</value>    //这里是你主机用户的密码
      </property> 
      <!-- 配置下面两个属性,可以配置 hive 2.x web ui -->
      <property>
            <name>hive.server2.webui.host</name>
            <value>master</value>
      </property>
      <!-- 重启Hive,访问http://master:10002/ -->  
</configuration>

如果觉得太多也可以这样做

[hadoop@master hive]$ mkdir tmp
[hadoop@master hive]$ cd conf
[hadoop@master conf]$ cp hive-default.xml.template hive-site.xml
将hive-site.xml文件中:
${system:java.io.tmpdir}——hive/tmp(hive安装目录下的tmp文件夹)
${system:user.name}——主机名

3.hive-env.sh

export JAVA_HOME=/software/hadoop/jdk1.8.0_221
export HADOOP_HOME=/software/hadoop/hadoop-2.7.7
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export HIVE_HOME=/software/hive
export HIVE_CONF_DIR=$HIVE_HOME/conf
export HIVE_AUX_JARS_PATH=$HIVE_HOME/lib

3.设置hive的日志在hive目录下

[hadoop@master conf]$ cp hive-log4j2.properties.template hive-log4j2.properties
[hadoop@master conf]$ vi hive-log4j2.properties
#修改以下参数
property.hive.log.dir = /root/bigdata/hive-2.3.6/logs

4.启动和hive端

[hadoop@master ~]$ hive --service metastore&
[hadoop@master ~]$ ctrl+c
[hadoop@master ~]$ hive

5.配置metastore

在默认情况下, Hive元数据保存在内嵌的derby数据库当中, 但根据要求生产环境需使用MySQL来存放Hive元数据。
将 mysql-connector-java-x.x.x.jar 放入 $HIVE_HOME/lib 下。(mysql jdbc驱动程序)

二、Mysql

1.下载解压mysql
2.配置相关文件

  • 新建mysql用户
[root@master ~]$ groupadd mysql   ---新建一个msyql组
[root@master ~]$ useradd -r -s /sbin/nologin -g mysql mysql -d /software/mysql 
  • 配置环境变量(/etc/profile:MYSQL_HOME、PATH)
  • 配置my.cnf文件

socket、basedir、datadir、log、pid文件与目录需自己提前创建并修改权限

[root@master ~]$ chown -R mysql:mysql 目录/文件路径
[root@master ~]$ chmod -R 777 目录/文件路径
#(mysql、777可以是别的值,相关参考关于里linux权限相关博客)
[mysql]
default-character-set=utf8

[mysqld]
port=3306
socket=/software/mysql/sock/mysql.sock
basedir=/software/mysql
datadir=/software/mysql/data
log-error=/var/log/mysql/mysql.log
pid-file=/var/run/mysql/mysql.pid
max_connections=1000
character-set-server=utf8
wait_timeout=31536000
interactive_timeout=31536000
default-storage-engine=INNODB
max_allowed_packet=1024M

[mysqld_safe]
socket=/software/mysql/sock/mysql.sock

[client]
socket=/software/mysql/sock/mysql.sock

[mysql.server]
socket=/software/mysql/sock/mysql.sock                                
  • mysqld.server
datadir=/software/mysql/data
basedir=/software/mysql/

  • server启动配置
[hadoop@master ~]$ cp /software/mysql/support-files/mysql.server /etc/init.d/mysqld

3.初始化mysql

[hadoop@master ~]$ mysqld --initialize

4.启动mysql

[hadoop@master ~]$ service mysqld start
[hadoop@master ~]$ mysql -uroot -p
#初次启动mysql需要临时密码,一般在my.cnf中定义的log文件中
#也可以用命令检索:grep ‘temporary password' 路径

5.进入后修改密码

mysql>  set password = password('密码');
mysql>  alter user 'hive'@'%'password expire never;
mysql>  flush privileges;

6.设置远程链接

mysql> use mysql;
mysql> update user set host =% where user ='hive';
mysql> flush privileges;
#链接不上查看防火墙是否是关闭状态;

三、Redis

1.下载解压redis后

[hadoop@master redis-4.0.1]$ make
[hadoop@master redis-4.0.1]$ cd ..
[hadoop@master software]$ mv redis-4.0.1/ redis
[hadoop@master software]$ cd redis/
[hadoop@master redis]$ ./src/redis-server 

2.启动redis

[hadoop@master redis]$ ./src/redis-server ./redis.conf

打开另一个终端

[hadoop@master redis]$ ./src/redis-cli

四、Flink(集群搭建)

1.下载安装flink
2.配置相关文件

  • zoo.cfg
    参考内容:
server.1=hadoop1:2888:3888
server.2=hadoop2:2888:3888
server.3=hadoop3:2888:3888
server.4=master:2888:3888
  • masters
master
  • slaves
hadoop1
hadoop2
hadoop3
  • flink-conf.yaml
jobmanager.rpc.address: master
jobmanager.rpc.port: 6123
jobmanager.heap.size: 1024m
taskmanager.memory.process.size: 1728m
high-availability: zookeeper
high-availability.storageDir: hdfs:/user/flink/
high-availability.zookeeper.quorum: hadoop1:2181,hadoop2:2181,hadoop3:2181 high-availability.zookeeper.path.root: /flink
high-availability.cluster-id: flinkCluster
state.backend: filesystem
state.checkpoints.dir: hdfs://master/user/flink/flink-checkpoints
state.savepoints.dir: hdfs://master/user/flink/flink-checkpoints
rest.port: 8081


  • /flink/bin/config.sh
DEFAULT_YARN_CONF_DIR="/software/hadoop/etc/hadoop"                            # YARN Configuration Directory, if necessary
DEFAULT_HADOOP_CONF_DIR="/software/hadoop/etc/hadoop"                          # Hadoop Configuration Directory, if necessary

五、运行Hive与mysql中遇到的问题

1.权限问题

FAILED: SemanticException org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeExcepti...

修改方案如下:

mysql> create user ‘hive’ identified by 'hive' #''号内写的是自己在mysql创建的用户名与密码
mysql> grant all privileges on *.* to 'hive'@'%' identified by 'hivedb' with grant option;

2.字符集的问题

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:An exception was thrown while adding/validating class(es) : Column length too big for column 'PARAM_VALUE' (max = 21845); use BLOB or TEXT instead
com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Column length too big for column 'PARAM_VALUE' (max = 21845); use BLOB or TEXT instead

修改方案如下:

mysql> alter database hive character set latin1;#hive那可更改,内容为数据库名;

你可能感兴趣的:(big,data,hive,hadoop)