snipercai

Centos7 + Apache Ranger 2.4.0 部署

一、Ranger简介

Apache Ranger提供一个集中式安全管理框架, 并解决授权和审计。它可以对Hadoop生态的组件如HDFS、Yarn、Hive、Hbase等进行细粒度的数据访问控制。通过操作Ranger控制台,管理员可以轻松的通过配置策略来控制用户访问权限。

1、组件列表

#	Service Name	Listen Port	Core Ranger Service
1	ranger	6080/tcp	Y (ranger engine - 3.0.0-SNAPSHOT version)
2	ranger-postgres	5432/tcp	Y (ranger datastore)
3	ranger-solr	8983/tcp	Y (audit store)
4	ranger-zk	2181/tcp	Y (used by solr)
5	ranger-usersync	-	Y (user/group synchronization from Local Linux/Mac)
6	ranger-kms	9292/tcp	N (needed only for Encrypted Storage / TDE)
7	ranger-tagsync	-	N (needed only for Tag Based Policies to be sync from ATLAS)

2、支持的数据引擎服务

#	Service Name	Listen Port	Service Description
1	Hadoop	8088/tcp 9000/tcp	Apache Hadoop 3.3.0 Protected by Apache Ranger's Hadoop Plugin
2	HBase	16000/tcp 16010/tcp 16020/tcp 16030/tcp	Apache HBase 2.4.6 Protected by Apache Ranger's HBase Plugin
3	Hive	10000/tcp	Apache Hive 3.1.2 Protected by Apache Ranger's Hive Plugin
4	Kafka	6667/tcp	Apache Kafka 2.8.1 Protected by Apache Ranger's Kafka Plugin
5	Knox	8443/tcp	Apache Knox 1.4.0 Protected by Apache Ranger's Knox Plugin

3、技术架构

说明：
Ranger Admin：该模块是Ranger的核心，它内置了一个Web管理界面，用户可以通过这个Web管理界面或者REST接口来制定安全策略；
Agent Plugin：该模块是嵌入到Hadoop生态圈组件的插件，它定期从Ranger Admin拉取策略并执行，同时记录操作以供审计使用；
User Sync：该模块是将操作系统用户/组的权限数据同步到Ranger数据库中。

4、工作流程

Ranger Admin是Apache Ranger和用户交互的主要界面，用户登录Ranger Admin后可以针对不同的Hadoop组件定制不同的安全策略，当策略制定并保存后，Agent Plugin会定期从Ranger Admin拉取该组件配置的所有策略，并缓存到本地。当有用户请求Hadoop组件时，Agent Plugin提供鉴权服务，并将鉴权结果反馈给相应的组件，从而实现了数据服务的权限控制功能。当用户在Ranger Admin中修改了配置策略后，Agent Plugin会拉取新策略并更新，如果用户在Ranger Admin中删除了配置策略，那么Agent Plugin的鉴权服务也无法继续使用。

以Hive为例子，具体流程如下所示：

二、集群规划

本次测试采用2台虚拟机，操作系统版本为centos7.6
Ranger版本为2.4.0，官网没有安装包，必须通过源码进行编译，过程可参考：
Centos7.6 + Apache Ranger 2.4.0编译（docker方式）_snipercai的博客-CSDN博客
Mysql版本5.7.43，KDC版本1.15.1，Hadoop版本为3.3.4

IP地址	主机名	Hadoop	Ranger
192.168.121.101	node101.cc.local	NN1 DN	Ranger Admin
192.168.121.102	node102.cc.local	NN2 DN
192.168.121.103	node103.cc.local	DN	Ranger Admin MYSQL

三、安装Mysql

1、下载安装mysql

wget http://dev.mysql.com/get/mysql57-community-release-el7-8.noarch.rpm
rpm -ivh mysql57-community-release-el7-8.noarch.rpm
rpm --import https://repo.mysql.com/RPM-GPG-KEY-mysql-2022
yum install -y mysql-server

2、配置/etc/my.cnf

# For advice on how to change settings please see
# http://dev.mysql.com/doc/refman/5.7/en/server-configuration-defaults.html

[mysqld]
#
# Remove leading # and set to the amount of RAM for the most important data
# cache in MySQL. Start at 70% of total RAM for dedicated server, else 10%.
# innodb_buffer_pool_size = 128M
#
# Remove leading # to turn on a very important data integrity option: logging
# changes to the binary log between backups.
# log_bin
#
# Remove leading # to set options mainly useful for reporting servers.
# The server defaults are faster for transactions and fast SELECTs.
# Adjust sizes as needed, experiment to find the optimal values.
# join_buffer_size = 128M
# sort_buffer_size = 2M
# read_rnd_buffer_size = 2M
datadir=/opt/mysql/data
socket=/opt/mysql/data/mysql.sock

# Disabling symbolic-links is recommended to prevent assorted security risks
symbolic-links=0

log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid

[mysql]

socket=/opt/mysql/data/mysql.sock

3、启动服务

systemctl start mysqld
systemctl status mysqld

● mysqld.service - MySQL Server
Loaded: loaded (/usr/lib/systemd/system/mysqld.service; enabled; vendor preset: disabled)
Active: active (running) since Fri 2023-08-11 15:16:50 CST; 5s ago
Docs: man:mysqld(8)
http://dev.mysql.com/doc/refman/en/using-systemd.html
Process: 8875 ExecStart=/usr/sbin/mysqld --daemonize --pid-file=/var/run/mysqld/mysqld.pid $MYSQLD_OPTS (code=exited, status=0/SUCCESS)
Process: 8822 ExecStartPre=/usr/bin/mysqld_pre_systemd (code=exited, status=0/SUCCESS)
Main PID: 8878 (mysqld)
CGroup: /system.slice/mysqld.service
└─8878 /usr/sbin/mysqld --daemonize --pid-file=/var/run/mysqld/mysqld.pid

Aug 11 15:16:46 node103.cc.local systemd[1]: Starting MySQL Server...
Aug 11 15:16:50 node103.cc.local systemd[1]: Started MySQL Server.

4、获得临时密码

grep password /var/log/mysqld.log

2023-08-11T07:16:48.014862Z 1 [Note] A temporary password is generated for root@localhost: x5/dgr.?sasI

5、修改root密码及刷新权限

mysql_secure_installation

Securing the MySQL server deployment.

Enter password for user root: x5/dgr.?sasI

The existing password for the user account root has expired. Please set a new password.

New password: Mysql@103!

Re-enter new password: Mysql@103!
The 'validate_password' plugin is installed on the server.
The subsequent steps will run with the existing configuration of the plugin.
Using existing password for root.

Estimated strength of the password: 100
Change the password for root ? ((Press y|Y for Yes, any other key for No) : y

New password: Mysql@103!

Re-enter new password: Mysql@103!

Estimated strength of the password: 100
Do you wish to continue with the password provided?(Press y|Y for Yes, any other key for No) : y

By default, a MySQL installation has an anonymous user,allowing anyone to log into MySQL without having to havea user account created for them. This is intended only for testing, and to make the installation go a bit smoother.You should remove them before moving into a production
environment.

Remove anonymous users? (Press y|Y for Yes, any other key for No) : y
Success.

Normally, root should only be allowed to connect from'localhost'. This ensures that someone cannot guess at the root password from the network.

Disallow root login remotely? (Press y|Y for Yes, any other key for No) : n
Success.

By default, MySQL comes with a database named 'test' that anyone can access. This is also intended only for testing, and should be removed before moving into a production environment.

Remove test database and access to it? (Press y|Y for Yes, any other key for No) : y
- Dropping test database...
Success.

- Removing privileges on test database...
Success.

Reloading the privilege tables will ensure that all changes made so far will take effect immediately.

Reload privilege tables now? (Press y|Y for Yes, any other key for No) : y
Success.

All done!

6、测试登录

mysql -uroot -p

Enter password: Mysql@103!
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 6
Server version: 5.7.43 MySQL Community Server (GPL)

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql>

四、安装Ranger Admin

1、解压编译后的程序包

tar -zxvf ranger-2.4.0-admin.tar.gz

2、创建ranger用户

groupadd -g 1025 ranger
useradd -g ranger -u 1025 -d /home/ranger ranger
echo ranger:rangerpwd | chpasswd
mkdir -p /opt/ranger
su - ranger

配置install.properties

# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

#
# This file provides a list of the deployment variables for the Policy Manager Web Application
#

#------------------------- DB CONFIG - BEGIN ----------------------------------
# Uncomment the below if the DBA steps need to be run separately
#setup_mode=SeparateDBA

PYTHON_COMMAND_INVOKER=python3

#DB_FLAVOR=MYSQL|ORACLE|POSTGRES|MSSQL|SQLA
DB_FLAVOR=MYSQL
#

#
# Location of DB client library (please check the location of the jar file)
#
#SQL_CONNECTOR_JAR=/usr/share/java/ojdbc6.jar
#SQL_CONNECTOR_JAR=/usr/share/java/mysql-connector-java.jar
#SQL_CONNECTOR_JAR=/usr/share/java/postgresql.jar
#SQL_CONNECTOR_JAR=/usr/share/java/sqljdbc4.jar
#SQL_CONNECTOR_JAR=/opt/sqlanywhere17/java/sajdbc4.jar
SQL_CONNECTOR_JAR=/usr/share/java/mysql-connector-java.jar

#
# DB password for the DB admin user-id
# **************************************************************************
# ** If the password is left empty or not-defined here,
# ** it will try with blank password during installation process
# **************************************************************************
#
#db_root_user=root|SYS|postgres|sa|dba
#db_host=host:port # for DB_FLAVOR=MYSQL|POSTGRES|SQLA|MSSQL #for example: db_host=localhost:3306
#db_host=host:port:SID # for DB_FLAVOR=ORACLE #for SID example: db_host=localhost:1521:ORCL
#db_host=host:port/ServiceName # for DB_FLAVOR=ORACLE #for Service example: db_host=localhost:1521/XE
db_root_user=root
db_root_password=Mysql@103!
db_host=192.168.121.103
#SSL config
db_ssl_enabled=false
db_ssl_required=false
db_ssl_verifyServerCertificate=false
#db_ssl_auth_type=1-way|2-way, where 1-way represents standard one way ssl authentication and 2-way represents mutual ssl authentication
db_ssl_auth_type=2-way
javax_net_ssl_keyStore=
javax_net_ssl_keyStorePassword=
javax_net_ssl_trustStore=
javax_net_ssl_trustStorePassword=
javax_net_ssl_trustStore_type=jks
javax_net_ssl_keyStore_type=jks

# For postgresql db
db_ssl_certificate_file=

#
# DB UserId used for the Ranger schema
#
db_name=ranger
db_user=rangeradmin
db_password=Ranger@103!

#For over-riding the jdbc url.
is_override_db_connection_string=false
db_override_connection_string=

# change password. Password for below mentioned users can be changed only once using this property.
#PLEASE NOTE :: Password should be minimum 8 characters with min one alphabet and one numeric.
rangerAdmin_password=Admin@103!
rangerTagsync_password=Tag@103!
rangerUsersync_password=User@103!
keyadmin_password=Key@103!

#Source for Audit Store. Currently solr, elasticsearch and cloudwatch logs are supported.
# * audit_store is solr
#audit_store=solr

# * audit_solr_url Elasticsearch Host(s). E.g. 127.0.0.1
audit_elasticsearch_urls=
audit_elasticsearch_port=
audit_elasticsearch_protocol=
audit_elasticsearch_user=
audit_elasticsearch_password=
audit_elasticsearch_index=
audit_elasticsearch_bootstrap_enabled=true

# * audit_solr_url URL to Solr. E.g. http://:6083/solr/ranger_audits
audit_solr_urls=
audit_solr_user=
audit_solr_password=
audit_solr_zookeepers=

audit_solr_collection_name=ranger_audits
#solr Properties for cloud mode
audit_solr_config_name=ranger_audits
audit_solr_configset_location=
audit_solr_no_shards=1
audit_solr_no_replica=1
audit_solr_max_shards_per_node=1
audit_solr_acl_user_list_sasl=solr,infra-solr
audit_solr_bootstrap_enabled=true

# * audit to amazon cloudwatch properties
audit_cloudwatch_region=
audit_cloudwatch_log_group=
audit_cloudwatch_log_stream_prefix=

#------------------------- DB CONFIG - END ----------------------------------

#
# ------- PolicyManager CONFIG ----------------
#

policymgr_external_url=http://192.168.121.103:6080
policymgr_http_enabled=true
policymgr_https_keystore_file=
policymgr_https_keystore_keyalias=rangeradmin
policymgr_https_keystore_password=

#Add Supported Components list below separated by semi-colon, default value is empty string to support all components
#Example : policymgr_supportedcomponents=hive,hbase,hdfs
policymgr_supportedcomponents=

#
# ------- PolicyManager CONFIG - END ---------------
#

#
# ------- UNIX User CONFIG ----------------
#
unix_user=ranger
unix_user_pwd=557TJy
unix_group=ranger

#
# ------- UNIX User CONFIG - END ----------------
#
#

#
# UNIX authentication service for Policy Manager
#
# PolicyManager can authenticate using UNIX username/password
# The UNIX server specified here as authServiceHostName needs to be installed with ranger-unix-ugsync package.
# Once the service is installed on authServiceHostName, the UNIX username/password from the host can be used to login into policy manager
#
# ** The installation of ranger-unix-ugsync package can be installed after the policymanager installation is finished.
#
#LDAP|ACTIVE_DIRECTORY|UNIX|NONE
authentication_method=NONE
remoteLoginEnabled=true
authServiceHostName=localhost
authServicePort=5151
ranger_unixauth_keystore=keystore.jks
ranger_unixauth_keystore_password=password
ranger_unixauth_truststore=cacerts
ranger_unixauth_truststore_password=changeit

####LDAP settings - Required only if have selected LDAP authentication ####
#
# Sample Settings
#
#xa_ldap_url=ldap://127.0.0.1:389
#xa_ldap_userDNpattern=uid={0},ou=users,dc=xasecure,dc=net
#xa_ldap_groupSearchBase=ou=groups,dc=xasecure,dc=net
#xa_ldap_groupSearchFilter=(member=uid={0},ou=users,dc=xasecure,dc=net)
#xa_ldap_groupRoleAttribute=cn
#xa_ldap_base_dn=dc=xasecure,dc=net
#xa_ldap_bind_dn=cn=admin,ou=users,dc=xasecure,dc=net
#xa_ldap_bind_password=
#xa_ldap_referral=follow|ignore
#xa_ldap_userSearchFilter=(uid={0})

xa_ldap_url=
xa_ldap_userDNpattern=
xa_ldap_groupSearchBase=
xa_ldap_groupSearchFilter=
xa_ldap_groupRoleAttribute=
xa_ldap_base_dn=
xa_ldap_bind_dn=
xa_ldap_bind_password=
xa_ldap_referral=
xa_ldap_userSearchFilter=
####ACTIVE_DIRECTORY settings - Required only if have selected AD authentication ####
#
# Sample Settings
#
#xa_ldap_ad_domain=xasecure.net
#xa_ldap_ad_url=ldap://127.0.0.1:389
#xa_ldap_ad_base_dn=dc=xasecure,dc=net
#xa_ldap_ad_bind_dn=cn=administrator,ou=users,dc=xasecure,dc=net
#xa_ldap_ad_bind_password=
#xa_ldap_ad_referral=follow|ignore
#xa_ldap_ad_userSearchFilter=(sAMAccountName={0})

xa_ldap_ad_domain=
xa_ldap_ad_url=
xa_ldap_ad_base_dn=
xa_ldap_ad_bind_dn=
xa_ldap_ad_bind_password=
xa_ldap_ad_referral=
xa_ldap_ad_userSearchFilter=

#------------ Kerberos Config -----------------
spnego_principal=
spnego_keytab=
token_valid=30
cookie_domain=
cookie_path=/
admin_principal=
admin_keytab=
lookup_principal=
lookup_keytab=
hadoop_conf=/opt/hadoop/hadoop-3.3.4/etc/hadoop/
#
#-------- SSO CONFIG - Start ------------------
#
sso_enabled=false
sso_providerurl=https://127.0.0.1:8443/gateway/knoxsso/api/v1/websso
sso_publickey=

#
#-------- SSO CONFIG - END ------------------

# Custom log directory path
RANGER_ADMIN_LOG_DIR=$PWD
RANGER_ADMIN_LOGBACK_CONF_FILE=

# PID file path
RANGER_PID_DIR_PATH=/var/run/ranger

# ################# DO NOT MODIFY ANY VARIABLES BELOW #########################
#
# --- These deployment variables are not to be modified unless you understand the full impact of the changes
#
################################################################################
XAPOLICYMGR_DIR=$PWD
app_home=$PWD/ews/webapp
TMPFILE=$PWD/.fi_tmp
LOGFILE=$PWD/logfile
LOGFILES="$LOGFILE"

JAVA_BIN='java'
JAVA_VERSION_REQUIRED='1.8'
JAVA_ORACLE='Java(TM) SE Runtime Environment'

ranger_admin_max_heap_size=1g
#retry DB and Java patches after the given time in seconds.
PATCH_RETRY_INTERVAL=120
STALE_PATCH_ENTRY_HOLD_TIME=10

#mysql_create_user_file=${PWD}/db/mysql/create_dev_user.sql
mysql_core_file=db/mysql/optimized/current/ranger_core_db_mysql.sql
mysql_audit_file=db/mysql/xa_audit_db.sql
#mysql_asset_file=${PWD}/db/mysql/reset_asset.sql

#oracle_create_user_file=${PWD}/db/oracle/create_dev_user_oracle.sql
oracle_core_file=db/oracle/optimized/current/ranger_core_db_oracle.sql
oracle_audit_file=db/oracle/xa_audit_db_oracle.sql
#oracle_asset_file=${PWD}/db/oracle/reset_asset_oracle.sql
#
postgres_core_file=db/postgres/optimized/current/ranger_core_db_postgres.sql
postgres_audit_file=db/postgres/xa_audit_db_postgres.sql
#
sqlserver_core_file=db/sqlserver/optimized/current/ranger_core_db_sqlserver.sql
sqlserver_audit_file=db/sqlserver/xa_audit_db_sqlserver.sql
#
sqlanywhere_core_file=db/sqlanywhere/optimized/current/ranger_core_db_sqlanywhere.sql
sqlanywhere_audit_file=db/sqlanywhere/xa_audit_db_sqlanywhere.sql
cred_keystore_filename=$app_home/WEB-INF/classes/conf/.jceks/rangeradmin.jceks

3、创建Ranger数据库

mysql> create database ranger;
Query OK, 1 row affected (0.00 sec)

mysql> GRANT ALL PRIVILEGES ON *.* TO 'rangeradmin'@'192.168.121.103' identified by 'Ranger@103!';
Query OK, 0 rows affected, 1 warning (0.00 sec)
mysql> GRANT ALL PRIVILEGES ON *.* TO 'rangeradmin'@'localhost' identified by 'Ranger@103!';
Query OK, 0 rows affected, 1 warning (0.00 sec)
mysql> GRANT ALL PRIVILEGES ON *.* TO 'ranger'@'%' identified by 'Ranger@103!';
Query OK, 0 rows affected, 1 warning (0.00 sec)
mysql> GRANT ALL PRIVILEGES ON *.* TO 'rangeradmin'@'node103.cc.local' identified by 'Ranger@103!';
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)

4、初始化Ranger admin

yum install -y mysql-connector-java
yum install -y python3

./setup.sh

Installation of Ranger PolicyManager Web Application is completed.

说明：
1、如果如错如下
2023-08-17 18:19:34,671 [I] Env filename : /etc/ranger/admin/conf/ranger-admin-env-hadoopconfdir.sh
Traceback (most recent call last):
File "db_setup.py", line 1451, in
main(sys.argv)
File "db_setup.py", line 1418, in main
run_env_file(env_file_path)
File "db_setup.py", line 163, in run_env_file
set_env_val(command)
File "db_setup.py", line 152, in set_env_val
(key, _, value) = line.partition("=")
TypeError: a bytes-like object is required, not 'str'
是因为python3中对于与line是bytes类型的字符串对象，并对该bytes类型的字符串对象进行按照str类型进行了操作，所以需要将line使用decode转化成str类型。在db_setup.py的152行，增加line = line.decode('ascii')即可

2、如果数据库日志报错Access denied for user 'rangeradmin'@'node103.cc.local' (using password: YES)，如果数据库授权和密码都正常，则检查一下conf/ranger-admin-site.xml文件，可能是配置文件里没有自动填写数据库密码。

./set_globals.sh

usermod: no changes
[2023/08/11 18:03:29]: [I] Soft linking /etc/ranger/admin/conf to ews/webapp/WEB-INF/classes/conf

5、启动Ranger admin

ranger-admin start

Starting Apache Ranger Admin Service
Apache Ranger Admin Service with pid 18756 has started.

6、登录网页

输入网址：http://192.168.121.103:6080/login.jsp
默认用户：admin
如没有配置rangerAdmin_password，则默认密码为admin

五、安装usersync

1、解压编译后的程序包

tar -zxvf ranger-2.4.0-usersync.tar.gz

2、配置install.properties

# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# The base path for the usersync process
ranger_base_dir = /etc/ranger

#
# The following URL should be the base URL for connecting to the policy manager web application
# For example:
#
# POLICY_MGR_URL = http://policymanager.xasecure.net:6080
#
POLICY_MGR_URL = http://192.168.121.103:6080

# sync source, only unix and ldap are supported at present
# defaults to unix
SYNC_SOURCE = unix

#
# Minimum Unix User-id to start SYNC.
# This should avoid creating UNIX system-level users in the Policy Manager
#
MIN_UNIX_USER_ID_TO_SYNC = 500

# Minimum Unix Group-id to start SYNC.
# This should avoid creating UNIX system-level users in the Policy Manager
#
MIN_UNIX_GROUP_ID_TO_SYNC = 500

# sync interval in minutes
# user, groups would be synced again at the end of each sync interval
# defaults to 5 if SYNC_SOURCE is unix
# defaults to 360 if SYNC_SOURCE is ldap
SYNC_INTERVAL =

#User and group for the usersync process
unix_user=ranger
unix_group=ranger

#change password of rangerusersync user. Please note that this password should be as per rangerusersync user in ranger
rangerUsersync_password=User@103!

#Set to run in kerberos environment
usersync_principal=
usersync_keytab=
hadoop_conf=/opt/hadoop/hadoop-3.3.4/etc/hadoop/
#
# The file where all credential is kept in cryptic format
#
CRED_KEYSTORE_FILENAME=/etc/ranger/usersync/conf/rangerusersync.jceks

# SSL Authentication
AUTH_SSL_ENABLED=false
AUTH_SSL_KEYSTORE_FILE=/etc/ranger/usersync/conf/cert/unixauthservice.jks
AUTH_SSL_KEYSTORE_PASSWORD=UnIx529p
AUTH_SSL_TRUSTSTORE_FILE=
AUTH_SSL_TRUSTSTORE_PASSWORD=

# ---------------------------------------------------------------
# The following properties are relevant only if SYNC_SOURCE = ldap
# ---------------------------------------------------------------

# The below properties ROLE_ASSIGNMENT_LIST_DELIMITER, USERS_GROUPS_ASSIGNMENT_LIST_DELIMITER, USERNAME_GROUPNAME_ASSIGNMENT_LIST_DELIMITER,
#and GROUP_BASED_ROLE_ASSIGNMENT_RULES can be used to assign role to LDAP synced users and groups
#NOTE all the delimiters should have different values and the delimiters should not contain characters that are allowed in userName or GroupName

# default value ROLE_ASSIGNMENT_LIST_DELIMITER = &
ROLE_ASSIGNMENT_LIST_DELIMITER = &

#default value USERS_GROUPS_ASSIGNMENT_LIST_DELIMITER = :
USERS_GROUPS_ASSIGNMENT_LIST_DELIMITER = :

#default value USERNAME_GROUPNAME_ASSIGNMENT_LIST_DELIMITER = ,
USERNAME_GROUPNAME_ASSIGNMENT_LIST_DELIMITER = ,

# with above mentioned delimiters a sample value would be ROLE_SYS_ADMIN:u:userName1,userName2&ROLE_SYS_ADMIN:g:groupName1,groupName2&ROLE_KEY_ADMIN:u:userName&ROLE_KEY_ADMIN:g:groupName&ROLE_USER:u:userName3,userName4&ROLE_USER:g:groupName3
#&ROLE_ADMIN_AUDITOR:u:userName&ROLE_KEY_ADMIN_AUDITOR:u:userName&ROLE_KEY_ADMIN_AUDITOR:g:groupName&ROLE_ADMIN_AUDITOR:g:groupName
GROUP_BASED_ROLE_ASSIGNMENT_RULES =

# URL of source ldap
# a sample value would be: ldap://ldap.example.com:389
# Must specify a value if SYNC_SOURCE is ldap
SYNC_LDAP_URL =

# ldap bind dn used to connect to ldap and query for users and groups
# a sample value would be cn=admin,ou=users,dc=hadoop,dc=apache,dc=org
# Must specify a value if SYNC_SOURCE is ldap
SYNC_LDAP_BIND_DN =

# ldap bind password for the bind dn specified above
# please ensure read access to this file is limited to root, to protect the password
# Must specify a value if SYNC_SOURCE is ldap
# unless anonymous search is allowed by the directory on users and group
SYNC_LDAP_BIND_PASSWORD =

# ldap delta sync flag used to periodically sync users and groups based on the updates in the server
# please customize the value to suit your deployment
# default value is set to true when is SYNC_SOURCE is ldap
SYNC_LDAP_DELTASYNC =

# search base for users and groups
# sample value would be dc=hadoop,dc=apache,dc=org
SYNC_LDAP_SEARCH_BASE =

# search base for users
# sample value would be ou=users,dc=hadoop,dc=apache,dc=org
# overrides value specified in SYNC_LDAP_SEARCH_BASE
SYNC_LDAP_USER_SEARCH_BASE =

# search scope for the users, only base, one and sub are supported values
# please customize the value to suit your deployment
# default value: sub
SYNC_LDAP_USER_SEARCH_SCOPE = sub

# objectclass to identify user entries
# please customize the value to suit your deployment
# default value: person
SYNC_LDAP_USER_OBJECT_CLASS = person

# optional additional filter constraining the users selected for syncing
# a sample value would be (dept=eng)
# please customize the value to suit your deployment
# default value is empty
SYNC_LDAP_USER_SEARCH_FILTER =

# attribute from user entry that would be treated as user name
# please customize the value to suit your deployment
# default value: cn
SYNC_LDAP_USER_NAME_ATTRIBUTE = cn

# attribute from user entry whose values would be treated as
# group values to be pushed into Policy Manager database
# You could provide multiple attribute names separated by comma
# default value: memberof, ismemberof
SYNC_LDAP_USER_GROUP_NAME_ATTRIBUTE = memberof,ismemberof
#
# UserSync - Case Conversion Flags
# possible values: none, lower, upper
SYNC_LDAP_USERNAME_CASE_CONVERSION=lower
SYNC_LDAP_GROUPNAME_CASE_CONVERSION=lower

#user sync log path
logdir=logs
#/var/log/ranger/usersync

# PID DIR PATH
USERSYNC_PID_DIR_PATH=/var/run/ranger

# do we want to do ldapsearch to find groups instead of relying on user entry attributes
# valid values: true, false
# any value other than true would be treated as false
# default value: false
SYNC_GROUP_SEARCH_ENABLED=

# do we want to do ldapsearch to find groups instead of relying on user entry attributes and
# sync memberships of those groups
# valid values: true, false
# any value other than true would be treated as false
# default value: false
SYNC_GROUP_USER_MAP_SYNC_ENABLED=

# search base for groups
# sample value would be ou=groups,dc=hadoop,dc=apache,dc=org
# overrides value specified in SYNC_LDAP_SEARCH_BASE, SYNC_LDAP_USER_SEARCH_BASE
# if a value is not specified, takes the value of SYNC_LDAP_SEARCH_BASE
# if SYNC_LDAP_SEARCH_BASE is also not specified, takes the value of SYNC_LDAP_USER_SEARCH_BASE
SYNC_GROUP_SEARCH_BASE=

# search scope for the groups, only base, one and sub are supported values
# please customize the value to suit your deployment
# default value: sub
SYNC_GROUP_SEARCH_SCOPE=

# objectclass to identify group entries
# please customize the value to suit your deployment
# default value: groupofnames
SYNC_GROUP_OBJECT_CLASS=

# optional additional filter constraining the groups selected for syncing
# a sample value would be (dept=eng)
# please customize the value to suit your deployment
# default value is empty
SYNC_LDAP_GROUP_SEARCH_FILTER=

# attribute from group entry that would be treated as group name
# please customize the value to suit your deployment
# default value: cn
SYNC_GROUP_NAME_ATTRIBUTE=

# attribute from group entry that is list of members
# please customize the value to suit your deployment
# default value: member
SYNC_GROUP_MEMBER_ATTRIBUTE_NAME=

# do we want to use paged results control during ldapsearch for user entries
# valid values: true, false
# any value other than true would be treated as false
# default value: true
# if the value is false, typical AD would not return more than 1000 entries
SYNC_PAGED_RESULTS_ENABLED=

# page size for paged results control
# search results would be returned page by page with the specified number of entries per page
# default value: 500
SYNC_PAGED_RESULTS_SIZE=
#LDAP context referral could be ignore or follow
SYNC_LDAP_REFERRAL =ignore

# if you want to enable or disable jvm metrics for usersync process
# valid values: true, false
# any value other than true would be treated as false
# default value: false
# if the value is false, jvm metrics is not created
JVM_METRICS_ENABLED=

# filename of jvm metrics created for usersync process
# default value: ranger_usersync_metric.json
JVM_METRICS_FILENAME=

#file directory for jvm metrics
# default value : logdir
JVM_METRICS_FILEPATH=

#frequency for jvm metrics to be updated
# default value : 10000 milliseconds
JVM_METRICS_FREQUENCY_TIME_IN_MILLIS=

3、安装usersync模块

./setup.sh

Creating ranger-usersync-env-logdir.sh file
Creating ranger-usersync-env-hadoopconfdir.sh file
Creating ranger-usersync-env-piddir.sh file
Creating ranger-usersync-env-confdir.sh file

4、开启自动同步

配置/opt/ranger/ranger-2.4.0-usersync/conf/ranger-ugsync-site.xml

ranger.usersync.credstore.filename
/etc/ranger/usersync/conf/rangerusersync.jceks

ranger.usersync.enabled
true

ranger.usersync.group.memberattributename

ranger.usersync.group.nameattribute

ranger.usersync.group.objectclass

ranger.usersync.group.searchbase

ranger.usersync.group.searchenabled

ranger.usersync.group.searchfilter

ranger.usersync.group.searchscope

ranger.usersync.ldap.binddn

ranger.usersync.ldap.groupname.caseconversion
lower

ranger.usersync.ldap.ldapbindpassword
_

ranger.usersync.ldap.searchBase

ranger.usersync.ldap.url

ranger.usersync.ldap.deltasync

ranger.usersync.ldap.user.groupnameattribute
memberof,ismemberof

ranger.usersync.ldap.user.nameattribute
cn

ranger.usersync.ldap.user.objectclass
person

ranger.usersync.ldap.user.searchbase

ranger.usersync.ldap.user.searchfilter

ranger.usersync.ldap.user.searchscope
sub

ranger.usersync.ldap.username.caseconversion
lower

ranger.usersync.logdir
logs

ranger.usersync.pagedresultsenabled

ranger.usersync.pagedresultssize

ranger.usersync.passwordvalidator.path
./native/credValidator.uexe

ranger.usersync.policymanager.baseURL
http://192.168.121.103:6080

ranger.usersync.policymanager.maxrecordsperapicall
1000

ranger.usersync.policymanager.mockrun
false

ranger.usersync.port
5151

ranger.usersync.sink.impl.class
org.apache.ranger.unixusersync.process.PolicyMgrUserGroupBuilder

ranger.usersync.sleeptimeinmillisbetweensynccycle
300000

ranger.usersync.source.impl.class
org.apache.ranger.unixusersync.process.UnixUserGroupBuilder

ranger.usersync.ssl
true

ranger.usersync.unix.minUserId
500

ranger.usersync.unix.minGroupId
500

ranger.usersync.keystore.file
/etc/ranger/usersync/conf/cert/unixauthservice.jks

ranger.usersync.truststore.file

ranger.usersync.policymgr.username
rangerusersync

ranger.usersync.policymgr.alias
ranger.usersync.policymgr.password

ranger.usersync.policymgr.keystore
/etc/ranger/usersync/conf/rangerusersync.jceks

ranger.usersync.sync.source
unix

ranger.usersync.ldap.referral
ignore

ranger.usersync.kerberos.principal

ranger.usersync.kerberos.keytab

ranger.usersync.keystore.password
_

ranger.usersync.truststore.password

5、启动usersync模块

ranger-usersync start

Starting Apache Ranger Usersync Service
Apache Ranger Usersync Service with pid 10537 has started.

6、查看用户同步

当启动usersync模块之后，会自动同步当前Linux系统中的用户

注意：这里只会同步除了root和虚拟用户外的用户，即UID和GID号较小的不同步

六、安装hdfs-plugin

1、解压编译后的程序包

tar -zxvf ranger-2.4.0-hdfs-plugin.tar.gz

2、配置install.properties

# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

#
# Location of Policy Manager URL
#
# Example:
# POLICY_MGR_URL=http://policymanager.xasecure.net:6080
#
POLICY_MGR_URL=http://192.168.121.103:6080

#
# This is the repository name created within policy manager
#
# Example:
# REPOSITORY_NAME=hadoopdev
#
REPOSITORY_NAME=hdfs_repo

#
# Set hadoop home when hadoop program and Ranger HDFS Plugin are not in the
# same path.
#
COMPONENT_INSTALL_DIR_NAME=/opt/hadoop/hadoop-3.3.4/

# AUDIT configuration with V3 properties
# Enable audit logs to Solr
#Example
#XAAUDIT.SOLR.ENABLE=true
#XAAUDIT.SOLR.URL=http://localhost:6083/solr/ranger_audits
#XAAUDIT.SOLR.ZOOKEEPER=
#XAAUDIT.SOLR.FILE_SPOOL_DIR=/var/log/hadoop/hdfs/audit/solr/spool

XAAUDIT.SOLR.ENABLE=false
XAAUDIT.SOLR.URL=NONE
XAAUDIT.SOLR.USER=NONE
XAAUDIT.SOLR.PASSWORD=NONE
XAAUDIT.SOLR.ZOOKEEPER=NONE
XAAUDIT.SOLR.FILE_SPOOL_DIR=/var/log/hadoop/hdfs/audit/solr/spool

# Enable audit logs to ElasticSearch
#Example
#XAAUDIT.ELASTICSEARCH.ENABLE=true
#XAAUDIT.ELASTICSEARCH.URL=localhost
#XAAUDIT.ELASTICSEARCH.INDEX=audit

XAAUDIT.ELASTICSEARCH.ENABLE=false
XAAUDIT.ELASTICSEARCH.URL=NONE
XAAUDIT.ELASTICSEARCH.USER=NONE
XAAUDIT.ELASTICSEARCH.PASSWORD=NONE
XAAUDIT.ELASTICSEARCH.INDEX=NONE
XAAUDIT.ELASTICSEARCH.PORT=NONE
XAAUDIT.ELASTICSEARCH.PROTOCOL=NONE

# Enable audit logs to HDFS
#Example
#XAAUDIT.HDFS.ENABLE=true
#XAAUDIT.HDFS.HDFS_DIR=hdfs://node-1.example.com:8020/ranger/audit
#XAAUDIT.HDFS.FILE_SPOOL_DIR=/var/log/hadoop/hdfs/audit/hdfs/spool
# If using Azure Blob Storage
#XAAUDIT.HDFS.HDFS_DIR=wasb[s]://@.blob.core.windows.net/
#XAAUDIT.HDFS.HDFS_DIR=wasb://[email protected]/ranger/audit

XAAUDIT.HDFS.ENABLE=false
XAAUDIT.HDFS.HDFS_DIR=hdfs://__REPLACE__NAME_NODE_HOST:8020/ranger/audit
XAAUDIT.HDFS.FILE_SPOOL_DIR=/var/log/hadoop/hdfs/audit/hdfs/spool

# Following additional propertis are needed When auditing to Azure Blob Storage via HDFS
# Get these values from your /etc/hadoop/conf/core-site.xml
#XAAUDIT.HDFS.HDFS_DIR=wasb[s]://@.blob.core.windows.net/
XAAUDIT.HDFS.AZURE_ACCOUNTNAME=__REPLACE_AZURE_ACCOUNT_NAME
XAAUDIT.HDFS.AZURE_ACCOUNTKEY=__REPLACE_AZURE_ACCOUNT_KEY
XAAUDIT.HDFS.AZURE_SHELL_KEY_PROVIDER=__REPLACE_AZURE_SHELL_KEY_PROVIDER
XAAUDIT.HDFS.AZURE_ACCOUNTKEY_PROVIDER=__REPLACE_AZURE_ACCOUNT_KEY_PROVIDER

#Log4j Audit Provider
XAAUDIT.LOG4J.ENABLE=false
XAAUDIT.LOG4J.IS_ASYNC=false
XAAUDIT.LOG4J.ASYNC.MAX.QUEUE.SIZE=10240
XAAUDIT.LOG4J.ASYNC.MAX.FLUSH.INTERVAL.MS=30000
XAAUDIT.LOG4J.DESTINATION.LOG4J=true
XAAUDIT.LOG4J.DESTINATION.LOG4J.LOGGER=xaaudit

# Enable audit logs to Amazon CloudWatch Logs
#Example
#XAAUDIT.AMAZON_CLOUDWATCH.ENABLE=true
#XAAUDIT.AMAZON_CLOUDWATCH.LOG_GROUP=ranger_audits
#XAAUDIT.AMAZON_CLOUDWATCH.LOG_STREAM={instance_id}
#XAAUDIT.AMAZON_CLOUDWATCH.FILE_SPOOL_DIR=/var/log/hive/audit/amazon_cloudwatch/spool

XAAUDIT.AMAZON_CLOUDWATCH.ENABLE=false
XAAUDIT.AMAZON_CLOUDWATCH.LOG_GROUP=NONE
XAAUDIT.AMAZON_CLOUDWATCH.LOG_STREAM_PREFIX=NONE
XAAUDIT.AMAZON_CLOUDWATCH.FILE_SPOOL_DIR=NONE
XAAUDIT.AMAZON_CLOUDWATCH.REGION=NONE

# End of V3 properties

#
# Audit to HDFS Configuration
#
# If XAAUDIT.HDFS.IS_ENABLED is set to true, please replace tokens
# that start with __REPLACE__ with appropriate values
# XAAUDIT.HDFS.IS_ENABLED=true
# XAAUDIT.HDFS.DESTINATION_DIRECTORY=hdfs://__REPLACE__NAME_NODE_HOST:8020/ranger/audit/%app-type%/%time:yyyyMMdd%
# XAAUDIT.HDFS.LOCAL_BUFFER_DIRECTORY=__REPLACE__LOG_DIR/hadoop/%app-type%/audit
# XAAUDIT.HDFS.LOCAL_ARCHIVE_DIRECTORY=__REPLACE__LOG_DIR/hadoop/%app-type%/audit/archive
#
# Example:
# XAAUDIT.HDFS.IS_ENABLED=true
# XAAUDIT.HDFS.DESTINATION_DIRECTORY=hdfs://namenode.example.com:8020/ranger/audit/%app-type%/%time:yyyyMMdd%
# XAAUDIT.HDFS.LOCAL_BUFFER_DIRECTORY=/var/log/hadoop/%app-type%/audit
# XAAUDIT.HDFS.LOCAL_ARCHIVE_DIRECTORY=/var/log/hadoop/%app-type%/audit/archive
#
XAAUDIT.HDFS.IS_ENABLED=false
XAAUDIT.HDFS.DESTINATION_DIRECTORY=hdfs://__REPLACE__NAME_NODE_HOST:8020/ranger/audit/%app-type%/%time:yyyyMMdd%
XAAUDIT.HDFS.LOCAL_BUFFER_DIRECTORY=__REPLACE__LOG_DIR/hadoop/%app-type%/audit
XAAUDIT.HDFS.LOCAL_ARCHIVE_DIRECTORY=__REPLACE__LOG_DIR/hadoop/%app-type%/audit/archive

XAAUDIT.HDFS.DESTINTATION_FILE=%hostname%-audit.log
XAAUDIT.HDFS.DESTINTATION_FLUSH_INTERVAL_SECONDS=900
XAAUDIT.HDFS.DESTINTATION_ROLLOVER_INTERVAL_SECONDS=86400
XAAUDIT.HDFS.DESTINTATION_OPEN_RETRY_INTERVAL_SECONDS=60
XAAUDIT.HDFS.LOCAL_BUFFER_FILE=%time:yyyyMMdd-HHmm.ss%.log
XAAUDIT.HDFS.LOCAL_BUFFER_FLUSH_INTERVAL_SECONDS=60
XAAUDIT.HDFS.LOCAL_BUFFER_ROLLOVER_INTERVAL_SECONDS=600
XAAUDIT.HDFS.LOCAL_ARCHIVE_MAX_FILE_COUNT=10

#Solr Audit Provider
XAAUDIT.SOLR.IS_ENABLED=false
XAAUDIT.SOLR.MAX_QUEUE_SIZE=1
XAAUDIT.SOLR.MAX_FLUSH_INTERVAL_MS=1000
XAAUDIT.SOLR.SOLR_URL=http://localhost:6083/solr/ranger_audits

# End of V2 properties

#
# SSL Client Certificate Information
#
# Example:
# SSL_KEYSTORE_FILE_PATH=/etc/hadoop/conf/ranger-plugin-keystore.jks
# SSL_KEYSTORE_PASSWORD=none
# SSL_TRUSTSTORE_FILE_PATH=/etc/hadoop/conf/ranger-plugin-truststore.jks
# SSL_TRUSTSTORE_PASSWORD=none
#
# You do not need use SSL between agent and security admin tool, please leave these sample value as it is.
#
SSL_KEYSTORE_FILE_PATH=/etc/hadoop/conf/ranger-plugin-keystore.jks
SSL_KEYSTORE_PASSWORD=myKeyFilePassword
SSL_TRUSTSTORE_FILE_PATH=/etc/hadoop/conf/ranger-plugin-truststore.jks
SSL_TRUSTSTORE_PASSWORD=changeit

#
# Custom component user
# CUSTOM_COMPONENT_USER=
# keep blank if component user is default
CUSTOM_USER=hadoop

#
# Custom component group
# CUSTOM_COMPONENT_GROUP=
# keep blank if component group is default
CUSTOM_GROUP=hadoop

3、开启HDFS-Plugin

//需要在每个hdfs节点上配置
./enable-hdfs-plugin.sh

Custom user and group is available, using custom user and group.
+ Fri Aug 18 10:22:34 CST 2023 : hadoop: lib folder=/opt/hadoop/hadoop-3.3.4/share/hadoop/hdfs/lib conf folder=/opt/hadoop/hadoop-3.3.4/etc/hadoop
+ Fri Aug 18 10:22:34 CST 2023 : Saving current config file: /opt/hadoop/hadoop-3.3.4/etc/hadoop/hdfs-site.xml to /opt/hadoop/hadoop-3.3.4/etc/hadoop/.hdfs-site.xml.20230818-102234 ...
+ Fri Aug 18 10:22:35 CST 2023 : Saving current config file: /opt/hadoop/hadoop-3.3.4/etc/hadoop/ranger-hdfs-audit.xml to /opt/hadoop/hadoop-3.3.4/etc/hadoop/.ranger-hdfs-audit.xml.20230818-102234 ...
+ Fri Aug 18 10:22:35 CST 2023 : Saving current config file: /opt/hadoop/hadoop-3.3.4/etc/hadoop/ranger-hdfs-security.xml to /opt/hadoop/hadoop-3.3.4/etc/hadoop/.ranger-hdfs-security.xml.20230818-102234 ...
+ Fri Aug 18 10:22:35 CST 2023 : Saving current config file: /opt/hadoop/hadoop-3.3.4/etc/hadoop/ranger-policymgr-ssl.xml to /opt/hadoop/hadoop-3.3.4/etc/hadoop/.ranger-policymgr-ssl.xml.20230818-102234 ...
+ Fri Aug 18 10:22:36 CST 2023 : Saving current JCE file: /etc/ranger/hadoopdev/cred.jceks to /etc/ranger/hadoopdev/.cred.jceks.20230818102236 ...
Ranger Plugin for hadoop has been enabled. Please restart hadoop to ensure that changes are effective.

4、重启HDFS

su - hadoop
stop_dfs.sh
start-dfs.sh

5、页面配置HDFS服务

Service Name：要与REPOSITORY_NAME保持一致
Username：hadoop用户
Password：hadoop用户密码
Namenode URL：HDFS的namenode的rpc地址，非web地址，如果是HA架构，多个nn节点地址用逗号分割

填写完成后，先点Test Connection测试一下连接。

如果连接测试成功，可以添加完成

6、验证权限控制

su - hadoop
hdfs dfs -mkdir /rangertest
echo 'hello!!!' > test.txt
hdfs dfs -put /test.txt /rangertest/

使用test用户读取rangertest数据和上传文件，但无法上传文件到rangertest目录下

su - test
hdfs dfs -cat /rangertest/test.txt

hdfs dfs -put test11.txt /rangertest/

使用ranger配置策略，允许test用户可以操作rangertest目录

配置后test用户可以操作上传文件到rangertest目录

七、安装Yarn-plugin

1、解压编译后的程序包

tar -zxvf ranger-2.4.0-yarn-plugin.tar.gz

2、配置install.properties

# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

#
# Location of Policy Manager URL
#
# Example:
# POLICY_MGR_URL=http://policymanager.xasecure.net:6080
#
POLICY_MGR_URL=http://192.168.121.103:6080

#
# This is the repository name created within policy manager
#
# Example:
# REPOSITORY_NAME=yarndev
#
REPOSITORY_NAME=yarn_repo

#
# Name of the directory where the component's lib and conf directory exist.
# This location should be relative to the parent of the directory containing
# the plugin installation files.
#
COMPONENT_INSTALL_DIR_NAME=/opt/hadoop/hadoop-3.3.4/

# Enable audit logs to Solr
#Example
#XAAUDIT.SOLR.ENABLE=true
#XAAUDIT.SOLR.URL=http://localhost:6083/solr/ranger_audits
#XAAUDIT.SOLR.ZOOKEEPER=
#XAAUDIT.SOLR.FILE_SPOOL_DIR=/var/log/hadoop/yarn/audit/solr/spool

XAAUDIT.SOLR.ENABLE=false
XAAUDIT.SOLR.URL=NONE
XAAUDIT.SOLR.USER=NONE
XAAUDIT.SOLR.PASSWORD=NONE
XAAUDIT.SOLR.ZOOKEEPER=NONE
XAAUDIT.SOLR.FILE_SPOOL_DIR=/var/log/hadoop/yarn/audit/solr/spool

# Enable audit logs to ElasticSearch
#Example
#XAAUDIT.ELASTICSEARCH.ENABLE=true
#XAAUDIT.ELASTICSEARCH.URL=localhost
#XAAUDIT.ELASTICSEARCH.INDEX=audit

XAAUDIT.ELASTICSEARCH.ENABLE=false
XAAUDIT.ELASTICSEARCH.URL=NONE
XAAUDIT.ELASTICSEARCH.USER=NONE
XAAUDIT.ELASTICSEARCH.PASSWORD=NONE
XAAUDIT.ELASTICSEARCH.INDEX=NONE
XAAUDIT.ELASTICSEARCH.PORT=NONE
XAAUDIT.ELASTICSEARCH.PROTOCOL=NONE

# Enable audit logs to HDFS
#Example
#XAAUDIT.HDFS.ENABLE=true
#XAAUDIT.HDFS.HDFS_DIR=hdfs://node-1.example.com:8020/ranger/audit
# If using Azure Blob Storage
#XAAUDIT.HDFS.HDFS_DIR=wasb[s]://@.blob.core.windows.net/
#XAAUDIT.HDFS.HDFS_DIR=wasb://[email protected]/ranger/audit
#XAAUDIT.HDFS.FILE_SPOOL_DIR=/var/log/hadoop/yarn/audit/hdfs/spool

XAAUDIT.HDFS.ENABLE=false
XAAUDIT.HDFS.HDFS_DIR=hdfs://__REPLACE__NAME_NODE_HOST:8020/ranger/audit
XAAUDIT.HDFS.FILE_SPOOL_DIR=/var/log/hadoop/yarn/audit/hdfs/spool

# Following additional propertis are needed When auditing to Azure Blob Storage via HDFS
# Get these values from your /etc/hadoop/conf/core-site.xml
#XAAUDIT.HDFS.HDFS_DIR=wasb[s]://@.blob.core.windows.net/
XAAUDIT.HDFS.AZURE_ACCOUNTNAME=__REPLACE_AZURE_ACCOUNT_NAME
XAAUDIT.HDFS.AZURE_ACCOUNTKEY=__REPLACE_AZURE_ACCOUNT_KEY
XAAUDIT.HDFS.AZURE_SHELL_KEY_PROVIDER=__REPLACE_AZURE_SHELL_KEY_PROVIDER
XAAUDIT.HDFS.AZURE_ACCOUNTKEY_PROVIDER=__REPLACE_AZURE_ACCOUNT_KEY_PROVIDER

#Log4j Audit Provider
XAAUDIT.LOG4J.ENABLE=false
XAAUDIT.LOG4J.IS_ASYNC=false
XAAUDIT.LOG4J.ASYNC.MAX.QUEUE.SIZE=10240
XAAUDIT.LOG4J.ASYNC.MAX.FLUSH.INTERVAL.MS=30000
XAAUDIT.LOG4J.DESTINATION.LOG4J=true
XAAUDIT.LOG4J.DESTINATION.LOG4J.LOGGER=xaaudit

# Enable audit logs to Amazon CloudWatch Logs
#Example
#XAAUDIT.AMAZON_CLOUDWATCH.ENABLE=true
#XAAUDIT.AMAZON_CLOUDWATCH.LOG_GROUP=ranger_audits
#XAAUDIT.AMAZON_CLOUDWATCH.LOG_STREAM={instance_id}
#XAAUDIT.AMAZON_CLOUDWATCH.FILE_SPOOL_DIR=/var/log/hive/audit/amazon_cloudwatch/spool

XAAUDIT.AMAZON_CLOUDWATCH.ENABLE=false
XAAUDIT.AMAZON_CLOUDWATCH.LOG_GROUP=NONE
XAAUDIT.AMAZON_CLOUDWATCH.LOG_STREAM_PREFIX=NONE
XAAUDIT.AMAZON_CLOUDWATCH.FILE_SPOOL_DIR=NONE
XAAUDIT.AMAZON_CLOUDWATCH.REGION=NONE

# End of V3 properties

#
# Audit to HDFS Configuration
#
# If XAAUDIT.HDFS.IS_ENABLED is set to true, please replace tokens
# that start with __REPLACE__ with appropriate values
# XAAUDIT.HDFS.IS_ENABLED=true
# XAAUDIT.HDFS.DESTINATION_DIRECTORY=hdfs://__REPLACE__NAME_NODE_HOST:8020/ranger/audit/%app-type%/%time:yyyyMMdd%
# XAAUDIT.HDFS.LOCAL_BUFFER_DIRECTORY=__REPLACE__LOG_DIR/yarn/audit
# XAAUDIT.HDFS.LOCAL_ARCHIVE_DIRECTORY=__REPLACE__LOG_DIR/yarn/audit/archive
#
# Example:
# XAAUDIT.HDFS.IS_ENABLED=true
# XAAUDIT.HDFS.DESTINATION_DIRECTORY=hdfs://namenode.example.com:8020/ranger/audit/%app-type%/%time:yyyyMMdd%
# XAAUDIT.HDFS.LOCAL_BUFFER_DIRECTORY=/var/log/yarn/audit
# XAAUDIT.HDFS.LOCAL_ARCHIVE_DIRECTORY=/var/log/yarn/audit/archive
#
XAAUDIT.HDFS.IS_ENABLED=false
XAAUDIT.HDFS.DESTINATION_DIRECTORY=hdfs://__REPLACE__NAME_NODE_HOST:8020/ranger/audit/%app-type%/%time:yyyyMMdd%
XAAUDIT.HDFS.LOCAL_BUFFER_DIRECTORY=__REPLACE__LOG_DIR/yarn/audit
XAAUDIT.HDFS.LOCAL_ARCHIVE_DIRECTORY=__REPLACE__LOG_DIR/yarn/audit/archive

XAAUDIT.HDFS.DESTINTATION_FILE=%hostname%-audit.log
XAAUDIT.HDFS.DESTINTATION_FLUSH_INTERVAL_SECONDS=900
XAAUDIT.HDFS.DESTINTATION_ROLLOVER_INTERVAL_SECONDS=86400
XAAUDIT.HDFS.DESTINTATION_OPEN_RETRY_INTERVAL_SECONDS=60
XAAUDIT.HDFS.LOCAL_BUFFER_FILE=%time:yyyyMMdd-HHmm.ss%.log
XAAUDIT.HDFS.LOCAL_BUFFER_FLUSH_INTERVAL_SECONDS=60
XAAUDIT.HDFS.LOCAL_BUFFER_ROLLOVER_INTERVAL_SECONDS=600
XAAUDIT.HDFS.LOCAL_ARCHIVE_MAX_FILE_COUNT=10

#Solr Audit Provider
XAAUDIT.SOLR.IS_ENABLED=false
XAAUDIT.SOLR.MAX_QUEUE_SIZE=1
XAAUDIT.SOLR.MAX_FLUSH_INTERVAL_MS=1000
XAAUDIT.SOLR.SOLR_URL=http://localhost:6083/solr/ranger_audits

# End of V2 properties

#
# SSL Client Certificate Information
#
# Example:
# SSL_KEYSTORE_FILE_PATH=/etc/hadoop/conf/ranger-plugin-keystore.jks
# SSL_KEYSTORE_PASSWORD=none
# SSL_TRUSTSTORE_FILE_PATH=/etc/hadoop/conf/ranger-plugin-truststore.jks
# SSL_TRUSTSTORE_PASSWORD=none
#
# You do not need use SSL between agent and security admin tool, please leave these sample value as it is.
#
SSL_KEYSTORE_FILE_PATH=/etc/hadoop/conf/ranger-plugin-keystore.jks
SSL_KEYSTORE_PASSWORD=myKeyFilePassword
SSL_TRUSTSTORE_FILE_PATH=/etc/hadoop/conf/ranger-plugin-truststore.jks
SSL_TRUSTSTORE_PASSWORD=changeit

#
# Custom component user
# CUSTOM_COMPONENT_USER=
# keep blank if component user is default
CUSTOM_USER=hadoop

#
# Custom component group
# CUSTOM_COMPONENT_GROUP=
# keep blank if component group is default
CUSTOM_GROUP=hadoop

3、开启HDFS-Plugin

//需要在每个yarn节点上配置
./enable-hdfs-plugin.sh

Custom user and group is available, using custom user and group.
+ Wed Aug 30 17:29:26 CST 2023 : yarn: lib folder=/opt/hadoop/hadoop-3.3.4/share/hadoop/hdfs/lib conf folder=/opt/hadoop/hadoop-3.3.4/etc/hadoop
+ Wed Aug 30 17:29:26 CST 2023 : Saving /opt/hadoop/hadoop-3.3.4/etc/hadoop/ranger-policymgr-ssl.xml to /opt/hadoop/hadoop-3.3.4/etc/hadoop/.ranger-policymgr-ssl.xml.20230830-172926 ...
+ Wed Aug 30 17:29:26 CST 2023 : Saving /opt/hadoop/hadoop-3.3.4/etc/hadoop/ranger-yarn-audit.xml to /opt/hadoop/hadoop-3.3.4/etc/hadoop/.ranger-yarn-audit.xml.20230830-172926 ...
+ Wed Aug 30 17:29:26 CST 2023 : Saving /opt/hadoop/hadoop-3.3.4/etc/hadoop/ranger-yarn-security.xml to /opt/hadoop/hadoop-3.3.4/etc/hadoop/.ranger-yarn-security.xml.20230830-172926 ...
+ Wed Aug 30 17:29:26 CST 2023 : Saving current config file: /opt/hadoop/hadoop-3.3.4/etc/hadoop/ranger-policymgr-ssl.xml to /opt/hadoop/hadoop-3.3.4/etc/hadoop/.ranger-policymgr-ssl.xml.20230830-172926 ...
+ Wed Aug 30 17:29:26 CST 2023 : Saving current config file: /opt/hadoop/hadoop-3.3.4/etc/hadoop/ranger-yarn-audit.xml to /opt/hadoop/hadoop-3.3.4/etc/hadoop/.ranger-yarn-audit.xml.20230830-172926 ...
+ Wed Aug 30 17:29:27 CST 2023 : Saving current config file: /opt/hadoop/hadoop-3.3.4/etc/hadoop/ranger-yarn-security.xml to /opt/hadoop/hadoop-3.3.4/etc/hadoop/.ranger-yarn-security.xml.20230830-172926 ...
+ Wed Aug 30 17:29:27 CST 2023 : Saving current config file: /opt/hadoop/hadoop-3.3.4/etc/hadoop/yarn-site.xml to /opt/hadoop/hadoop-3.3.4/etc/hadoop/.yarn-site.xml.20230830-172926 ...
+ Wed Aug 30 17:29:27 CST 2023 : Saving lib file: /opt/hadoop/hadoop-3.3.4/share/hadoop/hdfs/lib/ranger-plugin-classloader-2.4.0.jar to /opt/hadoop/hadoop-3.3.4/share/hadoop/hdfs/lib/.ranger-plugin-classloader-2.4.0.jar.20230830172927 ...
+ Wed Aug 30 17:29:27 CST 2023 : Saving lib file: /opt/hadoop/hadoop-3.3.4/share/hadoop/hdfs/lib/ranger-yarn-plugin-impl to /opt/hadoop/hadoop-3.3.4/share/hadoop/hdfs/lib/.ranger-yarn-plugin-impl.20230830172927 ...
+ Wed Aug 30 17:29:27 CST 2023 : Saving lib file: /opt/hadoop/hadoop-3.3.4/share/hadoop/hdfs/lib/ranger-yarn-plugin-shim-2.4.0.jar to /opt/hadoop/hadoop-3.3.4/share/hadoop/hdfs/lib/.ranger-yarn-plugin-shim-2.4.0.jar.20230830172927 ...
+ Wed Aug 30 17:29:27 CST 2023 : Saving current JCE file: /etc/ranger/yarn_repo/cred.jceks to /etc/ranger/yarn_repo/.cred.jceks.20230830172927 ...
+ Wed Aug 30 17:29:29 CST 2023 : Saving current JCE file: /etc/ranger/yarn_repo/cred.jceks to /etc/ranger/yarn_repo/.cred.jceks.20230830172929 ...
Ranger Plugin for yarn has been enabled. Please restart yarn to ensure that changes are effective.

说明：
如果报错如下：
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/commons/logging/LogFactory，原因是yarn-plugin缺少jar，可以从/opt/ranger/ranger-2.4.0-hdfs-plugin/install/lib中拷贝commons-logging-XXX.jar到/opt/ranger/ranger-2.4.0-yarn-plugin/install/lib/下。
同理，如果是Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/commons/lang3/StringUtils，则需要拷贝commons-lang3-XXX.jar
同理，如果Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/htrace/core/Tracer$Builder，则需要拷贝htrace-core4-4.1.0-incubating.jar
同理，如果Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/commons/compress/archivers/tar/TarArchiveInputStream，则需要拷贝commons-compress-XXX.jar

4、重启HDFS

su - hadoop
stop_yarn.sh
start-yarn.sh

5、页面配置YARN服务

Service Name：要与REPOSITORY_NAME保持一致
Username：hadoop用户
Password：hadoop用户密码
YARN REST URL *：yarn的resourcemanager的地址，如果是HA架构，多个rm节点地址用逗号或分号分割

填写完成后，先点Test Connection测试一下连接。

如果连接测试成功，可以添加完成

6、验证权限控制

配置新的策略，如下，yarn上的队列Queue默认为root.default，以此为例，允许hadoop用户在队列root.default上提交，拒绝test用户在队列root.default上提交

说明：策略配置后，生效时间为30秒

使用hadoop用户，运行计算任务：
hadoop jar hadoop-mapreduce-examples-3.3.4.jar pi 3 3

Number of Maps = 3
Samples per Map = 3
Wrote input for Map #0
Wrote input for Map #1
Wrote input for Map #2
Starting Job
2023-08-30 18:22:27,786 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm2
2023-08-30 18:22:27,911 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hadoop-yarn/staging/hadoop/.staging/job_1693388094495_0006
2023-08-30 18:22:28,118 INFO input.FileInputFormat: Total input files to process : 3
2023-08-30 18:22:28,243 INFO mapreduce.JobSubmitter: number of splits:3
2023-08-30 18:22:28,579 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1693388094495_0006
2023-08-30 18:22:28,581 INFO mapreduce.JobSubmitter: Executing with tokens: []
2023-08-30 18:22:28,856 INFO conf.Configuration: resource-types.xml not found
2023-08-30 18:22:28,862 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'.
2023-08-30 18:22:29,083 INFO impl.YarnClientImpl: Submitted application application_1693388094495_0006
2023-08-30 18:22:29,208 INFO mapreduce.Job: The url to track the job: http://node102.cc.local:8088/proxy/application_1693388094495_0006/
2023-08-30 18:22:29,209 INFO mapreduce.Job: Running job: job_1693388094495_0006
2023-08-30 18:22:39,549 INFO mapreduce.Job: Job job_1693388094495_0006 running in uber mode : false
2023-08-30 18:22:39,552 INFO mapreduce.Job: map 0% reduce 0%
2023-08-30 18:22:59,558 INFO mapreduce.Job: map 100% reduce 0%
2023-08-30 18:23:07,782 INFO mapreduce.Job: map 100% reduce 100%
2023-08-30 18:23:08,807 INFO mapreduce.Job: Job job_1693388094495_0006 completed successfully
2023-08-30 18:23:08,991 INFO mapreduce.Job: Counters: 54
File System Counters
FILE: Number of bytes read=72
FILE: Number of bytes written=1121844
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=768
HDFS: Number of bytes written=215
HDFS: Number of read operations=17
HDFS: Number of large read operations=0
HDFS: Number of write operations=3
HDFS: Number of bytes read erasure-coded=0
Job Counters
Launched map tasks=3
Launched reduce tasks=1
Data-local map tasks=3
Total time spent by all maps in occupied slots (ms)=54819
Total time spent by all reduces in occupied slots (ms)=5732
Total time spent by all map tasks (ms)=54819
Total time spent by all reduce tasks (ms)=5732
Total vcore-milliseconds taken by all map tasks=54819
Total vcore-milliseconds taken by all reduce tasks=5732
Total megabyte-milliseconds taken by all map tasks=56134656
Total megabyte-milliseconds taken by all reduce tasks=5869568
Map-Reduce Framework
Map input records=3
Map output records=6
Map output bytes=54
Map output materialized bytes=84
Input split bytes=414
Combine input records=0
Combine output records=0
Reduce input groups=2
Reduce shuffle bytes=84
Reduce input records=6
Reduce output records=0
Spilled Records=12
Shuffled Maps =3
Failed Shuffles=0
Merged Map outputs=3
GC time elapsed (ms)=366
CPU time spent (ms)=3510
Physical memory (bytes) snapshot=693649408
Virtual memory (bytes) snapshot=10972426240
Total committed heap usage (bytes)=436482048
Peak Map Physical memory (bytes)=194879488
Peak Map Virtual memory (bytes)=2741276672
Peak Reduce Physical memory (bytes)=110477312
Peak Reduce Virtual memory (bytes)=2748796928
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=354
File Output Format Counters
Bytes Written=97
Job Finished in 41.735 seconds
Estimated value of Pi is 3.55555555555555555556

使用test用户，运行计算任务：
hadoop jar /opt/hadoop/hadoop-3.3.4/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.3.4.jar pi 3 3

Number of Maps = 3
Samples per Map = 3
Wrote input for Map #0
Wrote input for Map #1
Wrote input for Map #2
Starting Job
2023-08-30 18:48:58,861 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm2
2023-08-30 18:48:59,045 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hadoop-yarn/staging/test/.staging/job_1693392487740_0001
2023-08-30 18:48:59,260 INFO input.FileInputFormat: Total input files to process : 3
2023-08-30 18:48:59,444 INFO mapreduce.JobSubmitter: number of splits:3
2023-08-30 18:48:59,834 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1693392487740_0001
2023-08-30 18:48:59,835 INFO mapreduce.JobSubmitter: Executing with tokens: []
2023-08-30 18:49:00,093 INFO conf.Configuration: resource-types.xml not found
2023-08-30 18:49:00,094 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'.
2023-08-30 18:49:00,295 INFO mapreduce.JobSubmitter: Cleaning up the staging area /tmp/hadoop-yarn/staging/test/.staging/job_1693392487740_0001
java.io.IOException: org.apache.hadoop.yarn.exceptions.YarnException: org.apache.hadoop.security.AccessControlException: User test does not have permission to submit application_1693392487740_0001 to queue default

你可能感兴趣的:(apache)

数据权限访问控制（Apache Sentry） deepdata_cn 权限管理 apache sentry
ApacheSentry最初由Cloudera公司内部开发，针对Hadoop系统中的数据（主要是HDFS、Hive的数据）进行细粒度控制，对HDFS、Hive以及Impala有着良好的支持性。2013年Sentry成为Apache的孵化项目，为Hadoop集群元数据和数据存储提供集中、细粒度的访问控制。其架构包括DataEngine、Plugin、Policymetadata等部分，Plugin负
Nginx多台服务器负载均衡 PS测服务器 nginx 负载均衡
一操作步骤:1.服务器IP45.114.124.215//主服务器(安装Nginx)45.114.124.99//从服务器(安装Nginx或Apache都可以)2.保证2台服务器网络互通3.在2台服务器上设置不同页面方便验证3.1在主服务器添加一个可以访问的站点3.2在次服务器添加一个站点,端口必须是主服务器在nginx指定给次服务器的端口4.在主服务器45.114.124.215安装Nginx，
【Python系列】高效Parquet数据处理策略：合并与分析实践小团团0 python 开发语言
在大数据时代，数据的存储、处理和分析变得尤为重要。Parquet作为一种高效的列存储格式，被广泛应用于大数据处理框架中，如ApacheSpark、ApacheHive等。Parquet是一个开源的列存储格式，它被设计用于支持复杂的嵌套数据结构，同时提供高效的压缩和编码方案，以优化存储空间和查询性能。以下将详细介绍如何使用Python对Parquet文件进行数据处理与合并，并提供相应的源码示例。一、
hbase表无法删除，命令行卡住问题处理 spring208208 大数据组件线上问题分析 hbase 数据库大数据
问题现象hbase表无法删除，命令行卡住1.activemaster日志出现超时WARNorg.apache.hadoop.hbase.master.procedure.TruncateTableProcedure:Retriableerrortryingtotruncatetable=xxxstate=TRUNCATE_TABLE_PRE_OPERATIONorg.apache.hadoop.h
rocketmq-client 4.3.0 在springboot中的使用 Myueye JAVA java
rocketmq-client4.3.0在springboot中的使用1、导入依赖2、配置文件属性3、编写配置类4、使用测试5、结果5.1RocketMQ后台显示5.2前端页面5.3后端后台1、导入依赖org.apache.rocketmqrocketmq-client4.3.02、配置文件属性mq.nameserverAdd=ip地址:9876mq.topic=top1(topic名称)mq.p
langchain4j+Tika小试牛刀 llm
序本文主要研究一下langchain4j结合ApacheTika进行文档解析步骤pom.xmldev.langchain4jlangchain4j-document-parser-apache-tika1.0.0-beta1examplepublicclassTikaTest{publicstaticvoidmain(String[]args){Stringpath=System.getPrope
数据湖Iceberg、Hudi和Paimon比较_数据湖框架对比(1) 2301_79098963 程序员知识图谱人工智能
4.Schema变更支持对比项ApacheIcebergApacheHudiApachePaimonSchemaEvolutionALLback-compatibleback-compatibleSelf-definedschemaobjectYESNO(spark-schema)NO（我理解，不准确）SchemaEvolution：指schema变更的支持情况，我的理解是hudi仅支持添加可选列
SpringBoot集成Flink-CDC，实现对数据库数据的监听 rkmhr_sef 面试学习路线阿里巴巴 spring boot flink 数据库
一、什么是CDC？CDC是ChangeDataCapture（变更数据获取）的简称。核心思想是，监测并捕获数据库的变动（包括数据或数据表的插入、更新以及删除等），将这些变更按发生的顺序完整记录下来，写入到消息中间件中以供其他服务进行订阅及消费。二、Flink-CDC是什么？CDCConnectorsforApacheFlink是一组用于ApacheFlink的源连接器，使用变更数据捕获(CDC)从
Apache大数据旭哥优选大数据选题 Apache大数据旭大数据定制选题 java hadoop spark 开发语言 idea hive 数据库架构
定制旭哥服务，一对一，无中介包安装+答疑+售后态度和技术都很重要定制按需求做要求不高就实惠一点定制需提前沟通好怎么做，这样才能避免不必要的麻烦python、flask、Django、mapreduce、mysqljava、springboot、vue、echarts、hadoop、spark、hive、hbase、flink、SparkStreaming、kafka、flume、sqoop分析+推
【Hive】-- hive 3.1.3 伪分布式部署（单节点） oo寻梦in记 Apache Paimon 大数据服务部署 hive 分布式 hadoop
1、环境准备1.1、版本选择apachehive3.1.3apachehadoop3.1.0oraclejdk1.8mysql8.0.15操作系统：Macos10.151.2、软件下载https://archive.apache.org/dist/hive/https://archive.apache.org/dist/hadoop/1.3、解压tar-zxvfapache-hive-4.0.0-
Tenacity（Python的坚韧重试库） ftpeak Python python 开发语言网络爬虫
概述Tenacity是一个基于Apache2.0协议的通用重试库，用Python编写，旨在简化向任何代码添加重试逻辑的过程。它起源于已停止维护的retrying库的分叉版本。Tenacity不兼容retrying的API，但新增了大量功能并修复了长期存在的错误。文档：Tenacity—Tenacitydocumentation主页：https://github.com/jd/tenacity核心功
自动化配置管理工具 SaltStack-03 Mr.Ron linux 自动化服务器运维
一、Jinja模板应用案例1、需求描述给之前通过saltstack安装好的lamp环境的apache修改配置文件，要求每个主机监听自己ip的80端口。2、实现思路如果通过单纯的修改配置文件根本无法实现，所以我们需要用到模板，将配置文件作为模板，通过定义模板中的变量来实现，并且需要引用grians参数。#编辑state配置文件[root@server~]#vim/srv/salt/prod/apac
jmeter安装和jmeter历史版本下载 weixin_30432007 java
一、jmete下载：1、最新版本下载地址：http://jmeter.apache.org/download_jmeter.cgi2、历史版本下载地址：https://archive.apache.org/dist/jmeter/binaries/二、软件安装及设置环境变量1、JDK安装目录在D:\ProgramFiles\Java，其环境变量设置为：JAVA_HOME值为：D:\ProgramF
找不到Jmeter历史版本下载的同学看这里（内附使用阿里镜像和腾讯镜像下载开源软件的地址）测试开发Kevin jmeter 测试工具 jmeter
最近需要在jmeter4上验证一个问题，于是就在网上各种找jmeter不同版本的下载地址，比较麻烦。为了让大家不踩坑，在这里汇总一下下载地址：下载jmeter地址汇总jmeter最新版本官网下载地址：ApacheJMeter-DownloadApacheJMeterhttps://jmeter.apache.org/download_jmeter.cgijmeter历史版本下载地址（建议收藏）In
Hadoop 集群规划与部署最佳实践 AI天才研究院 Python实战 DeepSeek R1 &大数据AI人工智能大模型自然语言处理人工智能语言模型编程实践开发语言架构设计
作者：禅与计算机程序设计艺术1.简介2009年2月2日，ApacheHadoop项目诞生。它是一个开源的分布式系统基础架构，用于存储、处理和分析海量的数据。Hadoop具有高容错性、可靠性、可扩展性、适应性等特征，因而广泛应用于数据仓库、日志分析、网络流量监测、推荐引擎、搜索引擎等领域。由于Hadoop采用“分而治之”的架构设计理念，因此可以轻松应对数据量、计算能力和存储成本的增长。2013年底，
轻松入门Apache SeaTunnel：数据集成利器窝窝和牛牛 SeaTunnel ETL 数据集成
文章目录轻松入门ApacheSeaTunnel：数据集成利器什么是SeaTunnel基本原理运行流程SeaTunnelvsDataX：两大数据集成工具对比实战场景：MySQL数据同步至ElasticsearchSeaTunnel实现方案DataX实现方案实现原理对比底层依赖环境方案优缺点分析快速上手环境准备简单示例总结轻松入门ApacheSeaTunnel：数据集成利器什么是SeaTunnelAp
HBase的架构介绍，安装及简单操作 pk_xz123456 大数据 hbase 架构数据库
一、HBase安装1.环境准备Java环境：确保系统中已经安装了Java8或更高版本。可以通过在命令行中输入java-version来检查Java版本。Hadoop环境：HBase依赖于Hadoop，需要先安装并配置好Hadoop集群。确保Hadoop的相关服务（如HDFS、YARN等）已经正常启动。2.下载HBase从HBase官方网站（https://hbase.apache.org/）下载适
springboot使用kafka自定义JSON序列化器和反序列化器 zhou_zhao_xu Kafka spring
1.序列化器packagecom.springboot.kafkademo.serialization;importcom.alibaba.fastjson.JSON;importcom.alibaba.fastjson.JSONObject;importorg.apache.kafka.common.serialization.Serializer;importjava.util.Map;/**
通过启用Ranger插件的Hive审计日志同步到Doris做分析 fzip Doris Hive doris 审计 hive
以下是基于ApacheDoris的RangerHive审计日志同步方案详细步骤，结合审计日志插件与数据导入策略实现：一、Doris环境准备1.创建审计日志库表参考搜索结果的表结构设计，根据Ranger日志字段调整建表语句：CREATEDATABASEIFNOTEXISTSranger_audit;CREATETABLEIFNOTEXISTSranger_audit_hive_log(repoTyp
kafka生产消息失败 ...has passed since batch creation plus linger time Lichenpar #记录BUG解决 kafka 网络安全 java
背景：公司要使用华为云的kafka服务，我负责进行技术预研，后期要封装kafka组件。从华为云下载了demo，完全按照开发者文档来进行配置文件配置，但是会报以下错误。org.apache.kafka.common.errors.TimeoutException:Expiring10record(s)fortopic-0:30015mshaspassedsincebatchcreationplusl
探索数据安全新境界：Apache Spark SQL Ranger Security插件深度揭秘乌昱有Melanie
探索数据安全新境界：ApacheSparkSQLRangerSecurity插件深度揭秘项目地址:https://gitcode.com/gh_mirrors/sp/spark-ranger随着大数据的爆炸性增长，数据安全性成为了企业不可忽视的核心议题。在这一背景下，【ApacheSparkSQLRangerSecurityPlugin】以其强大的数据访问控制能力脱颖而出，成为数据处理领域的明星级
云原生周刊丨CIO 洞察：Kubernetes 解锁 AI 新纪元 KubeSphere 云原生云原生 kubernetes 人工智能
开源项目推荐DRANETDRANET是由谷歌开发的K8s网络驱动程序，利用K8s的动态资源分配（DRA）功能，为高吞吐量和低延迟应用提供高性能网络支持。它旨在优化资源管理，确保K8s集群中的网络资源能够按需高效分配。DRANET采用Apache-2.0开源许可，鼓励社区贡献与扩展，是云原生环境下提升网络性能的创新解决方案。LazyjournalLazyjournal是一个用Go语言编写的终端用户界
Maven简介 z迦在线 maven java
Maven简介Maven是Apache软件基金会的一个开源项目,是一个优秀的项目构建工具,它用来帮助开发者管理项目中的jar,以及jar之间的依赖关系、完成项目的编译（.java--->.class）、测试、打包（源代码--->.jar文件）和发布等工作。Maven是如何管理项目中的jar文件的？Maven简化了Java项目中的JAR文件管理，主要通过以下几个关键点：POM文件：Maven使用po
Flink相关面试题努力的搬砖人. 面试 java 后端 flink
以下是150道ApacheFlink面试题及其详细回答，涵盖了Flink的基础知识、核心架构、API使用、性能调优等多个方面，每道题目都尽量详细且简单易懂：Flink基础概念类1.什么是ApacheFlink？ApacheFlink是一个开源的流处理和批处理框架，能够实现快速、可靠、可扩展的大数据处理。它既可以处理无界的数据流，也可以处理有界的数据批，提供了低延迟和高吞吐量的实时数据处理能力。Fl
shell 脚本搭建apache 好多知识都想学 apache
#!/bin/bash#SetApacheversiontoinstall##author:yuan#检查外网连接echo"检查外网连接..."pingwww.baidu.com-c3>/dev/null2>&1if[$?-eq0];then echo"外网通讯良好！"else echo"网络连接失败，请检查你的网络设置！" exit1fisleep5#检查并安装APR库echo"检查并安装
Spring系列学习之Spring Messaging消息支持 m0_74825488 面试学习路线阿里巴巴 spring linq java
英文原文：https://docs.spring.io/spring-boot/docs/current/reference/html/boot-features-messaging.html目录JMSActiveMQ支持Artemis支持使用JNDIConnectionFactory发送消息接收消息AMQPRabbitMQ支持发送消息接收消息ApacheKafka支持发送消息接收消息Kafka流
[每周一更]-(第137期)：Go + Gin 实战：Docker Compose + Apache 反向代理全流程 ifanatic 每周一更容器 Go golang gin docker
文章目录**1.Go代码示例（`main.go`）****2.`Dockerfile`多段构建**3.构建Docker镜像**4.`docker-compose.yml`直接拉取镜像****5.运行容器****6.测试API**7、配置域名访问**DNS解析：将域名转换为IP地址****DNS寻址示例**8.错误记录访问路径ip+端口：端口可以了，但是小程序中不支持该格式，还需要配置nginx代理
一、MyBatis简介：MyBatis历史、MyBatis特性、和其它持久化层技术对比、Mybatis下载依赖包流程智能硬件控制器信息分析传感器
@[toc]一、MyBatis简介1.1MyBatis历史MyBatis最初是Apache的一个开源项目iBatis,2010年6月这个项目由ApacheSoftwareFoundation迁移到了GoogleCode。随着开发团队转投GoogleCode旗下，iBatis3.x正式更名为MyBatis。代码于2013年11月迁移到Github。iBatis一词来源于“internet”和“aba
dubbo服务META-INF.dubbo文件夹作用 zhglhy dubbo java apache
META-INF.dubbo文件夹是ApacheDubbo框架中的一个重要目录，通常用于存放Dubbo的SPI（ServiceProviderInterface）扩展配置文件。Dubbo是一个高性能的JavaRPC框架，支持分布式服务治理，而SPI机制是Dubbo实现可扩展性的核心设计之一。1.SPI机制简介SPI是Java提供的一种服务发现机制，允许框架在运行时动态加载实现类。Dubbo对其进行
Tomcat从入门到精通：全方位深度解析与实战教程墨瑾轩一起学学Java【一】运维 tomcat java
一、Tomcat入门1.Tomcat简介ApacheTomcat，简称Tomcat，是一个开源的轻量级应用服务器，专为运行JavaServlet和JavaServerPages(JSP)技术设计。它是JavaWeb开发中最常用的Servlet容器之一，遵循JavaServlet和JavaServerPages规范，为开发者提供了一个稳定的、易于使用的部署环境。2.安装与启动安装下载最新版Tomca
解线性方程组 qiuwanchi
package gaodai.matrix; import java.util.ArrayList; import java.util.List; import java.util.Scanner; public class Test { public static void main(String[] args) { Scanner scanner = new Sc
在mysql内部存储代码 annan211 性能 mysql 存储过程触发器
在mysql内部存储代码在mysql内部存储代码，既有优点也有缺点，而且有人倡导有人反对。先看优点： 1 她在服务器内部执行，离数据最近，另外在服务器上执行还可以节省带宽和网络延迟。 2 这是一种代码重用。可以方便的统一业务规则，保证某些行为的一致性，所以也可以提供一定的安全性。 3 可以简化代码的维护和版本更新。 4 可以帮助提升安全，比如提供更细
Android使用Asynchronous Http Client完成登录保存cookie的问题 hotsunshine android
Asynchronous Http Client是android中非常好的异步请求工具除了异步之外还有很多封装比如json的处理，cookie的处理引用 Persistent Cookie Storage with PersistentCookieStore This library also includes a PersistentCookieStore whi
java面试题 Array_06 java 面试
java面试题第一，谈谈final, finally, finalize的区别。 final-修饰符（关键字）如果一个类被声明为final，意味着它不能再派生出新的子类，不能作为父类被继承。因此一个类不能既被声明为 abstract的，又被声明为final的。将变量或方法声明为final，可以保证它们在使用中不被改变。被声明为final的变量必须在声明时给定初值，而在以后的引用中只能
网站加速 oloz 网站加速
前序:本人菜鸟，此文研究总结来源于互联网上的资料，大牛请勿喷！本人虚心学习，多指教. 1、减小网页体积的大小，尽量采用div+css模式，尽量避免复杂的页面结构，能简约就简约。 2、采用Gzip对网页进行压缩； GZIP最早由Jean-loup Gailly和Mark Adler创建，用于UNⅨ系统的文件压缩。我们在Linux中经常会用到后缀为.gz
正确书写单例模式随意而生 java 设计模式单例
　　单例模式算是设计模式中最容易理解，也是最容易手写代码的模式了吧。但是其中的坑却不少，所以也常作为面试题来考。本文主要对几种单例写法的整理，并分析其优缺点。很多都是一些老生常谈的问题，但如果你不知道如何创建一个线程安全的单例，不知道什么是双检锁，那这篇文章可能会帮助到你。　　懒汉式，线程不安全　　当被问到要实现一个单例模式时，很多人的第一反应是写出如下的代码，包括教科书上也是这样
单例模式香水浓 java
懒汉调用getInstance方法时实例化 public class Singleton { private static Singleton instance; private Singleton() {} public static synchronized Singleton getInstance() { if(null == ins
安装Apache问题：系统找不到指定的文件 No installed service named "Apache2" AdyZhang apache http server
安装Apache问题：系统找不到指定的文件 No installed service named "Apache2" 每次到这一步都很小心防它的端口冲突问题，结果，特意留出来的80端口就是不能用，烦。解决方法确保几处： 1、停止IIS启动 2、把端口80改成其它（譬如90，800，，，什么数字都好） 3、防火墙(关掉试试) 在运行处输入 cmd 回车，转到apa
如何在android 文件选择器中选择多个图片或者视频？ aijuans android
我的android app有这样的需求，在进行照片和视频上传的时候，需要一次性的从照片/视频库选择多条进行上传但是android原生态的sdk中，只能一个一个的进行选择和上传。我想知道是否有其他的android上传库可以解决这个问题，提供一个多选的功能，可以使checkbox之类的，一次选择多个处理方法官方的图片选择器(但是不支持所有版本的androi，只支持API Level
mysql中查询生日提醒的日期相关的sql baalwolf mysql
SELECT sysid,user_name,birthday,listid,userhead_50,CONCAT(YEAR(CURDATE()),DATE_FORMAT(birthday,'-%m-%d')),CURDATE(), dayofyear( CONCAT(YEAR(CURDATE()),DATE_FORMAT(birthday,'-%m-%d')))-dayofyear(
MongoDB索引文件破坏后导致查询错误的问题 BigBird2012 mongodb
问题描述： MongoDB在非正常情况下关闭时，可能会导致索引文件破坏，造成数据在更新时没有反映到索引上。解决方案：使用脚本，重建MongoDB所有表的索引。 var names = db.getCollectionNames(); for( var i in names ){ var name = names[i]; print(name);
Javascript Promise bijian1013 JavaScript Promise
Parse JavaScript SDK现在提供了支持大多数异步方法的兼容jquery的Promises模式，那么这意味着什么呢，读完下文你就了解了。一.认识Promises “Promises”代表着在javascript程序里下一个伟大的范式，但是理解他们为什么如此伟大不是件简
[Zookeeper学习笔记九]Zookeeper源代码分析之Zookeeper构造过程 bit1129 zookeeper
Zookeeper重载了几个构造函数，其中构造者可以提供参数最多，可定制性最多的构造函数是 public ZooKeeper(String connectString, int sessionTimeout, Watcher watcher, long sessionId, byte[] sessionPasswd, boolea
【Java命令三】jstack bit1129 jstack
jstack是用于获得当前运行的Java程序所有的线程的运行情况(thread dump），不同于jmap用于获得memory dump [hadoop@hadoop sbin]$ jstack Usage: jstack [-l] <pid> (to connect to running process) jstack -F
jboss 5.1启停脚本　动静分离部署 ronin47
以前启动jboss，往各种xml配置文件，现只要运行一句脚本即可。start nohup sh /**/run.sh -c servicename -b ip -g clustername -u broatcast jboss.messaging.ServerPeerID=int -Djboss.service.binding.set=p
UI之如何打磨设计能力? brotherlamp UI ui教程 ui自学 ui资料 ui视频
在越来越拥挤的初创企业世界里，视觉设计的重要性往往可以与杀手级用户体验比肩。在许多情况下，尤其对于 Web 初创企业而言，这两者都是不可或缺的。前不久我们在《右脑革命：别学编程了，学艺术吧》中也曾发出过重视设计的呼吁。如何才能提高初创企业的设计能力呢?以下是 9 位创始人的体会。 1.找到自己的方式如果你是设计师，要想提高技能可以去设计博客和展示好设计的网站如D-lists或
三色旗算法 bylijinnan java 算法
import java.util.Arrays; /** 问题：假设有一条绳子，上面有红、白、蓝三种颜色的旗子，起初绳子上的旗子颜色并没有顺序，您希望将之分类，并排列为蓝、白、红的顺序，要如何移动次数才会最少，注意您只能在绳子上进行这个动作，而且一次只能调换两个旗子。网上的解法大多类似：在一条绳子上移动，在程式中也就意味只能使用一个阵列，而不使用其它的阵列来
警告:No configuration found for the specified action: \'s chiangfai configuration
1.index.jsp页面form标签未指定namespace属性。  <%@taglib prefix="s" uri="/struts-tags"%> ... <s:form action="submit" method="post"&g
redis -- hash_max_zipmap_entries设置过大有问题 chenchao051 redis hash
使用redis时为了使用hash追求更高的内存使用率，我们一般都用hash结构，并且有时候会把hash_max_zipmap_entries这个值设置的很大，很多资料也推荐设置到1000，默认设置为了512，但是这里有个坑 #define ZIPMAP_BIGLEN 254 #define ZIPMAP_END 255 /* Return th
select into outfile access deny问题 daizj mysql txt 导出数据到文件
本文转自：http://hatemysql.com/2010/06/29/select-into-outfile-access-deny%E9%97%AE%E9%A2%98/ 为应用建立了rnd的帐号，专门为他们查询线上数据库用的，当然，只有他们上了生产网络以后才能连上数据库，安全方面我们还是很注意的，呵呵。授权的语句如下： grant select on armory.* to rn
phpexcel导出excel表简单入门示例 dcj3sjt126com PHP Excel phpexcel
<?php error_reporting(E_ALL); ini_set('display_errors', TRUE); ini_set('display_startup_errors', TRUE); if (PHP_SAPI == 'cli') die('This example should only be run from a Web Brows
美国电影超短200句 dcj3sjt126com 电影
1. I see．我明白了。2. I quit! 我不干了!3. Let go! 放手!4. Me too．我也是。5. My god! 天哪!6. No way! 不行!7. Come on．来吧(赶快)8. Hold on．等一等。9. I agree。我同意。10. Not bad．还不错。11. Not yet．还没。12. See you．再见。13. Shut up!
Java访问远程服务 dyy_gusi httpclient webservice get post
随着webService的崛起，我们开始中会越来越多的使用到访问远程webService服务。当然对于不同的webService框架一般都有自己的client包供使用，但是如果使用webService框架自己的client包，那么必然需要在自己的代码中引入它的包，如果同时调运了多个不同框架的webService，那么就需要同时引入多个不同的clien
Maven的settings.xml配置 geeksun settings.xml
settings.xml是Maven的配置文件，下面解释一下其中的配置含义： settings.xml存在于两个地方： 1.安装的地方：$M2_HOME/conf/settings.xml 2.用户的目录：${user.home}/.m2/settings.xml 前者又被叫做全局配置，后者被称为用户配置。如果两者都存在，它们的内容将被合并，并且用户范围的settings.xml优先。
ubuntu的init与系统服务设置 hongtoushizi ubuntu
转载自： http://iysm.net/?p=178 init Init是位于/sbin/init的一个程序，它是在linux下，在系统启动过程中，初始化所有的设备驱动程序和数据结构等之后，由内核启动的一个用户级程序，并由此init程序进而完成系统的启动过程。 ubuntu与传统的linux略有不同，使用upstart完成系统的启动，但表面上仍维持init程序的形式。运行
跟我学Nginx+Lua开发目录贴 jinnianshilongnian nginx lua
使用Nginx+Lua开发近一年的时间，学习和实践了一些Nginx+Lua开发的架构，为了让更多人使用Nginx+Lua架构开发，利用春节期间总结了一份基本的学习教程，希望对大家有用。也欢迎谈探讨学习一些经验。目录第一章安装Nginx+Lua开发环境第二章 Nginx+Lua开发入门第三章 Redis/SSDB+Twemproxy安装与使用第四章 L
php位运算符注意事项 home198979 位运算 PHP &
$a = $b = $c = 0; $a & $b = 1; $b | $c = 1 问a,b,c最终为多少? 当看到这题时，我犯了一个低级错误，误以为位运算符会改变变量的值。所以得出结果是1 1 0 但是位运算符是不会改变变量的值的，例如： $a=1;$b=2; $a&$b; 这样a,b的值不会有任何改变
Linux shell数组建立和使用技巧 pda158 linux
1.数组定义　　[chengmo@centos5 ~]$ a=(1 2 3 4 5) 　　[chengmo@centos5 ~]$ echo $a 　　1 　　一对括号表示是数组，数组元素用“空格”符号分割开。　　 2.数组读取与赋值　　得到长度：　　[chengmo@centos5 ~]$ echo ${#a[@]} 　　5 　　用${#数组名[@或
hotspot源码(JDK7) ol_beta java HotSpot jvm
源码结构图，方便理解： ├─agent Serviceab
Oracle基本事务和ForAll执行批量DML练习 vipbooks oracle sql
基本事务的使用：从账户一的余额中转100到账户二的余额中去，如果账户二不存在或账户一中的余额不足100则整笔交易回滚 select * from account; -- 创建一张账户表 create table account( -- 账户ID id number(3) not null, -- 账户名称 nam