mongo-hadoop集成

当hadoop集群需要借助mongodb的数据进行辅助分析时,为了快速实施我们需要将hive集成mongodb

1. 下载jar包,放到hive节点的第三方包/etc/hive/auxlib目录下(这个目录通过hive  hive.aux.jars.path属性配置)

wget https://repo1.maven.org/maven2/org/mongodb/mongo-hadoop/mongo-hadoop-core/2.0.2/mongo-hadoop-core-2.0.2.jar;​
​
wget https://repo1.maven.org/maven2/org/mongodb/mongo-hadoop/mongo-hadoop-hive/2.0.2/mongo-hadoop-hive-2.0.2.jar;

wget https://repo1.maven.org/maven2/org/mongodb/mongo-java-driver/3.2.1/mongo-java-driver-3.2.1.jar;

2 . 重启hive

3.  创建hive-mongo映射表

CREATE external TABLE ods_t.o_tenant_config_external_mongo(
    id string,
    tenant_credit_config string,
    tenant_common_config string,
    tenant_freight_config string,
    sku_category string,
    last_updated_at TIMESTAMP
)
STORED BY 'com.mongodb.hadoop.hive.MongoStorageHandler'
WITH SERDEPROPERTIES('mongo.columns.mapping'='{"id":"_id","tenant_credit_config":"tenantCreditConfig","tenant_common_config":"tenantCommonConfig","tenant_freight_config":"tenantFreightConfig","sku_category":"skuCategory","last_updated_at":"lastUpdatedAt"}')
TBLPROPERTIES('mongo.uri'='mongodb://account_name:[email protected]:27017/account_db.t_tenant_config');

注意:

    ①. mongo-java-driver jar版本不能低于mongodb组件的版本

    ②. 包访问权限不足时:  chmod 777 /etc/hive/auxlib/mongo-*

    ③. where条件过滤时,不能用 = 可以用 in 或 like,join时 = 可以

你可能感兴趣的:(hadoop,大数据,mongodb,大数据,hadoop,hive,mongo,数仓)