线上问诊:数仓开发(三)

系列文章目录

线上问诊:业务数据采集
线上问诊:数仓数据同步
线上问诊:数仓开发(一)
线上问诊:数仓开发(二)
线上问诊:数仓开发(三)


文章目录

  • 系列文章目录
  • 前言
  • 一、ADS
    • 1.交易主题
      • 1.交易综合统计
      • 2.各医院交易统计
      • 3.各性别患者交易统计
      • 4.各年龄段患者交易统计
    • 2.医生主题
      • 1.医生变动统计
    • 3.用户主题
      • 1.用户变动统计
    • 4.评价主题
      • 1.评价综合统计
      • 2.各医院评价统计
      • 5.数据装载脚本
  • 一、报表数据导出
    • 1.MySQL建库建表
      • 1.创建数据库
      • 2.创建表
    • 2.数据导出
      • 1.DataX配置文件生成脚本
      • 2.执行配置文件生成器
      • 3.编写每日导出脚本
  • 总结


前言

这次我们继续进行数仓的开发,应该能写完。


一、ADS

1.交易主题

1.交易综合统计

建表语句

CREATE EXTERNAL TABLE IF NOT EXISTS ads_trade_stats
(
    `dt`                          STRING COMMENT '统计日期',
    `recent_days`                 BIGINT COMMENT '统计周期: 最近1,7,30日',
    `consultation_amount`         DECIMAL(16, 2) COMMENT '问诊金额',
    `consultation_count`          BIGINT COMMENT '问诊次数',
    `consultation_pay_suc_amount` DECIMAL(16, 2) COMMENT '问诊支付成功金额',
    `consultation_pay_suc_count`  BIGINT COMMENT '问诊支付成功次数',
    `prescription_amount`         DECIMAL(16, 2) COMMENT '处方金额',
    `prescription_count`          BIGINT COMMENT '处方次数',
    `prescription_pay_suc_amount` DECIMAL(16, 2) COMMENT '处方支付成功金额',
    `prescription_pay_suc_count`  BIGINT COMMENT '处方支付成功次数'
) COMMENT '交易综合统计'
    ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
    LOCATION '/warehouse/medical/ads/ads_trade_stats';

2.各医院交易统计

建表语句

CREATE EXTERNAL TABLE IF NOT EXISTS ads_hospital_trade_stats
(
    `dt`                          STRING COMMENT '统计日期',
    `recent_days`                 BIGINT COMMENT '统计周期: 最近1,7,30日',
    `hospital_id`                 STRING COMMENT '医院ID',
    `hospital_name`               STRING COMMENT '医院名称',
    `consultation_amount`         DECIMAL(16, 2) COMMENT '问诊金额',
    `consultation_count`          BIGINT COMMENT '问诊次数',
    `consultation_pay_suc_amount` DECIMAL(16, 2) COMMENT '问诊支付成功金额',
    `consultation_pay_suc_count`  BIGINT COMMENT '问诊支付成功次数',
    `prescription_amount`         DECIMAL(16, 2) COMMENT '处方金额',
    `prescription_count`          BIGINT COMMENT '处方次数',
    `prescription_pay_suc_amount` DECIMAL(16, 2) COMMENT '处方支付成功金额',
    `prescription_pay_suc_count`  BIGINT COMMENT '处方支付成功次数'
) COMMENT '各医院交易统计'
    ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
    LOCATION '/warehouse/medical/ads/ads_hospital_trade_stats';

3.各性别患者交易统计

建表语句

CREATE EXTERNAL TABLE IF NOT EXISTS ads_gender_trade_stats
(
    `dt`                          STRING COMMENT '统计日期',
    `recent_days`                 BIGINT COMMENT '统计周期: 最近1,7,30日',
    `gender_code`                 STRING COMMENT '患者性别编码',
    `gender`                      STRING COMMENT '患者性别',
    `consultation_amount`         DECIMAL(16, 2) COMMENT '问诊金额',
    `consultation_count`          BIGINT COMMENT '问诊次数',
    `consultation_pay_suc_amount` DECIMAL(16, 2) COMMENT '问诊支付成功金额',
    `consultation_pay_suc_count`  BIGINT COMMENT '问诊支付成功次数',
    `prescription_amount`         DECIMAL(16, 2) COMMENT '处方金额',
    `prescription_count`          BIGINT COMMENT '处方次数',
    `prescription_pay_suc_amount` DECIMAL(16, 2) COMMENT '处方支付成功金额',
    `prescription_pay_suc_count`  BIGINT COMMENT '处方支付成功次数'
) COMMENT '各性别患者交易统计'
    ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
    LOCATION '/warehouse/medical/ads/ads_gender_trade_stats';

4.各年龄段患者交易统计

建表语句

CREATE EXTERNAL TABLE IF NOT EXISTS ads_age_group_trade_stats
(
    `dt`                          STRING COMMENT '统计日期',
    `recent_days`                 BIGINT COMMENT '统计周期: 最近1,7,30日',
    `age_group`                   STRING COMMENT '患者年龄段',
    `consultation_amount`         DECIMAL(16, 2) COMMENT '问诊金额',
    `consultation_count`          BIGINT COMMENT '问诊次数',
    `consultation_pay_suc_amount` DECIMAL(16, 2) COMMENT '问诊支付成功金额',
    `consultation_pay_suc_count`  BIGINT COMMENT '问诊支付成功次数',
    `prescription_amount`         DECIMAL(16, 2) COMMENT '处方金额',
    `prescription_count`          BIGINT COMMENT '处方次数',
    `prescription_pay_suc_amount` DECIMAL(16, 2) COMMENT '处方支付成功金额',
    `prescription_pay_suc_count`  BIGINT COMMENT '处方支付成功次数'
) COMMENT '各年龄段患者交易统计'
    ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
    LOCATION '/warehouse/medical/ads/ads_age_group_trade_stats';

数据装载

2.医生主题

1.医生变动统计

建表语句

CREATE EXTERNAL TABLE IF NOT EXISTS ads_doctor_change_stats(
    `dt` STRING COMMENT '统计日期',
    `recent_days` BIGINT COMMENT '统计周期: 最近1,7,30日',
    `new_doctor_count` BIGINT COMMENT '新增医生数',
    `activated_doctor_count` BIGINT COMMENT '激活医生数',
    `active_doctor_count` BIGINT COMMENT '活跃医生数'
) COMMENT '医生变动统计'
    ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
    LOCATION '/warehouse/medical/ads/ads_doctor_change_stats';

3.用户主题

1.用户变动统计

建表语句

CREATE EXTERNAL TABLE IF NOT EXISTS ads_user_change_stats(
    `dt` STRING COMMENT '统计日期',
    `recent_days` BIGINT COMMENT '统计周期: 最近1,7,30日',
    `new_user_count` BIGINT COMMENT '新增用户数',
    `new_patient_count` BIGINT COMMENT '新增患者数'
) COMMENT '用户变动统计'
    ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
    LOCATION '/warehouse/medical/ads/ads_user_change_stats';

4.评价主题

1.评价综合统计

建表语句

CREATE EXTERNAL TABLE IF NOT EXISTS ads_review_stats(
    `dt` STRING COMMENT '统计日期',
    `review_user_count` BIGINT COMMENT '评价人数',
    `review_count` BIGINT COMMENT '评价次数',
    `good_review_rate` DECIMAL(16,2) COMMENT '好评率'
) COMMENT '用户变动统计'
    ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
    LOCATION '/warehouse/medical/ads/ads_review_stats';

2.各医院评价统计

建表语句

CREATE EXTERNAL TABLE IF NOT EXISTS ads_hospital_review_stats(
    `dt` STRING COMMENT '统计日期',
    `hospital_id` STRING COMMENT '医院ID',
    `hospital_name` STRING COMMENT '医院名称',
    `review_user_count` BIGINT COMMENT '评价人数',
    `review_count` BIGINT COMMENT '评价次数',
    `good_review_rate` DECIMAL(16,2) COMMENT '好评率'
) COMMENT '各医院评价统计'
    ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
    LOCATION '/warehouse/medical/ads/ads_hospital_review_stats';

5.数据装载脚本

vim ~/bin/medical_dws_to_ads.sh

#!/bin/bash

APP=medical

if [ -n $2 ]
then 
    do_date=$2
else 
    echo "请传入日期参数!!!"
    exit
fi

ads_trade_stats="
insert overwrite table ${APP}.ads_trade_stats
select dt,
       recent_days,
       consultation_amount,
       consultation_count,
       consultation_pay_suc_amount,
       consultation_pay_suc_count,
       prescription_amount,
       prescription_count,
       prescription_pay_suc_amount,
       prescription_pay_suc_count
from ${APP}.ads_trade_stats
union
select '$do_date' dt,
       consul.recent_days,
       consultation_amount,
       consultation_count,
       consultation_pay_suc_amount,
       consultation_pay_suc_count,
       prescription_amount,
       prescription_count,
       prescription_pay_suc_amount,
       prescription_pay_suc_count
from (select 1                        recent_days,
             sum(consultation_amount) consultation_amount,
             sum(consultation_count)  consultation_count
      from ${APP}.dws_trade_hospital_gender_age_group_consultation_1d
      where dt = '$do_date'
      union
      select recent_days,
             sum(if(recent_days = 7, consultation_amount_7d, consultation_amount_30d)) consultation_amount,
             sum(if(recent_days = 7, consultation_count_7d, consultation_count_30d))   consultation_count
      from ${APP}.dws_trade_hospital_gender_age_group_consultation_nd lateral view explode(array(7, 30)) tmp as recent_days
      where dt = '$do_date'
      group by recent_days) consul
         left join
     (select 1                                recent_days,
             sum(consultation_pay_suc_amount) consultation_pay_suc_amount,
             sum(consultation_pay_suc_count)  consultation_pay_suc_count
      from ${APP}.dws_trade_hospital_gender_age_group_consultation_pay_suc_1d
      where dt = '$do_date'
      union
      select recent_days,
             sum(if(recent_days = 7, consultation_pay_suc_amount_7d,
                    consultation_pay_suc_amount_30d)) consultation_pay_suc_amount,
             sum(if(recent_days = 7, consultation_pay_suc_count_7d,
                    consultation_pay_suc_count_30d))  consultation_pay_suc_count
      from ${APP}.dws_trade_hospital_gender_age_group_consultation_pay_suc_nd lateral view explode(array(7, 30)) tmp as recent_days
      where dt = '$do_date'
      group by recent_days) consul_pay_suc
     on consul.recent_days = consul_pay_suc.recent_days
         left join
     (select 1                        recent_days,
             sum(prescription_amount) prescription_amount,
             sum(prescription_count)  prescription_count
      from ${APP}.dws_trade_hospital_gender_age_group_prescription_1d
      where dt = '$do_date'
      union
      select recent_days,
             sum(if(recent_days = 7, prescription_amount_7d, prescription_amount_30d)) prescription_amount,
             sum(if(recent_days = 7, prescription_count_7d, prescription_count_30d))   prescription_count
      from ${APP}.dws_trade_hospital_gender_age_group_prescription_nd lateral view explode(array(7, 30)) tmp as recent_days
      where dt = '$do_date'
      group by recent_days) prescription
     on consul.recent_days = prescription.recent_days
         left join
     (select 1                                recent_days,
             sum(prescription_pay_suc_amount) prescription_pay_suc_amount,
             sum(prescription_pay_suc_count)  prescription_pay_suc_count
      from ${APP}.dws_trade_hospital_gender_age_group_prescription_pay_suc_1d
      where dt = '$do_date'
      union
      select recent_days,
             sum(if(recent_days = 7, prescription_pay_suc_amount_7d,
                    prescription_pay_suc_amount_30d)) prescription_pay_suc_amount,
             sum(if(recent_days = 7, prescription_pay_suc_count_7d,
                    prescription_pay_suc_count_30d))  prescription_pay_suc_count
      from ${APP}.dws_trade_hospital_gender_age_group_prescription_pay_suc_nd lateral view explode(array(7, 30)) tmp as recent_days
      where dt = '$do_date'
      group by recent_days) prescription_pay_suc
     on consul.recent_days = prescription_pay_suc.recent_days;
"
ads_hospital_trade_stats="
insert overwrite table ${APP}.ads_hospital_trade_stats
select dt,
       recent_days,
       hospital_id,
       hospital_name,
       consultation_amount,
       consultation_count,
       consultation_pay_suc_amount,
       consultation_pay_suc_count,
       prescription_amount,
       prescription_count,
       prescription_pay_suc_amount,
       prescription_pay_suc_count
from ${APP}.ads_hospital_trade_stats
union
select '$do_date' dt,
       consul.recent_days,
       consul.hospital_id,
       consul.hospital_name,
       consultation_amount,
       consultation_count,
       consultation_pay_suc_amount,
       consultation_pay_suc_count,
       prescription_amount,
       prescription_count,
       prescription_pay_suc_amount,
       prescription_pay_suc_count
from (select 1                        recent_days,
             hospital_id,
             hospital_name,
             sum(consultation_amount) consultation_amount,
             sum(consultation_count)  consultation_count
      from ${APP}.dws_trade_hospital_gender_age_group_consultation_1d
      where dt = '$do_date'
      group by hospital_id,
               hospital_name
      union
      select recent_days,
             hospital_id,
             hospital_name,
             sum(if(recent_days = 7, consultation_amount_7d, consultation_amount_30d)) consultation_amount,
             sum(if(recent_days = 7, consultation_count_7d, consultation_count_30d))   consultation_count
      from ${APP}.dws_trade_hospital_gender_age_group_consultation_nd lateral view explode(array(7, 30)) tmp as recent_days
      where dt = '$do_date'
      group by recent_days,
               hospital_id,
               hospital_name) consul
         left join
     (select 1                                recent_days,
             hospital_id,
             hospital_name,
             sum(consultation_pay_suc_amount) consultation_pay_suc_amount,
             sum(consultation_pay_suc_count)  consultation_pay_suc_count
      from ${APP}.dws_trade_hospital_gender_age_group_consultation_pay_suc_1d
      where dt = '$do_date'
      group by hospital_id,
               hospital_name
      union
      select recent_days,
             hospital_id,
             hospital_name,
             sum(if(recent_days = 7, consultation_pay_suc_amount_7d,
                    consultation_pay_suc_amount_30d)) consultation_pay_suc_amount,
             sum(if(recent_days = 7, consultation_pay_suc_count_7d,
                    consultation_pay_suc_count_30d))  consultation_pay_suc_count
      from ${APP}.dws_trade_hospital_gender_age_group_consultation_pay_suc_nd lateral view explode(array(7, 30)) tmp as recent_days
      where dt = '$do_date'
      group by recent_days,
               hospital_id,
               hospital_name) consul_pay_suc
     on consul.recent_days = consul_pay_suc.recent_days
         and consul.hospital_id = consul_pay_suc.hospital_id
         and consul.hospital_name = consul_pay_suc.hospital_name
         left join
     (select 1                        recent_days,
             hospital_id,
             hospital_name,
             sum(prescription_amount) prescription_amount,
             sum(prescription_count)  prescription_count
      from ${APP}.dws_trade_hospital_gender_age_group_prescription_1d
      where dt = '$do_date'
      group by hospital_id,
               hospital_name
      union
      select recent_days,
             hospital_id,
             hospital_name,
             sum(if(recent_days = 7, prescription_amount_7d, prescription_amount_30d)) prescription_amount,
             sum(if(recent_days = 7, prescription_count_7d, prescription_count_30d))   prescription_count
      from ${APP}.dws_trade_hospital_gender_age_group_prescription_nd lateral view explode(array(7, 30)) tmp as recent_days
      where dt = '$do_date'
      group by recent_days,
               hospital_id,
               hospital_name) prescription
     on consul.recent_days = prescription.recent_days
         and consul.hospital_id = prescription.hospital_id
         and consul.hospital_name = prescription.hospital_name
         left join
     (select 1                                recent_days,
             hospital_id,
             hospital_name,
             sum(prescription_pay_suc_amount) prescription_pay_suc_amount,
             sum(prescription_pay_suc_count)  prescription_pay_suc_count
      from ${APP}.dws_trade_hospital_gender_age_group_prescription_pay_suc_1d
      where dt = '$do_date'
      group by hospital_id,
               hospital_name
      union
      select recent_days,
             hospital_id,
             hospital_name,
             sum(if(recent_days = 7, prescription_pay_suc_amount_7d,
                    prescription_pay_suc_amount_30d)) prescription_pay_suc_amount,
             sum(if(recent_days = 7, prescription_pay_suc_count_7d,
                    prescription_pay_suc_count_30d))  prescription_pay_suc_count
      from ${APP}.dws_trade_hospital_gender_age_group_prescription_pay_suc_nd lateral view explode(array(7, 30)) tmp as recent_days
      where dt = '$do_date'
      group by recent_days,
               hospital_id,
               hospital_name) prescription_pay_suc
     on consul.recent_days = prescription_pay_suc.recent_days
         and consul.hospital_id = prescription_pay_suc.hospital_id
         and consul.hospital_name = prescription_pay_suc.hospital_name;
"
ads_gender_trade_stats="
insert overwrite table ${APP}.ads_gender_trade_stats
select dt,
       recent_days,
       gender_code,
       gender,
       consultation_amount,
       consultation_count,
       consultation_pay_suc_amount,
       consultation_pay_suc_count,
       prescription_amount,
       prescription_count,
       prescription_pay_suc_amount,
       prescription_pay_suc_count
from ${APP}.ads_gender_trade_stats
union
select '$do_date' dt,
       consul.recent_days,
       consul.gender_code,
       consul.gender,
       consultation_amount,
       consultation_count,
       consultation_pay_suc_amount,
       consultation_pay_suc_count,
       prescription_amount,
       prescription_count,
       prescription_pay_suc_amount,
       prescription_pay_suc_count
from (select 1                        recent_days,
             gender_code,
             gender,
             sum(consultation_amount) consultation_amount,
             sum(consultation_count)  consultation_count
      from ${APP}.dws_trade_hospital_gender_age_group_consultation_1d
      where dt = '$do_date'
      group by gender_code,
               gender
      union
      select recent_days,
             gender_code,
             gender,
             sum(if(recent_days = 7, consultation_amount_7d, consultation_amount_30d)) consultation_amount,
             sum(if(recent_days = 7, consultation_count_7d, consultation_count_30d))   consultation_count
      from ${APP}.dws_trade_hospital_gender_age_group_consultation_nd lateral view explode(array(7, 30)) tmp as recent_days
      where dt = '$do_date'
      group by recent_days,
               gender_code,
               gender) consul
         left join
     (select 1                                recent_days,
             gender_code,
             gender,
             sum(consultation_pay_suc_amount) consultation_pay_suc_amount,
             sum(consultation_pay_suc_count)  consultation_pay_suc_count
      from ${APP}.dws_trade_hospital_gender_age_group_consultation_pay_suc_1d
      where dt = '$do_date'
      group by gender_code,
               gender
      union
      select recent_days,
             gender_code,
             gender,
             sum(if(recent_days = 7, consultation_pay_suc_amount_7d,
                    consultation_pay_suc_amount_30d)) consultation_pay_suc_amount,
             sum(if(recent_days = 7, consultation_pay_suc_count_7d,
                    consultation_pay_suc_count_30d))  consultation_pay_suc_count
      from ${APP}.dws_trade_hospital_gender_age_group_consultation_pay_suc_nd lateral view explode(array(7, 30)) tmp as recent_days
      where dt = '$do_date'
      group by recent_days,
               gender_code,
               gender) consul_pay_suc
     on consul.recent_days = consul_pay_suc.recent_days
         and consul.gender_code = consul_pay_suc.gender_code
         and consul.gender = consul_pay_suc.gender
         left join
     (select 1                        recent_days,
             gender_code,
             gender,
             sum(prescription_amount) prescription_amount,
             sum(prescription_count)  prescription_count
      from ${APP}.dws_trade_hospital_gender_age_group_prescription_1d
      where dt = '$do_date'
      group by gender_code,
               gender
      union
      select recent_days,
             gender_code,
             gender,
             sum(if(recent_days = 7, prescription_amount_7d, prescription_amount_30d)) prescription_amount,
             sum(if(recent_days = 7, prescription_count_7d, prescription_count_30d))   prescription_count
      from ${APP}.dws_trade_hospital_gender_age_group_prescription_nd lateral view explode(array(7, 30)) tmp as recent_days
      where dt = '$do_date'
      group by recent_days,
               gender_code,
               gender) prescription
     on consul.recent_days = prescription.recent_days
         and consul.gender_code = prescription.gender_code
         and consul.gender = prescription.gender
         left join
     (select 1                                recent_days,
             gender_code,
             gender,
             sum(prescription_pay_suc_amount) prescription_pay_suc_amount,
             sum(prescription_pay_suc_count)  prescription_pay_suc_count
      from ${APP}.dws_trade_hospital_gender_age_group_prescription_pay_suc_1d
      where dt = '$do_date'
      group by gender_code,
               gender
      union
      select recent_days,
             gender_code,
             gender,
             sum(if(recent_days = 7, prescription_pay_suc_amount_7d,
                    prescription_pay_suc_amount_30d)) prescription_pay_suc_amount,
             sum(if(recent_days = 7, prescription_pay_suc_count_7d,
                    prescription_pay_suc_count_30d))  prescription_pay_suc_count
      from ${APP}.dws_trade_hospital_gender_age_group_prescription_pay_suc_nd lateral view explode(array(7, 30)) tmp as recent_days
      where dt = '$do_date'
      group by recent_days,
               gender_code,
               gender) prescription_pay_suc
     on consul.recent_days = prescription_pay_suc.recent_days
         and consul.gender_code = prescription_pay_suc.gender_code
         and consul.gender = prescription_pay_suc.gender;
"
ads_age_group_trade_stats="
insert overwrite table ${APP}.ads_age_group_trade_stats
select dt,
       recent_days,
       age_group,
       consultation_amount,
       consultation_count,
       consultation_pay_suc_amount,
       consultation_pay_suc_count,
       prescription_amount,
       prescription_count,
       prescription_pay_suc_amount,
       prescription_pay_suc_count
from ${APP}.ads_age_group_trade_stats
union
select '$do_date' dt,
       consul.recent_days,
       consul.age_group,
       consultation_amount,
       consultation_count,
       consultation_pay_suc_amount,
       consultation_pay_suc_count,
       prescription_amount,
       prescription_count,
       prescription_pay_suc_amount,
       prescription_pay_suc_count
from (select 1                        recent_days,
             age_group,
             sum(consultation_amount) consultation_amount,
             sum(consultation_count)  consultation_count
      from ${APP}.dws_trade_hospital_gender_age_group_consultation_1d
      where dt = '$do_date'
      group by age_group
      union
      select recent_days,
             age_group,
             sum(if(recent_days = 7, consultation_amount_7d, consultation_amount_30d)) consultation_amount,
             sum(if(recent_days = 7, consultation_count_7d, consultation_count_30d))   consultation_count
      from ${APP}.dws_trade_hospital_gender_age_group_consultation_nd lateral view explode(array(7, 30)) tmp as recent_days
      where dt = '$do_date'
      group by recent_days,
               age_group) consul
         left join
     (select 1                                recent_days,
             age_group,
             sum(consultation_pay_suc_amount) consultation_pay_suc_amount,
             sum(consultation_pay_suc_count)  consultation_pay_suc_count
      from ${APP}.dws_trade_hospital_gender_age_group_consultation_pay_suc_1d
      where dt = '$do_date'
      group by age_group
      union
      select recent_days,
             age_group,
             sum(if(recent_days = 7, consultation_pay_suc_amount_7d,
                    consultation_pay_suc_amount_30d)) consultation_pay_suc_amount,
             sum(if(recent_days = 7, consultation_pay_suc_count_7d,
                    consultation_pay_suc_count_30d))  consultation_pay_suc_count
      from ${APP}.dws_trade_hospital_gender_age_group_consultation_pay_suc_nd lateral view explode(array(7, 30)) tmp as recent_days
      where dt = '$do_date'
      group by recent_days,
               age_group) consul_pay_suc
     on consul.recent_days = consul_pay_suc.recent_days
         and consul.age_group = consul_pay_suc.age_group
         left join
     (select 1                        recent_days,
             age_group,
             sum(prescription_amount) prescription_amount,
             sum(prescription_count)  prescription_count
      from ${APP}.dws_trade_hospital_gender_age_group_prescription_1d
      where dt = '$do_date'
      group by age_group
      union
      select recent_days,
             age_group,
             sum(if(recent_days = 7, prescription_amount_7d, prescription_amount_30d)) prescription_amount,
             sum(if(recent_days = 7, prescription_count_7d, prescription_count_30d))   prescription_count
      from ${APP}.dws_trade_hospital_gender_age_group_prescription_nd lateral view explode(array(7, 30)) tmp as recent_days
      where dt = '$do_date'
      group by recent_days,
               age_group) prescription
     on consul.recent_days = prescription.recent_days
         and consul.age_group = prescription.age_group
         left join
     (select 1                                recent_days,
             age_group,
             sum(prescription_pay_suc_amount) prescription_pay_suc_amount,
             sum(prescription_pay_suc_count)  prescription_pay_suc_count
      from ${APP}.dws_trade_hospital_gender_age_group_prescription_pay_suc_1d
      where dt = '$do_date'
      group by age_group
      union
      select recent_days,
             age_group,
             sum(if(recent_days = 7, prescription_pay_suc_amount_7d,
                    prescription_pay_suc_amount_30d)) prescription_pay_suc_amount,
             sum(if(recent_days = 7, prescription_pay_suc_count_7d,
                    prescription_pay_suc_count_30d))  prescription_pay_suc_count
      from ${APP}.dws_trade_hospital_gender_age_group_prescription_pay_suc_nd lateral view explode(array(7, 30)) tmp as recent_days
      where dt = '$do_date'
      group by recent_days,
               age_group) prescription_pay_suc
     on consul.recent_days = prescription_pay_suc.recent_days
         and consul.age_group = prescription_pay_suc.age_group;
"
ads_doctor_change_stats="
insert overwrite table ${APP}.ads_doctor_change_stats
select dt,
       recent_days,
       new_doctor_count,
       activated_doctor_count,
       active_doctor_count
from ${APP}.ads_doctor_change_stats
union
select '$do_date' dt,
       new.recent_days,
       new_doctor_count,
       activated_doctor_count,
       active_doctor_count
from (select recent_days,
             count(*) new_doctor_count
      from ${APP}.dwd_doctor_register_inc lateral view explode(array(1, 7, 30)) tmp as recent_days
      where dt >= date_add('$do_date', -recent_days + 1)
      group by recent_days) new
         left join
     (select recent_days,
             count(*) activated_doctor_count
      from ${APP}.dws_trade_doctor_consultation_td lateral view explode(array(1, 7, 30)) tmp as recent_days
      where dt = '$do_date'
        and first_consultation_dt >= date_add('$do_date', -recent_days + 1)
      group by recent_days) activated
     on new.recent_days = activated.recent_days
         left join
     (select 1        recent_days,
             count(*) active_doctor_count
      from ${APP}.dws_trade_doctor_consultation_1d
      where dt = '$do_date'
        and consultation_count >= 2
      union
      select recent_days,
             count(*) active_doctor_count
      from ${APP}.dws_trade_doctor_consultation_nd lateral view explode(array(7, 30)) tmp as recent_days
      where dt = '$do_date'
        and ((recent_days = 7 and consultation_count_7d >= 2)
          or (recent_days = 30 and consultation_count_30d >= 2))
      group by recent_days) active
     on new.recent_days = active.recent_days;
"
ads_user_change_stats="
insert overwrite table ${APP}.ads_user_change_stats
select dt,
       recent_days,
       new_user_count,
       new_patient_count
from ${APP}.ads_user_change_stats
union
select '$do_date' dt,
       new_user.recent_days,
       new_user_count,
       new_patient_count
from (select recent_days,
             count(*) new_user_count
      from ${APP}.dwd_user_register_inc lateral view explode(array(1, 7, 30)) tmp as recent_days
      where dt >= date_add('$do_date', -recent_days + 1)
      group by recent_days) new_user
         left join
     (select recent_days,
             count(*) new_patient_count
      from ${APP}.dwd_user_patient_add_inc lateral view explode(array(1, 7, 30)) tmp as recent_days
      where dt >= date_add('$do_date', -recent_days + 1)
      group by recent_days) new_patient
     on new_user.recent_days = new_patient.recent_days;
"
ads_review_stats="
insert overwrite table ${APP}.ads_review_stats
select dt,
       review_user_count,
       review_count,
       good_review_rate
from ${APP}.ads_review_stats
union
select '$do_date' dt,
       review_user_count,
       review_count,
       good_review_rate
from (select count(distinct user_id) review_user_count
      from ${APP}.dws_interaction_hospital_user_review_td
      where dt = '$do_date') user_count
         left join
     (select sum(review_count)                          review_count,
             sum(good_review_count) / sum(review_count) good_review_rate
      from ${APP}.dws_interaction_hospital_review_td
      where dt = '$do_date') review_stats;
"
ads_hospital_review_stats="
insert overwrite table ${APP}.ads_hospital_review_stats
select dt,
       hospital_id,
       hospital_name,
       review_user_count,
       review_count,
       good_review_rate
from ${APP}.ads_hospital_review_stats
union
select '$do_date' dt,
       user_count.hospital_id,
       user_count.hospital_name,
       review_user_count,
       review_count,
       good_review_rate
from (select hospital_id,
             hospital_name,
             count(user_id) review_user_count
      from ${APP}.dws_interaction_hospital_user_review_td
      where dt = '$do_date'
      group by hospital_id,
               hospital_name) user_count
         left join
     (select hospital_id,
             hospital_name,
             review_count,
             good_review_count / review_count good_review_rate
      from ${APP}.dws_interaction_hospital_review_td
      where dt = '$do_date') review_stats
     on user_count.hospital_id = review_stats.hospital_id
         and user_count.hospital_name = review_stats.hospital_name;
"

case $1 in 
    ads_trade_stats | ads_hospital_trade_stats | ads_gender_trade_stats | ads_age_group_trade_stats | ads_doctor_change_stats | ads_user_change_stats | ads_review_stats | ads_hospital_review_stats)
    hive -e "${!1}"
    ;;
    "all")
    hive -e "$ads_trade_stats$ads_hospital_trade_stats$ads_gender_trade_stats$ads_age_group_trade_stats$ads_doctor_change_stats$ads_user_change_stats$ads_review_stats$ads_hospital_review_stats"
    ;;
    "*")
    echo "非法参数!!!"
    ;;
esac

数据装载
medical_dws_to_ads.sh all 2023-05-09
线上问诊:数仓开发(三)_第1张图片
找个表看一下数据就行。

一、报表数据导出

1.MySQL建库建表

1.创建数据库

CREATE DATABASE IF NOT EXISTS medical_report DEFAULT CHARSET utf8 COLLATE utf8_general_ci;

2.创建表

1.交易综合统计

CREATE TABLE `ads_trade_stats` (
  `dt` date NOT NULL COMMENT '统计日期',
  `recent_days` bigint NOT NULL COMMENT '统计周期: 最近1,7,30日',
  `consultation_amount` decimal(16,2) DEFAULT NULL COMMENT '问诊金额',
  `consultation_count` bigint DEFAULT NULL COMMENT '问诊次数',
  `consultation_pay_suc_amount` decimal(16,2) DEFAULT NULL COMMENT '问诊支付成功金额',
  `consultation_pay_suc_count` bigint DEFAULT NULL COMMENT '问诊支付成功次数',
  `prescription_amount` decimal(16,2) DEFAULT NULL COMMENT '处方金额',
  `prescription_count` bigint DEFAULT NULL COMMENT '处方次数',
  `prescription_pay_suc_amount` decimal(16,2) DEFAULT NULL COMMENT '处方支付成功金额',
  `prescription_pay_suc_count` bigint DEFAULT NULL COMMENT '处方支付成功次数',
  PRIMARY KEY (`dt`,`recent_days`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci COMMENT='交易综合统计';

2.各医院交易统计

CREATE TABLE `ads_hospital_trade_stats` (
  `dt` date NOT NULL COMMENT '统计日期',
  `recent_days` bigint NOT NULL COMMENT '统计周期: 最近1,7,30日',
  `hospital_id` varchar(255) NOT NULL COMMENT '医院ID',
  `hospital_name` varchar(255) NOT NULL COMMENT '医院名称',
  `consultation_amount` decimal(16,2) DEFAULT NULL COMMENT '问诊金额',
  `consultation_count` bigint DEFAULT NULL COMMENT '问诊次数',
  `consultation_pay_suc_amount` decimal(16,2) DEFAULT NULL COMMENT '问诊支付成功金额',
  `consultation_pay_suc_count` bigint DEFAULT NULL COMMENT '问诊支付成功次数',
  `prescription_amount` decimal(16,2) DEFAULT NULL COMMENT '处方金额',
  `prescription_count` bigint DEFAULT NULL COMMENT '处方次数',
  `prescription_pay_suc_amount` decimal(16,2) DEFAULT NULL COMMENT '处方支付成功金额',
  `prescription_pay_suc_count` bigint DEFAULT NULL COMMENT '处方支付成功次数',
  PRIMARY KEY (`dt`,`recent_days`,`hospital_id`,`hospital_name`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci COMMENT='各医院交易统计';

3.各性别患者交易统计

CREATE TABLE `ads_gender_trade_stats` (
  `dt` date NOT NULL COMMENT '统计日期',
  `recent_days` bigint NOT NULL COMMENT '统计周期: 最近1,7,30日',
  `gender_code` varchar(255) NOT NULL COMMENT '患者性别编码',
  `gender` varchar(255) NOT NULL COMMENT '患者性别',
  `consultation_amount` decimal(16,2) DEFAULT NULL COMMENT '问诊金额',
  `consultation_count` bigint DEFAULT NULL COMMENT '问诊次数',
  `consultation_pay_suc_amount` decimal(16,2) DEFAULT NULL COMMENT '问诊支付成功金额',
  `consultation_pay_suc_count` bigint DEFAULT NULL COMMENT '问诊支付成功次数',
  `prescription_amount` decimal(16,2) DEFAULT NULL COMMENT '处方金额',
  `prescription_count` bigint DEFAULT NULL COMMENT '处方次数',
  `prescription_pay_suc_amount` decimal(16,2) DEFAULT NULL COMMENT '处方支付成功金额',
  `prescription_pay_suc_count` bigint DEFAULT NULL COMMENT '处方支付成功次数',
  PRIMARY KEY (`dt`,`recent_days`,`gender_code`,`gender`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci COMMENT='各性别患者交易统计';

4.各年龄段患者交易统计

CREATE TABLE `ads_age_group_trade_stats` (
  `dt` date NOT NULL COMMENT '统计日期',
  `recent_days` bigint NOT NULL COMMENT '统计周期: 最近1,7,30日',
  `age_group` varchar(255) NOT NULL COMMENT '患者年龄段',
  `consultation_amount` decimal(16,2) DEFAULT NULL COMMENT '问诊金额',
  `consultation_count` bigint DEFAULT NULL COMMENT '问诊次数',
  `consultation_pay_suc_amount` decimal(16,2) DEFAULT NULL COMMENT '问诊支付成功金额',
  `consultation_pay_suc_count` bigint DEFAULT NULL COMMENT '问诊支付成功次数',
  `prescription_amount` decimal(16,2) DEFAULT NULL COMMENT '处方金额',
  `prescription_count` bigint DEFAULT NULL COMMENT '处方次数',
  `prescription_pay_suc_amount` decimal(16,2) DEFAULT NULL COMMENT '处方支付成功金额',
  `prescription_pay_suc_count` bigint DEFAULT NULL COMMENT '处方支付成功次数',
  PRIMARY KEY (`dt`,`recent_days`,`age_group`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci COMMENT='各年龄段患者交易统计';

5.医生变动统计

CREATE TABLE `ads_doctor_change_stats` (
  `dt` date NOT NULL COMMENT '统计日期',
  `recent_days` bigint NOT NULL COMMENT '统计周期: 最近1,7,30日',
  `new_doctor_count` bigint DEFAULT NULL COMMENT '新增医生数',
  `activated_doctor_count` bigint DEFAULT NULL COMMENT '激活医生数',
  `active_doctor_count` bigint DEFAULT NULL COMMENT '活跃医生数',
  PRIMARY KEY (`dt`,`recent_days`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci COMMENT='医生变动统计';

6.用户变动统计

CREATE TABLE `ads_user_change_stats` (
  `dt` date NOT NULL COMMENT '统计日期',
  `recent_days` bigint NOT NULL COMMENT '统计周期: 最近1,7,30日',
  `new_user_count` bigint DEFAULT NULL COMMENT '新增用户数',
  `new_patient_count` bigint DEFAULT NULL COMMENT '新增患者数',
  PRIMARY KEY (`dt`,`recent_days`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci COMMENT='用户变动统计';

7.评价综合统计

CREATE TABLE `ads_review_stats` (
  `dt` date NOT NULL COMMENT '统计日期',
  `review_user_count` bigint DEFAULT NULL COMMENT '评价人数',
  `review_count` bigint DEFAULT NULL COMMENT '评价次数',
  `good_review_rate` decimal(16,2) DEFAULT NULL COMMENT '好评率',
  PRIMARY KEY (`dt`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci COMMENT='用户变动统计';

8.各医院评价统计

CREATE TABLE `ads_hospital_review_stats` (
  `dt` date NOT NULL COMMENT '统计日期',
  `hospital_id` varchar(255) NOT NULL COMMENT '医院ID',
  `hospital_name` varchar(255) NOT NULL COMMENT '医院名称',
  `review_user_count` bigint DEFAULT NULL COMMENT '评价人数',
  `review_count` bigint DEFAULT NULL COMMENT '评价次数',
  `good_review_rate` decimal(16,2) DEFAULT NULL COMMENT '好评率',
  PRIMARY KEY (`dt`,`hospital_id`,`hospital_name`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci COMMENT='各医院评价统计';

2.数据导出

1.DataX配置文件生成脚本

vim /opt/module/gen_datax_config/configuration.properties
线上问诊:数仓开发(三)_第2张图片

mysql.username=root
mysql.password=000000
mysql.host=hadoop102
mysql.port=3306
mysql.database.import=medical
# 从HDFS导出进入的 MySQL 数据库名称
mysql.database.export=medical_report
mysql.tables.import=dict,doctor,hospital,medicine,patient,user
# MySQL 库中需要导出的表,空串表示导出库的所有表
mysql.tables.export=
is.seperated.tables=0
hdfs.uri=hdfs://hadoop102:8020
import_out_dir=/opt/module/datax/job/medical/import
# DataX 导出配置文件存放路径
export_out_dir=/opt/module/datax/job/medical/export

2.执行配置文件生成器

java -jar datax-config-generator-1.0-SNAPSHOT-jar-with-dependencies.jar
线上问诊:数仓开发(三)_第3张图片

3.编写每日导出脚本

vim ~/bin/medical_hdfs_to_mysql.sh

#!/bin/bash

DATAX_HOME=/opt/module/datax

handle_path(){
	for file in `hadoop fs -ls -R $1 | awk '{print $8}'`
	do
		hadoop fs -test -z $file
		if [[ $? -eq 0 ]]
		then 
			echo "文件 $file 大小为零,正在删除..."
			hadoop fs -rm -f -r $file
		fi
	done
}

export_data(){
	export_dir=$1
	datax_config=$2

	echo "正在校验目录 $export_dir ..."
	handle_path $export_dir
	count=`hadoop fs -count $export_dir | awk '{print $2}'`
	if [[ $count -eq 0 ]]
	then 
		echo "目录为空,跳过"
	else
		echo "正在处理目录 $export_dir ..."
		$DATAX_HOME/bin/datax.py -p"-Dexportdir=$export_dir" $datax_config >$DATAX_HOME/job/medical/export.log 2>&1
		if [[ $? -ne 0 ]]
		then 
			echo "执行出错,日志如下 ..."
			cat $DATAX_HOME/job/medical/export.log
		fi
	fi
}

case $1 in
	ads_trade_stats | ads_hospital_trade_stats | ads_gender_trade_stats | ads_age_group_trade_stats | ads_doctor_change_stats | ads_user_change_stats | ads_review_stats | ads_hospital_review_stats)
	export_data /warehouse/medical/ads/$1 $DATAX_HOME/job/medical/export/medical_report.$1.json
	;;
	"all")
	for tab in ads_trade_stats ads_hospital_trade_stats ads_gender_trade_stats ads_age_group_trade_stats ads_doctor_change_stats ads_user_change_stats ads_review_stats ads_hospital_review_stats
	do 
		export_data /warehouse/medical/ads/${tab} $DATAX_HOME/job/medical/export/medical_report.${tab}.json
	done
	;;
	"*")
	echo "非法参数!!!"
	;;
esac

添加权限
chmod +x ~/bin/medical_hdfs_to_mysql.sh
数据装载
medical_hdfs_to_mysql.sh all
线上问诊:数仓开发(三)_第4张图片


总结

数仓开发到这里就结束了。

你可能感兴趣的:(线上问诊,数据仓库)