18:15分接到电话说要配合项目更新做定时的数据同步导入操作,在晚饭前赶着做完了等他们测试,结果饭还没扒几口,说数据有空格导致的问题,改咯(要么在导出端改,要么导入端改):
数据源库是Oracle,使用SPOOL导出为TXT格式:
set linesize 1000
set pagesize 0
set echo off
set heading off
set feedback off
set colsep "||"
SET trims ON
set term off
SET trimspool ON
SET trimout ON
spool '/mnt/upload/wxspool/fansstat.txt'
SELECT to_char(trunc(sysdate - 1), 'yyyy-MM-dd') as statDate, trim(sex), trim(country), trim(province), trim(city), count(1) as fansCnt
FROM tb_sdgame_fans
WHERE orignalid = 'gh_82be7859af49' AND subscribe = 1
GROUP BY sex, country, province, city;
注意:以上的导出脚本并不能解决导出每列数据的去空格问题。
trim()函数的去空格功能在SPOOL导出数据得不到体现,SQL*PLUS命令窗口执行SELECT时,如果不用COL命令去显示每列的宽度,即便trim()了空格,也会按照字段原始定义的宽度以空格形式补全列值,参详如下例子:
SQL> desc tb_sdgame_fans;
Name Null? Type
----------------------------------------- -------- ----------------------------
CITY VARCHAR2(256)
COUNTRY VARCHAR2(256)
SQL> SELECT country,city FROM tb_sdgame_fans WHERE orignalid = 'gh_82be7859af49' AND subscribe = 1 AND ROWNUM<5;
COUNTRY
--------------------------------------------------------------------------------
CITY
--------------------------------------------------------------------------------
中国
海口
安道尔
中国
大兴
COUNTRY
--------------------------------------------------------------------------------
CITY
--------------------------------------------------------------------------------
中国
西安
SQL> SELECT trim(country),trim(city) FROM tb_sdgame_fans WHERE orignalid = 'gh_82be7859af49' AND subscribe = 1 AND ROWNUM<5;
TRIM(COUNTRY)
--------------------------------------------------------------------------------
TRIM(CITY)
--------------------------------------------------------------------------------
中国
海口
安道尔
中国
大兴
TRIM(COUNTRY)
--------------------------------------------------------------------------------
TRIM(CITY)
--------------------------------------------------------------------------------
中国
西安
SQL> col country for a10;
SQL> col city for a10;
SQL> SELECT country,city FROM tb_sdgame_fans WHERE orignalid = 'gh_82be7859af49' AND subscribe = 1 AND ROWNUM<5;
COUNTRY CITY
---------- ----------
中国 海口
安道尔
中国 大兴
中国 西安
换一种思路,将**各列值连接**来解决问题:
set linesize 1000
set pagesize 0
set echo off
set heading off
set feedback off
--set colsep "||"
SET trims ON
set term off
SET trimspool ON
SET trimout ON
spool '/mnt/upload/wxspool/fansstat.txt'
--如下select以“,”分隔各列值,上面的set colsep "||"就无需定义了
SELECT to_char(trunc(sysdate - 1), 'yyyy-MM-dd')||','||sex||','||country||','||province||','||city||','||count(1) as fansCnt
FROM tb_wxyx_fans
WHERE orignalid = 'gh_82be7859af49' AND subscribe = 1
GROUP BY sex, country, province, city;
spool off;
目标库是MySQL,使用LOAD DATA INFILE导入,在已有空格的TXT数据文件基础上使用trim()函数来实现过滤空格:
load data local infile "/mnt/upload/wxspool/userstat.txt" into table tb_user_dimensions
fields terminated by '||' (statDate,sex,country, province,city,fansCnt)
set statDate=statDate,sex=trim(sex),country=trim(country), province=trim(province),city=trim(city),fansCnt=fansCnt;
完毕!
2020年08月04日
K4全身强化进行中,刚虐完两组波比然后接到工作电话,害怕电话那端脑补喘气音,急着解释说在做运动,然后,空气和信号都尴尬了,窒息啊!!!