Windows(Win10)
打开cmd
首先输入sqlplus,依次输入用户名、口令
C:\Users\hasee>sqlplus SQL*Plus: Release 11.2.0.1.0 Production on 星期三 3月 13 16:55:46 2019 Copyright (c) 1982, 2010, Oracle. All rights reserved. 请输入用户名: scott 输入口令: 连接到: Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production With the Partitioning, OLAP, Data Mining and Real Application Testing options
连接数据库管理员
[oracle@CentOS7One ~]$ sqlplus / as sysdba SQL*Plus: Release 11.2.0.1.0 Production on 星期一 3月 18 14:28:40 2019 Copyright (c) 1982, 2009, Oracle. All rights reserved. 已连接到空闲例程。
1.exp导出表
退出oracle,将scott用户下orcl数据库的buy_cnt_c1表导出到e:/opt/oracle_output/daochu.dmp
C:\Users\hasee>exp scott/tiger@orcl file=e:\opt\\oracle_output\daochu.dmp tables=(buy_cnt_c1) Export: Release 11.2.0.1.0 - Production on 星期三 3月 13 16:58:30 2019 Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved. 连接到: Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production With the Partitioning, OLAP, Data Mining and Real Application Testing options 已导出 ZHS16GBK 字符集和 AL16UTF16 NCHAR 字符集 即将导出指定的表通过常规路径... . . 正在导出表 BUY_CNT_C1导出了 8293 行 成功终止导出, 没有出现警告。
8293条数据 22ms
8492033条数据18s65ms
2.imp导入表
导入前要将原来的表删除
drop table buy_cnt_c1;
退出oracle,将e:/opt/oracle_output/daochu.dmp导入到scott用户下orcl数据库
C:\Users\hasee>imp scott/tiger@orcl file=e:\opt\\oracle_output\daochu.dmp Import: Release 11.2.0.1.0 - Production on 星期三 3月 13 17:03:29 2019 Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved. 连接到: Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production With the Partitioning, OLAP, Data Mining and Real Application Testing options 经由常规路径由 EXPORT:V11.02.00 创建的导出文件 已经完成 ZHS16GBK 字符集和 AL16UTF16 NCHAR 字符集中的导入 . 正在将 SCOTT 的对象导入到 SCOTT . 正在将 SCOTT 的对象导入到 SCOTT . . 正在导入表 "BUY_CNT_C1"导入了 8293 行 成功终止导入, 没有出现警告。
8293条数据 18ms
8492033条数据 1m19s8ms
3.sqluldr2导出表
1.导出表
E:\bigdata\sqluldr2_win_jb51>sqluldr264 scott/tiger@127.0.0.1/orcl query="select * from buy_cnt_c1" head=yes file=e:\opt\oracle_output\tmp001.txt 0 rows exported at 2019-03-13 17:40:46, size 0 MB. 8293 rows exported at 2019-03-13 17:40:46, size 0 MB. output file e:\opt\oracle_output\tmp001.txt closed at 8293 rows, size 0 MB.
8293条数据 23ms
8492033条数据17s55ms
2.指定分隔符
field对字段进行分隔,record对记录进行分隔。
ESCF是escape from,指定哪些形式的数据需要被转义;
ESCAPE是转义前缀
ESCT是escape to,转义成目标字符串
\r=0x0d \n=0x0a |=0x7c ,=0x2c, \t=0x09, :=0x3a, #=0x23, "=0x22 '=0x27
sqluldr264 test/test@127.0.0.1/orcl query="select * from test" head=yes file=e:\opt\oracle_output\hex_test.txt log=e:\opt\oracle_output\hex.log field=0x7c record=0x0d ESCF=0x7c0x0d ESCAPE='\' ESCT=i
表信息如下
第一行“|a”字段里含有我们的字段分隔符,用ESCAPE即可替换
第五行含有0x0d,即换行符,不会影响我们的结果,sqluldr会将它看成字符串
导出结果为
ID|NAME 1|'ia 2|sanq 3|jieba 4|wuren 5|0x0d
4.sqlldr导入数据
首先创建表
DROP TABLE "SCOTT"."BUY_CNT_C1"; CREATE TABLE "SCOTT"."BUY_CNT_C1" ( "T_INDEX" NUMBER(19) NULL , "INNET_TIME" NUMBER(19) NULL , "UP_DATA_AMOUNT_TOTAL" FLOAT(126) NULL , "DOWN_DATA_AMOUNT_TOTAL" FLOAT(126) NULL , "PAY_CNT" NUMBER(19) NULL , "BUY_CNT" NUMBER(19) NULL , "MONTH_FEE" FLOAT(126) NULL , "CALL_DURATION" NUMBER(19) NULL , "AGE_RANGE_ID" FLOAT(126) NULL , "DAY_ACT_NUM_MEAN" FLOAT(126) NULL ) LOGGING NOCOMPRESS NOCACHE ;
编写ctl文件
tmp_002.ctl
load data infile * into table BUY_CNT_TEST ( "T_INDEX" char terminated by ',', "INNET_TIME" char terminated by ',', "UP_DATA_AMOUNT_TOTAL" char terminated by ',', "DOWN_DATA_AMOUNT_TOTAL" char terminated by ',', "PAY_CNT" char terminated by ',', "BUY_CNT" char terminated by ',', "MONTH_FEE" char terminated by ',', "CALL_DURATION" char terminated by ',', "AGE_RANGE_ID" char terminated by ',', "DAY_ACT_NUM_MEAN" char terminated by ',' )
cmd输入
E:\bigdata\sqluldr2_win_jb51>sqlldr userid=scott/tiger control=e:/opt/oracle_ctl/tmp002.ctl log=e:/opt/oracle_ctl/tmp001.log data=e:/opt/oracle_ctl/tmp001.csv rows=64 SQL*Loader: Release 11.2.0.1.0 - Production on 星期四 3月 14 10:50:46 2019 Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved. 达到提交点 - 逻辑记录计数 64 达到提交点 - 逻辑记录计数 128 达到提交点 - 逻辑记录计数 192 达到提交点 - 逻辑记录计数 256
...
达到提交点 - 逻辑记录计数 8192 达到提交点 - 逻辑记录计数 8256 达到提交点 - 逻辑记录计数 8294
8294条数据 28ms
8492033条数据 2m6s12ms
Linux(CentOS7)
1.exp导出表
[oracle@CentOS7One ~]$ exp scott/tiger file=output/daochu.dmp tables='(buy_cnt_c1)' log=output/log/daochudmp.log Export: Release 11.2.0.1.0 - Production on 星期三 3月 20 09:12:08 2019 Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved. 连接到: Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production With the Partitioning, OLAP, Data Mining and Real Application Testing options 已导出 AL32UTF8 字符集和 AL16UTF16 NCHAR 字符集 服务器使用 ZHS16GBK 字符集 (可能的字符集转换) 即将导出指定的表通过常规路径... . . 正在导出表 BUY_CNT_C1导出了 8492032 行 成功终止导出, 没有出现警告。
8293条数据 2s
8492033条数据60s
2.imp导入表
导入前要将原来的表删除
drop table buy_cnt_c1;
[oracle@CentOS7One ~]$ imp scott/tiger@orcl11g file=input/daoru.cmp Import: Release 11.2.0.1.0 - Production on 星期三 3月 20 10:42:51 2019 Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved. 连接到: Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production With the Partitioning, OLAP, Data Mining and Real Application Testing options 经由常规路径由 EXPORT:V11.02.00 创建的导出文件 已经完成 AL32UTF8 字符集和 AL16UTF16 NCHAR 字符集中的导入 导入服务器使用 ZHS16GBK 字符集 (可能的字符集转换) . 正在将 SCOTT 的对象导入到 SCOTT . 正在将 SCOTT 的对象导入到 SCOTT . . 正在导入表 "BUY_CNT_C1"导入了 8492032 行 成功终止导入, 没有出现警告。
8293条数据 48ms
8492032条数据61s
3.sqluldr2导出数据
sqluldr2_linux64_10204.bin user=scott/tiger@orcl11g query="select * from buy_cnt_c1" head=yes file=output/tmp002.csv log=output/log/tmp002.log
8293条数据 小于1s
8492032条数据 5min17s
4.sqlldr导入数据
ctl文件同上,先创建表
[oracle@CentOS7One ~]$ sqlldr userid=scott/tiger control=input/tmp002.ctl log=input/log/tmp002.log data=input/tmp001.csv rows=64 SQL*Loader: Release 11.2.0.1.0 - Production on 星期二 3月 19 16:46:56 2019 Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved. 达到提交点 - 逻辑记录计数 64 达到提交点 - 逻辑记录计数 128 达到提交点 - 逻辑记录计数 192 达到提交点 - 逻辑记录计数 256 达到提交点 - 逻辑记录计数 320 ... 达到提交点 - 逻辑记录计数 8192 达到提交点 - 逻辑记录计数 8256 达到提交点 - 逻辑记录计数 8294
8294条数据 47ms
8492033数据 5m27s26ms
测试
编写测试脚本,统计脚本命令执行时间
Windows脚本如下
@echo off set /a startMS=%time:~9,2% set /a startS=%time:~6,2% set /a startM=%time:~3,2% echo %time% ::写你的命令 set /a endMS=%time:~9,2% set /a endS=%time:~6,2% set /a endM=%time:~3,2% echo %time% set /a diffMS_=%endMS%-%startMS% set /a diffS_=%endS%-%startS% set /a diffM_=%endM%-%startM% echo cost:%diffM_% %diffS_% %diffMS_% pause
Linux脚本如下
start_time=`date --date='0 days ago' "+%Y-%m-%d %H:%M:%S"` #this is your shell script #写你的命令
############## finish_time=`date --date='0 days ago' "+%Y-%m-%d %H:%M:%S"` duration=$(($(($(date +%s -d "$finish_time")-$(date +%s -d "$start_time"))))) echo "this shell script execution duration: $duration"
CentOS7部署在虚拟机上,所以配置较Windows差很多
8294条数据(小数据量)
8492033条数据(大数据量)
业务实操1
中移动3300万数据的csv(6.03GB)使用sqlldr导入到oracle数据库
表宽61列,建表语句如下,字段长度建议设置255以上
create table T_CJYX_DZXX ( id VARCHAR2(50), version VARCHAR2(15), metacategory VARCHAR2(100), indicator VARCHAR2(20), iscity VARCHAR2(20), parentid VARCHAR2(24), name VARCHAR2(255), code VARCHAR2(255), nameabbrpy VARCHAR2(255), hierarchy VARCHAR2(20), alias VARCHAR2(255), alias2 VARCHAR2(255), alias3 VARCHAR2(255), alias4 VARCHAR2(255), postcode VARCHAR2(255), cover VARCHAR2(25), longitude VARCHAR2(50), latitude VARCHAR2(26), rgt VARCHAR2(20), lft VARCHAR2(20), status VARCHAR2(25), comments VARCHAR2(255), detailaddress VARCHAR2(255), logdate VARCHAR2(50), areaid VARCHAR2(28), isdefine VARCHAR2(22), faultstatus VARCHAR2(25), restoretime VARCHAR2(50), faultdescription VARCHAR2(2555), deviceusage VARCHAR2(25), faultdetailcode VARCHAR2(20), floortype VARCHAR2(25), faultcount VARCHAR2(20), createdate VARCHAR2(50), auditdate VARCHAR2(50), createuser VARCHAR2(255), audituser VARCHAR2(255), addresstype VARCHAR2(25), marketgrade VARCHAR2(25), iscoordinate VARCHAR2(20), ttmigrate VARCHAR2(25), gjgx VARCHAR2(25), accesstype VARCHAR2(25), shape VARCHAR2(128), fillcolor VARCHAR2(255), fillwidth VARCHAR2(22), fillalpha VARCHAR2(21), fillstyle VARCHAR2(255), fillbordercolor VARCHAR2(255), area VARCHAR2(23), areaunit VARCHAR2(22), createuserphone VARCHAR2(255), modifyuser VARCHAR2(255), modifydate VARCHAR2(50), gcstatus VARCHAR2(25), pmsprjcode VARCHAR2(255), pmsprjname VARCHAR2(255), deleteuser VARCHAR2(255), deletedate VARCHAR2(50), afterdeleteuser VARCHAR2(255), afterdeletedate VARCHAR2(50) ) tablespace TBS_AHJZH_DATA pctfree 10 initrans 1 maxtrans 255 storage ( initial 64 next 8 minextents 1 maxextents unlimited );
ctl文件如下
LOAD DATA INFILE * INTO TABLE T_CJYX_DZXX REPLACE FIELDS TERMINATED BY '$' TRAILING NULLCOLS ( id, version, metacategory, indicator, iscity, parentid, name, code, nameabbrpy, hierarchy, alias, alias2, alias3, alias4, postcode, cover, longitude, latitude, rgt, lft, status, comments, detailaddress, logdate, areaid, isdefine, faultstatus, restoretime, faultdescription, deviceusage, faultdetailcode, floortype, faultcount, createdate, auditdate, createuser, audituser, addresstype, marketgrade, iscoordinate, ttmigrate, gjgx, accesstype, shape, fillcolor, fillwidth, fillalpha, fillstyle, fillbordercolor, area, areaunit, createuserphone, modifyuser, modifydate, gcstatus, pmsprjcode, pmsprjname, deleteuser, deletedate, afterdeleteuser, afterdeletedate )
运行日志如下,耗时约14小时
SQL*Loader: Release 11.2.0.1.0 - Production on 星期三 4月 3 08:46:55 2019 Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved. 控制文件: E:\opt\srcbigdata\di_70002_20190325.ctl 数据文件: E:\opt\srcbigdata\di_70002_20190325.csv 错误文件: E:\opt\srcbigdata\di_70002_20190325.bad 废弃文件: 未作指定 (可废弃所有记录) 要加载的数: ALL 要跳过的数: 0 允许的错误: 50 绑定数组: 64 行, 最大 256000 字节 继续: 未作指定 所用路径: 常规 表 T_CJYX_DZXX,已加载从每个逻辑记录 插入选项对此表 REPLACE 生效 TRAILING NULLCOLS 选项生效 列名 位置 长度 中止 包装数据类型 ------------------------------ ---------- ----- ---- ---- --------------------- ID FIRST * $ CHARACTER VERSION NEXT * $ CHARACTER METACATEGORY NEXT * $ CHARACTER INDICATOR NEXT * $ CHARACTER ISCITY NEXT * $ CHARACTER PARENTID NEXT * $ CHARACTER NAME NEXT * $ CHARACTER CODE NEXT * $ CHARACTER NAMEABBRPY NEXT * $ CHARACTER HIERARCHY NEXT * $ CHARACTER ALIAS NEXT * $ CHARACTER ALIAS2 NEXT * $ CHARACTER ALIAS3 NEXT * $ CHARACTER ALIAS4 NEXT * $ CHARACTER POSTCODE NEXT * $ CHARACTER COVER NEXT * $ CHARACTER LONGITUDE NEXT * $ CHARACTER LATITUDE NEXT * $ CHARACTER RGT NEXT * $ CHARACTER LFT NEXT * $ CHARACTER STATUS NEXT * $ CHARACTER COMMENTS NEXT * $ CHARACTER DETAILADDRESS NEXT * $ CHARACTER LOGDATE NEXT * $ CHARACTER AREAID NEXT * $ CHARACTER ISDEFINE NEXT * $ CHARACTER FAULTSTATUS NEXT * $ CHARACTER RESTORETIME NEXT * $ CHARACTER FAULTDESCRIPTION NEXT * $ CHARACTER DEVICEUSAGE NEXT * $ CHARACTER FAULTDETAILCODE NEXT * $ CHARACTER FLOORTYPE NEXT * $ CHARACTER FAULTCOUNT NEXT * $ CHARACTER CREATEDATE NEXT * $ CHARACTER AUDITDATE NEXT * $ CHARACTER CREATEUSER NEXT * $ CHARACTER AUDITUSER NEXT * $ CHARACTER ADDRESSTYPE NEXT * $ CHARACTER MARKETGRADE NEXT * $ CHARACTER ISCOORDINATE NEXT * $ CHARACTER TTMIGRATE NEXT * $ CHARACTER GJGX NEXT * $ CHARACTER ACCESSTYPE NEXT * $ CHARACTER SHAPE NEXT * $ CHARACTER FILLCOLOR NEXT * $ CHARACTER FILLWIDTH NEXT * $ CHARACTER FILLALPHA NEXT * $ CHARACTER FILLSTYLE NEXT * $ CHARACTER FILLBORDERCOLOR NEXT * $ CHARACTER AREA NEXT * $ CHARACTER AREAUNIT NEXT * $ CHARACTER CREATEUSERPHONE NEXT * $ CHARACTER MODIFYUSER NEXT * $ CHARACTER MODIFYDATE NEXT * $ CHARACTER GCSTATUS NEXT * $ CHARACTER PMSPRJCODE NEXT * $ CHARACTER PMSPRJNAME NEXT * $ CHARACTER DELETEUSER NEXT * $ CHARACTER DELETEDATE NEXT * $ CHARACTER AFTERDELETEUSER NEXT * $ CHARACTER AFTERDELETEDATE NEXT * $ CHARACTER ROWS 参数所用的值已从 64 更改为 16 表 T_CJYX_DZXX: 33543432 行 加载成功。 由于数据错误, 0 行 没有加载。 由于所有 WHEN 子句失败, 0 行 没有加载。 由于所有字段都为空的, 0 行 没有加载。 为绑定数组分配的空间: 251808 字节 (16 行) 读取 缓冲区字节数: 1048576 跳过的逻辑记录总数: 0 读取的逻辑记录总数: 33543432 拒绝的逻辑记录总数: 0 废弃的逻辑记录总数: 0 从 星期三 4月 03 08:46:55 2019 开始运行 在 星期三 4月 03 22:48:09 2019 处运行结束 经过时间为: 14: 01: 14.33 CPU 时间为: 00: 18: 19.89
业务实操2
建表语句
create table T_CJYX_HOMECOUNT_BACKUP ( acyc_id VARCHAR2(50), address_id VARCHAR2(50), address_name VARCHAR2(200), address_level VARCHAR2(50), check_type VARCHAR2(50), check_target_num VARCHAR2(50), check_value VARCHAR2(50), target_phone VARCHAR2(4000), notarget_phone VARCHAR2(4000), parent_id VARCHAR2(50), bcyc_id VARCHAR2(50) )
从服务器导入16G数据,表宽十一列,一亿八千万条记录
控制文件di_00121_20190427.ctl如下
LOAD DATA INFILE * INTO TABLE T_CJYX_HOMECOUNT_BACKUP REPLACE FIELDS TERMINATED BY '$' TRAILING NULLCOLS ( acyc_id CHAR(40000), address_id CHAR(40000), address_name CHAR(40000), address_level CHAR(40000), check_type CHAR(40000), check_target_num CHAR(40000), check_value CHAR(40000), target_phone CHAR(40000), notarget_phone CHAR(40000), parent_id CHAR(40000), bcyc_id CHAR(40000) )
运行脚本
这种模式导入速度非常慢,十五小时只能导入三千万条,但是此方案安全稳定
sqlldr userid=username/password@//10.243.5.16:1521/itgrept control=E:\opt\srcbigdata2\di_00121_20190427.ctl log=E:\opt\srcbigdata2\di_00121_20190427.log data=E:\opt\srcbigdata2\di_00121_20190427.dat
参数调优后的脚本,十五小时可以导入一亿条数据
sqlldr userid=username/password@//10.243.5.16:1521/itgrept control=E:\opt\srcbigdata2\di_00121_20190427.ctl log=E:\opt\srcbigdata2\di_00121_20190427_gbk.log data=E:\opt\srcbigdata2\di_00121_20190427_gbk.dat rows=10000 readsize=20680000 bindsize=20680000
最佳性能的情况下,
报错ORA-03113:通信通道的文件结尾
原因:sqlORA-03113 连接到数据库的网络中断
1.SYSTEM中审计表aud$在数据库编程过程中,对SQL语句的编译过程,资源占用会较大
2.SYSAUX则进行的是AWR快照,也会进行占用较多的空间
解决方案:以sysdba的身份清除审计表,最好的方法是弄一个本地表空间进行导入
报错:
SQL*Loader-604: 试图提交时出错
ORA-03135: 连接失去联系
实际上脚本运行中途会因为各种原因中断,使用append方式向表追加
di_00121_20190427_append.ctl如下
LOAD DATA INFILE * INTO TABLE T_CJYX_HOMECOUNT_BACKUP APPEND FIELDS TERMINATED BY '$' TRAILING NULLCOLS ( acyc_id CHAR(40000), address_id CHAR(40000), address_name CHAR(40000), address_level CHAR(40000), check_type CHAR(40000), check_target_num CHAR(40000), check_value CHAR(40000), target_phone CHAR(40000), notarget_phone CHAR(40000), parent_id CHAR(40000), bcyc_id CHAR(40000) )
在108238266行中断
从108238266行继续导入
运行脚本如下
sqlldr userid=username/password@//10.243.5.16:1521/itgrept control=E:\opt\srcbigdata2\di_00121_20190427_append.ctl log=E:\opt\srcbigdata2\di_00121_20190427_gbk_append.log data=E:\opt\srcbigdata2\di_00121_20190427_gbk.dat rows=100 readsize=20680000 bindsize=20680000 skip=108238266
耗时大约四十一小时
总结
exp导出和imp导入的效率要大于sqluldr2和sqlldr;
小数据量体现不出来exp和imp的优势,效率和sqluldr2和sqlldr差不多;
大数据量exp和imp明显优于sqluldr2和sqlldr;
硬件配置高会提高导入导出效率;
不同数据库迁移建议使用sqluldr2和sqlldr,支持csv,txt等数据格式的导入导出;