用SQLLDR将Linux的用户文件passwd导入数据库

通过这个实验我以Linux系统用户文件passwd导入数据库为例,全程记录一下使用SQL*Loader迁移数据的过程,然后小结一下SQLLOADER的优缺点。

1.看一下当前我的系统里passwd文件内容。这个文件中每条记录都是以冒号“:”分割的信息(这个好像地球人都知道,飘过~~)

简单以oracle用户这一行信息为例,解释一下每一个被分割的字段的含义:
条目:oracle:x   :500:500:        :/home/oracle:/bin/bash
解释:用户名 :密码: uid:gid:用户描述  :主目录       :登陆的shell

ora10g@testdb183 /home/oracle$ cat /etc/hosts
root:x:0:0:root:/root:/bin/bash
bin:x:1:1:bin:/bin:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin
adm:x:3:4:adm:/var/adm:/sbin/nologin
lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin
sync:x:5:0:sync:/sbin:/bin/sync
shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
halt:x:7:0:halt:/sbin:/sbin/halt
mail:x:8:12:mail:/var/spool/mail:/sbin/nologin
uucp:x:10:14:uucp:/var/spool/uucp:/sbin/nologin
operator:x:11:0:operator:/root:/sbin/nologin
games:x:12:100:games:/usr/games:/sbin/nologin
gopher:x:13:30:gopher:/var/gopher:/sbin/nologin
ftp:x:14:50:FTP User:/var/ftp:/sbin/nologin
nobody:x:99:99:Nobody:/:/sbin/nologin
nscd:x:28:28:NSCD Daemon:/:/sbin/nologin
vcsa:x:69:69:virtual console memory owner:/dev:/sbin/nologin
pcap:x:77:77::/var/arpwatch:/sbin/nologin
mailnull:x:47:47::/var/spool/mqueue:/sbin/nologin
smmsp:x:51:51::/var/spool/mqueue:/sbin/nologin
rpc:x:32:32:Portmapper RPC user:/:/sbin/nologin
rpcuser:x:29:29:RPC Service User:/var/lib/nfs:/sbin/nologin
nfsnobody:x:4294967294:4294967294:Anonymous NFS User:/var/lib/nfs:/sbin/nologin
sshd:x:74:74:Privilege-separated SSH:/var/empty/sshd:/sbin/nologin
dbus:x:81:81:System message bus:/:/sbin/nologin
haldaemon:x:68:68:HAL daemon:/:/sbin/nologin
avahi-autoipd:x:100:101:avahi-autoipd:/var/lib/avahi-autoipd:/sbin/nologin
avahi:x:70:70:Avahi daemon:/:/sbin/nologin
apache:x:48:48:Apache:/var/www:/sbin/nologin
distcache:x:94:94:Distcache:/:/sbin/nologin
squid:x:23:23::/var/spool/squid:/sbin/nologin
webalizer:x:67:67:Webalizer:/var/www/usage:/sbin/nologin
ntp:x:38:38::/etc/ntp:/sbin/nologin
postgres:x:26:26:PostgreSQL Server:/var/lib/pgsql:/bin/bash
mysql:x:27:27:MySQL Server:/var/lib/mysql:/bin/bash
named:x:25:25:Named:/var/named:/sbin/nologin
xfs:x:43:43:X Font Server:/etc/X11/fs:/sbin/nologin
gdm:x:42:42::/var/gdm:/sbin/nologin
sabayon:x:86:86:Sabayon user:/home/sabayon:/sbin/nologin
dovecot:x:97:97:dovecot:/usr/libexec/dovecot:/sbin/nologin
amanda:x:33:6:Amanda user:/var/lib/amanda:/bin/bash
exim:x:93:93::/var/spool/exim:/sbin/nologin
mailman:x:41:41:GNU Mailing List Manager:/usr/lib/mailman:/sbin/nologin
ident:x:98:98::/home/ident:/sbin/nologin
pvm:x:24:24::/usr/share/pvm3:/bin/bash
quagga:x:92:92:Quagga routing suite:/var/run/quagga:/sbin/nologin
privoxy:x:73:73::/etc/privoxy:/sbin/nologin
radvd:x:75:75:radvd user:/:/sbin/nologin
uuidd:x:101:104:UUID generator helper daemon:/var/lib/libuuid:/sbin/nologin
cyrus:x:76:12:Cyrus IMAP Server:/var/lib/imap:/bin/bash
ldap:x:55:55:LDAP User:/var/lib/ldap:/bin/false
postfix:x:89:89::/var/spool/postfix:/sbin/nologin
radiusd:x:95:95:radiusd user:/home/radiusd:/sbin/nologin
pegasus:x:66:65:tog-pegasus OpenPegasus WBEM/CIM services:/var/lib/Pegasus:/sbin/nologin
tomcat:x:91:91:Tomcat:/usr/share/tomcat5:/bin/sh
oracle:x:500:500::/home/oracle:/bin/bash

2.针对passwd文件格式在数据库中创建供导入使用的数据表linux_passwd
sec@ORA10G> create table linux_passwd
  2  ( p_user_name   varchar2(20) constraint pk_linux_passwd primary key,
  3    p_password    varchar2(20),
  4    p_uid         number(20),
  5    p_gid         number(20),
  6    p_description varchar2(100),
  7    p_main_dir    varchar2(100),
  8    p_shell       varchar2(50)
  9  )
 10  /

Table created.

3.准备SQL*Loader的控制文件load_passwd.ctl
注意:注释信息是为了描述方便添加的,在您演示这个实验的时候,记得将这些内容删除掉。
ora10g@testdb183 /home/oracle$ cat load_passwd.ctl
LOAD DATA                 -- 解释:告之SQL*Loader我们要做什么,这里表示要加载数据
INFILE *                  -- 解释:要输入的数据文件,这里“*”表示待加载的数据包含在控制文件中
INTO TABLE linux_passwd   -- 解释:向表linux_passwd中加载数据
REPLACE                   -- 解释:这里可以是insert(要求表为空)、append(在表中追加数据)、replace(delete方式删除原有的记录后再加载数据)和truncate(功能同replace,不过这里会先truncate表,效率更高)
FIELDS TERMINATED BY ':'  -- 解释:分隔符定义为“:”
( p_user_name   ,         -- 解释:以下定义对应于待导入表的结构,输入流数据类型默认为CHAR(255)
  p_password    ,
  p_uid         ,
  p_gid         ,
  p_description ,
  p_main_dir    ,
  p_shell
)
BEGINDATA                 -- 解释:通知SQL*Loader输入数据的描述已经完成,下面内容是待加载的数据
root:x:0:0:root:/root:/bin/bash
bin:x:1:1:bin:/bin:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin
adm:x:3:4:adm:/var/adm:/sbin/nologin
lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin
sync:x:5:0:sync:/sbin:/bin/sync
shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
halt:x:7:0:halt:/sbin:/sbin/halt
mail:x:8:12:mail:/var/spool/mail:/sbin/nologin
uucp:x:10:14:uucp:/var/spool/uucp:/sbin/nologin
operator:x:11:0:operator:/root:/sbin/nologin
games:x:12:100:games:/usr/games:/sbin/nologin
gopher:x:13:30:gopher:/var/gopher:/sbin/nologin
ftp:x:14:50:FTP User:/var/ftp:/sbin/nologin
nobody:x:99:99:Nobody:/:/sbin/nologin
nscd:x:28:28:NSCD Daemon:/:/sbin/nologin
vcsa:x:69:69:virtual console memory owner:/dev:/sbin/nologin
pcap:x:77:77::/var/arpwatch:/sbin/nologin
mailnull:x:47:47::/var/spool/mqueue:/sbin/nologin
smmsp:x:51:51::/var/spool/mqueue:/sbin/nologin
rpc:x:32:32:Portmapper RPC user:/:/sbin/nologin
rpcuser:x:29:29:RPC Service User:/var/lib/nfs:/sbin/nologin
nfsnobody:x:4294967294:4294967294:Anonymous NFS User:/var/lib/nfs:/sbin/nologin
sshd:x:74:74:Privilege-separated SSH:/var/empty/sshd:/sbin/nologin
dbus:x:81:81:System message bus:/:/sbin/nologin
haldaemon:x:68:68:HAL daemon:/:/sbin/nologin
avahi-autoipd:x:100:101:avahi-autoipd:/var/lib/avahi-autoipd:/sbin/nologin
avahi:x:70:70:Avahi daemon:/:/sbin/nologin
apache:x:48:48:Apache:/var/www:/sbin/nologin
distcache:x:94:94:Distcache:/:/sbin/nologin
squid:x:23:23::/var/spool/squid:/sbin/nologin
webalizer:x:67:67:Webalizer:/var/www/usage:/sbin/nologin
ntp:x:38:38::/etc/ntp:/sbin/nologin
postgres:x:26:26:PostgreSQL Server:/var/lib/pgsql:/bin/bash
mysql:x:27:27:MySQL Server:/var/lib/mysql:/bin/bash
named:x:25:25:Named:/var/named:/sbin/nologin
xfs:x:43:43:X Font Server:/etc/X11/fs:/sbin/nologin
gdm:x:42:42::/var/gdm:/sbin/nologin
sabayon:x:86:86:Sabayon user:/home/sabayon:/sbin/nologin
dovecot:x:97:97:dovecot:/usr/libexec/dovecot:/sbin/nologin
amanda:x:33:6:Amanda user:/var/lib/amanda:/bin/bash
exim:x:93:93::/var/spool/exim:/sbin/nologin
mailman:x:41:41:GNU Mailing List Manager:/usr/lib/mailman:/sbin/nologin
ident:x:98:98::/home/ident:/sbin/nologin
pvm:x:24:24::/usr/share/pvm3:/bin/bash
quagga:x:92:92:Quagga routing suite:/var/run/quagga:/sbin/nologin
privoxy:x:73:73::/etc/privoxy:/sbin/nologin
radvd:x:75:75:radvd user:/:/sbin/nologin
uuidd:x:101:104:UUID generator helper daemon:/var/lib/libuuid:/sbin/nologin
cyrus:x:76:12:Cyrus IMAP Server:/var/lib/imap:/bin/bash
ldap:x:55:55:LDAP User:/var/lib/ldap:/bin/false
postfix:x:89:89::/var/spool/postfix:/sbin/nologin
radiusd:x:95:95:radiusd user:/home/radiusd:/sbin/nologin
pegasus:x:66:65:tog-pegasus OpenPegasus WBEM/CIM services:/var/lib/Pegasus:/sbin/nologin
tomcat:x:91:91:Tomcat:/usr/share/tomcat5:/bin/sh
oracle:x:500:500::/home/oracle:/bin/bash

4.前面都是准备工作,真正的导入操作在此
ora10g@testdb183 /home/oracle$sqlldr userid=sec/sec control=load_passwd.ctl log=load_passwd.log bad=load_passwd.bad discard=load_passwd.dsc

SQL*Loader: Release 10.2.0.3.0 - Production on Sun Aug 30 17:08:59 2009

Copyright (c) 1982, 2005, Oracle.  All rights reserved.

Commit point reached - logical record count 56

5.如果成功导入,对应目录中只会有load_passwd.log文件,查看一下这个文件的内容,该文件详细地记录了导入过程涉及到的参数和导入结果
ora10g@testdb183 /home/oracle$ cat load_passwd.log

SQL*Loader: Release 10.2.0.3.0 - Production on Sun Aug 30 17:08:59 2009

Copyright (c) 1982, 2005, Oracle.  All rights reserved.

Control File:   load_passwd.ctl
Data File:      load_passwd.ctl
  Bad File:     load_passwd.bad
  Discard File: load_passwd.dsc
 (Allow all discards)

Number to load: ALL
Number to skip: 0
Errors allowed: 50
Bind array:     64 rows, maximum of 256000 bytes
Continuation:    none specified
Path used:      Conventional

Table LINUX_PASSWD, loaded from every logical record.
Insert option in effect for this table: REPLACE

   Column Name                  Position   Len  Term Encl Datatype
------------------------------ ---------- ----- ---- ---- ----------
P_USER_NAME                         FIRST     *   :       CHARACTER
P_PASSWORD                           NEXT     *   :       CHARACTER
P_UID                                NEXT     *   :       CHARACTER
P_GID                                NEXT     *   :       CHARACTER
P_DESCRIPTION                        NEXT     *   :       CHARACTER
P_MAIN_DIR                           NEXT     *   :       CHARACTER
P_SHELL                              NEXT     *   :       CHARACTER


Table LINUX_PASSWD:
  56 Rows successfully loaded.
  0 Rows not loaded due to data errors.
  0 Rows not loaded because all WHEN clauses were failed.
  0 Rows not loaded because all fields were null.


Space allocated for bind array:                 115584 bytes(64 rows)
Read   buffer bytes: 1048576

Total logical records skipped:          0
Total logical records read:            56
Total logical records rejected:         0
Total logical records discarded:        0

Run began on Sun Aug 30 17:08:59 2009
Run ended on Sun Aug 30 17:08:59 2009

Elapsed time was:     00:00:00.08
CPU time was:         00:00:00.02
ora10g@testdb183 /home/oracle$

6.在成功导入之后,我们到数据库中查看一下linux_passwd表中的数据
ora10g@testdb183 /home/oracle$ sqlplus sec/sec

SQL*Plus: Release 10.2.0.3.0 - Production on Sun Aug 30 17:10:17 2009

Copyright (c) 1982, 2006, Oracle.  All Rights Reserved.


Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - 64bit Production
With the Partitioning, Oracle Label Security, OLAP and Data Mining Scoring Engine options

sec@ORA10G> col P_USER_NAME for a13
sec@ORA10G> col P_PASSWORD for a4
sec@ORA10G> col P_UID for 99999999999
sec@ORA10G> col P_GID for 99999999999
sec@ORA10G> col P_DESCRIPTION for a42
sec@ORA10G> col P_MAIN_DIR for a22
sec@ORA10G> col P_SHELL for a15
sec@ORA10G> select * from linux_passwd;

P_USER_NAME   P_PA P_UID P_GID P_DESCRIPTION  P_MAIN_DIR             P_SHELL       
------------- ---- ----- ----- -------------- ---------------------- ---------------
root          x        0     0 root           /root                  /bin/bash     
bin           x        1     1 bin            /bin                   /sbin/nologin 
daemon        x        2     2 daemon         /sbin                  /sbin/nologin 
adm           x        3     4 adm            /var/adm               /sbin/nologin 
lp            x        4     7 lp             /var/spool/lpd         /sbin/nologin 
sync          x        5     0 sync           /sbin                  /bin/sync     
shutdown      x        6     0 shutdown       /sbin                  /sbin/shutdown
halt          x        7     0 halt           /sbin                  /sbin/halt    
mail          x        8    12 mail           /var/spool/mail        /sbin/nologin 
uucp          x       10    14 uucp           /var/spool/uucp        /sbin/nologin 
operator      x       11     0 operator       /root                  /sbin/nologin 
games         x       12   100 games          /usr/games             /sbin/nologin 
gopher        x       13    30 gopher         /var/gopher            /sbin/nologin 
ftp           x       14    50 FTP User       /var/ftp               /sbin/nologin 
nobody        x       99    99 Nobody         /                      /sbin/nologin 
nscd          x       28    28 NSCD Daemon    /                      /sbin/nologin 
vcsa          x       69    69 virtual consol /dev                   /sbin/nologin 
pcap          x       77    77                /var/arpwatch          /sbin/nologin 
mailnull      x       47    47                /var/spool/mqueue      /sbin/nologin 
smmsp         x       51    51                /var/spool/mqueue      /sbin/nologin 
rpc           x       32    32 Portmapper RPC /                      /sbin/nologin 
rpcuser       x       29    29 RPC Service Us /var/lib/nfs           /sbin/nologin 
nfsnobody     x    67294 67294 Anonymous NFS  /var/lib/nfs           /sbin/nologin 
sshd          x       74    74 Privilege-sepa /var/empty/sshd        /sbin/nologin 
dbus          x       81    81 System message /                      /sbin/nologin 
haldaemon     x       68    68 HAL daemon     /                      /sbin/nologin 
avahi-autoipd x      100   101 avahi-autoipd  /var/lib/avahi-autoipd /sbin/nologin 
avahi         x       70    70 Avahi daemon   /                      /sbin/nologin 
apache        x       48    48 Apache         /var/www               /sbin/nologin 
distcache     x       94    94 Distcache      /                      /sbin/nologin 
squid         x       23    23                /var/spool/squid       /sbin/nologin 
webalizer     x       67    67 Webalizer      /var/www/usage         /sbin/nologin 
ntp           x       38    38                /etc/ntp               /sbin/nologin 
postgres      x       26    26 PostgreSQL Ser /var/lib/pgsql         /bin/bash     
mysql         x       27    27 MySQL Server   /var/lib/mysql         /bin/bash     
named         x       25    25 Named          /var/named             /sbin/nologin 
xfs           x       43    43 X Font Server  /etc/X11/fs            /sbin/nologin 
gdm           x       42    42                /var/gdm               /sbin/nologin 
sabayon       x       86    86 Sabayon user   /home/sabayon          /sbin/nologin 
dovecot       x       97    97 dovecot        /usr/libexec/dovecot   /sbin/nologin 
amanda        x       33     6 Amanda user    /var/lib/amanda        /bin/bash     
exim          x       93    93                /var/spool/exim        /sbin/nologin 
mailman       x       41    41 GNU Mailing Li /usr/lib/mailman       /sbin/nologin 
ident         x       98    98                /home/ident            /sbin/nologin 
pvm           x       24    24                /usr/share/pvm3        /bin/bash     
quagga        x       92    92 Quagga routing /var/run/quagga        /sbin/nologin 
privoxy       x       73    73                /etc/privoxy           /sbin/nologin 
radvd         x       75    75 radvd user     /                      /sbin/nologin 
uuidd         x      101   104 UUID generator /var/lib/libuuid       /sbin/nologin 
cyrus         x       76    12 Cyrus IMAP Ser /var/lib/imap          /bin/bash     
ldap          x       55    55 LDAP User      /var/lib/ldap          /bin/false    
postfix       x       89    89                /var/spool/postfix     /sbin/nologin 
radiusd       x       95    95 radiusd user   /home/radiusd          /sbin/nologin 
pegasus       x       66    65 tog-pegasus Op /var/lib/Pegasus       /sbin/nologin 
tomcat        x       91    91 Tomcat         /usr/share/tomcat5     /bin/sh       
oracle        x      500   500                /home/oracle           /bin/bash     
                                                                                   
56 rows selected.                                                                                                            


7.OK,圆满的完成了实验目标。
这里演示的是一个很经典的SQL*Loader操作流程,任何SQL*Loader导入都可按照这个流程来完成。只是在一些细节上有区别,比如数据文件是否是放到一个单独的文件中(这个实验中数据是放到控制文件里的),各种各样的分隔符的特殊处理方式等等。

8.不可不看的扩展参考
SQL*Loader参考一】在命令行窗口直接使用sqlldr就可以看到一个简略的帮助文档
ora10g@testdb183 /home/oracle$ sqlldr

SQL*Loader: Release 10.2.0.3.0 - Production on Sun Aug 30 16:10:32 2009

Copyright (c) 1982, 2005, Oracle.  All rights reserved.


Usage: SQLLDR keyword=value [,keyword=value,...]

Valid Keywords:

    userid -- ORACLE username/password
   control -- control file name
       log -- log file name
       bad -- bad file name
      data -- data file name
   discard -- discard file name
discardmax -- number of discards to allow          (Default all)
      skip -- number of logical records to skip    (Default 0)
      load -- number of logical records to load    (Default all)
    errors -- number of errors to allow            (Default 50)
      rows -- number of rows in conventional path bind array or between direct path data saves
               (Default: Conventional path 64, Direct path all)
  bindsize -- size of conventional path bind array in bytes  (Default 256000)
    silent -- suppress messages during run (header,feedback,errors,discards,partitions)
    direct -- use direct path                      (Default FALSE)
   parfile -- parameter file: name of file that contains parameter specifications
  parallel -- do parallel load                     (Default FALSE)
      file -- file to allocate extents from
skip_unusable_indexes -- disallow/allow unusable indexes or index partitions  (Default FALSE)
skip_index_maintenance -- do not maintain indexes, mark affected indexes as unusable  (Default FALSE)
commit_discontinued -- commit loaded rows when load is discontinued  (Default FALSE)
  readsize -- size of read buffer                  (Default 1048576)
external_table -- use external table for load; NOT_USED, GENERATE_ONLY, EXECUTE  (Default NOT_USED)
columnarrayrows -- number of rows for direct path column array  (Default 5000)
streamsize -- size of direct path stream buffer in bytes  (Default 256000)
multithreading -- use multithreading in direct path
 resumable -- enable or disable resumable for current session  (Default FALSE)
resumable_name -- text string to help identify resumable statement
resumable_timeout -- wait time (in seconds) for RESUMABLE  (Default 7200)
date_cache -- size (in entries) of date conversion cache  (Default 1000)

PLEASE NOTE: Command-line parameters may be specified either by
position or by keywords.  An example of the former case is 'sqlldr
scott/tiger foo'; an example of the latter is 'sqlldr control=foo
userid=scott/tiger'.  One may specify parameters by position before
but not after parameters specified by keywords.  For example,
'sqlldr scott/tiger control=foo logfile=log' is allowed, but
'sqlldr scott/tiger control=foo log' is not, even though the
position of the parameter 'log' is correct.

SQL*Loader参考二】《Oracle SQL*Loader Version 10.2》这个文档很有参考价值
http://www.psoug.org/reference/sqlloader.html

SQL*Loader参考三】Oralce官方文档《SQL*Loader》Chapter 6到Chapter 11六章的内容都是介绍SQL*Loader的,可见Oracle对这个工具的重视程度。
http://download.oracle.com/docs/cd/B19306_01/server.102/b14215/part_ldr.htm#i436326

9.小结
SQLLDR优点:
1.迁移、备份和恢复数据的又一个有效手段
2.不同数据库之间进行数据迁移的非常方便而且通用的工具,避免类似EXP(EXPDP)/IMP(IMPDP)工具导致乱码问题;
3.从文本文件向数据库迁移的超级有效的手段;
4.速度快,尤其结合使用直接路径加载技术,这个技术可以跳过整个SQL引擎,同时避免undo和redo的生成,有效的提高数据加载效率;
5.与外部表技术结合紧密。

SQLLDR缺点:
基本上没有什么缺点,不过,对较特别的类型数据,如LOB类型的数据导入过程需要注意的地方较多,有机会就此展开讨论一下。

-- The End --

你可能感兴趣的:(linux,server,user,PostgreSQL,generator,multithreading)