LOAD DATA INFILE '/var/lib/mysql-files/jjdb_fkdb_all_dropdup_20w.csv'
into table `jjdb_fkdb_all_dropdup_20w` character set utf8
fields terminated by ',' optionally enclosed by '"' escaped by '"'
lines terminated by '\r\n';
注:导入csv文件,对文件的结构和内容要求苛刻,失败概率很大。
1,将csv文件保存为txt文件输出
df.to_csv('jjdb_fkdb_all_dropdup.txt',sep='\t',index=False)
2,进入数据库
mysql -uroot -p
use mysql57
3,数据导入mysql
LOAD DATA INFILE '/var/lib/mysql-files/jjdb_fkdb_all_dropdup_20w.txt'
若txt中字段的顺序与table中字段的顺序不一致,则会报错,所以以下命令中,按照table中的顺序输入字段名称:
INTO TABLE jjdb_fkdb_all_dropdup_20w IGNORE 1 LINES (xzqhdm,jjdbh,jjdwdm,jjybh,jjyxm,jjtbh,jjtip,jjsj,bjdh,jjlyh,bjrxm,bjrxbdm,lxdh,lxdz,jqdz,gxdwdm,jqlxdm,zddwxzb,zddwyzb,jqztdm,gxsjc,gljqbh,tfhm,rksjc,bjnr,bjlxmc,labels,sjdbh,jjdbh_f,cjdbh,fkdbh,fksj,fkdwip,fktbh,fkybh,fkdwdm,fkyxm,jwqdm,jwqmc,cjrxm,sjcjsj,ddxcsj,jqlbdm,jqxldm,jqfssj,jqjssj,jqdjdm,hzdjdm,afcslxdm,qhjzlbdm,cjqk,sfphxsaj,sfcczaaj,sfjjjf,cdclqk,cdryqk,hzyydm,cljgdm,cljg,rksjc_f,gxsjc_f,jqztdm_f,id);
导入成功
Query OK, 6331041 rows affected, 65535 warnings (2 min 30.16 sec)
Records: 6331041 Deleted: 0 Skipped: 0 Warnings: 48877809
1,导入过程中出现1261报错:
ERROR 1261 (01000): Row 404 doesn't contain data for all columns
则设置sql_mode,操作如下:
show variables like "sql_mode";
set sql_mode='';
2,导入过程中出现1064报错:
ERROR 1062 (23000): Duplicate entry '20' for key 'PRIMARY'
则在Navicat中,将table中的主键去掉即可,但如此导入结束后会发现数据量增加。
从表jjdb_fkdb_all_dropdup_copy中导出csv文件(导出的csv文件用pandas读取可能会出错)
select * into outfile '/var/lib/mysql-files/all_vs_dup.csv' fields terminated by '\t' lines terminated by '\n' from jjdb_fkdb_all_dropdup_copy;
import pandas as pd
import pymysql
# sql 命令
sql_cmd = "SELECT * FROM table_name"
# 用DBAPI构建数据库链接engine
con = pymysql.connect(host='172.**.**.**', user='****', password='**这里填写数据库密码**', database='**填写数据库名称**', charset='utf8', use_unicode=True)
df = pd.read_sql(sql_cmd, con)