mysql使用存储过程插入大量测试数据

随机字符串

delimiter //
CREATE DEFINER=`root`@`%` FUNCTION `rand_string`(n INT) RETURNS varchar(255) CHARSET utf8mb4
    DETERMINISTIC
BEGIN
    DECLARE chars_str varchar(100) DEFAULT 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789';
    DECLARE return_str varchar(255) DEFAULT '' ;
    DECLARE i INT DEFAULT 0;
    WHILE i < n DO
        SET return_str = concat(return_str, substring(chars_str, FLOOR(1 + RAND() * 62), 1));
        SET i = i + 1;
    END WHILE;
    RETURN return_str;
END;//

指定时间段内的随机时间

delimiter //
CREATE DEFINER=`root`@`%` FUNCTION `rand_datetime`(sd DATETIME,ed DATETIME) RETURNS datetime
    DETERMINISTIC
BEGIN
    DECLARE sub INT DEFAULT 0;
    DECLARE ret DATETIME;
    SET sub = ABS(UNIX_TIMESTAMP(ed)-UNIX_TIMESTAMP(sd));
    SET ret = DATE_ADD(sd,INTERVAL FLOOR(1+RAND()*(sub-1)) SECOND);
    RETURN ret;
END;//

test:

SELECT rand_datetime(DATE_FORMAT('2017-1-1 00:00:00','%Y-%m-%d %H:%i:%s'),DATE_FORMAT('2017-12-31 23:59:59','%Y-%m-%d %H:%i:%s')) AS t;

随机整数

select TRUNCATE( RAND( ) *100000,0)
SELECT FLOOR( 1 + RAND( ) *5 )

插入表的存储过程

delimiter  //
CREATE DEFINER=`root`@`%` PROCEDURE `add_article_ready`(IN n int)
BEGIN
    DECLARE i INT DEFAULT 1;
    WHILE (i <= n) DO
        INSERT INTO article_ready (article_id, url, title, content, summary, published_at, account_id, comment_count, channel_id, cover_id, lang , created_at, updated_at) VALUES ( 
        FLOOR(1 + RAND() *70000), rand_string(50), rand_string(50), rand_string(1000), rand_string(500), rand_datetime(DATE_FORMAT('2017-1-1 00:00:00','%Y-%m-%d %H:%i:%s'),DATE_FORMAT('2019-01-24 23:59:59','%Y-%m-%d %H:%i:%s')), TRUNCATE(RAND() *70000,0), TRUNCATE(RAND() *1000,0), 
        FLOOR( 1 + RAND() * 5), FLOOR( 1 + RAND( ) * 80000),  ELT(0.5 + RAND() * 2, 'zh-tw', 'zh-cn'), NOW(), NOW());
        SET i = i + 1;
    END WHILE;
END;//
delimiter  //
CREATE DEFINER=`root`@`%` PROCEDURE `add_article_file`(IN n int)
BEGIN
    DECLARE i INT DEFAULT 1;
    WHILE (i <= n) DO
        INSERT INTO article_file (file_name, file_size, file_path, status, created_at, updated_at) VALUES ( 
        concat(rand_string(20), ".jpg"), 
        FLOOR(1 + RAND() *200000), 
        concat("2018", "/", FLOOR(1 + RAND() * 12), "/", FLOOR(1 + RAND() * 30)), 
        1, NOW(), NOW());
        SET i = i + 1;
    END WHILE;
END;//

执行插入

CALL add_article_ready(10000)
CALL add_article_file(80000)

报错,return_str 太长了,因为rand_string的值设置了255长度,而我要生成1000的长度超过了限制。但是如果改成varchar(2000),依然提示Data too long for column 'rand_string(500)' at row 1.改成

   DECLARE return_str TEXTDEFAULT '' ;

并在phpmyadmin的函数中将rand_string的返回类型也改成TEXT.
执行

SET @p0='500'; SELECT `rand_string`(@p0) AS `rand_string`;

就可以输出大长度的随机字符串了。

插入100w条数据,用了30分钟才插入20W数据。效率比较渣。因为rand_string生成的数据比较大,会影响到速度,而且sql语句太长也会影响效率。
效率提升的方法:

  • 如果用LOAD DATA 效率会比insert 高。
  • 在导入之前关闭索引、惟一性检测也会提高效率
ALTER TABLE table_name DISABLE KEYS;
ALTER TABLE table_name ENABLE KEYS;
SET  UNIQUE_CHECKS=0;
SET  UNIQUE_CHECKS=1;
  • 使用临时表

参考:
https://windmt.com/2018/05/04/mysql-easily-generate-millions-of-test-data/
https://blog.csdn.net/JQ_AK47/article/details/52087484
http://www.111cn.net/database/mysql/53274.htm

你可能感兴趣的:(mysql)