laravel mongodb数据导出csv方式测试

数据量:10w

需求:订单表关联批次表导出线管数据

用应用jenssegers/mongodb 拓展包存取数据给我们带来便利,但是会不会带来性能方面的问题呢?

应用jenssegers/mongodb 拓展导出

工具/环境:homestead + jenssegers/mongodb": "^3.2 + console

$t = microtime(true);
    $order_id  = '';
    $is_true = true;
	$limit = 1000;
    $file_name = storage_path('files/csv_test.csv');
    $fp = fopen($file_name, 'w');
    while ($is_true) {
        $query = \App\Models\Order_test::query()
            ->with('batch')
            ->where('payment.status', '>', \App\Models\Order::PAYMENT_UNPAID);
        if (!empty($order_id)) {
            $query->where('id','>', $order_id);
        }
        $orders      = $query->orderBy('id', 'asc')->limit($limit)->get();
        $order_count = count($orders);
        if ($order_count > 0) {
            //第一次
            if (empty($order_id)) {
                fwrite($fp, chr(0xEF) . chr(0xBB) . chr(0xBF));
                $title_list = [
                    '订单号',
                    '付款时间',
                    '订单状态',
                    '商品金额',
                    '订单金额',
                    '商品种类',
                    '批次信息1',
                    '批次信息2',
                    '订单创建时间'
                ];
                fputcsv($fp, $title_list);
            }
            //数据格式
            foreach ($orders as $order) {
                $order_id = $order['id']; 
                $data = [
                    'order_id'      => $order_id.'\t',
                    'pay_time'      => $order->pay_time ?? '',
                    'order_status'  => \App\Models\Order::$_statusArr[$order->status] ?? '',
                    'products_cost' => $order->products_cost,
                    'cost_total'    => $order->total,
                    'items_total'   => $order->items_total,
                    'batch_field1'  => $order->batch['field1']??'';,
                    'batch_field2'  => $order->batch['field2']??'';,
                    'created_at'    => (string)$order->created_at,
                ];

                fputcsv($fp, $data);
            }
        }

        //结束循环
        if ($order_count < $limit) {
            $is_true = false;
            //关闭文件
            fclose($fp);
        }

    }

    $tl = microtime(true)-$t;
    dd($tl);

用时73s

原生导出

工具/环境:homestead + console

Artisan::command('tiway:mongo_raw', function () {

    $t = microtime(true);
    //链接mongo
    $con = new \MongoDB\Client("mongodb://username:password@host:port/");
    $db = $con->selectDatabase('test');
    $collection = $db->selectCollection('order_test');

    $order_id  = '';
    $is_true   = true;
    //分页数
    $limit_item = 1000;
    $file_name = storage_path('files/mongo_raw.csv');
    $fp = fopen($file_name, 'w');
    while ($is_true) {
        //第一次
        if (empty($order_id)) {
            fwrite($fp, chr(0xEF) . chr(0xBB) . chr(0xBF));
            $title_list = [
                    '订单号',
                    '付款时间',
                    '订单状态',
                    '商品金额',
                    '订单金额',
                    '商品种类',
                    '批次信息1',
                    '批次信息2',
                    '订单创建时间'
                ];
            fputcsv($fp, $title_list);
        }

        $pipeline = [];
        $match = ["payment.status" => ['$gt'=> \App\Models\Order::PAYMENT_UNPAID]];
        if (!empty($order_id)) {
            $order_id_match = ["id" => ['$gt'=> $order_id]];
            $match += $order_id_match;
        }
        array_push($pipeline,['$match' => $match]);
        $sort = [
            '$sort' =>['id'=>1],
        ];
        $limit = [
            '$limit'=>$limit_item
        ];
        array_push($pipeline,$limit,$sort);
        //关联batch
        $batch_pipe = [
            '$lookup' => [
                "from"         => "easyeda_batch",
                // 关联到order表
                "localField"   => "batch_num",
                // user 表关联的字段
                "foreignField" => "batch_num",
                // order 表关联的字段
                "as"           => "batch",
            ],
        ];
        array_push($pipeline,$batch_pipe);
        //是否允许用文件缓存索引
        $allow_disk = ['allowDiskUse'=> true];
        $orders = $collection->aggregate($pipeline,$allow_disk);
        $row = 0;
        //数据格式
        foreach ($orders as $order) {
            $order_id = $order->id;
            $data = [
                    'order_id'      => $order_id.'\t',
                    'pay_time'      => $order->pay_time ?? '',
                    'order_status'  => \App\Models\Order::$_statusArr[$order->status] ?? '',
                    'products_cost' => $order->products_cost,
                    'cost_total'    => $order->total,
                    'items_total'   => $order->items_total,
                    'batch_field1'  => $order->batch['field1']??'';,
                    'batch_field2'  => $order->batch['field2']??'';,
                    'created_at'    => (string)$order->created_at,
                ];
            $row ++;
            fputcsv($fp, $data);
        }
        //结束循环
        if ($row < $limit_item) {
            $is_true = false;
            //关闭文件
            fclose($fp);
        }
    }

    $tl = microtime(true)-$t;
    dd($tl);

});

居然用时172s,居然更多了!!

猜测是[‘allowDiskUse’=> true]这个设置的原因,但是如果不设置为true,内存不够呀;

In Aggregate.php line 219:

  Sort exceeded memory limit of 104857600 bytes, but did not opt in to external
  sorting. Aborting operation. Pass allowDiskUse:true to opt in.

尝试不用 s o r t 用 sort 用 sortskip 去分页:

然后。。。。

导出在第37000 条的时候已经超过172s了,然后越来越慢慢慢慢。。。 第二天发现:2213s!!! what?!!!

不要轻易使用Skip来做查询,否则数据量大了就会导致性能急剧下降,这是因为Skip是一条一条的数过来的,多了自然就慢了。

mongodb 聚合文档
在探索中…

你可能感兴趣的:(mongo)