mongodump 在数据量大的时候非常缓慢

这边的解决方案是增加 --forceTableScan

--forceTableScan
Forces mongodump to scan the data store directly: typically, mongodump saves entries as they appear in the index of the _id field. Use --forceTableScan to skip the index and scan the data directly. Typically there are two cases where this behavior is preferable to the default:

If you have key sizes over 800 bytes that would not be present in the _id index.
Your database uses a custom _id field.
When you run with --forceTableScan, mongodump does not use $snapshot. As a result, the dump produced by mongodump can reflect the state of the database at many different points in time.

首先有一点,我使用过的是自己设的_id。正常情况下,mongodump 会通过mongo的_id 这个索引去读取,会产生大量的随机读。使用这个参数,可以不通过_id去读取。

mongodump默认使用snapshot,会通过扫描_id 索引,然后去读取实际的文档。TableScan会按照mongodb的物理存储顺序进行扫描,没有读取index的过程。但是tablescan潜在的问题是,如果一个文档在dump的过程中移动了(物理上),有可能会最终输出两次。

你可能感兴趣的:(mongodump 在数据量大的时候非常缓慢)