pytorch读取lmdb文件报错,lmdb.InvalidParameterError:解决

一,问题描述:

        Pytorch框架,使用lmdb格式的数据集进行训练时,报错

lmdb.InvalidParameterError: Caught InvalidParameterError in DataLoader worker process 0.

二,报错详细信息:

Traceback (most recent call last):
  File "/home/xxxx/PycharmProjects/xxxx/utils/data_v2_squence_lstm_lmdb.py", line 453, in 
    test_dataset()
  File "/home/xxxx/PycharmProjects/xxxx/utils/data_v2_squence_lstm_lmdb.py", line 390, in test_dataset
    for d, t in train_loader:
  File "/home/xxxx/miniconda3/envs/pytorch-1.7/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 435, in __next__
    data = self._next_data()
  File "/home/xxxx/miniconda3/envs/pytorch-1.7/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 1065, in _next_data
    return self._process_data(data)
  File "/home/xxxx/miniconda3/envs/pytorch-1.7/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 1111, in _process_data
    data.reraise()
  File "/home/xxxx/miniconda3/envs/pytorch-1.7/lib/python3.6/site-packages/torch/_utils.py", line 428, in reraise
    raise self.exc_type(msg)
lmdb.InvalidParameterError: Caught InvalidParameterError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/home/xxxx/miniconda3/envs/pytorch-1.7/lib/python3.6/site-packages/torch/utils/data/_utils/worker.py", line 198, in _worker_loop
    data = fetcher.fetch(index)
  File "/home/xxxx/miniconda3/envs/pytorch-1.7/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/xxxx/miniconda3/envs/pytorch-1.7/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in 
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/xxxx/PycharmProjects/xxxx/utils/data_v2_squence_lstm_lmdb.py", line 293, in __getitem__
    img = self.query_by_key(query_key, lmdb_dir)
  File "/home/xxxx/PycharmProjects/xxxx/utils/data_v2_squence_lstm_lmdb.py", line 137, in query_by_key
    db = lmdb.open(lmdb_path, subdir=os.path.isdir(lmdb_path), readonly=True)
lmdb.InvalidParameterError: mdb_txn_begin: Invalid argument

三,原因分析:

1,报错分析

可以看到报错信息是在打开lmdb文件时发生了错误。

db = lmdb.open(lmdb_path, subdir=os.path.isdir(lmdb_path), readonly=True)

2,问题参考

这个 问答 提到

LMDB will keep separate processes from overwriting over each other. However, it can't distinguish between requests from the same process from different environment objects. This will lead to data corruption.

Each process should open its own environment object (pointing at the same path).


LMDB将防止独立进程相互覆盖。但是,它不能区分来自不同环境对象的相同进程的请求。这将导致数据损坏。

每个进程应该打开自己的环境对象(指向相同的路径)。

3,lmdb.open参数查看

open方法中有一个 lock 参数

文档中说明如下

`lock`:
    If ``False``, don't do any locking. If concurrent access is
    anticipated, the caller must manage all concurrency itself. For
    proper operation the caller must enforce single-writer semantics,
    and must ensure that no readers are using old transactions while a
    writer is active. The simplest approach is to use an exclusive lock
    so that no readers may be active at all when a writer begins.

        如果为' False ',则不进行任何锁定。如果并发访问是预期的,调用方必须管理所有并发性本身。对于正确的操作,调用者必须强制单single-writer语义,并且必须确保在写入器处于活动状态时没有任何reader在使用旧事务。最简单的方法是使用独占锁,这样当writer开始时,就没有任何readers是活动的。

        所以为了保证数据安全性,默认lock是True的,此时不支持多reader访问,所以关闭此参数即可解决。

4,其他不好的解决方法

number_worker=0

        这个 回答 中说到可以把number_worker设置为0,确实可以解决这个问题,但是肯定会导致速度变慢,不推荐使用。

四,解决方案:

读取lmdb文件时,lock设置为False


# lock设置为False
db = lmdb.open(lmdb_path, subdir=os.path.isdir(lmdb_path), readonly=True, lock=False)

五,引用:

lmdb.InvalidParameterError: mdb_txn_begin: Invalid argument · Issue #289 · jnwatson/py-lmdb · GitHubAffected Operating Systems Linux Affected py-lmdb Version '1.0.0' py-lmdb Installation Method pip install lmdb Using bundled or distribution-provided LMDB library? Bundled Distribution name and LMDB library version (0, 9, 24) Machine "fr...https://github.com/jnwatson/py-lmdb/issues/289

Python迭代DataLoader时出现TypeError: Caught TypeError in DataLoader worker process 0.错误。_小心丶的博客-CSDN博客_raise self.exc_type(msg)迭代DataLoader时出现TypeError: Caught TypeError in DataLoader worker process 0.错误。遇见一个难以解决的问题遇见一个难以解决的问题迭代 DataLoader时出现以下错误,暂时不知道怎么解决,向大家求救,是一个比较稀罕的错误,也分享给大家一个奇葩的问题一起讨论。Traceback (most recent call last...https://blog.csdn.net/weixin_45093926/article/details/103330105

python - lmdb.BadRslotError : mdb_txn_begin: MDB_BAD_RSLOT: Invalid reuse of reader locktable slot? - IT工具网https://www.coder.work/article/7789480

python - lmdb.BadRslotError: mdb_txn_begin: MDB_BAD_RSLOT: Invalid reuse of reader locktable slot? - Stack Overflowhttps://stackoverflow.com/questions/56905502/lmdb-badrsloterror-mdb-txn-begin-mdb-bad-rslot-invalid-reuse-of-reader-lockta

你可能感兴趣的:(深度学习,人工智能,pytorch,lmdb)