Recurrent-VLN-BERT

Recurrent VLN-BERT复现遇到的bug

create docker

sudo nvidia-docker run -it --mount type=bind,source=/home/oem/Desktop/Zeyue/Attack/Matterport3DSimulator,target=/home/oem/Desktop/Zeyue/Attack/Matterport3DSimulator --volume `pwd`:/root/mount/Matterport3DSimulator mattersim:9.2-devel-ubuntu18.04

BUG:

    return [get_point_angle_feature(baseViewId) for baseViewId in range(36)]
  File "/home/oem/Desktop/Zeyue/Attack/Matterport3DSimulator/VLN/Recurrent-VLN-BERT/r2r_src/utils.py", line 378, in <listcomp>
    return [get_point_angle_feature(baseViewId) for baseViewId in range(36)]
  File "/home/oem/Desktop/Zeyue/Attack/Matterport3DSimulator/VLN/Recurrent-VLN-BERT/r2r_src/utils.py", line 363, in get_point_angle_feature
    sim.newEpisode('ZMojNkEp431', '2f4d90acd4024c269fb0efe49a8ac540', 0, math.radians(-30))
TypeError: newEpisode(): incompatible function arguments. The following argument types are supported:
    1. (self: MatterSim.Simulator, arg0: List[str], arg1: List[str], arg2: List[float], arg3: List[float]) -> None

Invoked with: <MatterSim.Simulator object at 0x7fee5ec38180>, 'ZMojNkEp431', '2f4d90acd4024c269fb0efe49a8ac540', 0, -0.5235987755982988
sim = new_simulator()

feature = np.empty((36, args.angle_feat_size), np.float32)
base_heading = (baseViewId % 12) * math.radians(30)
for ix in range(36):
    if ix == 0:
        sim.newEpisode(['ZMojNkEp431'], ['2f4d90acd4024c269fb0efe49a8ac540'], [0], [math.radians(-30)])
    elif ix % 12 == 0:
        sim.makeAction([0], [1.0], [1.0])
    else:
        sim.makeAction([0], [1.0], [0])

    state = sim.getState()[0]
    # print(state,ix)
    assert state.viewIndex == ix

解决方法:按格式改

AttributeError: 'list' object has no attribute 'viewIndex'

解决方法:state = sim.getState()改为state = sim.getState()[0]

ModuleNotFoundError: No module named 'transformers.pytorch_transformers'

解决方法: pytorch_transformers去掉

OSError: file Prevalent/pretrained_model/pytorch_model.bin not found

原因: 没有把预训练权重导进去
解决方法:把相应的地址改一下:
/home/tianzeyue/EQA_attack/Recurrent-VLN-BERT/weight/pytorch_model.bin

TypeError: init_weights() takes 1 positional argument but 2 were given

解决方法:
self.apply(self.init_weights)改为self.init_weights()
参考: https://github.com/lonePatient/Bert-Multi-Label-Text-Classification/issues/24

  File "r2r_src/train.py", line 198, in train_val
    train_env = R2RBatch(feat_dict, batch_size=args.batchSize, splits=['train'], tokenizer=tok)
  File "/home/oem/Desktop/Zeyue/Attack/Matterport3DSimulator/VLN/Recurrent-VLN-BERT/r2r_src/env.py", line 96, in __init__
    self.env = EnvBatch(feature_store=feature_store, batch_size=batch_size)
  File "/home/oem/Desktop/Zeyue/Attack/Matterport3DSimulator/VLN/Recurrent-VLN-BERT/r2r_src/env.py", line 55, in __init__
    sim.init()
AttributeError: 'MatterSim.Simulator' object has no attribute 'init'

解决方法:
sim.init()改为sim.initialize()
作者提到是版本原因: https://github.com/YicongHong/Recurrent-VLN-BERT/issues/4

    param.data = fn(param.data)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 265, in <lambda>
    return self._apply(lambda t: t.cuda(device))
  File "/usr/local/lib/python3.6/dist-packages/torch/cuda/__init__.py", line 163, in _lazy_init
    torch._C._cuda_init()
RuntimeError: cuda runtime error (38) : no CUDA-capable device is detected at /pytorch/aten/src/THC/THCGeneral.cpp:51

single GPU
解决方法: os.environ[‘CUDA_VISIBLE_DEVICES’] = ‘0’

  File "/home/oem/Desktop/Zeyue/Attack/Matterport3DSimulator/VLN/Recurrent-VLN-BERT/r2r_src/agent.py", line 571, in load
    states = torch.load(path)
  File "/usr/local/lib/python3.6/dist-packages/torch/serialization.py", line 387, in load
    return _load(f, map_location, pickle_module, **pickle_load_args)
  File "/usr/local/lib/python3.6/dist-packages/torch/serialization.py", line 560, in _load
    raise RuntimeError("{} is a zip archive (did you mean to use torch.jit.load()?)".format(f.name))
RuntimeError: snap/VLNBERT-PREVALENT-final/state_dict/best_val_unseen is a zip archive (did you mean to use torch.jit.load()?)

解决方法: torch.jit.load()
update the pytorch version

  File "/home/oem/Desktop/Zeyue/Attack/Matterport3DSimulator/VLN/Recurrent-VLN-BERT/r2r_src/vlnbert/vlnbert_PREVALENT.py", line 393, in forward
    token_type_ids = torch.zeros_like(input_ids)
RuntimeError: CUDA error: no kernel image is available for execution on the device

解决方法:change the cuda version
https://blog.csdn.net/li4692625/article/details/123169438

Traceback (most recent call last):
  File "src/driver/driver.py", line 37, in <module>
    sim.makeAction(sim, [1], [1], [1])
TypeError: makeAction(): incompatible function arguments. The following argument types are supported:
    1. (self: MatterSim.Simulator, arg0: List[int], arg1: List[float], arg2: List[float]) -> None
    # For now, the agent can't pick which forward move to make - just the one in the middle
    env_actions = {
      'left': [[0],[-1], [0]], # left
      'right': [[0], [1], [0]], # right
      'up': [[0], [0], [1]], # up
      'down': [[0], [0],[-1]], # down
      'forward': [[1],[0], [0]], # forward
      '': [[0], [0], [0]], # <end>
      '': [[0], [0], [0]], # <start>
      '': [[0], [0], [0]]  # <ignore>
    }

解决方法:change the defined action ‘left’: (0, 0, 0) to ‘left’: [[0],[-1], [0]].

Result:

Loaded the listener model at iter 114000 from snap/VLNBERT-PREVALENT-final/state_dict/best_val_unseen
result length 1501
Env name: val_train_seen, nav_error: 0.8354, oracle_error: 0.6634, steps: 5.1839, lengths: 10.0262, success_rate: 0.9394, oracle_rate: 0.9520, spl: 0.9125
result length 1021
Env name: val_seen, nav_error: 2.8973, oracle_error: 1.9402, steps: 5.5436, lengths: 11.1393, success_rate: 0.7228, oracle_rate: 0.7826, spl: 0.6774
result length 2349
Env name: val_unseen, nav_error: 3.9281, oracle_error: 2.5456, steps: 6.1235, lengths: 12.0003, success_rate: 0.6275, oracle_rate: 0.7020, spl: 0.5684

你可能感兴趣的:(pytorch,人工智能,python)