mindspore profiler出现问题

执行环境:mindspore/mindspore-gpu-cuda10.1:1.7.0镜像

代码如下所示:

if __name__=="__main__":
    ms_profiler = Profiler(output_path="./prof_result")
    # Init a SummaryCollector callback instance, and use it in model.train or model.eval
    specified = {"collect_metric": True, "histogram_regular": "^conv1.*|^conv2.*", "collect_graph": True,
                 "collect_dataset_graph": True}

    summary_collector = SummaryCollector(summary_dir="./summary_dir/summary_01", collect_specified_data=specified,
                                         collect_freq=1, keep_default_action=False, collect_tensor_freq=200)
    net = LinearNet()
    net_loss = nn.loss.MSELoss()
    optim = nn.Momentum(net.trainable_params(), learning_rate=0.01, momentum=0.6)
    model = Model(net, net_loss, optim)
    epoch = 1
    model.train(epoch, ds_train, callbacks=[LossMonitor(100), summary_collector], dataset_sink_mode=False)
    ms_profiler.analyse()
复制

直接执行后出现权限问题:

[ERROR] PROFILER(5125,7f104abf0740,python):2022-06-23-02:27:25.980.770 [mindspore/ccsrc/profiler/device/gpu/gpu_profiling.cc:487] StopCUPTI] CUPTI Error:CUPTI_ERROR_INSUFFICIENT_PRIVILEGES function:CuptiUnsubscribe. You may not have access to the NVIDIA GPU performance counters on the target device. Please use the root account to run profiling or configure permissions. If there is still the problem, please refer to the GPU performance tuning document on the official website of mindinsight.
[ERROR] PROFILER(5125,7f104abf0740,python):2022-06-23-02:27:25.980.893 [mindspore/ccsrc/profiler/device/gpu/gpu_profiling.cc:488] StopCUPTI] CUPTI Error:CUPTI_ERROR_NOT_INITIALIZED function:CuptiActivityFlushAll. You may not have access to the NVIDIA GPU performance counters on the target device. Please use the root account to run profiling or configure permissions. If there is still the problem, please refer to the GPU performance tuning document on the official website of mindinsight.
[ERROR] PROFILER(5125,7f104abf0740,python):2022-06-23-02:27:25.980.946 [mindspore/ccsrc/profiler/device/gpu/gpu_profiling.cc:491] StopCUPTI] CUPTI Error:CUPTI_ERROR_NOT_INITIALIZED function:CuptiActivityDisable. You may not have access to the NVIDIA GPU performance counters on the target device. Please use the root account to run profiling or configure permissions. If there is still the problem, please refer to the GPU performance tuning document on the official website of mindinsight.
[ERROR] PROFILER(5125,7f104abf0740,python):2022-06-23-02:27:25.980.981 [mindspore/ccsrc/profiler/device/gpu/gpu_profiling.cc:491] StopCUPTI] CUPTI Error:CUPTI_ERROR_NOT_INITIALIZED function:CuptiActivityDisable. You may not have access to the NVIDIA GPU performance counters on the target device. Please use the root account to run profiling or configure permissions. If there is still the problem, please refer to the GPU performance tuning document on the official website of mindinsight.
[ERROR] PROFILER(5125,7f104abf0740,python):2022-06-23-02:27:25.981.012 [mindspore/ccsrc/profiler/device/gpu/gpu_profiling.cc:491] StopCUPTI] CUPTI Error:CUPTI_ERROR_NOT_INITIALIZED function:CuptiActivityDisable. You may not have access to the NVIDIA GPU performance counters on the target device. Please use the root account to run profiling or configure permissions. If there is still the problem, please refer to the GPU performance tuning document on the official website of mindinsight.
[WARNING] PROFILER(5125,7f104abf0740,python):2022-06-23-02:27:25.982.601 [mindspore/ccsrc/profiler/device/gpu/gpu_data_saver.cc:138] WriteFile] No operation detail infos to write.
Traceback (most recent call last):
  File "profile.py", line 69, in 
    ms_profiler.analyse()
  File "/usr/local/python-3.7.5/lib/python3.7/site-packages/mindspore/profiler/profiling.py", line 334, in analyse
    self._gpu_analyse()
  File "/usr/local/python-3.7.5/lib/python3.7/site-packages/mindspore/profiler/profiling.py", line 710, in _gpu_analyse
    reduce_op_type = self._get_step_reduce_op_type()
  File "/usr/local/python-3.7.5/lib/python3.7/site-packages/mindspore/profiler/profiling.py", line 759, in _get_step_reduce_op_type
    with open(step_trace_file_path, 'r') as f_obj:
FileNotFoundError: [Errno 2] No such file or directory: '/home/prof_result/profiler/step_trace_profiling_0.txt'
复制

但是使用sudo给与管理员权限之后出现如下错误:找不到nvcc。

Traceback (most recent call last):
  File "profile.py", line 2, in 
    from mindspore import context
  File "/usr/local/python-3.7.5/lib/python3.7/site-packages/mindspore/__init__.py", line 17, in 
    from .run_check import run_check
  File "/usr/local/python-3.7.5/lib/python3.7/site-packages/mindspore/run_check/__init__.py", line 17, in 
    from ._check_version import check_version_and_env_config
  File "/usr/local/python-3.7.5/lib/python3.7/site-packages/mindspore/run_check/_check_version.py", line 454, in 
    check_version_and_env_config()
  File "/usr/local/python-3.7.5/lib/python3.7/site-packages/mindspore/run_check/_check_version.py", line 434, in check_version_and_env_config
    env_checker.check_version()
  File "/usr/local/python-3.7.5/lib/python3.7/site-packages/mindspore/run_check/_check_version.py", line 143, in check_version
    nvcc_version = self._get_nvcc_version(False)
  File "/usr/local/python-3.7.5/lib/python3.7/site-packages/mindspore/run_check/_check_version.py", line 85, in _get_nvcc_version
    timeout=3, text=True, capture_output=True, check=False)
  File "/usr/local/python-3.7.5/lib/python3.7/subprocess.py", line 488, in run
    with Popen(*popenargs, **kwargs) as process:
  File "/usr/local/python-3.7.5/lib/python3.7/subprocess.py", line 800, in __init__
    restore_signals, start_new_session)
  File "/usr/local/python-3.7.5/lib/python3.7/subprocess.py", line 1551, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'nvcc': 'nvcc'
复制

最后我直接使用sudo su切换到管理员身份并执行python代码,最后还是会出现最开始的权限问题。

启动docker时使用--privileged=true选项即可

你可能感兴趣的:(python,深度学习,开发语言)