MindSpore报错RuntimeError:Exceed function call depth limit 1000, (function call depth: 1001...

1 报错描述

1.1 系统环境

Hardware Environment(Ascend/GPU/CPU): GPU

Software Environment:

– MindSpore version (source or binary): 1.6.0

– Python version (e.g., Python 3.7.5): 3.7.6

– OS platform and distribution (e.g., Linux Ubuntu 16.04): Ubuntu 4.15.0-74-generic

– GCC/Compiler version (if compiled from source):

1.2 基本信息

1.2.1 脚本

训练脚本是通过构建Concat单算子网络,在特定的轴连接张量的例子。脚本如下:

 1 context.set_context(mode=context.GRAPH_MODE, device_target="Ascend")
 2 class Net(nn.Cell):
 3   def __init__(self):
 4     super(Net, self).__init__()
 5     self.concat = ops.Concat()
 6
 7   def construct(self, x):
 8     n = 1000
 9     input_x = ()
 10     for i in range(n):
 11       input_x += (x,)
 12     output = self.concat(input_x)
 13     return output
 14
 15 net = Net()
 16 input_x1 = Tensor(np.random.rand(1, 4, 16, 16), mindspore.float32)
 17 output = net(input_x1)
 18 print(f"输出结果:{output.shape}")报错

这里报错信息如下:

Traceback (most recent call last):
 File "demo.py", line 17, in <module>
  output = net(input_x1)
 File " /lib/python3.7/site-packages/mindspore/nn/cell.py", line 542, in __call__
  out = self.compile_and_run(*args)
 File " lib/python3.7/site-packages/mindspore/nn/cell.py", line 872, in compile_and_run
  self.compile(*inputs)
 File "/lib/python3.7/site-packages/mindspore/nn/cell.py", line 857, in compile
  _cell_graph_executor.compile(self, *inputs, phase=self.phase, auto_parallel_mode=self._auto_parallel_mode)
 File "/lib/python3.7/site-packages/mindspore/common/api.py", line 712, in compile
  result = self._graph_executor.compile(obj, args_list, phase, self._use_vm_mode())
RuntimeError: mindspore/ccsrc/pipeline/jit/static_analysis/evaluator.cc:100 EnterStackFrame] Exceed function call depth limit 1000, (function call depth: 1001, simulate call depth: 999).
It's always happened with complex construction of code or infinite recursion or loop.
Please check the code if it's has the infinite recursion or call 'context.set_context(max_call_depth=value)' to adjust this value.
If max_call_depth is set larger, the system max stack depth should be set larger too to avoid stack overflow.
For more details, please refer to the FAQ at https://www.mindspore.cn.
The function call stack (See file 'demo/rank_0/om/analyze_fail.dat' for more details):
\# 0 In file demo.py(10)for i in range(n):
\# 1 In file /lib/python3.7/site-packages/mindspore/_extends/parse/standard_method.py(1441)
  return it.__ms_hasnext__()\# 2 In file /lib/python3.7/site-packages/mindspore/_extends/parse/standard_method.py(1860)
  return len(xs) > 0

原因分析

​ 在MindSpore1.6版本,在construct中创建和使用Tensor。如脚本中第15行代码所示。

​ 接着看报错信息,在RuntimeError中,写到Exceed function call depth limit 1000, (function call depth: 1001, simulate call depth: 999),意思是超过函数调用深度限制1000,(函数调用深度:1001,模拟调用深度:999),这是由于函数默认调用最大限度设置为1000,继续看报错信息,写到Please check the code if it’s has the infinite recursion or call ‘context.set_context(max_call_depth=value)’ to adjust this value,即检查代码是否存在无限递归计算,或者调用context.set_context(max_call_depth=value),调整函数深度限制的默认设置。

2 解决方法

基于上面已知的原因,很容易做出如下修改:

 1 context.set_context(mode=context.GRAPH_MODE, max_call_depth=20000,device_target="Ascend")
 2 class Net(nn.Cell):
 3   def __init__(self):
 4     super(Net, self).__init__()
 5     self.concat = ops.Concat()
 6
 7   def construct(self, x):
 8     n = 1000
 9     input_x = ()
 10     for i in range(n):
 11       input_x += (x,)
 12     output = self.concat(input_x)
 13     return output
 14
 15 net = Net()
 16 input_x1 = Tensor(np.random.rand(1, 196, 80, 38), mindspore.float32)
 17 output = net(input_x1)
 18 print(f"输出结果:{output.shape}")

此时执行成功,输出如下:

输出结果:(1000, 196, 80, 38)

3 总结

定位报错问题的步骤:

1、 找到报错的用户代码行:output = net(input_x1);

2、 根据日志报错信息中的关键字,缩小分析问题的范围: Exceed function call depth limit 1000, (function call depth: 1001, simulate call depth: 999);

3、 根据报错提示信息进行修改默认设置,call ‘context.set_context(max_call_depth=value)’ to adjust this value

4、 需要重点关注变量定义、初始化的正确性。

4 参考文档

4.1 API映射

你可能感兴趣的:(python,人工智能,深度学习)