Pytorch C++ 工程化

Python Server 的问题

多进程消耗内存。如果一个模型500M, 那么,开10个进程,就是5G内存。而如果使用 C++ 支持多线程的语言,不论开多少个worker, 内存依然是500M。

另外,C++的模型服务,在性能上也有一点提升。

Pytorch C++ 工程化

在最初的Pytorch给出的方案是将模型转化成 Onnx, 使用 Caffe2 来运行。

Onnx

  1. data[index] = new_data 不支持. 使用 data.scatter_ 绕过。
  2. 没有 Tensor List 的概念,所以有些函数无法导出,比如 x.unbind(0)
  3. Dictionaries and strings 支持不推荐。
  4. PyTorch 和 ONNX backends(Caffe2, ONNX Runtime, etc) 有些运算符是有差异的,在某些网络情况会产生不同的效果。

Torch Script

TorchScript is a way to create serializable and optimizable models from PyTorch code. Any TorchScript program can be saved from a Python process and loaded in a process where there is no Python dependency.

可以将python写的模型,转换成Torch Script 再通过C++加载。

Torch Script 有两种方式转换,tracingscripting

pytorc version >= 1.4

tracing

torch.jit.trace(func, example_inputs, optimize=None, check_trace=True, check_inputs=None, check_tolerance=1e-5)
  • func (callable or torch.nn.Module) – A Python function or torch.nn.Module that will be run with example_inputs. arguments and returns to func must be tensors or (possibly nested) tuples that contain tensors. When a module is passed to torch.jit.trace, only the forward method is run and traced (see torch.jit.trace for details).

  • example_inputs (tuple) – A tuple of example inputs that will be passed to the function while tracing. The resulting trace can be run with inputs of different types and shapes assuming the traced operations support those types and shapes. example_inputs may also be a single Tensor in which case it is automatically wrapped in a tuple.

class MyCell(torch.nn.Module):
    def __init__(self):
        super(MyCell, self).__init__()
        self.linear = torch.nn.Linear(4, 4)

    def forward(self, x, h):
        new_h = torch.tanh(self.linear(x) + h)
        return new_h, new_h

my_cell = MyCell()
x, h = torch.rand(3, 4), torch.rand(3, 4)
traced_cell = torch.jit.trace(my_cell, (x, h))
print(traced_cell)
traced_cell(x, h)

Output:

MyCell(
  original_name=MyCell
  (linear): Linear(original_name=Linear)
)
print(traced_cell.graph)

Output:

graph(%self.1 : __torch__.torch.nn.modules.module.___torch_mangle_1.Module,
      %input : Float(3, 4),
      %h : Float(3, 4)):
  %19 : __torch__.torch.nn.modules.module.Module = prim::GetAttr[name="linear"](%self.1)
  %21 : Tensor = prim::CallMethod[name="forward"](%19, %input)
  %12 : int = prim::Constant[value=1]() # /var/lib/jenkins/workspace/beginner_source/Intro_to_TorchScript_tutorial.py:188:0
  %13 : Float(3, 4) = aten::add(%21, %h, %12) # /var/lib/jenkins/workspace/beginner_source/Intro_to_TorchScript_tutorial.py:188:0
  %14 : Float(3, 4) = aten::tanh(%13) # /var/lib/jenkins/workspace/beginner_source/Intro_to_TorchScript_tutorial.py:188:0
  %15 : (Float(3, 4), Float(3, 4)) = prim::TupleConstruct(%14, %14)
  return (%15)
print(traced_cell.code)

output:

def forward(self,
    input: Tensor,
    h: Tensor) -> Tuple[Tensor, Tensor]:
  _0 = torch.add((self.linear).forward(input, ), h, alpha=1)
  _1 = torch.tanh(_0)
  return (_1, _1)

scripting

因为 tracing 是通过追踪模型的运行产出的记录结果,所以当模型中有分支的时后,将会只能跟踪一个分支,这是有问题的。

class MyDecisionGate(torch.nn.Module):
    def forward(self, x):
        if x.sum() > 0:
            return x
        else:
            return -x

class MyCell(torch.nn.Module):
    def __init__(self, dg):
        super(MyCell, self).__init__()
        self.dg = dg
        self.linear = torch.nn.Linear(4, 4)

    def forward(self, x, h):
        new_h = torch.tanh(self.dg(self.linear(x)) + h)
        return new_h, new_h

my_cell = MyCell(MyDecisionGate())
traced_cell = torch.jit.trace(my_cell, (x, h))
print(traced_cell.code)

output:

TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if x.sum() > 0:

def forward(self,
    input: Tensor,
    h: Tensor) -> Tuple[Tensor, Tensor]:
  _0 = (self.dg).forward((self.linear).forward(input, ), )
  _1 = torch.tanh(torch.add(_0, h, alpha=1))
  return (_1, _1)

针对这种情况, 通过 scripting 的方式,本质上是将 python 代码编译成 torch script.

scripted_gate = torch.jit.script(MyDecisionGate())

my_cell = MyCell(scripted_gate)
traced_cell = torch.jit.script(my_cell)
print(traced_cell.code)

output:

def forward(self,
    x: Tensor,
    h: Tensor) -> Tuple[Tensor, Tensor]:
  _0 = (self.dg).forward((self.linear).forward(x, ), )
  new_h = torch.tanh(torch.add(_0, h, alpha=1))
  return (new_h, new_h)

C++ for torch script

#include 
#include 

void demo() {
    std::string model_fp = "/Users/panxu/MyProjects/icode/yq-offline/data/ts/script_model.zip";
    torch::jit::script::Module model = torch::jit::load(model_fp);
    std::vector tokens;
    tokens.push_back("abc");
    tokens.push_back("def");

    std::vector param;
    param.push_back(tokens);

    auto result = model.forward(param);
    std::cout << result << std::endl;
}


int main() {

    demo();
}

依赖包:

下载选项

下载地址: https://pytorch.org/

Demo Review

注意事项

  1. f(*args, **awkg) 是无法转换成 torch script
  2. python 的所有函数必须有严格的类型声明,否则会出现转换出错。在任何转的地方都要把类型写清楚,包括 troch.tensor(*, dtypt=*)
  3. scripting 转换的必须是继承自 torch.nn.Module 的子类。在运行的时候,只能运行标准的 Module 接口。

可以直接用C++写模型

torch script 在调试的时候,还会遇到不少问题,尤其是 python 的子集支持。torch 的c++接口很简洁,可以考虑将模型改成C++,这也是一种方案。

你可能感兴趣的:(Pytorch C++ 工程化)