【TVM系列五】添加Relay自定义算子

一、前言

本文以实现一个axis_abs的自定义算子为例介绍如何在tvm中添加新的relay算子，该算子实现的功能是以输入的3维tensor取某一维度的指定切片取绝对值。

二、添加自定义算子

新增relay算子基本是下面几个步骤：

定义新增算子的属性节点（Attribute Node），声明在编译时已知的固定参数；
为新增算子编写类型关系，以集成到relay的类型系统中；
使用C++RELAY_REGISTER_OP宏，为新增算子注册生命参数数量、类型、提示信息；
算子的compute实现；
注册算子的compute、schedule；
定义C++函数，为新增算子生成调用节点，并为该函数注册 Python API hook；
将上面的 Python API hook 封装成简洁的调用方式；
为新的relay 算子编写测试。

1、定义新增算子的属性节点（Attribute Node）

在include/tvm/relay/attrs/transform.h中增加算子的属性数据结构：

/*! \brief Attributes used in axisabs operator */
struct AxisAbsAttrs : public tvm::AttrsNode {
    int axis;
    int indice;

    TVM_DECLARE_ATTRS(AxisAbsAttrs, "relay.attrs.AxisAbsAttrs") {
        TVM_ATTR_FIELD(axis).set_default(0).describe("Axis to abs");
        TVM_ATTR_FIELD(indice).set_default(0).describe("Indice to abs");
    }
};

Q：宏TVM_DECLARE_ATTRS 与 TVM_ATTR_FIELD的作用是什么？
A：这两个宏定义在 include/tvm/ir/attrs.h

#define TVM_DECLARE_ATTRS(ClassName, TypeKey)                    \
  static constexpr const char* _type_key = TypeKey;              \
  TVM_DECLARE_FINAL_OBJECT_INFO(ClassName, ::tvm::BaseAttrsNode) \
  template                                      \
  void __VisitAttrs__(FVisit& __fvisit__)  // NOLINT(*)

#define TVM_ATTR_FIELD(FieldName) __fvisit__(#FieldName, &FieldName)

其中的TVM_DECLARE_FINAL_OBJECT_INFO定义在include/tvm/runtime/object.h

#define TVM_DECLARE_FINAL_OBJECT_INFO(TypeName, ParentType) \
   static const constexpr bool _type_final = true;           \
   static const constexpr int _type_child_slots = 0;         \
   TVM_DECLARE_BASE_OBJECT_INFO(TypeName, ParentType)
  
#define TVM_DECLARE_BASE_OBJECT_INFO(TypeName, ParentType)                                     \
   static_assert(!ParentType::_type_final, "ParentObj marked as final");                        \
   static uint32_t RuntimeTypeIndex() {                                                         \
     static_assert(TypeName::_type_child_slots == 0 || ParentType::_type_child_slots == 0 ||    \
                      TypeName::_type_child_slots < ParentType::_type_child_slots,             \
                  "Need to set _type_child_slots when parent specifies it.");                  \
     if (TypeName::_type_index != ::tvm::runtime::TypeIndex::kDynamic) {                        \
       return TypeName::_type_index;                                                            \
     }                                                                                          \
     return _GetOrAllocRuntimeTypeIndex();                                                      \
   }                                                                                            \
  static uint32_t _GetOrAllocRuntimeTypeIndex() {                                              \
    static uint32_t tindex = Object::GetOrAllocRuntimeTypeIndex(                               \
        TypeName::_type_key, TypeName::_type_index, ParentType::_GetOrAllocRuntimeTypeIndex(), \
        TypeName::_type_child_slots, TypeName::_type_child_slots_can_overflow);                \
    return tindex;                                                                             \
  }

所以宏展开后定义的属性节点数据结构为：

struct AxisAbsAttrs : public tvm::ArrayNode {
    int axis;    
    static constexpr const char* _type_key = "relay.attrs.AxisAbsAttrs";
    static const constexpr bool _type_final = true;
    static const constexpr int _type_child_slots = 0;

    static_assert(!::tvm::BaseAttrsNode::_type_final, "ParentObj marked as final");

    static uint32_t RuntimeTypeIndex() {                                                       
        static_assert(AxisAbsAttrs::_type_child_slots == 0 || ::tvm::BaseAttrsNode::_type_child_slots == 0 ||    
                          AxisAbsAttrs::_type_child_slots < ::tvm::BaseAttrsNode::_type_child_slots,             
                      "Need to set _type_child_slots when parent specifies it.");                  
        if (AxisAbsAttrs::_type_index != ::tvm::runtime::TypeIndex::kDynamic) {                        
            return AxisAbsAttrs::_type_index;                                                            
        }                                                                                          
         return _GetOrAllocRuntimeTypeIndex();                                                      
      }           

    static uint32_t _GetOrAllocRuntimeTypeIndex() {                                              
         static uint32_t tindex = Object::GetOrAllocRuntimeTypeIndex(                               
         AxisAbsAttrs::_type_key, AxisAbsAttrs::_type_index, ::tvm::BaseAttrsNode::_GetOrAllocRuntimeTypeIndex(), 
         AxisAbsAttrs::_type_child_slots, AxisAbsAttrs::_type_child_slots_can_overflow);                
         return tindex;                                                                             
    }

    template                                     
    void __VisitAttrs__(FVisit& __fvisit__)  {
        __fvisit__(axis, &axis).set_default(0).describe("Axis to abs");
    }
}

可以看到，每个属性节点都定义了获取运行时类型索引的函数RuntimeTypeIndex()以及访问属性内部成员的模版函数VisitAttrs(FVisit& fvisit)。

Q：模版函数VisitAttrs(FVisit& fvisit)的调用过程是怎么样的？
A：首先分析定义在include/tvm/ir/attrs.h中的类class AttrsNode

template 
class AttrsNode : public BaseAttrsNode {
public:
  void VisitAttrs(AttrVisitor* v) {
    ::tvm::detail::AttrNormalVisitor vis(v);
    self()->__VisitAttrs__(vis);
  }
  void VisitNonDefaultAttrs(AttrVisitor* v) {...}
  void InitByPackedArgs(const runtime::TVMArgs& args, bool allow_unknown) final {...}
  bool SEqualReduce(const DerivedType* other, SEqualReducer equal) const {...}
  void SHashReduce(SHashReducer hash_reducer) const {...}
  Array ListFieldInfo() const final {...}
private:
  DerivedType* self() const {
    return const_cast(static_cast(this));
  }
};

它是一个模版类，模版参数是继承它的子类类型，在成员函数VisitAttrs(AttrVisitor* v)中，传入属性访问器类AttrVisitor对象：

class AttrVisitor {
 public:
  //! \cond Doxygen_Suppress
  TVM_DLL virtual ~AttrVisitor() = default;
  TVM_DLL virtual void Visit(const char* key, double* value) = 0;
  TVM_DLL virtual void Visit(const char* key, int64_t* value) = 0;
  TVM_DLL virtual void Visit(const char* key, uint64_t* value) = 0;
  TVM_DLL virtual void Visit(const char* key, int* value) = 0;
  TVM_DLL virtual void Visit(const char* key, bool* value) = 0;
  TVM_DLL virtual void Visit(const char* key, std::string* value) = 0;
  TVM_DLL virtual void Visit(const char* key, void** value) = 0;
  TVM_DLL virtual void Visit(const char* key, DataType* value) = 0;
  TVM_DLL virtual void Visit(const char* key, runtime::NDArray* value) = 0;
  TVM_DLL virtual void Visit(const char* key, runtime::ObjectRef* value) = 0;
  template ::value>::type>
  void Visit(const char* key, ENum* ptr) {
    static_assert(std::is_same::type>::value,
                  "declare enum to be enum int to use visitor");
    this->Visit(key, reinterpret_cast(ptr));
  }
  //! \endcond
};

然后通过::tvm::detail::AttrNormalVisitor vis(v);包裹一层普通属性访问函数：

// Wrapper for normal visitor.
class AttrNormalVisitor {
public:
  explicit AttrNormalVisitor(AttrVisitor* visitor) : visitor_(visitor) {}
  template 
  AttrNopEntry operator()(const char* key, T* value) {
    visitor_->Visit(key, value);
    return AttrNopEntry();
  }

private:
  AttrVisitor* visitor_;
};

它重载了运算符“()”，当class AttrsNode通过self()->VisitAttrs(vis)获取子类的对象并通过子类对象调用VisitAttrs(FVisit& fvisit) 时，随即调用了fvisit(axis, &axis)，这个fvisit最终调到的就是class AttrNormalVisitor 中的重载"()"函数，这个函数会返回一个结构体用于支持链式调用：

// helper entry that does nothing in set_default/bound/describe calls.
struct AttrNopEntry {
  using TSelf = AttrNopEntry;
  TSelf& describe(DMLC_ATTRIBUTE_UNUSED const char* str) { return *this; }
  template 
  TSelf& set_default(DMLC_ATTRIBUTE_UNUSED const T& value) {return *this;}
  template 
  TSelf& set_lower_bound(DMLC_ATTRIBUTE_UNUSED const T& begin) {return *this;}
  template 
  TSelf& set_upper_bound(DMLC_ATTRIBUTE_UNUSED const T& end) {return *this;}
};

这些调用实际上什么都没有做就返回了其自身。

2、编写算子类型关系，集成到Relay的类型系统

为了算子注册的灵活性以及relay算子有更好的泛化能力，relay算子通过输入输出之间的类型关系来实例化。本质上，算子类型关系除了推导输出类型外，还能够强制指定类型规则（检查输入类型）。需要在src\relay\op\tensor\transform.cc中添加算子的类型关系处理函数：

bool AxisAbsRel(const Array& types, int num_inputs, const Attrs& attrs,
               const TypeReporter& reporter) {
    // types: [data, output]
    ICHECK_EQ(types.size(), 2);
    const auto* data = types[0].as();
    if (data == nullptr) {
      ICHECK(types[0].as())
          << "cast: expect input type to be TensorType but get " << types[0];
      return false;
    }
    const auto* param = attrs.as();
    const int ndim = static_cast(data->shape.size());
    const int axis = param->axis;
    const int axis_len = data->shape[axis].as()->value;
    const int indice = param->indice;

    ICHECK(0 <= axis && axis < ndim)
      << "axis_abs only accepts `axis` in [0, data.ndim - 1]"
      << ", but got axis = " << axis << ", and data.ndim = " << ndim;

    ICHECK(0 <= indice && indice < axis_len)
      << "axis_abs only accepts `indice` in [0, data[axis] - 1"
      << ", but got indice = " << indice << ", and data[axis] = " << axis_len;

    reporter->Assign(types[1], TensorType(data->shape, data->dtype));
    return true;
}

Q：类型关系处理函数在什么时候调用？
A：类型关系处理函数在注册Relay算子时通过链式调用add_type_rel()注册。

Q：函数输入参数types的含意是什么？
A：types传入的是一个数组引用，内容一般为输入与输出的TensorType，首先看class TensorTypeNode：

class TensorTypeNode : public BaseTensorTypeNode {
 public:

  Array shape;   // Tensor的shape
  DataType dtype;    // Tensor中数据类型
  void VisitAttrs(tvm::AttrVisitor* v) {
    v->Visit("shape", &shape);
    v->Visit("dtype", &dtype);
    v->Visit("span", &span);
  }
  bool SEqualReduce(const TensorTypeNode* other, SEqualReducer equal) const {...}
  void SHashReduce(SHashReducer hash_reduce) const {...}
  TVM_DLL PrimExpr Size() const;
  static constexpr const char* _type_key = "relay.TensorType";
  TVM_DECLARE_FINAL_OBJECT_INFO(TensorTypeNode, BaseTensorTypeNode);
};

它定义了一个Tensor所需要的基本数据信息如：shape与数据类型，但是并没有实际的数据，所以类名也就叫TensorTypeNode。通过它可以获取到输入Tensor的类型信息从而对参数做合法性检查。

Q：函数输入参数reporter的含意是什么？
A：class TypeReporter是一个TypeReporterNode的容器类：

class TypeReporter : public ObjectRef {
 public:
  TypeReporter() {}
  explicit TypeReporter(ObjectPtr

【TVM系列五】添加Relay自定义算子

一、前言

二、添加自定义算子

1、定义新增算子的属性节点（Attribute Node）

2、编写算子类型关系，集成到Relay的类型系统

3、关联算子的参数数目、属性

4、算子compute实现

5、注册算子的compute、schedule

6、为算子生成调用节点并注册 API hook

7、将Python API hook 封装成简洁的调用方式

8、为新的relay 算子编写测试用例

三、总结

你可能感兴趣的:(【TVM系列五】添加Relay自定义算子)