LLVM学习笔记(62)

4.4.3.3. X86TargetLowering子对象

在X86Subtarget构造函数的314行,接着调用X86TargetLowering构造函数构建X86Subtarget中的该类型的子对象TLInfo。

这个TargetLowering派生类,由基于SelectionDAG的指令选择器用于描述LLVM代码如何被降级为SelectionDAG操作。至于其他,这个类展示了:

  • 用于各种ValueType的一个初始寄存器类别,
  • 目标机器原生支持哪些操作,
  • Setcc操作的返回类型,
  • 可用作偏移数的类型,及
  • 各种高级特性,比如通过常量将除法转换为一组乘法是否合算

4.4.3.3.1. TargetLowering

首先看一下基类TargetLowering的构造函数。

40       TargetLowering::TargetLowering(const TargetMachine &tm)

41         : TargetLoweringBase(tm) {}

TargetLoweringBase构造函数的定义如下。它为各个目标机器提供了基准设置,各目标机器可以在自己的TargetLowering派生类的构造函数里重新设置相关的参数。

532     TargetLoweringBase::TargetLoweringBase(const TargetMachine &tm) : TM(tm) {

533       initActions();

534    

535       // Perform these initializations only once.

536       MaxStoresPerMemset = MaxStoresPerMemcpy = MaxStoresPerMemmove =

537       MaxLoadsPerMemcmp = 8;

538       MaxGluedStoresPerMemcpy = 0;

539       MaxStoresPerMemsetOptSize = MaxStoresPerMemcpyOptSize

540         = MaxStoresPerMemmoveOptSize = MaxLoadsPerMemcmpOptSize = 4;

541       UseUnderscoreSetJmp = false;

542       UseUnderscoreLongJmp = false;

543       HasMultipleConditionRegisters = false;

544       HasExtractBitsInsn = false;

545       JumpIsExpensive = JumpIsExpensiveOverride;

546       PredictableSelectIsExpensive = false;

547       EnableExtLdPromotion = false;

548       HasFloatingPointExceptions = true;

549       StackPointerRegisterToSaveRestore = 0;

550       BooleanContents = UndefinedBooleanContent;

551       BooleanFloatContents = UndefinedBooleanContent;

552       BooleanVectorContents = UndefinedBooleanContent;

553       SchedPreferenceInfo = Sched::ILP;

554       JumpBufSize = 0;

555       JumpBufAlignment = 0;

556       MinFunctionAlignment = 0;

557       PrefFunctionAlignment = 0;

558       PrefLoopAlignment = 0;

559       GatherAllAliasesMaxDepth = 18;

560       MinStackArgumentAlignment = 1;

561       // TODO: the default will be switched to 0 in the next commit, along

562      // with the Target-specific changes necessary.

563       MaxAtomicSizeInBitsSupported = 1024;

564    

565       MinCmpXchgSizeInBits = 0;

566       SupportsUnalignedAtomics = false;

567    

568       std::fill(std::begin(LibcallRoutineNames), std::end(LibcallRoutineNames), nullptr);

569    

570       InitLibcalls(TM.getTargetTriple());

571       InitCmpLibcallCCs(CmpLibcallCCs);

572     }

在750行调用initActions()初始化各种action。从下面的代码可以看到,有这些action:OpActions、LoadExtActions、TruncStoreActions、IndexedModeActions与CondCodeActions。它们都是整数类型的数组,数组的内容则是一个LegalizeAction枚举类型。这个枚举类型表示指定的操作对一个目标机器是否合法。如果不是,应该采取什么行动使它们合法:

43       namespace LegalizeActions {

44       enum LegalizeAction : std::uint8_t {

45         /// The operation is expected to be selectable directly by the target, and

46         /// no transformation is necessary.

47         Legal,

48      

49         /// The operation should be synthesized from multiple instructions acting on

50         /// a narrower scalar base-type. For example a 64-bit add might be

51         /// implemented in terms of 32-bit add-with-carry.

52         NarrowScalar,

53      

54         /// The operation should be implemented in terms of a wider scalar

55         /// base-type. For example a <2 x s8> add could be implemented as a <2

56         /// x s32> add (ignoring the high bits).

57         WidenScalar,

58      

59         /// The (vector) operation should be implemented by splitting it into

60         /// sub-vectors where the operation is legal. For example a <8 x s64> add

61         /// might be implemented as 4 separate <2 x s64> adds.

62         FewerElements,

63      

64         /// The (vector) operation should be implemented by widening the input

65         /// vector and ignoring the lanes added by doing so. For example <2 x i8> is

66         /// rarely legal, but you might perform an <8 x i8> and then only look at

67         /// the first two results.

68         MoreElements,

69      

70         /// The operation itself must be expressed in terms of simpler actions on

71         /// this target. E.g. a SREM replaced by an SDIV and subtraction.

72         Lower,

73      

74         /// The operation should be implemented as a call to some kind of runtime

75         /// support library. For example this usually happens on machines that don't

76         /// support floating-point operations natively.

77         Libcall,

78      

79         /// The target wants to do something special with this combination of

80         /// operand and type. A callback will be issued when it is needed.

81         Custom,

82      

83         /// This operation is completely unsupported on the target. A programming

84         /// error has occurred.

85         Unsupported,

86      

87         /// Sentinel value for when no action was found in the specified table.

88         NotFound,

89      

90         /// Fall back onto the old rules.

91         /// TODO: Remove this once we've migrated

92         UseLegacyRules,

93       };

94       } // end namespace LegalizeActions

因此上述数组的定义分别为:

  • LegalizeAction OpActions[MVT::LAST_VALUETYPE][ISD::BUILTIN_OP_END]

对于每个操作符以及每个类型,保存一个LegalizeAction值,指示指令选择如何处理该操作。大多数操作是合法的(即目标机器原生支持),但是不支持的操作应该被描述。注意这里不考虑非法值类型上的操作。

  • uint16_t LoadExtActions[MVT::LAST_VALUETYPE][MVT::LAST_VALUETYPE]

对于每个载入扩展类型以及每个值类型,保存一个LegalizeAction值,指示指令选择应如何应对涉及一个指定值类型及其扩展类型的载入。使用4比特为每个载入类型保存动作,4个载入类型为一组。

  • LegalizeAction TruncStoreActions[MVT::LAST_VALUETYPE][MVT::LAST_VALUETYPE]

对于每个值类型对,保存一个LegalizeAction值,指示涉及一个指定值类型及其截断类型的截断载入是否合法。

  • uint8_t IndexedModeActions[MVT::LAST_VALUETYPE][ISD::LAST_INDEXED_MODE]

其中ISD::LAST_INDEXED_MODE是内存地址索引模式的数量。对于每个索引模式以及每个值类型,保存一对LegalizeAction值来指示指令选择应如何应对保存及载入。第一维是参考的value_type。第二维代表读写的各种模式。

  • uint32_t CondCodeActions[ISD::SETCC_INVALID][(MVT::LAST_VALUETYPE + 7) / 8]

其中ISD::SETCC_INVALID是LLVM IR条件指令的数量。因此对每个条件码(ISD::CondCode)保存一个LegalizeAction值,指示指令选择应如何处理该条件码。每个CC活动使用4比特。

  • 另外,TargetDAGCombineArray是另一个数组定义。它的类型是:

unsigned char TargetDAGCombineArray[(ISD::BUILTIN_OP_END+CHAR_BIT-1)/CHAR_BIT]

它是一个位图,每个LLVM IR操作对应一个位,如果是1,表示该操作期望使用目标机器的回调方法PerformDAGCombine()来执行指令合并。

574     void TargetLoweringBase::initActions() {

575       // All operations default to being supported.

576       memset(OpActions, 0, sizeof(OpActions));

577       memset(LoadExtActions, 0, sizeof(LoadExtActions));

578       memset(TruncStoreActions, 0, sizeof(TruncStoreActions));

579       memset(IndexedModeActions, 0, sizeof(IndexedModeActions));

580       memset(CondCodeActions, 0, sizeof(CondCodeActions));

581       std::fill(std::begin(RegClassForVT), std::end(RegClassForVT), nullptr);

582       std::fill(std::begin(TargetDAGCombineArray),

583                 std::end(TargetDAGCombineArray), 0);

584    

585       // Set default actions for various operations.

586       for (MVT VT : MVT::all_valuetypes()) {

587         // Default all indexed load / store to expand.

588         for (unsigned IM = (unsigned)ISD::PRE_INC;

589              IM != (unsigned)ISD::LAST_INDEXED_MODE; ++IM) {

590           setIndexedLoadAction(IM, VT, Expand);

591           setIndexedStoreAction(IM, VT, Expand);

592         }

593    

594         // Most backends expect to see the node which just returns the value loaded.

595        setOperationAction(ISD::ATOMIC_CMP_SWAP_WITH_SUCCESS, VT, Expand);

596    

597         // These operations default to expand.

598         setOperationAction(ISD::FGETSIGN, VT, Expand);

599         setOperationAction(ISD::CONCAT_VECTORS, VT, Expand);

600         setOperationAction(ISD::FMINNUM, VT, Expand);

601         setOperationAction(ISD::FMAXNUM, VT, Expand);

602         setOperationAction(ISD::FMINNAN, VT, Expand);

603         setOperationAction(ISD::FMAXNAN, VT, Expand);

604         setOperationAction(ISD::FMAD, VT, Expand);

605         setOperationAction(ISD::SMIN, VT, Expand);

606         setOperationAction(ISD::SMAX, VT, Expand);

607         setOperationAction(ISD::UMIN, VT, Expand);

608         setOperationAction(ISD::UMAX, VT, Expand);

609         setOperationAction(ISD::ABS, VT, Expand);

610    

611         // Overflow operations default to expand

612         setOperationAction(ISD::SADDO, VT, Expand);

613         setOperationAction(ISD::SSUBO, VT, Expand);

614         setOperationAction(ISD::UADDO, VT, Expand);

615         setOperationAction(ISD::USUBO, VT, Expand);

616         setOperationAction(ISD::SMULO, VT, Expand);

617         setOperationAction(ISD::UMULO, VT, Expand);

618    

619         // ADDCARRY operations default to expand

620         setOperationAction(ISD::ADDCARRY, VT, Expand);

621         setOperationAction(ISD::SUBCARRY, VT, Expand);

622         setOperationAction(ISD::SETCCCARRY, VT, Expand);

623    

624         // ADDC/ADDE/SUBC/SUBE default to expand.

625         setOperationAction(ISD::ADDC, VT, Expand);

626         setOperationAction(ISD::ADDE, VT, Expand);

627         setOperationAction(ISD::SUBC, VT, Expand);

628         setOperationAction(ISD::SUBE, VT, Expand);

629    

630         // These default to Expand so they will be expanded to CTLZ/CTTZ by default.

631         setOperationAction(ISD::CTLZ_ZERO_UNDEF, VT, Expand);

632         setOperationAction(ISD::CTTZ_ZERO_UNDEF, VT, Expand);

633    

634         setOperationAction(ISD::BITREVERSE, VT, Expand);

635    

636         // These library functions default to expand.

637         setOperationAction(ISD::FROUND, VT, Expand);

638         setOperationAction(ISD::FPOWI, VT, Expand);

639    

640         // These operations default to expand for vector types.

641         if (VT.isVector()) {

642           setOperationAction(ISD::FCOPYSIGN, VT, Expand);

643           setOperationAction(ISD::ANY_EXTEND_VECTOR_INREG, VT, Expand);

644           setOperationAction(ISD::SIGN_EXTEND_VECTOR_INREG, VT, Expand);

645           setOperationAction(ISD::ZERO_EXTEND_VECTOR_INREG, VT, Expand);

646         }

647    

648         // For most targets @llvm.get.dynamic.area.offset just returns 0.

649         setOperationAction(ISD::GET_DYNAMIC_AREA_OFFSET, VT, Expand);

650       }

651    

652       // Most targets ignore the @llvm.prefetch intrinsic.

653       setOperationAction(ISD::PREFETCH, MVT::Other, Expand);

654    

655      // Most targets also ignore the @llvm.readcyclecounter intrinsic.

656       setOperationAction(ISD::READCYCLECOUNTER, MVT::i64, Expand);

657    

658       // ConstantFP nodes default to expand.  Targets can either change this to

659       // Legal, in which case all fp constants are legal, or use isFPImmLegal()

660       // to optimize expansions for certain constants.

661       setOperationAction(ISD::ConstantFP, MVT::f16, Expand);

662       setOperationAction(ISD::ConstantFP, MVT::f32, Expand);

663       setOperationAction(ISD::ConstantFP, MVT::f64, Expand);

664       setOperationAction(ISD::ConstantFP, MVT::f80, Expand);

665       setOperationAction(ISD::ConstantFP, MVT::f128, Expand);

666    

667       // These library functions default to expand.

668       for (MVT VT : {MVT::f32, MVT::f64, MVT::f128}) {

669         setOperationAction(ISD::FLOG ,      VT, Expand);

670         setOperationAction(ISD::FLOG2,      VT, Expand);

671         setOperationAction(ISD::FLOG10,     VT, Expand);

672         setOperationAction(ISD::FEXP ,      VT, Expand);

673         setOperationAction(ISD::FEXP2,      VT, Expand);

674         setOperationAction(ISD::FFLOOR,     VT, Expand);

675         setOperationAction(ISD::FNEARBYINT, VT, Expand);

676         setOperationAction(ISD::FCEIL,      VT, Expand);

677         setOperationAction(ISD::FRINT,      VT, Expand);

678         setOperationAction(ISD::FTRUNC,     VT, Expand);

679         setOperationAction(ISD::FROUND,     VT, Expand);

680       }

681    

682       // Default ISD::TRAP to expand (which turns it into abort).

683       setOperationAction(ISD::TRAP, MVT::Other, Expand);

684    

685       // On most systems, DEBUGTRAP and TRAP have no difference. The "Expand"

686       // here is to inform DAG Legalizer to replace DEBUGTRAP with TRAP.

687       setOperationAction(ISD::DEBUGTRAP, MVT::Other, Expand);

688     }

576~580行将所有这些容器都置0了,意味着所有的action都是合法的,而且所有的操作都不需要回调PerformDAGCombine。接下来的代码将个别的操作设置为Expand,下面会看到X86的派生类型还会进行自己的改写。

执行完initActions()后,在TargetLoweringBase构造函数,接下来初始化这些参数成员。

  • MaxStoresPerMemset
  • MaxLoadsPerMemcmp
  • MaxStoresPerMemcpy
  • MaxStoresPerMemmove

在降级@llvm.memset/@llvm.memcpy/@llvm.memmove时,这个域指明替换memset/memcpy/ memmove调用所需的最大储存次数。目标机器必须基于代价门限设置这个值。应该假设目标机器将根据对齐限制,首先使用尽可能多的最大的储存操作,然后如果需要较小的操作。例如,在32位机器上以16比特对齐保存9字节将导致4次2字节储存与1次单字节储存。这仅适用于设置一个常量大小的常量数组。

  • MaxStoresPerMemcpyOptSize
  • MaxLoadsPerMemcmpOptSize
  • MaxStoresPerMemmoveOptSize

替换memcpy/memmove调用的最大储存次数,用于带有OptSize属性的函数。

  • UseUnderscoreSetJmp,UseUnderscoreLongJmp

表示是否使用_setjmp或_longjmp来实现llvm.setjmp或llvm.longjmp。

  • MaxGluedStoresPerMemcpy

在基于MaxStoresPerMemcpy内联memcpy时,说明保持在一起的最大储存指令数。这有助于后面的成对与向量化。

  • HasMultipleConditionRegisters

告诉代码生成器目标机器是否有多个(可分配)条件寄存器用于保存比较结果。如果有多个条件寄存器,代码生成器就不会激进地将比较下沉到使用者所在基本块。

  • HasExtractBitsInsn

告诉代码生成器目标机器是否有BitExtract指令。如果对BitExtract指令,使用者生成一个与shift组合的and指令,代码生成器将激进地将shift下沉到使用者所在基本块。

  • JumpIsExpensive

告诉代码生成器不要生成额外的流控指令,而是应该尝试通过预测合并流控指令。

  • PredictableSelectIsExpensive

告诉代码生成器,如果一个分支的预测通常是正确的,select比该跳转代价要更高。

  • EnableExtLdPromotion

表示目标机器是否希望使用将ext(promotableInst1(...(promotableInstN(load))))转换为promotedInst1(...(promotedInstN(ext(load))))的优化。

  • HasFloatingPointExceptions

表示目标机器是否支持或在意保留浮点数的异常行为。

  • StackPointerRegisterToSaveRestore

如果设置为一个物理寄存器,就指定了llvm.savestack及llvm.restorestack应该保存及恢复的寄存器。

  • BooleanContents
  • BooleanFloatContents
  • BooleanVectorContents

它们都是BooleanContent枚举类型,其定义如下:

140       enum BooleanContent {

141         UndefinedBooleanContent,    // Only bit 0 counts, the rest can hold garbage.

142         ZeroOrOneBooleanContent,        // All bits zero except for bit 0.

143         ZeroOrNegativeOneBooleanContent // All bits equal to bit 0.

144       };

用于表示各自大于i1类型中的布尔值高位的内容。

  • SchedPreferenceInfo

表示目标机器的调度偏好,通常为了达到总周期数最短或最低寄存器压力的目的。它的类型是Sched::Preference,这个枚举类型给出了LLVM目前支持的调度器类型。

95           enum Preference {

96             None,             // No preference

97             Source,           // Follow source order.

98             RegPressure,      // Scheduling for lowest register pressure.

99             Hybrid,           // Scheduling for both latency and register pressure.

100           ILP,              // Scheduling for ILP in low register pressure mode.

101           VLIW              // Scheduling for VLIW targets.

102         };

103       }

  • JumpBufSize
  • JumpBufAlignment

目标机器jmp_buf缓冲的字节数以及对齐要求。

  • MinFunctionAlignment
  • PrefFunctionAlignment
  • PrefLoopAlignment

分别表示函数的最小对齐要求(用于优化代码大小时,防止显式提供的对齐要求导致错误代码),函数的期望对齐要求(用于没有对齐要求且优化速度时),以及期望的循环对齐要求。

  • MinStackArgumentAlignment

栈上任何参数所需的最小对齐要求。

在568行的容器LibcallRoutineNames的定义是:const char *LibcallRoutineNames[RTLIB:: UNKNOWN_LIBCALL]。其中RTLIB::UNKNOWN_LIBCALL是后端可以发布的运行时库函数调用的数量。这些库函数由RTLIB::Libcall枚举类型描述。这个表由下面的方法根据配置文件来填充:

118     void TargetLoweringBase::InitLibcalls(const Triple &TT) {

119     #define HANDLE_LIBCALL(code, name) \

120       setLibcallName(RTLIB::code, name);

121     #include "llvm/IR/RuntimeLibcalls.def"

122     #undef HANDLE_LIBCALL

123       // Initialize calling conventions to their default.

124       for (int LC = 0; LC < RTLIB::UNKNOWN_LIBCALL; ++LC)

125         setLibcallCallingConv((RTLIB::Libcall)LC, CallingConv::C);

126    

127       // A few names are different on particular architectures or environments.

128       if (TT.isOSDarwin()) {

129         // For f16/f32 conversions, Darwin uses the standard naming scheme, instead

130         // of the gnueabi-style __gnu_*_ieee.

131         // FIXME: What about other targets?

132         setLibcallName(RTLIB::FPEXT_F16_F32, "__extendhfsf2");

133         setLibcallName(RTLIB::FPROUND_F32_F16, "__truncsfhf2");

134    

135         // Some darwins have an optimized __bzero/bzero function.

136         switch (TT.getArch()) {

137         case Triple::x86:

138         case Triple::x86_64:

139           if (TT.isMacOSX() && !TT.isMacOSXVersionLT(10, 6))

140             setLibcallName(RTLIB::BZERO, "__bzero");

141           break;

142         case Triple::aarch64:

143           setLibcallName(RTLIB::BZERO, "bzero");

144           break;

145         default:

146           break;

147         }

148    

149         if (darwinHasSinCos(TT)) {

150           setLibcallName(RTLIB::SINCOS_STRET_F32, "__sincosf_stret");

151           setLibcallName(RTLIB::SINCOS_STRET_F64, "__sincos_stret");

152           if (TT.isWatchABI()) {

153             setLibcallCallingConv(RTLIB::SINCOS_STRET_F32,

154                                   CallingConv::ARM_AAPCS_VFP);

155             setLibcallCallingConv(RTLIB::SINCOS_STRET_F64,

156                                   CallingConv::ARM_AAPCS_VFP);

157           }

158         }

159       } else {

160         setLibcallName(RTLIB::FPEXT_F16_F32, "__gnu_h2f_ieee");

161         setLibcallName(RTLIB::FPROUND_F32_F16, "__gnu_f2h_ieee");

162       }

163    

164       if (TT.isGNUEnvironment() || TT.isOSFuchsia()) {

165         setLibcallName(RTLIB::SINCOS_F32, "sincosf");

166         setLibcallName(RTLIB::SINCOS_F64, "sincos");

167         setLibcallName(RTLIB::SINCOS_F80, "sincosl");

168         setLibcallName(RTLIB::SINCOS_F128, "sincosl");

169         setLibcallName(RTLIB::SINCOS_PPCF128, "sincosl");

170       }

171    

172       if (TT.isOSOpenBSD()) {

173         setLibcallName(RTLIB::STACKPROTECTOR_CHECK_FAIL, nullptr);

174       }

175     }

配置文件RuntimeLibcalls.def定义了后端可以生成的运行时库调用。它包含的内容形如:

HANDLE_LIBCALL(SHL_I16, "__ashlhi3")

InitLibcalls()开头生成的宏定义会合成这样的枚举值:RTLIB::SHL_I16。这个枚举值实际上也是根据RuntimeLibcalls.def的内容生成的(RuntimeLibcalls.h)。因此,setLibcallName()就是以这些枚举值为下标记录对应的函数名。

setLibcallCallingConv()则是以这些枚举值为下标记录对应函数使用的调用惯例。缺省都是与C调用惯例兼容的LLVM缺省调用惯例。

571行的CmpLibcallCCs的定义是:ISD::CondCode CmpLibcallCCs[RTLIB::UNKNOWN_LIBCALL]。因此InitCmpLibcallCCs()是通过CmpLibcallCCs将RTLIB::Libcall中关于比较的函数关联到反映它们布尔结果的ISD::CondCode值。

 

你可能感兴趣的:(LLVM学习笔记,学习,笔记,llvm,编译)