LLVM学习笔记(61)

4.4.3.2. X86InstrInfo子对象

X86Subtarget构造函数315行处的InstrInfo是X86InstrInfo类型的成员,因此调用了下面的构造函数。它是一个很长的函数,我们要分段看。首先看一下X86InstrInfo构造函数的参数是怎么来的。X86Subtarget::initializeSubtargetDependencies()初始化X86Subtarget所依赖的选项,返回实例本身。

294     X86Subtarget &X86Subtarget::initializeSubtargetDependencies(StringRef CPU,

295                                                                 StringRef FS) {

296       initSubtargetFeatures(CPU, FS);

297       return *this;

298     }

296行调用initSubtargetFeatures()的参数CPU与FS分别是描述CPU与特征的字符串。它们实际上分别来自命令行选项-mcpu与-mattr。这些描述需要尽可能地详尽,因为242行的ParseSubtargetFeatures()(由TableGen的-gen-subtarget选项生成)将根据它们来确定上面参数的值。这些参数的缺省值固然能生成正确的代码,但许多优化也成为了不可能。

218     void X86Subtarget::initSubtargetFeatures(StringRef CPU, StringRef FS) {

219       std::string CPUName = CPU;

220       if (CPUName.empty())

221         CPUName = "generic";

222    

223       // Make sure 64-bit features are available in 64-bit mode. (But make sure

224       // SSE2 can be turned off explicitly.)

225       std::string FullFS = FS;

226       if (In64BitMode) {

227         if (!FullFS.empty())

228           FullFS = "+64bit,+sse2," + FullFS;

229         else

230           FullFS = "+64bit,+sse2";

231       }

232    

233       // LAHF/SAHF are always supported in non-64-bit mode.

234       if (!In64BitMode) {

235         if (!FullFS.empty())

236           FullFS = "+sahf," + FullFS;

237         else

238           FullFS = "+sahf";

239       }

240    

241       // If feature string is not empty, parse features string.

242       ParseSubtargetFeatures(CPUName, FullFS);

243    

244       // All CPUs that implement SSE4.2 or SSE4A support unaligned accesses of

245       // 16-bytes and under that are reasonably fast. These features were

246       // introduced with Intel's Nehalem/Silvermont and AMD's Family10h

247       // micro-architectures respectively.

248       if (hasSSE42() || hasSSE4A())

249         IsUAMem16Slow = false;

250    

251       // It's important to keep the MCSubtargetInfo feature bits in sync with

252       // target data structure which is shared with MC code emitter, etc.

253       if (In64BitMode)

254         ToggleFeature(X86::Mode64Bit);

255       else if (In32BitMode)

256         ToggleFeature(X86::Mode32Bit);

257       else if (In16BitMode)

258         ToggleFeature(X86::Mode16Bit);

259       else

260         llvm_unreachable("Not 16-bit, 32-bit or 64-bit mode!");

261    

262       DEBUG(dbgs() << "Subtarget features: SSELevel " << X86SSELevel

263                    << ", 3DNowLevel " << X863DNowLevel

264                    << ", 64bit " << HasX86_64 << "\n");

265       assert((!In64BitMode || HasX86_64) &&

266              "64-bit code requested on a subtarget that doesn't support it!");

267    

268       // Stack alignment is 16 bytes on Darwin, Linux and Solaris (both

269       // 32 and 64 bit) and for all 64-bit targets.

270       if (StackAlignOverride)

271         stackAlignment = StackAlignOverride;

272       else if (isTargetDarwin() || isTargetLinux() || isTargetSolaris() ||

273                isTargetKFreeBSD() || In64BitMode)

274         stackAlignment = 16;

275    

276       // Some CPUs have more overhead for gather. The specified overhead is relative

277       // to the Load operation. "2" is the number provided by Intel architects. This

278       // parameter is used for cost estimation of Gather Op and comparison with

279       // other alternatives.

280       // TODO: Remove the explicit hasAVX512()?, That would mean we would only

281       // enable gather with a -march.

282       if (hasAVX512() || (hasAVX2() && hasFastGather()))

283         GatherOverhead = 2;

284       if (hasAVX512())

285         ScatterOverhead = 2;

286    

287       // Consume the vector width attribute or apply any target specific limit.

288       if (PreferVectorWidthOverride)

289         PreferVectorWidth = PreferVectorWidthOverride;

290       else if (Prefer256Bit)

291         PreferVectorWidth = 256;

292     }

在242行的ParseSubtargetFeatures()中调用InitMCProcessorInfo()根据CPU描述字符串,构建一个MCSchedModel对象。

30       void MCSubtargetInfo::InitMCProcessorInfo(StringRef CPU) {

31         FeatureBits = getFeatures(CPU, FS, ProcDesc, ProcFeatures);

32         if (!CPU.empty())

33           CPUSchedModel = getSchedModelForCPU(CPU);

34         else

35           CPUSchedModel = MCSchedModel::GetDefaultSchedModel();

36       }

下面92行的ProcSchedModels实际上来自X86GenSubtargetInfo.inc文件里的X86ProcSchedKV,它指向同一个文件里,类型为MCSchedModel的XXXModel(比如GenericModel,AtomModel等)。

86       const MCSchedModel MCSubtargetInfo::getSchedModelForCPU(StringRef CPU) const {

87         assert(ProcSchedModels && "Processor machine model not available!");

88      

89         ArrayRef SchedModels(ProcSchedModels, ProcDesc.size());

90      

91         assert(std::is_sorted(SchedModels.begin(), SchedModels.end(),

92                           [](const SubtargetInfoKV &LHS, const SubtargetInfoKV &RHS) {

93                               return strcmp(LHS.Key, RHS.Key) < 0;

94                           }) &&

95                    "Processor machine model table is not sorted")

96      

97         // Find entry

98         auto Found =

99           std::lower_bound(SchedModels.begin(), SchedModels.end(), CPU);

100       if (Found == SchedModels.end() || StringRef(Found->Key) != CPU) {

101         if (CPU != "help") // Don't error if the user asked for help.

102           errs() << "'" << CPU

103                  << "' is not a recognized processor for this target"

104                  << " (ignoring processor)\n";

105         return MCSchedModel::GetDefaultSchedModel();

106       }

107       assert(Found->Value && "Missing processor SchedModel value");

108       return *(const MCSchedModel *)Found->Value;

109     }

如果没有匹配的调度模型,就会使用缺省模型(105行及InitMCProcessorInfo()的35行),缺省模型是这么一个对象(即TD里的GenericModel定义),它没有任何调度优化功能:

377       static const MCSchedModel &GetDefaultSchedModel() { return Default; }

378       static const MCSchedModel Default;

v7.0使用了C++11的新语法来生成Default对象:

25       const MCSchedModel MCSchedModel::Default = {DefaultIssueWidth,

26                                                   DefaultMicroOpBufferSize,

27                                                   DefaultLoopMicroOpBufferSize,

28                                                   DefaultLoadLatency,

29                                                   DefaultHighLatency,

30                                                   DefaultMispredictPenalty,

31                                                   false,

32                                                   true,

33                                                   0,

34                                                   nullptr,

35                                                   nullptr,

36                                                   0,

37                                                   0,

38                                                   nullptr,

39                                                   nullptr};

好了,在得到初步初始化的X86Subtarget实例后,开始执行X86InstrInfo的构造函数!v7.0新增了关于MSVC的异常处理。

80       X86InstrInfo::X86InstrInfo(X86Subtarget &STI)

81           : X86GenInstrInfo((STI.isTarget64BitLP64() ? X86::ADJCALLSTACKDOWN64

82                                                      : X86::ADJCALLSTACKDOWN32),

83                          (STI.isTarget64BitLP64() ? X86::ADJCALLSTACKUP64

84                                                      : X86::ADJCALLSTACKUP32),

85                         X86::CATCHRET,

86                       (STI.is64Bit() ? X86::RETQ : X86::RETL)),

87             Subtarget(STI), RI(STI.getTargetTriple()) {

88         }

v3.6.1里的数组MemoryFoldTable2Addr以及其他相关数组都移到了新的X86InstrFoldTables.cpp文件里,仍然作为静态常量数组。这个文件同时还包含了从这些数组查找指定项的方法。在不久的将来这些内容希望能由codegen生成,参考X86折叠表的生成(v7.0)一章)。

接着,在X86InstrInfo构造函数的87行调用了X86RegisterInfo的构造函数来初始化同类型的成员RI,即寄存器信息。

45       X86RegisterInfo::X86RegisterInfo(const Triple &TT)

46           : X86GenRegisterInfo((TT.isArch64Bit() ? X86::RIP : X86::EIP),

47                                X86_MC::getDwarfRegFlavour(TT, false),

48                                X86_MC::getDwarfRegFlavour(TT, true),

49                                (TT.isArch64Bit() ? X86::RIP : X86::EIP)) {

50         X86_MC::initLLVMToSEHAndCVRegMapping(this);

51      

52         // Cache some information.

53         Is64Bit = TT.isArch64Bit();

54         IsWin64 = Is64Bit && TT.isOSWindows();

55      

56         // Use a callee-saved register as the base pointer.  These registers must

57         // not conflict with any ABI requirements.  For example, in 32-bit mode PIC

58         // requires GOT in the EBX register before function calls via PLT GOT pointer.

59         if (Is64Bit) {

60           SlotSize = 8;

61           // This matches the simplified 32-bit pointer code in the data layout

62           // computation.

63           // FIXME: Should use the data layout?

64           bool Use64BitReg = TT.getEnvironment() != Triple::GNUX32;

65           StackPtr = Use64BitReg ? X86::RSP : X86::ESP;

66           FramePtr = Use64BitReg ? X86::RBP : X86::EBP;

67           BasePtr = Use64BitReg ? X86::RBX : X86::EBX;

68         } else {

69           SlotSize = 4;

70           StackPtr = X86::ESP;

71           FramePtr = X86::EBP;

72           BasePtr = X86::ESI;

73         }

74       }

46行的X86GenRegisterInfo构造函数是TableGen生成的,它将X86特定的寄存器信息绑定到与目标机器无关的MCRegisterInfo层,并执行相关的寄存器编号映射,比如LLVM到Dwarf。

50行调用的initLLVMToSEHAndCVRegMapping()将LLVM的寄存器编号映射到汇编指令中寄存器的编号,并且匹配codeview寄存器编号到MC寄存器编号。

你可能感兴趣的:(学习,笔记)