4.4.3.2. X86InstrInfo子对象
X86Subtarget构造函数315行处的InstrInfo是X86InstrInfo类型的成员,因此调用了下面的构造函数。它是一个很长的函数,我们要分段看。首先看一下X86InstrInfo构造函数的参数是怎么来的。X86Subtarget::initializeSubtargetDependencies()初始化X86Subtarget所依赖的选项,返回实例本身。
294 X86Subtarget &X86Subtarget::initializeSubtargetDependencies(StringRef CPU,
295 StringRef FS) {
296 initSubtargetFeatures(CPU, FS);
297 return *this;
298 }
296行调用initSubtargetFeatures()的参数CPU与FS分别是描述CPU与特征的字符串。它们实际上分别来自命令行选项-mcpu与-mattr。这些描述需要尽可能地详尽,因为242行的ParseSubtargetFeatures()(由TableGen的-gen-subtarget选项生成)将根据它们来确定上面参数的值。这些参数的缺省值固然能生成正确的代码,但许多优化也成为了不可能。
218 void X86Subtarget::initSubtargetFeatures(StringRef CPU, StringRef FS) {
219 std::string CPUName = CPU;
220 if (CPUName.empty())
221 CPUName = "generic";
222
223 // Make sure 64-bit features are available in 64-bit mode. (But make sure
224 // SSE2 can be turned off explicitly.)
225 std::string FullFS = FS;
226 if (In64BitMode) {
227 if (!FullFS.empty())
228 FullFS = "+64bit,+sse2," + FullFS;
229 else
230 FullFS = "+64bit,+sse2";
231 }
232
233 // LAHF/SAHF are always supported in non-64-bit mode.
234 if (!In64BitMode) {
235 if (!FullFS.empty())
236 FullFS = "+sahf," + FullFS;
237 else
238 FullFS = "+sahf";
239 }
240
241 // If feature string is not empty, parse features string.
242 ParseSubtargetFeatures(CPUName, FullFS);
243
244 // All CPUs that implement SSE4.2 or SSE4A support unaligned accesses of
245 // 16-bytes and under that are reasonably fast. These features were
246 // introduced with Intel's Nehalem/Silvermont and AMD's Family10h
247 // micro-architectures respectively.
248 if (hasSSE42() || hasSSE4A())
249 IsUAMem16Slow = false;
250
251 // It's important to keep the MCSubtargetInfo feature bits in sync with
252 // target data structure which is shared with MC code emitter, etc.
253 if (In64BitMode)
254 ToggleFeature(X86::Mode64Bit);
255 else if (In32BitMode)
256 ToggleFeature(X86::Mode32Bit);
257 else if (In16BitMode)
258 ToggleFeature(X86::Mode16Bit);
259 else
260 llvm_unreachable("Not 16-bit, 32-bit or 64-bit mode!");
261
262 DEBUG(dbgs() << "Subtarget features: SSELevel " << X86SSELevel
263 << ", 3DNowLevel " << X863DNowLevel
264 << ", 64bit " << HasX86_64 << "\n");
265 assert((!In64BitMode || HasX86_64) &&
266 "64-bit code requested on a subtarget that doesn't support it!");
267
268 // Stack alignment is 16 bytes on Darwin, Linux and Solaris (both
269 // 32 and 64 bit) and for all 64-bit targets.
270 if (StackAlignOverride)
271 stackAlignment = StackAlignOverride;
272 else if (isTargetDarwin() || isTargetLinux() || isTargetSolaris() ||
273 isTargetKFreeBSD() || In64BitMode)
274 stackAlignment = 16;
275
276 // Some CPUs have more overhead for gather. The specified overhead is relative
277 // to the Load operation. "2" is the number provided by Intel architects. This
278 // parameter is used for cost estimation of Gather Op and comparison with
279 // other alternatives.
280 // TODO: Remove the explicit hasAVX512()?, That would mean we would only
281 // enable gather with a -march.
282 if (hasAVX512() || (hasAVX2() && hasFastGather()))
283 GatherOverhead = 2;
284 if (hasAVX512())
285 ScatterOverhead = 2;
286
287 // Consume the vector width attribute or apply any target specific limit.
288 if (PreferVectorWidthOverride)
289 PreferVectorWidth = PreferVectorWidthOverride;
290 else if (Prefer256Bit)
291 PreferVectorWidth = 256;
292 }
在242行的ParseSubtargetFeatures()中调用InitMCProcessorInfo()根据CPU描述字符串,构建一个MCSchedModel对象。
30 void MCSubtargetInfo::InitMCProcessorInfo(StringRef CPU) {
31 FeatureBits = getFeatures(CPU, FS, ProcDesc, ProcFeatures);
32 if (!CPU.empty())
33 CPUSchedModel = getSchedModelForCPU(CPU);
34 else
35 CPUSchedModel = MCSchedModel::GetDefaultSchedModel();
36 }
下面92行的ProcSchedModels实际上来自X86GenSubtargetInfo.inc文件里的X86ProcSchedKV,它指向同一个文件里,类型为MCSchedModel的XXXModel(比如GenericModel,AtomModel等)。
86 const MCSchedModel MCSubtargetInfo::getSchedModelForCPU(StringRef CPU) const {
87 assert(ProcSchedModels && "Processor machine model not available!");
88
89 ArrayRef
90
91 assert(std::is_sorted(SchedModels.begin(), SchedModels.end(),
92 [](const SubtargetInfoKV &LHS, const SubtargetInfoKV &RHS) {
93 return strcmp(LHS.Key, RHS.Key) < 0;
94 }) &&
95 "Processor machine model table is not sorted")
96
97 // Find entry
98 auto Found =
99 std::lower_bound(SchedModels.begin(), SchedModels.end(), CPU);
100 if (Found == SchedModels.end() || StringRef(Found->Key) != CPU) {
101 if (CPU != "help") // Don't error if the user asked for help.
102 errs() << "'" << CPU
103 << "' is not a recognized processor for this target"
104 << " (ignoring processor)\n";
105 return MCSchedModel::GetDefaultSchedModel();
106 }
107 assert(Found->Value && "Missing processor SchedModel value");
108 return *(const MCSchedModel *)Found->Value;
109 }
如果没有匹配的调度模型,就会使用缺省模型(105行及InitMCProcessorInfo()的35行),缺省模型是这么一个对象(即TD里的GenericModel定义),它没有任何调度优化功能:
377 static const MCSchedModel &GetDefaultSchedModel() { return Default; }
378 static const MCSchedModel Default;
v7.0使用了C++11的新语法来生成Default对象:
25 const MCSchedModel MCSchedModel::Default = {DefaultIssueWidth,
26 DefaultMicroOpBufferSize,
27 DefaultLoopMicroOpBufferSize,
28 DefaultLoadLatency,
29 DefaultHighLatency,
30 DefaultMispredictPenalty,
31 false,
32 true,
33 0,
34 nullptr,
35 nullptr,
36 0,
37 0,
38 nullptr,
39 nullptr};
好了,在得到初步初始化的X86Subtarget实例后,开始执行X86InstrInfo的构造函数!v7.0新增了关于MSVC的异常处理。
80 X86InstrInfo::X86InstrInfo(X86Subtarget &STI)
81 : X86GenInstrInfo((STI.isTarget64BitLP64() ? X86::ADJCALLSTACKDOWN64
82 : X86::ADJCALLSTACKDOWN32),
83 (STI.isTarget64BitLP64() ? X86::ADJCALLSTACKUP64
84 : X86::ADJCALLSTACKUP32),
85 X86::CATCHRET,
86 (STI.is64Bit() ? X86::RETQ : X86::RETL)),
87 Subtarget(STI), RI(STI.getTargetTriple()) {
88 }
v3.6.1里的数组MemoryFoldTable2Addr以及其他相关数组都移到了新的X86InstrFoldTables.cpp文件里,仍然作为静态常量数组。这个文件同时还包含了从这些数组查找指定项的方法。在不久的将来这些内容希望能由codegen生成,参考X86折叠表的生成(v7.0)一章)。
接着,在X86InstrInfo构造函数的87行调用了X86RegisterInfo的构造函数来初始化同类型的成员RI,即寄存器信息。
45 X86RegisterInfo::X86RegisterInfo(const Triple &TT)
46 : X86GenRegisterInfo((TT.isArch64Bit() ? X86::RIP : X86::EIP),
47 X86_MC::getDwarfRegFlavour(TT, false),
48 X86_MC::getDwarfRegFlavour(TT, true),
49 (TT.isArch64Bit() ? X86::RIP : X86::EIP)) {
50 X86_MC::initLLVMToSEHAndCVRegMapping(this);
51
52 // Cache some information.
53 Is64Bit = TT.isArch64Bit();
54 IsWin64 = Is64Bit && TT.isOSWindows();
55
56 // Use a callee-saved register as the base pointer. These registers must
57 // not conflict with any ABI requirements. For example, in 32-bit mode PIC
58 // requires GOT in the EBX register before function calls via PLT GOT pointer.
59 if (Is64Bit) {
60 SlotSize = 8;
61 // This matches the simplified 32-bit pointer code in the data layout
62 // computation.
63 // FIXME: Should use the data layout?
64 bool Use64BitReg = TT.getEnvironment() != Triple::GNUX32;
65 StackPtr = Use64BitReg ? X86::RSP : X86::ESP;
66 FramePtr = Use64BitReg ? X86::RBP : X86::EBP;
67 BasePtr = Use64BitReg ? X86::RBX : X86::EBX;
68 } else {
69 SlotSize = 4;
70 StackPtr = X86::ESP;
71 FramePtr = X86::EBP;
72 BasePtr = X86::ESI;
73 }
74 }
46行的X86GenRegisterInfo构造函数是TableGen生成的,它将X86特定的寄存器信息绑定到与目标机器无关的MCRegisterInfo层,并执行相关的寄存器编号映射,比如LLVM到Dwarf。
50行调用的initLLVMToSEHAndCVRegMapping()将LLVM的寄存器编号映射到汇编指令中寄存器的编号,并且匹配codeview寄存器编号到MC寄存器编号。