4.4. 目标机器对象
在main()函数的350行,TimeCompilations默认为1,可以通过隐藏的选项“-time-compilations”来指定它的值,它的作用是重复进行指定次数的编译,以得到更好的编译用时数据。而在这个循环中调用的compileModule(),则是执行编译的入口。
387 static int compileModule(char **argv, LLVMContext &Context) {
388 // Load the module to be compiled...
389 SMDiagnostic Err;
390 std::unique_ptr
391 std::unique_ptr
392 Triple TheTriple;
393
394 bool SkipModule = MCPU == "help" ||
395 (!MAttrs.empty() && MAttrs.front() == "help");
396
397 // If user just wants to list available options, skip module loading
398 if (!SkipModule) {
399 if (InputLanguage == "mir" ||
400 (InputLanguage == "" && StringRef(InputFilename).endswith_lower(".mir"))) {
401 MIR = createMIRParserFromFile(InputFilename, Err, Context);
402 if (MIR) {
403 M = MIR->parseIRModule();
404 } else
405 M = parseIRFile(InputFilename, Err, Context);
406 if (!M) {
407 Err.print(argv[0], errs() WithColor::error(errs(), argv[0]));
408 return 1;
409 }
410
411 // If we are supposed to override the target triple, do so now.
412 if (!TargetTriple.empty())
413 M->setTargetTriple(Triple::normalize(TargetTriple));
414 TheTriple = Triple(M->getTargetTriple());
415 } else {
416 TheTriple = Triple(Triple::normalize(TargetTriple));
417 }
418
419 if (TheTriple.getTriple().empty())
420 TheTriple.setTriple(sys::getDefaultTargetTriple());
421
422 // Get the target specific parser.
423 std::string Error;
424 const Target *TheTarget = TargetRegistry::lookupTarget(MArch, TheTriple,
425 Error);
426 if (!TheTarget) {
427 errs() << argv[0] << ": " << Error;
428 return 1;
430 }
431
432 std::string CPUStr = getCPUStr(), FeaturesStr = getFeaturesStr();
433
434 CodeGenOpt::Level OLvl = CodeGenOpt::Default;
435 switch (OptLevel) {
436 default:
437 errs() << argv[0] << ": invalid optimization level.\n";
438 return 1;
439 case ' ': break;
440 case '0': OLvl = CodeGenOpt::None; break;
441 case '1': OLvl = CodeGenOpt::Less; break;
442 case '2': OLvl = CodeGenOpt::Default; break;
443 case '3': OLvl = CodeGenOpt::Aggressive; break;
444 }
445
446 TargetOptions Options = InitTargetOptionsFromCodeGenFlags();
447 Options.DisableIntegratedAS = NoIntegratedAssembler;
448 Options.MCOptions.ShowMCEncoding = ShowMCEncoding;
449 Options.MCOptions.MCUseDwarfDirectory = EnableDwarfDirectory;
450 Options.MCOptions.AsmVerbose = AsmVerbose;
451 Options.MCOptions.PreserveAsmComments = PreserveComments;
452 Options.MCOptions.IASSearchPaths = IncludeDirs;
453 Options.MCOptions.SplitDwarfFile = SplitDwarfFile;
398行的SkipModule不为true才会执行真正的编译。Llc编译的源文件有两种格式。一种是后缀为.mir的MIR文件。这种文件用于调试,可以用来测试单个代码生成遍,在Machine Instructions Format Reference有详细的介绍。另一种就是Clang生成的IR格式文件。这里都不深入解析输出文件生成Module对象的过程。Module实例是所有其他LLVM IR对象的顶层容器。每个Module对象直接包含这个模块所也依赖的一组全局变量,一组函数,一组库,一张符号表,及目标机器属性的各种数据。
412行的TargetTriple可以通过选项“-mtriple”来设置,缺省为空字符串,这时从输入文件得到的Module对象获取这个目标机器三位信息。如果还获取不了,在420行使用“i686-pc-linux-gnu”作为缺省选择。424行TargetRegistry::lookupTarget()在前面注册的Target对象里查找与TheTriple匹配的实例。445行InitTargetOptionsFromCodeGenFlags()根据命令行选项来设置TargetOptions对象。
下面467行的FloatABIForCalls来自编译选项-float-abi,这用于选择浮点的ABI类型,有default、soft、hard这3种选择。那么只要它不是default,就要把它记录在Options里。
下面的SplitDwarfOutputFile来自选项-split-dwarf-output,指定一个.dwo后缀的文件。通过在编译时刻把调试信息分为两部分——一部分保留在.o文件,另一部分写入一个并行的.dwo(DWARF目标)文件——可以减小由链接器处理的目标文件的总体尺寸。Out与DwoOut就是这两个文件。
DisableSimplifyLibCalls同样来自选项-disable-simplify-libcalls,用于禁止使用内置函数(builtin)。
LLVM IR本身在不断演进,为了向下兼容,v7.0提供更新功能,可以将过时的IR固有函数或特性更新到当前版本。下面的UpgradeDebugInfo()检查调试信息的版本,丢弃过时的调试信息。
compileModule(续)
454 std::unique_ptr
455 TheTriple.getTriple(), CPUStr, FeaturesStr, Options, getRelocModel(),
456 getCodeModel(), OLvl));
457
458 assert(Target && "Could not allocate target machine!");
459
460 // If we don't have a module then just exit now. We do this down
461 // here since the CPU/Feature help is underneath the target machine
462 // creation.
463 if (SkipModule)
464 return 0;
465
466 assert(M && "Should have exited if we didn't have a module!");
467 if (FloatABIForCalls != FloatABI::Default)
468 Options.FloatABIType = FloatABIForCalls;
469
470 // Figure out where we are going to send the output.
471 std::unique_ptr
472 GetOutputStream(TheTarget->getName(), TheTriple.getOS(), argv[0]);
473 if (!Out) return 1;
474
475 std::unique_ptr
476 if (!SplitDwarfOutputFile.empty()) {
477 std::error_code EC;
478 DwoOut = llvm::make_unique
479 sys::fs::F_None);
480 if (EC) {
481 WithColor::error(errs(), argv[0]) << EC.message() << '\n';
482 return 1;
483 }
484 }
485
486 // Build up all of the passes that we want to do to the module.
487 legacy::PassManager PM;
488
489 // Add an appropriate TargetLibraryInfo pass for the module's triple.
490 TargetLibraryInfoImpl TLII(Triple(M->getTargetTriple()));
491
492 // The -disable-simplify-libcalls flag actually disables all builtin optzns.
493 if (DisableSimplifyLibCalls)
494 TLII.disableAllFunctions();
495 PM.add(new TargetLibraryInfoWrapperPass(TLII));
496
497 // Add the target data from the target machine, if it exists, or the module.
498
499 M->setDataLayout(Target->createDataLayout());
500
501 // This needs to be done after setting datalayout since it calls verifier
502 // to check debug info whereas verifier relies on correct datalayout.
503 UpgradeDebugInfo(*M);
504
505 // Verify module immediately to catch problems before doInitialization() is
506 // called on any passes.
507 if (!NoVerify && verifyModule(*M, &errs())) {
508 std::string Prefix =
509 (Twine(argv[0]) + Twine(": ") + Twine(InputFilename)).str();
510 WithColor::error(errs(), Prefix) << "input module is broken!\n";
511 return 1;
512 }
上面的NoVerify则来自选项-disable-verify,用于禁止验证输入模块。因为LLVM可以接受IR形式的模块,我们需要一定的安全检查,确保模块是良好的。这个验证不提供完整的“Java形式的”安全与验证,相反只尝试保证代码是良好的,它会完成这些内容(来自文件verifier.cpp的注释):
454行调用下面的方法创建一个TargetMachine对象,这个对象是对目标机器的一个完整描述。
388 TargetMachine * createTargetMachine(StringRef TT, StringRef CPU,
389 StringRef Features,
390 const TargetOptions &Options,
391 Reloc::Model RM,
392 CodeModel::Model CM = None,
393 CodeGenOpt::Level OL = CodeGenOpt::Default,
394 bool JIT = false) const {
395 if (!TargetMachineCtorFn)
396 return nullptr;
397 return TargetMachineCtorFn(*this, Triple(TT), CPU, Features, Options, RM,
398 CM, OL, JIT);
399 }
397行的TargetMachineCtorFn就是前面注册的RegisterTargetMachine的Allocator(),调用它创建一个X86TargetMachine实例。
215 X86TargetMachine::X86TargetMachine(const Target &T, const Triple &TT,
216 StringRef CPU, StringRef FS,
217 const TargetOptions &Options,
218 Reloc::Model RM,
219 CodeModel::Model CM,
220 CodeGenOpt::Level OL, JIT)
221 : LLVMTargetMachine(
222 T, computeDataLayout(TT), TT, CPU, FS, Options,
223 getEffectiveRelocModel(TT, JIT, RM),
224 getEffectiveCodeModel(CM, JIT, TT.getArch() == Triple::x86_64), OL),
225 TLOF(createTLOF(getTargetTriple())) {
226 // Windows stack unwinder gets confused when execution flow "falls through"
227 // after a call to 'noreturn' function.
228 // To prevent that, we emit a trap for 'unreachable' IR instructions.
229 // (which on X86, happens to be the 'ud2' instruction)
230 // On PS4, the "return address" of a 'noreturn' call must still be within
231 // the calling function, and TrapUnreachable is an easy way to get that.
232 // The check here for 64-bit windows is a bit icky, but as we're unlikely
233 // to ever want to mix 32 and 64-bit windows code in a single module
234 // this should be fine.
235 if ((TT.isOSWindows() && TT.getArch() == Triple::x86_64) || TT.isPS4() ||
236 TT.isOSBinFormatMachO()) {
237 this->Options.TrapUnreachable = true;
238 this->Options.NoTrapAfterNoreturn = TT.isOSBinFormatMachO();
239 }
240
241 // Outlining is available for x86-64.
242 if (TT.getArch() == Triple::x86_64)
243 setMachineOutliner(true);
244
245 initAsmInfo();
246 }
X86TargetMachine是LLVMTargetMachine的派生类。基类LLVMTargetMachine构造函数的定义是:
77 LLVMTargetMachine::LLVMTargetMachine(const Target &T,
78 StringRef DataLayoutString,
79 const Triple &TT, StringRef CPU,
80 StringRef FS, TargetOptions Options,
81 Reloc::Model RM, CodeModel::Model CM,
82 CodeGenOpt::Level OL)
83 : TargetMachine(T, DataLayoutString, TT, CPU, FS, Options) {
84 this->RM = RM;
85 this->CMModel = CM;
86 this->OptLevel = OL;
87
88 if (EnableTrapUnreachable)
89 this->Options.TrapUnreachable = true;
90 }
另外,基类TargetMachine的构造函数是这样的:
35 TargetMachine::TargetMachine(const Target &T, StringRef DataLayoutString,
36 const Triple &TT, StringRef CPU, StringRef FS,
37 const TargetOptions &Options)
38 : TheTarget(T), DL(DataLayoutString), TargetTriple(TT), TargetCPU(CPU),
39 TargetFS(FS), AsmInfo(nullptr), MRI(nullptr), MII(nullptr), STI(nullptr),
40 RequireStructuredCFG(false), DefaultOptions(Options), Options(Options) {
41 }
DefaultOptions与Options都被初始化为相同的内容。不过,由于不同的函数可以使用不同的属性,因此DefaultOptions保存了根据编译命令行设置的属性(对当前编译单元而言,所谓的缺省属性),而Options会根据当前函数声明使用的属性进行更新。