dyld(the dynamic link editor),也就是动态链接器,是内核在完成进程工作后,需要将需要的库和符号链接到Mach-O
镜像文件中,而这个填充工作便是由动态链接器dyld完成的。
而我们知道程序的入口是main
函数,dyld是如何一步步走到main函数,类又是何时加载到内存中的呢?这便是本文探究的内容。
关于Mach-O
Mach-O是Mach Object文件格式的缩写,是iOS或OS X系统下可执行文件类型的统称,它可以是可执行文件、目标代码、动态库、内核转储的文件格式。事实上我们开发运行程序生成的.app文件,打开内部便有Mach-O文件,这是系统内核运行程序是真正加载运行的东西。
Mach-O的文件类型
【Mach-O】有3种文件类型:Executable、Dylib、Bundle
- Executable:app的二进制文件
- Dylib:动态库的二进制文件
- Bundle:一种特殊类型的动态库,无法对其进行链接,只能在Runtime运行时通过dlopen加载,可以在macOS上用于插件
Mach-O的文件结构
Mach-O文件分3个主要区域:Header、Load Commands、Data。
- Header:保存了Mach-O的一些基本信息,包括运行平台、文件类型、LoadCommands指令的个数、指令总大小等等。
- Load Command:Mach-O中最重要的元信息,紧跟在头文件信息之后,它清晰地告诉加载器如何处理二进制数据,有些命令是内核处理的,有些事动态链接器处理的,加载Mach-O文件会使用这部分数据确定内存分布以及相关加载命令,对系统内核加载器和动态链接器起指导作用,比如main()函数的加载地址,程序所需dyld的文件路径,以及相关依赖库的文件路径。
- Data:每个segment的具体数据保存在这里,包含具体的代码、数据等等。
App加载流程
我们知道程序的入口是mian()函数,我们可以在类的load
方法打上断点,查看调用的堆栈信息。
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 2.1
* frame #0: 0x0000000100000f2c 消息转发`+[Animal load](self=Animal, _cmd="load") at Animal.m:16:1
frame #1: 0x00007fff6df5e560 libobjc.A.dylib`load_images + 1529
frame #2: 0x000000010000b26c dyld`dyld::notifySingle(dyld_image_states, ImageLoader const*, ImageLoader::InitializerTimingList*) + 418
frame #3: 0x000000010001efe9 dyld`ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int, char const*, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 475
frame #4: 0x000000010001d0b4 dyld`ImageLoader::processInitializers(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 188
frame #5: 0x000000010001d154 dyld`ImageLoader::runInitializers(ImageLoader::LinkContext const&, ImageLoader::InitializerTimingList&) + 82
frame #6: 0x000000010000b6a8 dyld`dyld::initializeMainExecutable() + 199
frame #7: 0x0000000100010bba dyld`dyld::_main(macho_header const*, unsigned long, int, char const**, char const**, char const**, unsigned long*) + 6667
frame #8: 0x000000010000a227 dyld`dyldbootstrap::start(dyld3::MachOLoaded const*, int, char const**, dyld3::MachOLoaded const*, unsigned long*) + 453
frame #9: 0x000000010000a025 dyld`_dyld_start + 37
可以看到,调用流程是从dyld的_dyld_start
开始的。我们从官网上下载dyld
的源码进行探究。(dyld下载地址)
_dyld_start
_dyld_start
是由汇编代码编写的,入口在dyldStartup.s
文件,尽管不同架构下有不同的实现,但通过注释我们可以知道_dyld_start
调用了dyldbootstrap::start
#if __arm64__
.text
.align 2
.globl __dyld_start
__dyld_start:
...
// call dyldbootstrap::start(app_mh, argc, argv, dyld_mh, &startGlue)
bl __ZN13dyldbootstrap5startEPKN5dyld311MachOLoadedEiPPKcS3_Pm
mov x16,x0 // save entry point address in x16
#if __LP64__
ldr x1, [sp]
#else
ldr w1, [sp]
#endif
dyldbootstrap::start
//
// This is code to bootstrap dyld. This work in normally done for a program by dyld and crt.
// In dyld we have to do this manually.
//
uintptr_t start(const dyld3::MachOLoaded* appsMachHeader, int argc, const char* argv[],
const dyld3::MachOLoaded* dyldsMachHeader, uintptr_t* startGlue)
{
// Emit kdebug tracepoint to indicate dyld bootstrap has started
dyld3::kdebug_trace_dyld_marker(DBG_DYLD_TIMING_BOOTSTRAP_START, 0, 0, 0, 0);
// if kernel had to slide dyld, we need to fix up load sensitive locations
// we have to do this before using any global variables
rebaseDyld(dyldsMachHeader);
// kernel sets up env pointer to be just past end of agv array
const char** envp = &argv[argc+1];
// kernel sets up apple pointer to be just past end of envp array
const char** apple = envp;
while(*apple != NULL) { ++apple; }
++apple;
// set up random value for stack canary
__guard_setup(apple);
#if DYLD_INITIALIZER_SUPPORT
// run all C++ initializers inside dyld
runDyldInitializers(argc, argv, envp, apple);
#endif
// now that we are done bootstrapping dyld, call dyld's main
uintptr_t appsSlide = appsMachHeader->getSlide();
return dyld::_main((macho_header*)appsMachHeader, appsSlide, argc, argv, envp, apple, startGlue);
}
dyldbootstrap
命名空间在dyldInitialization.cpp
中,在start
函数中主要做了这3件事:
- 对
dyld
进行rebase操作,以修复为real pointer来运行 - 设置参数和环境变量
- 对Mach-O文件的header得到偏移量appsSlide,然后调用了
dyld
命名空间下的_main
方法
dyld::_main
//
// Entry point for dyld. The kernel loads dyld and jumps to __dyld_start which
// sets up some registers and call this function.
//
// Returns address of main() in target program which __dyld_start jumps to
//
uintptr_t
_main(const macho_header* mainExecutableMH, uintptr_t mainExecutableSlide,
int argc, const char* argv[], const char* envp[], const char* apple[],
uintptr_t* startGlue)
{
if (dyld3::kdebug_trace_dyld_enabled(DBG_DYLD_TIMING_LAUNCH_EXECUTABLE)) {
launchTraceID = dyld3::kdebug_trace_dyld_duration_start(DBG_DYLD_TIMING_LAUNCH_EXECUTABLE, (uint64_t)mainExecutableMH, 0, 0);
}
//Check and see if there are any kernel flags
dyld3::BootArgs::setFlags(hexToUInt64(_simple_getenv(apple, "dyld_flags"), nullptr));
// Grab the cdHash of the main executable from the environment
uint8_t mainExecutableCDHashBuffer[20];
const uint8_t* mainExecutableCDHash = nullptr;
if ( hexToBytes(_simple_getenv(apple, "executable_cdhash"), 40, mainExecutableCDHashBuffer) )
mainExecutableCDHash = mainExecutableCDHashBuffer;
#if !TARGET_OS_SIMULATOR
// Trace dyld's load
notifyKernelAboutImage((macho_header*)&__dso_handle, _simple_getenv(apple, "dyld_file"));
// Trace the main executable's load
notifyKernelAboutImage(mainExecutableMH, _simple_getenv(apple, "executable_file"));
#endif
uintptr_t result = 0;
sMainExecutableMachHeader = mainExecutableMH;
sMainExecutableSlide = mainExecutableSlide;
// Set the platform ID in the all image infos so debuggers can tell the process type
// FIXME: This can all be removed once we make the kernel handle it in rdar://43369446
if (gProcessInfo->version >= 16) {
__block bool platformFound = false;
((dyld3::MachOFile*)mainExecutableMH)->forEachSupportedPlatform(^(dyld3::Platform platform, uint32_t minOS, uint32_t sdk) {
if (platformFound) {
halt("MH_EXECUTE binaries may only specify one platform");
}
gProcessInfo->platform = (uint32_t)platform;
platformFound = true;
});
if (gProcessInfo->platform == (uint32_t)dyld3::Platform::unknown) {
// There were no platforms found in the binary. This may occur on macOS for alternate toolchains and old binaries.
// It should never occur on any of our embedded platforms.
#if __MAC_OS_X_VERSION_MIN_REQUIRED
gProcessInfo->platform = (uint32_t)dyld3::Platform::macOS;
#else
halt("MH_EXECUTE binaries must specify a minimum supported OS version");
#endif
}
}
#if __MAC_OS_X_VERSION_MIN_REQUIRED
// Check to see if we need to override the platform
const char* forcedPlatform = _simple_getenv(envp, "DYLD_FORCE_PLATFORM");
if (forcedPlatform) {
if (strncmp(forcedPlatform, "6", 1) != 0) {
halt("DYLD_FORCE_PLATFORM is only supported for platform 6");
}
const dyld3::MachOFile* mf = (dyld3::MachOFile*)sMainExecutableMachHeader;
if (mf->allowsAlternatePlatform()) {
gProcessInfo->platform = PLATFORM_IOSMAC;
}
}
// if this is host dyld, check to see if iOS simulator is being run
const char* rootPath = _simple_getenv(envp, "DYLD_ROOT_PATH");
if ( (rootPath != NULL) ) {
// look to see if simulator has its own dyld
char simDyldPath[PATH_MAX];
strlcpy(simDyldPath, rootPath, PATH_MAX);
strlcat(simDyldPath, "/usr/lib/dyld_sim", PATH_MAX);
int fd = my_open(simDyldPath, O_RDONLY, 0);
if ( fd != -1 ) {
const char* errMessage = useSimulatorDyld(fd, mainExecutableMH, simDyldPath, argc, argv, envp, apple, startGlue, &result);
if ( errMessage != NULL )
halt(errMessage);
return result;
}
}
else {
((dyld3::MachOFile*)mainExecutableMH)->forEachSupportedPlatform(^(dyld3::Platform platform, uint32_t minOS, uint32_t sdk) {
if ( dyld3::MachOFile::isSimulatorPlatform(platform) )
halt("attempt to run simulator program outside simulator (DYLD_ROOT_PATH not set)");
});
}
#endif
CRSetCrashLogMessage("dyld: launch started");
setContext(mainExecutableMH, argc, argv, envp, apple);
// Pickup the pointer to the exec path.
sExecPath = _simple_getenv(apple, "executable_path");
// Remove interim apple[0] transition code from dyld
if (!sExecPath) sExecPath = apple[0];
#if __IPHONE_OS_VERSION_MIN_REQUIRED && !TARGET_OS_SIMULATOR
// kernel is not passing a real path for main executable
if ( strncmp(sExecPath, "/var/containers/Bundle/Application/", 35) == 0 ) {
if ( char* newPath = (char*)malloc(strlen(sExecPath)+10) ) {
strcpy(newPath, "/private");
strcat(newPath, sExecPath);
sExecPath = newPath;
}
}
#endif
if ( sExecPath[0] != '/' ) {
// have relative path, use cwd to make absolute
char cwdbuff[MAXPATHLEN];
if ( getcwd(cwdbuff, MAXPATHLEN) != NULL ) {
// maybe use static buffer to avoid calling malloc so early...
char* s = new char[strlen(cwdbuff) + strlen(sExecPath) + 2];
strcpy(s, cwdbuff);
strcat(s, "/");
strcat(s, sExecPath);
sExecPath = s;
}
}
// Remember short name of process for later logging
sExecShortName = ::strrchr(sExecPath, '/');
if ( sExecShortName != NULL )
++sExecShortName;
else
sExecShortName = sExecPath;
configureProcessRestrictions(mainExecutableMH, envp);
// Check if we should force dyld3. Note we have to do this outside of the regular env parsing due to AMFI
if ( dyld3::internalInstall() ) {
if (const char* useClosures = _simple_getenv(envp, "DYLD_USE_CLOSURES")) {
if ( strcmp(useClosures, "0") == 0 ) {
sClosureMode = ClosureMode::Off;
} else if ( strcmp(useClosures, "1") == 0 ) {
#if __MAC_OS_X_VERSION_MIN_REQUIRED
#if __i386__
// don't support dyld3 for 32-bit macOS
#else
// Also don't support dyld3 for iOSMac right now
if ( gProcessInfo->platform != PLATFORM_IOSMAC ) {
sClosureMode = ClosureMode::On;
}
#endif // __i386__
#else
sClosureMode = ClosureMode::On;
#endif // __MAC_OS_X_VERSION_MIN_REQUIRED
} else {
dyld::warn("unknown option to DYLD_USE_CLOSURES. Valid options are: 0 and 1\n");
}
}
}
#if __MAC_OS_X_VERSION_MIN_REQUIRED
if ( !gLinkContext.allowEnvVarsPrint && !gLinkContext.allowEnvVarsPath && !gLinkContext.allowEnvVarsSharedCache ) {
pruneEnvironmentVariables(envp, &apple);
// set again because envp and apple may have changed or moved
setContext(mainExecutableMH, argc, argv, envp, apple);
}
else
#endif
{
checkEnvironmentVariables(envp);
defaultUninitializedFallbackPaths(envp);
}
#if __MAC_OS_X_VERSION_MIN_REQUIRED
if ( gProcessInfo->platform == PLATFORM_IOSMAC ) {
gLinkContext.rootPaths = parseColonList("/System/iOSSupport", NULL);
gLinkContext.iOSonMac = true;
if ( sEnv.DYLD_FALLBACK_LIBRARY_PATH == sLibraryFallbackPaths )
sEnv.DYLD_FALLBACK_LIBRARY_PATH = sRestrictedLibraryFallbackPaths;
if ( sEnv.DYLD_FALLBACK_FRAMEWORK_PATH == sFrameworkFallbackPaths )
sEnv.DYLD_FALLBACK_FRAMEWORK_PATH = sRestrictedFrameworkFallbackPaths;
}
else if ( ((dyld3::MachOFile*)mainExecutableMH)->supportsPlatform(dyld3::Platform::driverKit) ) {
gLinkContext.driverKit = true;
gLinkContext.sharedRegionMode = ImageLoader::kDontUseSharedRegion;
}
#endif
if ( sEnv.DYLD_PRINT_OPTS )
printOptions(argv);
if ( sEnv.DYLD_PRINT_ENV )
printEnvironmentVariables(envp);
// Parse this envirionment variable outside of the regular logic as we want to accept
// this on binaries without an entitelment
#if !TARGET_OS_SIMULATOR
if ( _simple_getenv(envp, "DYLD_JUST_BUILD_CLOSURE") != nullptr ) {
#if TARGET_OS_IPHONE
const char* tempDir = getTempDir(envp);
if ( (tempDir != nullptr) && (geteuid() != 0) ) {
// Use realpath to prevent something like TMPRIR=/tmp/../usr/bin
char realPath[PATH_MAX];
if ( realpath(tempDir, realPath) != NULL )
tempDir = realPath;
if (strncmp(tempDir, "/private/var/mobile/Containers/", strlen("/private/var/mobile/Containers/")) == 0) {
sJustBuildClosure = true;
}
}
#endif
// If we didn't like the format of TMPDIR, just exit. We don't want to launch the app as that would bring up the UI
if (!sJustBuildClosure) {
_exit(EXIT_SUCCESS);
}
}
#endif
if ( sJustBuildClosure )
sClosureMode = ClosureMode::On;
getHostInfo(mainExecutableMH, mainExecutableSlide);
// load shared cache
checkSharedRegionDisable((dyld3::MachOLoaded*)mainExecutableMH, mainExecutableSlide);
if ( gLinkContext.sharedRegionMode != ImageLoader::kDontUseSharedRegion ) {
#if TARGET_OS_SIMULATOR
if ( sSharedCacheOverrideDir)
mapSharedCache();
#else
mapSharedCache();
#endif
}
// If we haven't got a closure mode yet, then check the environment and cache type
if ( sClosureMode == ClosureMode::Unset ) {
// First test to see if we forced in dyld2 via a kernel boot-arg
if ( dyld3::BootArgs::forceDyld2() ) {
sClosureMode = ClosureMode::Off;
} else if ( inDenyList(sExecPath) ) {
sClosureMode = ClosureMode::Off;
} else if ( sEnv.hasOverride ) {
sClosureMode = ClosureMode::Off;
} else if ( dyld3::BootArgs::forceDyld3() ) {
sClosureMode = ClosureMode::On;
} else {
sClosureMode = getPlatformDefaultClosureMode();
}
}
#if !TARGET_OS_SIMULATOR
if ( sClosureMode == ClosureMode::Off ) {
if ( gLinkContext.verboseWarnings )
dyld::log("dyld: not using closure because of DYLD_USE_CLOSURES or -force_dyld2=1 override\n");
} else {
const dyld3::closure::LaunchClosure* mainClosure = nullptr;
dyld3::closure::LoadedFileInfo mainFileInfo;
mainFileInfo.fileContent = mainExecutableMH;
mainFileInfo.path = sExecPath;
// FIXME: If we are saving this closure, this slice offset/length is probably wrong in the case of FAT files.
mainFileInfo.sliceOffset = 0;
mainFileInfo.sliceLen = -1;
struct stat mainExeStatBuf;
if ( ::stat(sExecPath, &mainExeStatBuf) == 0 ) {
mainFileInfo.inode = mainExeStatBuf.st_ino;
mainFileInfo.mtime = mainExeStatBuf.st_mtime;
}
// check for closure in cache first
if ( sSharedCacheLoadInfo.loadAddress != nullptr ) {
mainClosure = sSharedCacheLoadInfo.loadAddress->findClosure(sExecPath);
if ( gLinkContext.verboseWarnings && (mainClosure != nullptr) )
dyld::log("dyld: found closure %p (size=%lu) in dyld shared cache\n", mainClosure, mainClosure->size());
}
// We only want to try build a closure at runtime if its an iOS third party binary, or a macOS binary from the shared cache
bool allowClosureRebuilds = false;
if ( sClosureMode == ClosureMode::On ) {
allowClosureRebuilds = true;
} else if ( (sClosureMode == ClosureMode::PreBuiltOnly) && (mainClosure != nullptr) ) {
allowClosureRebuilds = true;
}
if ( (mainClosure != nullptr) && !closureValid(mainClosure, mainFileInfo, mainExecutableCDHash, true, envp) )
mainClosure = nullptr;
// If we didn't find a valid cache closure then try build a new one
if ( (mainClosure == nullptr) && allowClosureRebuilds ) {
// if forcing closures, and no closure in cache, or it is invalid, check for cached closure
if ( !sForceInvalidSharedCacheClosureFormat )
mainClosure = findCachedLaunchClosure(mainExecutableCDHash, mainFileInfo, envp);
if ( mainClosure == nullptr ) {
// if no cached closure found, build new one
mainClosure = buildLaunchClosure(mainExecutableCDHash, mainFileInfo, envp);
}
}
// exit dyld after closure is built, without running program
if ( sJustBuildClosure )
_exit(EXIT_SUCCESS);
// try using launch closure
if ( mainClosure != nullptr ) {
CRSetCrashLogMessage("dyld3: launch started");
bool launched = launchWithClosure(mainClosure, sSharedCacheLoadInfo.loadAddress, (dyld3::MachOLoaded*)mainExecutableMH,
mainExecutableSlide, argc, argv, envp, apple, &result, startGlue);
if ( !launched && allowClosureRebuilds ) {
// closure is out of date, build new one
mainClosure = buildLaunchClosure(mainExecutableCDHash, mainFileInfo, envp);
if ( mainClosure != nullptr ) {
launched = launchWithClosure(mainClosure, sSharedCacheLoadInfo.loadAddress, (dyld3::MachOLoaded*)mainExecutableMH,
mainExecutableSlide, argc, argv, envp, apple, &result, startGlue);
}
}
if ( launched ) {
gLinkContext.startedInitializingMainExecutable = true;
#if __has_feature(ptrauth_calls)
// start() calls the result pointer as a function pointer so we need to sign it.
result = (uintptr_t)__builtin_ptrauth_sign_unauthenticated((void*)result, 0, 0);
#endif
if (sSkipMain)
result = (uintptr_t)&fake_main;
return result;
}
else {
if ( gLinkContext.verboseWarnings ) {
dyld::log("dyld: unable to use closure %p\n", mainClosure);
}
}
}
}
#endif // TARGET_OS_SIMULATOR
// could not use closure info, launch old way
// install gdb notifier
stateToHandlers(dyld_image_state_dependents_mapped, sBatchHandlers)->push_back(notifyGDB);
stateToHandlers(dyld_image_state_mapped, sSingleHandlers)->push_back(updateAllImages);
// make initial allocations large enough that it is unlikely to need to be re-alloced
sImageRoots.reserve(16);
sAddImageCallbacks.reserve(4);
sRemoveImageCallbacks.reserve(4);
sAddLoadImageCallbacks.reserve(4);
sImageFilesNeedingTermination.reserve(16);
sImageFilesNeedingDOFUnregistration.reserve(8);
#if !TARGET_OS_SIMULATOR
#ifdef WAIT_FOR_SYSTEM_ORDER_HANDSHAKE
// Add gating mechanism to dyld support system order file generation process
WAIT_FOR_SYSTEM_ORDER_HANDSHAKE(dyld::gProcessInfo->systemOrderFlag);
#endif
#endif
try {
// add dyld itself to UUID list
addDyldImageToUUIDList();
#if SUPPORT_ACCELERATE_TABLES
#if __arm64e__
// Disable accelerator tables when we have threaded rebase/bind, which is arm64e executables only for now.
if (sMainExecutableMachHeader->cpusubtype == CPU_SUBTYPE_ARM64E)
sDisableAcceleratorTables = true;
#endif
bool mainExcutableAlreadyRebased = false;
if ( (sSharedCacheLoadInfo.loadAddress != nullptr) && !dylibsCanOverrideCache() && !sDisableAcceleratorTables && (sSharedCacheLoadInfo.loadAddress->header.accelerateInfoAddr != 0) ) {
struct stat statBuf;
if ( ::stat(IPHONE_DYLD_SHARED_CACHE_DIR "no-dyld2-accelerator-tables", &statBuf) != 0 )
sAllCacheImagesProxy = ImageLoaderMegaDylib::makeImageLoaderMegaDylib(&sSharedCacheLoadInfo.loadAddress->header, sSharedCacheLoadInfo.slide, mainExecutableMH, gLinkContext);
}
reloadAllImages:
#endif
#if __MAC_OS_X_VERSION_MIN_REQUIRED
gLinkContext.strictMachORequired = false;
// be less strict about old macOS mach-o binaries
((dyld3::MachOFile*)mainExecutableMH)->forEachSupportedPlatform(^(dyld3::Platform platform, uint32_t minOS, uint32_t sdk) {
if ( (platform == dyld3::Platform::macOS) && (sdk >= DYLD_PACKED_VERSION(10,15,0)) ) {
gLinkContext.strictMachORequired = true;
}
});
if ( gLinkContext.iOSonMac )
gLinkContext.strictMachORequired = true;
#else
// simulators, iOS, tvOS, watchOS, are always strict
gLinkContext.strictMachORequired = true;
#endif
CRSetCrashLogMessage(sLoadingCrashMessage);
// instantiate ImageLoader for main executable
sMainExecutable = instantiateFromLoadedImage(mainExecutableMH, mainExecutableSlide, sExecPath);
gLinkContext.mainExecutable = sMainExecutable;
gLinkContext.mainExecutableCodeSigned = hasCodeSignatureLoadCommand(mainExecutableMH);
#if TARGET_OS_SIMULATOR
// check main executable is not too new for this OS
{
if ( ! isSimulatorBinary((uint8_t*)mainExecutableMH, sExecPath) ) {
throwf("program was built for a platform that is not supported by this runtime");
}
uint32_t mainMinOS = sMainExecutable->minOSVersion();
// dyld is always built for the current OS, so we can get the current OS version
// from the load command in dyld itself.
uint32_t dyldMinOS = ImageLoaderMachO::minOSVersion((const mach_header*)&__dso_handle);
if ( mainMinOS > dyldMinOS ) {
#if TARGET_OS_WATCH
throwf("app was built for watchOS %d.%d which is newer than this simulator %d.%d",
mainMinOS >> 16, ((mainMinOS >> 8) & 0xFF),
dyldMinOS >> 16, ((dyldMinOS >> 8) & 0xFF));
#elif TARGET_OS_TV
throwf("app was built for tvOS %d.%d which is newer than this simulator %d.%d",
mainMinOS >> 16, ((mainMinOS >> 8) & 0xFF),
dyldMinOS >> 16, ((dyldMinOS >> 8) & 0xFF));
#else
throwf("app was built for iOS %d.%d which is newer than this simulator %d.%d",
mainMinOS >> 16, ((mainMinOS >> 8) & 0xFF),
dyldMinOS >> 16, ((dyldMinOS >> 8) & 0xFF));
#endif
}
}
#endif
#if SUPPORT_ACCELERATE_TABLES
sAllImages.reserve((sAllCacheImagesProxy != NULL) ? 16 : INITIAL_IMAGE_COUNT);
#else
sAllImages.reserve(INITIAL_IMAGE_COUNT);
#endif
// Now that shared cache is loaded, setup an versioned dylib overrides
#if SUPPORT_VERSIONED_PATHS
checkVersionedPaths();
#endif
// dyld_all_image_infos image list does not contain dyld
// add it as dyldPath field in dyld_all_image_infos
// for simulator, dyld_sim is in image list, need host dyld added
#if TARGET_OS_SIMULATOR
// get path of host dyld from table of syscall vectors in host dyld
void* addressInDyld = gSyscallHelpers;
#else
// get path of dyld itself
void* addressInDyld = (void*)&__dso_handle;
#endif
char dyldPathBuffer[MAXPATHLEN+1];
int len = proc_regionfilename(getpid(), (uint64_t)(long)addressInDyld, dyldPathBuffer, MAXPATHLEN);
if ( len > 0 ) {
dyldPathBuffer[len] = '\0'; // proc_regionfilename() does not zero terminate returned string
if ( strcmp(dyldPathBuffer, gProcessInfo->dyldPath) != 0 )
gProcessInfo->dyldPath = strdup(dyldPathBuffer);
}
// load any inserted libraries
if ( sEnv.DYLD_INSERT_LIBRARIES != NULL ) {
for (const char* const* lib = sEnv.DYLD_INSERT_LIBRARIES; *lib != NULL; ++lib)
loadInsertedDylib(*lib);
}
// record count of inserted libraries so that a flat search will look at
// inserted libraries, then main, then others.
sInsertedDylibCount = sAllImages.size()-1;
// link main executable
gLinkContext.linkingMainExecutable = true;
#if SUPPORT_ACCELERATE_TABLES
if ( mainExcutableAlreadyRebased ) {
// previous link() on main executable has already adjusted its internal pointers for ASLR
// work around that by rebasing by inverse amount
sMainExecutable->rebase(gLinkContext, -mainExecutableSlide);
}
#endif
link(sMainExecutable, sEnv.DYLD_BIND_AT_LAUNCH, true, ImageLoader::RPathChain(NULL, NULL), -1);
sMainExecutable->setNeverUnloadRecursive();
if ( sMainExecutable->forceFlat() ) {
gLinkContext.bindFlat = true;
gLinkContext.prebindUsage = ImageLoader::kUseNoPrebinding;
}
// link any inserted libraries
// do this after linking main executable so that any dylibs pulled in by inserted
// dylibs (e.g. libSystem) will not be in front of dylibs the program uses
if ( sInsertedDylibCount > 0 ) {
for(unsigned int i=0; i < sInsertedDylibCount; ++i) {
ImageLoader* image = sAllImages[i+1];
link(image, sEnv.DYLD_BIND_AT_LAUNCH, true, ImageLoader::RPathChain(NULL, NULL), -1);
image->setNeverUnloadRecursive();
}
if ( gLinkContext.allowInterposing ) {
// only INSERTED libraries can interpose
// register interposing info after all inserted libraries are bound so chaining works
for(unsigned int i=0; i < sInsertedDylibCount; ++i) {
ImageLoader* image = sAllImages[i+1];
image->registerInterposing(gLinkContext);
}
}
}
if ( gLinkContext.allowInterposing ) {
// dyld should support interposition even without DYLD_INSERT_LIBRARIES
for (long i=sInsertedDylibCount+1; i < sAllImages.size(); ++i) {
ImageLoader* image = sAllImages[i];
if ( image->inSharedCache() )
continue;
image->registerInterposing(gLinkContext);
}
}
#if SUPPORT_ACCELERATE_TABLES
if ( (sAllCacheImagesProxy != NULL) && ImageLoader::haveInterposingTuples() ) {
// Accelerator tables cannot be used with implicit interposing, so relaunch with accelerator tables disabled
ImageLoader::clearInterposingTuples();
// unmap all loaded dylibs (but not main executable)
for (long i=1; i < sAllImages.size(); ++i) {
ImageLoader* image = sAllImages[i];
if ( image == sMainExecutable )
continue;
if ( image == sAllCacheImagesProxy )
continue;
image->setCanUnload();
ImageLoader::deleteImage(image);
}
// note: we don't need to worry about inserted images because if DYLD_INSERT_LIBRARIES was set we would not be using the accelerator table
sAllImages.clear();
sImageRoots.clear();
sImageFilesNeedingTermination.clear();
sImageFilesNeedingDOFUnregistration.clear();
sAddImageCallbacks.clear();
sRemoveImageCallbacks.clear();
sAddLoadImageCallbacks.clear();
sAddBulkLoadImageCallbacks.clear();
sDisableAcceleratorTables = true;
sAllCacheImagesProxy = NULL;
sMappedRangesStart = NULL;
mainExcutableAlreadyRebased = true;
gLinkContext.linkingMainExecutable = false;
resetAllImages();
goto reloadAllImages;
}
#endif
// apply interposing to initial set of images
for(int i=0; i < sImageRoots.size(); ++i) {
sImageRoots[i]->applyInterposing(gLinkContext);
}
ImageLoader::applyInterposingToDyldCache(gLinkContext);
// Bind and notify for the main executable now that interposing has been registered
uint64_t bindMainExecutableStartTime = mach_absolute_time();
sMainExecutable->recursiveBindWithAccounting(gLinkContext, sEnv.DYLD_BIND_AT_LAUNCH, true);
uint64_t bindMainExecutableEndTime = mach_absolute_time();
ImageLoaderMachO::fgTotalBindTime += bindMainExecutableEndTime - bindMainExecutableStartTime;
gLinkContext.notifyBatch(dyld_image_state_bound, false);
// Bind and notify for the inserted images now interposing has been registered
if ( sInsertedDylibCount > 0 ) {
for(unsigned int i=0; i < sInsertedDylibCount; ++i) {
ImageLoader* image = sAllImages[i+1];
image->recursiveBind(gLinkContext, sEnv.DYLD_BIND_AT_LAUNCH, true);
}
}
// do weak binding only after all inserted images linked
sMainExecutable->weakBind(gLinkContext);
gLinkContext.linkingMainExecutable = false;
sMainExecutable->recursiveMakeDataReadOnly(gLinkContext);
CRSetCrashLogMessage("dyld: launch, running initializers");
#if SUPPORT_OLD_CRT_INITIALIZATION
// Old way is to run initializers via a callback from crt1.o
if ( ! gRunInitializersOldWay )
initializeMainExecutable();
#else
// run all initializers
initializeMainExecutable();
#endif
// notify any montoring proccesses that this process is about to enter main()
notifyMonitoringDyldMain();
if (dyld3::kdebug_trace_dyld_enabled(DBG_DYLD_TIMING_LAUNCH_EXECUTABLE)) {
dyld3::kdebug_trace_dyld_duration_end(launchTraceID, DBG_DYLD_TIMING_LAUNCH_EXECUTABLE, 0, 0, 2);
}
ARIADNEDBG_CODE(220, 1);
#if __MAC_OS_X_VERSION_MIN_REQUIRED
if ( gLinkContext.driverKit ) {
result = (uintptr_t)sEntryOveride;
if ( result == 0 )
halt("no entry point registered");
*startGlue = (uintptr_t)gLibSystemHelpers->startGlueToCallExit;
}
else
#endif
{
// find entry point for main executable
result = (uintptr_t)sMainExecutable->getEntryFromLC_MAIN();
if ( result != 0 ) {
// main executable uses LC_MAIN, we need to use helper in libdyld to call into main()
if ( (gLibSystemHelpers != NULL) && (gLibSystemHelpers->version >= 9) )
*startGlue = (uintptr_t)gLibSystemHelpers->startGlueToCallExit;
else
halt("libdyld.dylib support not present for LC_MAIN");
}
else {
// main executable uses LC_UNIXTHREAD, dyld needs to let "start" in program set up for main()
result = (uintptr_t)sMainExecutable->getEntryFromLC_UNIXTHREAD();
*startGlue = 0;
}
}
#if __has_feature(ptrauth_calls)
// start() calls the result pointer as a function pointer so we need to sign it.
result = (uintptr_t)__builtin_ptrauth_sign_unauthenticated((void*)result, 0, 0);
#endif
}
catch(const char* message) {
syncAllImages();
halt(message);
}
catch(...) {
dyld::log("dyld: launch failed\n");
}
CRSetCrashLogMessage("dyld2 mode");
#if !TARGET_OS_SIMULATOR
if (sLogClosureFailure) {
// We failed to launch in dyld3, but dyld2 can handle it. synthesize a crash report for analytics
dyld3::syntheticBacktrace("Could not generate launchClosure, falling back to dyld2", true);
}
#endif
if (sSkipMain) {
notifyMonitoringDyldMain();
if (dyld3::kdebug_trace_dyld_enabled(DBG_DYLD_TIMING_LAUNCH_EXECUTABLE)) {
dyld3::kdebug_trace_dyld_duration_end(launchTraceID, DBG_DYLD_TIMING_LAUNCH_EXECUTABLE, 0, 0, 2);
}
ARIADNEDBG_CODE(220, 1);
result = (uintptr_t)&fake_main;
*startGlue = (uintptr_t)gLibSystemHelpers->startGlueToCallExit;
}
return result;
}
这段代码较多,主要做了以下几件事情:
- 准备运行环境,准备Mach-O文件加载
- 使用
mapSharedCache
加载共享缓存 - 执行
instantiateFromLoadedImage(mainExecutableMH, mainExecutableSlide, sExecPath)
加载主程序 - 执行
loadInsertedDylib
加载插入的动态库 -
link(sMainExecutable, sEnv.DYLD_BIND_AT_LAUNCH, true, ImageLoader::RPathChain(NULL, NULL), -1)
链接主程序 -
link(image, sEnv.DYLD_BIND_AT_LAUNCH, true, ImageLoader::RPathChain(NULL, NULL), -1)
链接动态库 -
sMainExecutable->weakBind(gLinkContext)
绑定弱符号 -
initializeMainExecutable()
执行初始化方法 - 查找程序入口并返回
main()
initializeMainExecutable
void initializeMainExecutable()
{
// record that we've reached this step
gLinkContext.startedInitializingMainExecutable = true;
// run initialzers for any inserted dylibs
ImageLoader::InitializerTimingList initializerTimes[allImagesCount()];
initializerTimes[0].count = 0;
const size_t rootCount = sImageRoots.size();
if ( rootCount > 1 ) {
for(size_t i=1; i < rootCount; ++i) {
sImageRoots[i]->runInitializers(gLinkContext, initializerTimes[0]);
}
}
// run initializers for main executable and everything it brings up
sMainExecutable->runInitializers(gLinkContext, initializerTimes[0]);
// register cxa_atexit() handler to run static terminators in all loaded images when this process exits
if ( gLibSystemHelpers != NULL )
(*gLibSystemHelpers->cxa_atexit)(&runAllStaticTerminators, NULL, NULL);
// dump info if requested
if ( sEnv.DYLD_PRINT_STATISTICS )
ImageLoader::printStatistics((unsigned int)allImagesCount(), initializerTimes[0]);
if ( sEnv.DYLD_PRINT_STATISTICS_DETAILS )
ImageLoaderMachO::printStatisticsDetails((unsigned int)allImagesCount(), initializerTimes[0]);
}
可以看到,这个方法里先执行了动态库sImageRoots
的初始化方法runInitializers
,再执行主程序sMainExecutable
的runInitializers
。
我们继续跟进runInitializers
方法。
runInitializers
void ImageLoader::runInitializers(const LinkContext& context, InitializerTimingList& timingInfo)
{
uint64_t t1 = mach_absolute_time();
mach_port_t thisThread = mach_thread_self();
ImageLoader::UninitedUpwards up;
up.count = 1;
up.imagesAndPaths[0] = { this, this->getPath() };
processInitializers(context, thisThread, timingInfo, up);
context.notifyBatch(dyld_image_state_initialized, false);
mach_port_deallocate(mach_task_self(), thisThread);
uint64_t t2 = mach_absolute_time();
fgTotalInitTime += (t2 - t1);
}
这里可以看到内部调用了processInitializers
processInitializers
void ImageLoader::processInitializers(const LinkContext& context, mach_port_t thisThread,
InitializerTimingList& timingInfo, ImageLoader::UninitedUpwards& images)
{
uint32_t maxImageCount = context.imageCount()+2;
ImageLoader::UninitedUpwards upsBuffer[maxImageCount];
ImageLoader::UninitedUpwards& ups = upsBuffer[0];
ups.count = 0;
// Calling recursive init on all images in images list, building a new list of
// uninitialized upward dependencies.
for (uintptr_t i=0; i < images.count; ++i) {
images.imagesAndPaths[i].first->recursiveInitialization(context, thisThread, images.imagesAndPaths[i].second, timingInfo, ups);
}
// If any upward dependencies remain, init them.
if ( ups.count > 0 )
processInitializers(context, thisThread, timingInfo, ups);
}
这里为所有的镜像执行recursiveInitialization
方法
recursiveInitialization
void ImageLoader::recursiveInitialization(const LinkContext& context, mach_port_t this_thread, const char* pathToInitialize,
InitializerTimingList& timingInfo, UninitedUpwards& uninitUps)
{
recursive_lock lock_info(this_thread);
recursiveSpinLock(lock_info);
if ( fState < dyld_image_state_dependents_initialized-1 ) {
uint8_t oldState = fState;
// break cycles
fState = dyld_image_state_dependents_initialized-1;
try {
// initialize lower level libraries first
for(unsigned int i=0; i < libraryCount(); ++i) {
ImageLoader* dependentImage = libImage(i);
if ( dependentImage != NULL ) {
// don't try to initialize stuff "above" me yet
if ( libIsUpward(i) ) {
uninitUps.imagesAndPaths[uninitUps.count] = { dependentImage, libPath(i) };
uninitUps.count++;
}
else if ( dependentImage->fDepth >= fDepth ) {
dependentImage->recursiveInitialization(context, this_thread, libPath(i), timingInfo, uninitUps);
}
}
}
// record termination order
if ( this->needsTermination() )
context.terminationRecorder(this);
// let objc know we are about to initialize this image
uint64_t t1 = mach_absolute_time();
fState = dyld_image_state_dependents_initialized;
oldState = fState;
context.notifySingle(dyld_image_state_dependents_initialized, this, &timingInfo);
// initialize this image
bool hasInitializers = this->doInitialization(context);
// let anyone know we finished initializing this image
fState = dyld_image_state_initialized;
oldState = fState;
context.notifySingle(dyld_image_state_initialized, this, NULL);
if ( hasInitializers ) {
uint64_t t2 = mach_absolute_time();
timingInfo.addTime(this->getShortName(), t2-t1);
}
}
catch (const char* msg) {
// this image is not initialized
fState = oldState;
recursiveSpinUnLock();
throw;
}
}
recursiveSpinUnLock();
}
这里有两个关键方法:
context.notifySingle
this->doInitialization(context)
context.notifySingle
全局搜索notifySingle
,它的实现如下
static void notifySingle(dyld_image_states state, const ImageLoader* image, ImageLoader::InitializerTimingList* timingInfo)
{
//dyld::log("notifySingle(state=%d, image=%s)\n", state, image->getPath());
std::vector* handlers = stateToHandlers(state, sSingleHandlers);
if ( handlers != NULL ) {
dyld_image_info info;
info.imageLoadAddress = image->machHeader();
info.imageFilePath = image->getRealPath();
info.imageFileModDate = image->lastModified();
for (std::vector::iterator it = handlers->begin(); it != handlers->end(); ++it) {
const char* result = (*it)(state, 1, &info);
if ( (result != NULL) && (state == dyld_image_state_mapped) ) {
//fprintf(stderr, " image rejected by handler=%p\n", *it);
// make copy of thrown string so that later catch clauses can free it
const char* str = strdup(result);
throw str;
}
}
}
if ( state == dyld_image_state_mapped ) {
// Save load addr + UUID for images from outside the shared cache
if ( !image->inSharedCache() ) {
dyld_uuid_info info;
if ( image->getUUID(info.imageUUID) ) {
info.imageLoadAddress = image->machHeader();
addNonSharedCacheImageUUID(info);
}
}
}
if ( (state == dyld_image_state_dependents_initialized) && (sNotifyObjCInit != NULL) && image->notifyObjC() ) {
uint64_t t0 = mach_absolute_time();
dyld3::ScopedTimer timer(DBG_DYLD_TIMING_OBJC_INIT, (uint64_t)image->machHeader(), 0, 0);
(*sNotifyObjCInit)(image->getRealPath(), image->machHeader());
uint64_t t1 = mach_absolute_time();
uint64_t t2 = mach_absolute_time();
uint64_t timeInObjC = t1-t0;
uint64_t emptyTime = (t2-t1)*100;
if ( (timeInObjC > emptyTime) && (timingInfo != NULL) ) {
timingInfo->addTime(image->getShortName(), timeInObjC);
}
}
// mach message csdlc about dynamically unloaded images
if ( image->addFuncNotified() && (state == dyld_image_state_terminated) ) {
notifyKernel(*image, false);
const struct mach_header* loadAddress[] = { image->machHeader() };
const char* loadPath[] = { image->getPath() };
notifyMonitoringDyld(true, 1, loadAddress, loadPath);
}
}
它的内部调用了sNotifyObjCInit
,全局搜索sNotifyObjCInit
,并没找到他的相关定义,实际他是一个函数指针,在registerObjCNotifiers
方法中有关于它的赋值操作
void registerObjCNotifiers(_dyld_objc_notify_mapped mapped, _dyld_objc_notify_init init, _dyld_objc_notify_unmapped unmapped)
{
// record functions to call
sNotifyObjCMapped = mapped;
sNotifyObjCInit = init;
sNotifyObjCUnmapped = unmapped;
// call 'mapped' function with all images mapped so far
try {
notifyBatchPartial(dyld_image_state_bound, true, NULL, false, true);
}
catch (const char* msg) {
// ignore request to abort during registration
}
// call 'init' function on all images already init'ed (below libSystem)
for (std::vector::iterator it=sAllImages.begin(); it != sAllImages.end(); it++) {
ImageLoader* image = *it;
if ( (image->getState() == dyld_image_state_initialized) && image->notifyObjC() ) {
dyld3::ScopedTimer timer(DBG_DYLD_TIMING_OBJC_INIT, (uint64_t)image->machHeader(), 0, 0);
(*sNotifyObjCInit)(image->getRealPath(), image->machHeader());
}
}
}
接着搜索registerObjCNotifiers
,发现它在_dyld_objc_notify_register
中被调用,然而全局搜索,我们发现并没有在dyld有调用该方法,当我们可以在dyld_priv.h
的在方法苹果写的这样一段关于registerObjCNotifiers
的注释。
//
// Note: only for use by objc runtime
// Register handlers to be called when objc images are mapped, unmapped, and initialized.
// Dyld will call back the "mapped" function with an array of images that contain an objc-image-info section.
// Those images that are dylibs will have the ref-counts automatically bumped, so objc will no longer need to
// call dlopen() on them to keep them from being unloaded. During the call to _dyld_objc_notify_register(),
// dyld will call the "mapped" function with already loaded objc images. During any later dlopen() call,
// dyld will also call the "mapped" function. Dyld will call the "init" function when dyld would be called
// initializers in that image. This is when objc calls any +load methods in that image.
//
void _dyld_objc_notify_register(_dyld_objc_notify_mapped mapped,
_dyld_objc_notify_init init,
_dyld_objc_notify_unmapped unmapped);
可以看到,这是一个给objc运行时调用的方法。至此,我们得到一个结论objc通过_dyld_objc_notify_register
在dyld
中注册了一个3个回调方法:mapped、init、unmapped,而dyld
会在执行初始化方法时通过notifySingle
通知调用objc注册在这里的回调方法。
_objc_init
我们来到objc4-781
源码中搜索_dyld_objc_notify_register
,在objc-os.mm
中找到它被调用的地方
// runtime + 类的信息
void _objc_init(void)
{
static bool initialized = false;
if (initialized) return;
initialized = true;
// fixme defer initialization until an objc-using image is found?
environ_init();
tls_init();
static_init();
runtime_init();
exception_init();
cache_init();
_imp_implementationWithBlock_init();
// 什么时候调用? images 镜像文件
// map_images()
// load_images()
_dyld_objc_notify_register(&map_images, load_images, unmap_image);
#if __OBJC2__
didCallDyldNotifyRegister = true;
#endif
}
可以看到sNotifyObjCInit
赋值的就是objc
的load_images
,接着我们在objc-runtime-new.mm
中查看load_images
的实现。
void
load_images(const char *path __unused, const struct mach_header *mh)
{
if (!didInitialAttachCategories && didCallDyldNotifyRegister) {
didInitialAttachCategories = true;
loadAllCategories();
}
// Return without taking locks if there are no +load methods here.
if (!hasLoadMethods((const headerType *)mh)) return;
recursive_mutex_locker_t lock(loadMethodLock);
// Discover load methods
{
mutex_locker_t lock2(runtimeLock);
prepare_load_methods((const headerType *)mh);
}
// Call +load methods (without runtimeLock - re-entrant)
call_load_methods();
}
继续查看call_load_methods
,
void call_load_methods(void)
{
static bool loading = NO;
bool more_categories;
loadMethodLock.assertLocked();
// Re-entrant calls do nothing; the outermost call will finish the job.
if (loading) return;
loading = YES;
void *pool = objc_autoreleasePoolPush();
do {
// 1. Repeatedly call class +loads until there aren't any more
while (loadable_classes_used > 0) {
call_class_loads();
}
// 2. Call category +loads ONCE
more_categories = call_category_loads();
// 3. Run more +loads if there are classes OR more untried categories
} while (loadable_classes_used > 0 || more_categories);
objc_autoreleasePoolPop(pool);
loading = NO;
}
接着查看call_class_loads()
static void call_class_loads(void)
{
int i;
// Detach current loadable list.
struct loadable_class *classes = loadable_classes;
int used = loadable_classes_used;
loadable_classes = nil;
loadable_classes_allocated = 0;
loadable_classes_used = 0;
// Call all +loads for the detached list.
for (i = 0; i < used; i++) {
Class cls = classes[i].cls;
load_method_t load_method = (load_method_t)classes[i].method;
if (!cls) continue;
if (PrintLoading) {
_objc_inform("LOAD: +[%s load]\n", cls->nameForLogging());
}
(*load_method)(cls, @selector(load));
}
// Destroy the detached list.
if (classes) free(classes);
}
看到(*load_method)(cls, @selector(load));
我们也就知道,为什么类方法的load会在main()之前。
那么_objc_init
是何时调用的用的呢?我们在dyld
和objc
源码库中都没找到相关调用,可以猜测这是其他库调用的,我们在一个可执行工程中打上_objc_init
的符号断点探究他的调用堆栈。
可以看到,当初始化方法执行到doInitialization
时,会调用libSystem
的libSystem_initializerr
方法,而这个方法又会调用libdispatch
的libdispatch_init
方法,libdispatch_init
又调用_os_object_init
进而调用_objc_init
,至此,关于dyld和objc的关系已经破案。
总结
本文主要通过程序的启动到main函数执行的过程,探究dyld
与objc
之间的联系,它们之间可以用下图表示: