iOS底层原理之dyld应用程序加载

前言

众所周知,main作为程序的入口,但是在它之前发生了什么?有点好奇,让我们来瞅一瞅:

一、准备工作

1.代码

__attribute__((constructor)) void Func(){
    printf("来了 : %s \n",__func__);
}

int main(int argc, char * argv[]) {
    NSString * appDelegateClassName;
    
    NSLog(@"1223333");
    
    @autoreleasepool {
        // Setup code that might create autoreleased objects goes here.
        appDelegateClassName = NSStringFromClass([AppDelegate class]);
    }
    return UIApplicationMain(argc, argv, nil, appDelegateClassName);
}
+ (void)load{
    NSLog(@"%s",__func__);
}

- (void)viewDidLoad {
    [super viewDidLoad];
    // Do any additional setup after loading the view.
}

运行查看结果:

2021-07-09 14:34:17.783102+0800 002-应用程加载分析[5229:357059] +[ViewController load]
来了 : Func 
2021-07-09 14:34:17.783813+0800 002-应用程加载分析[5229:357059] 1223333

发现程序的执行顺序依次是 load -> c++ 函数-> main, 这是为什么呢?
找了一下资料

2.资料

dyld简介

源码

dyly
objc
libdispatch
libysystem

应用程序的编译过程

打包角度

  • 编译信息写入辅助文件,创建文件架构 .app 文件
  • 处理文件打包信息
  • 执行 CocoaPod 编译前脚本,checkPods Manifest.lock
  • 编译.m文件,使用 CompileC 和 clang 命令
  • 链接需要的 Framework
  • 编译 xib
  • 拷贝 xib ,资源文件
  • 编译 ImageAssets
  • 处理 info.plist
  • 执行 CocoaPod 脚本
  • 拷贝标准库
  • 创建 .app 文件和签名等

http://www.cocoachina.com/articles/21888 参考此文章

xcrun -sdk iphoneos clang -arch armv7 -F Foundation -fobjc-arc -c main.m -o main.o
xcrun -sdk iphoneos clang main.o -arch armv7 -fobjc-arc -framework Foundation -o main
# 这样还没法看清clang的全部过程,可以通过-E查看clang在预处理处理这步做了什么。
clang -E main.m
# 执行完后可以看到文件
# 1 "/System/Library/Frameworks/Foundation.framework/Headers/FoundationLegacySwiftCompatibility.h" 1 3
# 185 "/System/Library/Frameworks/Foundation.framework/Headers/Foundation.h" 2 3
# 2 "main.m" 2
int main(){
    @autoreleasepool {
        int eight = 8;
        int six = 6;
        NSString* site = [[NSString alloc] initWithUTF8String:"starming"];
        int rank = eight + six;
        NSLog(@"%@ rank %d", site, rank);
    }
    return 0;
}
# 这个过程的处理包括宏的替换,头文件的导入,以及类似#if的处理。预处理完成后就会进行词法分析,这里会把代码切成一个个 Token,比如大小括号,等于号还有字符串等。
clang -fmodules -fsyntax-only -Xclang -dump-tokens main.m
# 然后是语法分析,验证语法是否正确,然后将所有节点组成抽象语法树 AST 。
clang -fmodules -fsyntax-only -Xclang -ast-dump main.m
# 完成这些步骤后就可以开始IR中间代码的生成了,CodeGen 会负责将语法树自顶向下遍历逐步翻译成 LLVM IR,IR 是编译过程的前端的输出后端的输入。
clang -S -fobjc-arc -emit-llvm main.m -o main.ll
# 这里 LLVM 会去做些优化工作,在 Xcode 的编译设置里也可以设置优化级别-01,-03,-0s,还可以写些自己的 Pass。
# Pass 是 LLVM 优化工作的一个节点,一个节点做些事,一起加起来就构成了 LLVM 完整的优化和转化。
# 如果开启了 bitcode 苹果会做进一步的优化,有新的后端架构还是可以用这份优化过的 bitcode 去生成。
clang -emit-llvm -c main.m -o main.bc
# 生成汇编
clang -S -fobjc-arc main.m -o main.s
# 生成目标文件
clang -fmodules -c main.m -o main.o
# 生成可执行文件,这样就能够执行看到输出结果
clang main.o -o main
# 执行
./main
# 输出
starming rank 14
编译过程
.h、.m、.cpp等源文件->预编译->编译->汇编->链接(动静态库加持)->可执行文件。
源文件:载入.h、.m、.cpp等文件
预处理:替换宏,删除注释,展开头文件,产生.i文件
编译:将.i文件转换为汇编语言,产生.s文件
汇编:将汇编文件转换为机器码文件,产生.o文件
链接:对.o文件中引用其他库的地方进行引用,生成最后的可执行文件
动态库

程序编译时并不会链接到目标程序中,目标程序只会存储指向动态库的引用,在程序运行时才被载入。例如:.so、.framwork、.dll

  • 优点:减少打包之后app的大小,共享内存,节约资源,更新动态库,达到更新程序
  • 缺点:动态载入会带来一部分性能损失,使用动态库也会使得程序依赖于外部环境,如果环境缺少了动态库,或者库的版本不正确,就会导致程序无法运行
静态库

1.静态库
在链接阶段,会将汇编生成的目标程序与引用的库一起链接打包到可执行文件当中。此时的静态库就不会在改变了,因为它是编译时被直接拷贝一份,复制到目标程序里的。例如:.a、.lib

  • 优点:编译完成后,库文件实际上就没有作用了,目标程序没有外部依赖,直接就可以运行
  • 缺点:由于静态库可能会有两份,所以会导致目标程序的体积增大,对内存、性能、速度消耗很大

dyld

dyld(the dynamic link editor)是苹果的动态链接器,是苹果操作系统的重要组成部分,在app被编译打包成可执行文件格式的Mach-O文件后,交由dyld负责连接,并加载程序。

  • dyld的作用:加载各个库,也就是image镜像文件,由dyld从内存中读到表中,加载主程序, link链接各个动静态库,进行主程序初始化。

二、dyld探索

load方法来一个断点,运行项目:

(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
  * frame #0: 0x0000000105d2ae57 002-应用程加载分析`+[ViewController load](self=ViewController, _cmd="load") at ViewController.m:17:5
    frame #1: 0x0000000106542ff2 libobjc.A.dylib`load_images + 1439
    frame #2: 0x0000000105d3ee2c dyld_sim`dyld::notifySingle(dyld_image_states, ImageLoader const*, ImageLoader::InitializerTimingList*) + 425
    frame #3: 0x0000000105d4dba5 dyld_sim`ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int, char const*, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 437
    frame #4: 0x0000000105d4bec7 dyld_sim`ImageLoader::processInitializers(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 191
    frame #5: 0x0000000105d4bf68 dyld_sim`ImageLoader::runInitializers(ImageLoader::LinkContext const&, ImageLoader::InitializerTimingList&) + 82
    frame #6: 0x0000000105d3f26b dyld_sim`dyld::initializeMainExecutable() + 199
    frame #7: 0x0000000105d43f56 dyld_sim`dyld::_main(macho_header const*, unsigned long, int, char const**, char const**, char const**, unsigned long*) + 4789
    frame #8: 0x0000000105d3e1c2 dyld_sim`start_sim + 122
    frame #9: 0x0000000112b84a88 dyld`dyld::useSimulatorDyld(int, macho_header const*, char const*, int, char const**, char const**, char const**, unsigned long*, unsigned long*) + 2093
    frame #10: 0x0000000112b82162 dyld`dyld::_main(macho_header const*, unsigned long, int, char const**, char const**, char const**, unsigned long*) + 1198
    frame #11: 0x0000000112b7c224 dyld`dyldbootstrap::start(dyld3::MachOLoaded const*, int, char const**, dyld3::MachOLoaded const*, unsigned long*) + 450
    frame #12: 0x0000000112b7c025 dyld`_dyld_start + 37

按照栈区的规则,先进后出,所以调用顺序是
调用流程:_dyld_start -> dyldbootstrap::start -> dyld::main -> dyld::initializeMainExecutable -> ImageLoader::runInitializers -> ImageLoader::processInitializers -> ImageLoader::recursiveInitialization -> dyld::notifySingle -> load_images -> [ViewController load]
打开汇编调试,进入_dyld_start

dyldbootstrap::start(dyld3::MachOLoaded const*, int, char const**, dyld3::MachOLoaded const*, unsigned long*)

那么证明前面的调用流程是对的,接下来进行进一步验证:

_dyld_start

接下来打开dyld源码,搜索这个方法:
iOS底层原理之dyld应用程序加载_第1张图片

dyldbootstrap:

iOS底层原理之dyld应用程序加载_第2张图片
代码做了什么?
dyld3::kdebug_trace_dyld_marker(DBG_DYLD_TIMING_BOOTSTRAP_START, 0, 0, 0, 0);

  • kdebug标记dyld引导已经启动 rdar://46878536 (ps:不知道是什么)
  • 将dyld的data端链接到一起,以固定指针来运行
  • 然后做准备工作初始化调用 dyld::_main

典型的Mach-O文件包含三个区域:

  • Header:保存Mach-O的一些基本信息,包括平台、文件类型、指令数、指令总大小,dyld标记Flags等等。
  • Load Commands:紧跟Header,加载Mach-O文件时会使用这部分数据确定内存分布,对系统内核加载器和动态连接器起指导作用。
  • Data:每个segment的具体数据保存在这里,包含具体的代码、数据等等。

dyld:main

部分main代码(600+的代码,太长了)
iOS底层原理之dyld应用程序加载_第3张图片

直接分析一波流程吧:

确定执行内核版本:

dyld3::BootArgs::setFlags(hexToUInt64(_simple_getenv(apple, "dyld_flags"), nullptr));
  • 接着找到从环境中找到可执行文件的cdHash(Macho_Header,主程序的slider)
	uint8_t mainExecutableCDHashBuffer[20];
	const uint8_t* mainExecutableCDHash = nullptr;
	if ( const char* mainExeCdHashStr = _simple_getenv(apple, "executable_cdhash") ) {
		unsigned bufferLenUsed;
		if ( hexStringToBytes(mainExeCdHashStr, mainExecutableCDHashBuffer, sizeof(mainExecutableCDHashBuffer), bufferLenUsed) )
			mainExecutableCDHash = mainExecutableCDHashBuffer;
	}
	getHostInfo(mainExecutableMH, mainExecutableSlide);
  • 非模拟器环境下观察 dyld、主程序的加载
#if !TARGET_OS_SIMULATOR
	// Trace dyld's load
	notifyKernelAboutImage((macho_header*)&__dso_handle, _simple_getenv(apple, "dyld_file"));
	// Trace the main executable's load
	notifyKernelAboutImage(mainExecutableMH, _simple_getenv(apple, "executable_file"));
#endif

前面好像都是dyld根据当前的运行环境做准备,继续往下看

  • 检查是否开启,以及共享缓存是否映射到共享区域,例如UIKit、CoreFoundation等。
if ( sJustBuildClosure )
		sClosureMode = ClosureMode::On;

	// load shared cache
	checkSharedRegionDisable((dyld3::MachOLoaded*)mainExecutableMH, mainExecutableSlide);
	if ( gLinkContext.sharedRegionMode != ImageLoader::kDontUseSharedRegion ) {
#if TARGET_OS_SIMULATOR
		if ( sSharedCacheOverrideDir)
			// 加载缓存相关
			mapSharedCache(mainExecutableSlide);
#else
		mapSharedCache(mainExecutableSlide);
#endif
mapSharedCache(加载共享缓存相关)
// iOS是必须有共享缓存的
static void mapSharedCache(uintptr_t mainExecutableSlide)
{
	dyld3::SharedCacheOptions opts;
	opts.cacheDirOverride	= sSharedCacheOverrideDir;
	opts.forcePrivate		= (gLinkContext.sharedRegionMode == ImageLoader::kUsePrivateSharedRegion);
#if __x86_64__ && !TARGET_OS_SIMULATOR
	opts.useHaswell			= sHaswell;
#else
	opts.useHaswell			= false;
#endif
	opts.verbose			= gLinkContext.verboseMapping;
    // <rdar://problem/32031197> respect -disable_aslr boot-arg
    // <rdar://problem/56299169> kern.bootargs is now blocked
	opts.disableASLR		= (mainExecutableSlide == 0) && dyld3::internalInstall(); // infer ASLR is off if main executable is not slid
	loadDyldCache(opts, &sSharedCacheLoadInfo);
}

loadDyldCache(加载共享缓存相关)
bool loadDyldCache(const SharedCacheOptions& options, SharedCacheLoadInfo* results)
{
    results->loadAddress        = 0;
    results->slide              = 0;
    results->errorMessage       = nullptr;

#if TARGET_OS_SIMULATOR
    // 模拟器只支持 mmap() 缓存私下进入进程
    return mapCachePrivate(options, results);
#else
    if ( options.forcePrivate ) {
        // 仅加载当前进程
        return mapCachePrivate(options, results);
    }
    else {
        // 如果共享缓存已经加在了,不做任何处理
        bool hasError = false;
        if ( reuseExistingCache(options, results) ) {
            hasError = (results->errorMessage != nullptr);
        } else {
            // 第一次夹在共享缓存,调用mapCacheSystemWide
            hasError = mapCacheSystemWide(options, results);
        }
        return hasError;
    }
#endif
}
instantiateFromLoadedImage(实例化加载主程序相关)
// 调用instantiateFromLoadedImage函数实例化了一个ImageLoader对象(实例化主程序)
	CRSetCrashLogMessage(sLoadingCrashMessage);
	// instantiate ImageLoader for main executable
	sMainExecutable = instantiateFromLoadedImage(mainExecutableMH, mainExecutableSlide, sExecPath);
	gLinkContext.mainExecutable = sMainExecutable;
	gLinkContext.mainExecutableCodeSigned = hasCodeSignatureLoadCommand(mainExecutableMH);
	
// 内核在 dyld 获得控制之前映射到主可执行文件中。
// 为已经映射到主可执行文件中的对象创建一个 ImageLoader*。
static ImageLoaderMachO* instantiateFromLoadedImage(const macho_header* mh, uintptr_t slide, const char* path)
{
        //加载moch-o
	ImageLoader* image = ImageLoaderMachO::instantiateMainExecutable(mh, slide, path, gLinkContext);
        //将image添加到AllImages,所以AllImage里面第一个是主程序
	addImage(image);
	return (ImageLoaderMachO*)image;
}

遍历DYLD_INSERT_LIBRARIES环境变量,调用loadInsertedDylib。

		char dyldPathBuffer[MAXPATHLEN+1];
		int len = proc_regionfilename(getpid(), (uint64_t)(long)addressInDyld, dyldPathBuffer, MAXPATHLEN);
		if ( len > 0 ) {
			dyldPathBuffer[len] = '\0'; // proc_regionfilename() does not zero terminate returned string
			if ( strcmp(dyldPathBuffer, gProcessInfo->dyldPath) != 0 )
				gProcessInfo->dyldPath = strdup(dyldPathBuffer);
		}

		// load any inserted libraries
		if	( sEnv.DYLD_INSERT_LIBRARIES != NULL ) {
			for (const char* const* lib = sEnv.DYLD_INSERT_LIBRARIES; *lib != NULL; ++lib) 
				loadInsertedDylib(*lib);
		}
		// record count of inserted libraries so that a flat search will look at 
		// inserted libraries, then main, then others.
		sInsertedDylibCount = sAllImages.size()-1;

		// link main executable
		gLinkContext.linkingMainExecutable = true;

链接主程序,动态库

link(sMainExecutable, sEnv.DYLD_BIND_AT_LAUNCH, true, ImageLoader::RPathChain(NULL, NULL), -1);
		sMainExecutable->setNeverUnloadRecursive();
		if ( sMainExecutable->forceFlat() ) {
			gLinkContext.bindFlat = true;
			gLinkContext.prebindUsage = ImageLoader::kUseNoPrebinding;
		}

		// link any inserted libraries  链接动态库
		// do this after linking main executable so that any dylibs pulled in by inserted 
		// dylibs (e.g. libSystem) will not be in front of dylibs the program uses
		if ( sInsertedDylibCount > 0 ) {
			for(unsigned int i=0; i < sInsertedDylibCount; ++i) {
				ImageLoader* image = sAllImages[i+1];
				link(image, sEnv.DYLD_BIND_AT_LAUNCH, true, ImageLoader::RPathChain(NULL, NULL), -1);
				image->setNeverUnloadRecursive();
			}
			if ( gLinkContext.allowInterposing ) {
				// only INSERTED libraries can interpose
				// register interposing info after all inserted libraries are bound so chaining works
				for(unsigned int i=0; i < sInsertedDylibCount; ++i) {
					ImageLoader* image = sAllImages[i+1];
					image->registerInterposing(gLinkContext);
				}
			}
		}

		if ( gLinkContext.allowInterposing ) {
			// <rdar://problem/19315404> dyld should support interposition even without DYLD_INSERT_LIBRARIES
			for (long i=sInsertedDylibCount+1; i < sAllImages.size(); ++i) {
				ImageLoader* image = sAllImages[i];
				if ( image->inSharedCache() )
					continue;
				image->registerInterposing(gLinkContext);
			}
		}

注册插入之前的image

// apply interposing to initial set of images
	for(int i=0; i < sImageRoots.size(); ++i) {
		sImageRoots[i]->applyInterposing(gLinkContext);
	}
	ImageLoader::applyInterposingToDyldCache(gLinkContext);

	// Bind and notify for the main executable now that interposing has been registered
	uint64_t bindMainExecutableStartTime = mach_absolute_time();
	sMainExecutable->recursiveBindWithAccounting(gLinkContext, sEnv.DYLD_BIND_AT_LAUNCH, true);
	uint64_t bindMainExecutableEndTime = mach_absolute_time();
	ImageLoaderMachO::fgTotalBindTime += bindMainExecutableEndTime - bindMainExecutableStartTime;
	gLinkContext.notifyBatch(dyld_image_state_bound, false);

	// Bind and notify for the inserted images now interposing has been registered
	if ( sInsertedDylibCount > 0 ) {
		for(unsigned int i=0; i < sInsertedDylibCount; ++i) {
			ImageLoader* image = sAllImages[i+1];
			image->recursiveBind(gLinkContext, sEnv.DYLD_BIND_AT_LAUNCH, true, nullptr);
		}
	}
	
	// <rdar://problem/12186933> do weak binding only after all inserted images linked
	sMainExecutable->weakBind(gLinkContext);
	gLinkContext.linkingMainExecutable = false;

	sMainExecutable->recursiveMakeDataReadOnly(gLinkContext);

  • 初始化程序
	initializeMainExecutable(); 
  • 从Load Command读取LC_MAIN入口,如果没有,就读取LC_UNIXTHREAD,这样就来到了日常开发中熟悉的main函数了。
// find entry point for main executable
	result = (uintptr_t)sMainExecutable->getEntryFromLC_MAIN();
	if ( result != 0 ) {
		// main executable uses LC_MAIN, we need to use helper in libdyld to call into main()
		if ( (gLibSystemHelpers != NULL) && (gLibSystemHelpers->version >= 9) )
			*startGlue = (uintptr_t)gLibSystemHelpers->startGlueToCallExit;
		else
			halt("libdyld.dylib support not present for LC_MAIN");
	}
	else {
		// main executable uses LC_UNIXTHREAD, dyld needs to let "start" in program set up for main()
		result = (uintptr_t)sMainExecutable->getEntryFromLC_UNIXTHREAD();
		*startGlue = 0;
	}

流程:环境配置->共享缓存->主程序初始化->加载动态库->链接主程序->链接动态库->执行初始化方法->main函数

主程序初始化

  • 主程序变量为sMainExecutable,它通过instantiateFromLoadedImage函数实现主程序的初始化。查看源码如下:
static ImageLoaderMachO* instantiateFromLoadedImage(const macho_header* mh, uintptr_t slide, const char* path)
{
	// try mach-o loader
//	if ( isCompatibleMachO((const uint8_t*)mh, path) ) {
		ImageLoader* image = ImageLoaderMachO::instantiateMainExecutable(mh, slide, path, gLinkContext);
		addImage(image);
		return (ImageLoaderMachO*)image;
//	}
	
//	throw "main executable not a known format";
}
  • 该方法创建了一个ImageLoader实例对象,其创建方法为instantiateMainExecutable
  • 其作用为创建主程序映射,返回一个ImageLoder类型的image对象,即主程序。其中sniffLoadCommands函数获取Mach-O文件的load Command相关信息,并对其进行各种校验(// 确定此 mach-o 文件是否具有经典或压缩的 LINKEDIT 以及它具有的段数)。
// 确定此 mach-o 文件是否具有经典或压缩的 LINKEDIT 以及它具有的段数
void ImageLoaderMachO::sniffLoadCommands(const macho_header* mh, const char* path, bool inCache, bool* compressed,
											unsigned int* segCount, unsigned int* libCount, const LinkContext& context,
											const linkedit_data_command** codeSigCmd,
											const encryption_info_command** encryptCmd)
{
	*compressed = false;
        //segment数量
	*segCount = 0;
        //lib数量
	*libCount = 0;
        //代码签名
	*codeSigCmd = NULL;
        //代码加密
	*encryptCmd = NULL;
        
        //截取一些关键信息,中间逻辑运算忽略了
        
        // 如果segment>255就会报错
	if ( *segCount > 255 )
		dyld::throwf("malformed mach-o image: more than 255 segments in %s", path);

	// libCount>255就会报错
	if ( *libCount > 4095 )
		dyld::throwf("malformed mach-o image: more than 4095 dependent libraries in %s", path);
}
  • 进入instantiateMainExecutable源码:
// create image for main executable
ImageLoader* ImageLoaderMachO::instantiateMainExecutable(const macho_header* mh, uintptr_t slide, const char* path, const LinkContext& context)
{
	//dyld::log("ImageLoader=%ld, ImageLoaderMachO=%ld, ImageLoaderMachOClassic=%ld, ImageLoaderMachOCompressed=%ld\n",
	//	sizeof(ImageLoader), sizeof(ImageLoaderMachO), sizeof(ImageLoaderMachOClassic), sizeof(ImageLoaderMachOCompressed));
	bool compressed;
	unsigned int segCount;
	unsigned int libCount;
	const linkedit_data_command* codeSigCmd;
	const encryption_info_command* encryptCmd;
	sniffLoadCommands(mh, path, false, &compressed, &segCount, &libCount, context, &codeSigCmd, &encryptCmd);
	// instantiate concrete class based on content of load commands
	if ( compressed ) 
		return ImageLoaderMachOCompressed::instantiateMainExecutable(mh, slide, path, segCount, libCount, context);
	else
#if SUPPORT_CLASSIC_MACHO
		return ImageLoaderMachOClassic::instantiateMainExecutable(mh, slide, path, segCount, libCount, context);
#else
		throw "missing LC_DYLD_INFO load command";
#endif
}

主程序执行流程

  • 通过上面的分析,我们已经跟踪到了以下的程序流程: _dyld_start -> dyldbootstrap::start -> dyld::main。 继续跟踪主程序执行执行流程,进入initializeMainExecutable函数:
void initializeMainExecutable()
{
	// record that we've reached this step
	gLinkContext.startedInitializingMainExecutable = true;

	// run initialzers for any inserted dylibs
	ImageLoader::InitializerTimingList initializerTimes[allImagesCount()];
	initializerTimes[0].count = 0;
	const size_t rootCount = sImageRoots.size();
	if ( rootCount > 1 ) {
		for(size_t i=1; i < rootCount; ++i) {
			sImageRoots[i]->runInitializers(gLinkContext, initializerTimes[0]);
		}
	}
	
	// run initializers for main executable and everything it brings up 
	sMainExecutable->runInitializers(gLinkContext, initializerTimes[0]);
	
	// register cxa_atexit() handler to run static terminators in all loaded images when this process exits
	if ( gLibSystemHelpers != NULL ) 
		(*gLibSystemHelpers->cxa_atexit)(&runAllStaticTerminators, NULL, NULL);

	// dump info if requested
	if ( sEnv.DYLD_PRINT_STATISTICS )
		ImageLoader::printStatistics((unsigned int)allImagesCount(), initializerTimes[0]);
	if ( sEnv.DYLD_PRINT_STATISTICS_DETAILS )
		ImageLoaderMachO::printStatisticsDetails((unsigned int)allImagesCount(), initializerTimes[0]);
}
  • 此过程会为所有插入的dylib调用initialzers,进入runInitializers函数,代码如下:
void ImageLoader::processInitializers(const LinkContext& context, mach_port_t thisThread,
									 InitializerTimingList& timingInfo, ImageLoader::UninitedUpwards& images)
{
	uint32_t maxImageCount = context.imageCount()+2;
	ImageLoader::UninitedUpwards upsBuffer[maxImageCount];
	ImageLoader::UninitedUpwards& ups = upsBuffer[0];
	ups.count = 0;
	// Calling recursive init on all images in images list, building a new list of
	// uninitialized upward dependencies.
	for (uintptr_t i=0; i < images.count; ++i) {
		images.imagesAndPaths[i].first->recursiveInitialization(context, thisThread, images.imagesAndPaths[i].second, timingInfo, ups);
	}
	// If any upward dependencies remain, init them.
	if ( ups.count > 0 )
		processInitializers(context, thisThread, timingInfo, ups);
}


void ImageLoader::runInitializers(const LinkContext& context, InitializerTimingList& timingInfo)
{
	uint64_t t1 = mach_absolute_time();
	mach_port_t thisThread = mach_thread_self();
	ImageLoader::UninitedUpwards up;
	up.count = 1;
	up.imagesAndPaths[0] = { this, this->getPath() };
	processInitializers(context, thisThread, timingInfo, up);
	context.notifyBatch(dyld_image_state_initialized, false);
	mach_port_deallocate(mach_task_self(), thisThread);
	uint64_t t2 = mach_absolute_time();
	fgTotalInitTime += (t2 - t1);
}
  • runInitializers中的核心代码是processInitializers,在processInitializers函数中,对镜像列表调用recursiveInitialization函数进行递归实例化。进入recursiveInitialization函数,源码如下:
void ImageLoader::recursiveInitialization(const LinkContext& context, mach_port_t this_thread, const char* pathToInitialize,
										  InitializerTimingList& timingInfo, UninitedUpwards& uninitUps)
{
	recursive_lock lock_info(this_thread);
	recursiveSpinLock(lock_info);

	if ( fState < dyld_image_state_dependents_initialized-1 ) {
		uint8_t oldState = fState;
		// break cycles
		fState = dyld_image_state_dependents_initialized-1;
		try {
			// initialize lower level libraries first
			for(unsigned int i=0; i < libraryCount(); ++i) {
				ImageLoader* dependentImage = libImage(i);
				if ( dependentImage != NULL ) {
					// don't try to initialize stuff "above" me yet
					if ( libIsUpward(i) ) {
						uninitUps.imagesAndPaths[uninitUps.count] = { dependentImage, libPath(i) };
						uninitUps.count++;
					}
					else if ( dependentImage->fDepth >= fDepth ) {
						dependentImage->recursiveInitialization(context, this_thread, libPath(i), timingInfo, uninitUps);
					}
                }
			}
			
			// record termination order
			if ( this->needsTermination() )
				context.terminationRecorder(this);

			// let objc know we are about to initialize this image
			uint64_t t1 = mach_absolute_time();
			fState = dyld_image_state_dependents_initialized;
			oldState = fState;
			context.notifySingle(dyld_image_state_dependents_initialized, this, &timingInfo);
			
			// initialize this image
			bool hasInitializers = this->doInitialization(context);

			// let anyone know we finished initializing this image
			fState = dyld_image_state_initialized;
			oldState = fState;
			context.notifySingle(dyld_image_state_initialized, this, NULL);
			
			if ( hasInitializers ) {
				uint64_t t2 = mach_absolute_time();
				timingInfo.addTime(this->getShortName(), t2-t1);
			}
		}
		catch (const char* msg) {
			// this image is not initialized
			fState = oldState;
			recursiveSpinUnLock();
			throw;
		}
	}
	
	recursiveSpinUnLock();
}
  • 程序加载流程如下:
    _dyld_start -> dyldbootstrap::start -> dyld::main -> initializeMainExecutable -> runInitializers -> processInitializers -> recursiveInitialization(递归)。
notifySingle
// let objc know we are about to initialize this image
	`uint64_t t1 = mach_absolute_time();
	fState = dyld_image_state_dependents_initialized;
	oldState = fState;
	context.notifySingle(dyld_image_state_dependents_initialized, this, &timingInfo);`
	
	// initialize this image
	`bool hasInitializers = this->doInitialization(context);`

	// let anyone know we finished initializing this image
	`fState = dyld_image_state_initialized;
	oldState = fState;
	context.notifySingle(dyld_image_state_initialized, this, NULL);`

此方法大意为初始化此image做准备,然后初始化此image,最后结束初始化,

  • 看一下notifySingle
static void notifySingle(dyld_image_states state, const ImageLoader* image, ImageLoader::InitializerTimingList* timingInfo)
{
	//dyld::log("notifySingle(state=%d, image=%s)\n", state, image->getPath());
	std::vector<dyld_image_state_change_handler>* handlers = stateToHandlers(state, sSingleHandlers);
	if ( handlers != NULL ) {
		dyld_image_info info;
		info.imageLoadAddress	= image->machHeader();
		info.imageFilePath		= image->getRealPath();
		info.imageFileModDate	= image->lastModified();
		for (std::vector<dyld_image_state_change_handler>::iterator it = handlers->begin(); it != handlers->end(); ++it) {
			const char* result = (*it)(state, 1, &info);
			if ( (result != NULL) && (state == dyld_image_state_mapped) ) {
				//fprintf(stderr, "  image rejected by handler=%p\n", *it);
				// make copy of thrown string so that later catch clauses can free it
				const char* str = strdup(result);
				throw str;
			}
		}
	}
	if ( state == dyld_image_state_mapped ) {
		// <rdar://problem/7008875> Save load addr + UUID for images from outside the shared cache
		// <rdar://problem/50432671> Include UUIDs for shared cache dylibs in all image info when using private mapped shared caches
		if (!image->inSharedCache()
			|| (gLinkContext.sharedRegionMode == ImageLoader::kUsePrivateSharedRegion)) {
			dyld_uuid_info info;
			if ( image->getUUID(info.imageUUID) ) {
				info.imageLoadAddress = image->machHeader();
				addNonSharedCacheImageUUID(info);
			}
		}
	}
	if ( (state == dyld_image_state_dependents_initialized) && (sNotifyObjCInit != NULL) && image->notifyObjC() ) {
		uint64_t t0 = mach_absolute_time();
		dyld3::ScopedTimer timer(DBG_DYLD_TIMING_OBJC_INIT, (uint64_t)image->machHeader(), 0, 0);
		(*sNotifyObjCInit)(image->getRealPath(), image->machHeader());
		uint64_t t1 = mach_absolute_time();
		uint64_t t2 = mach_absolute_time();
		uint64_t timeInObjC = t1-t0;
		uint64_t emptyTime = (t2-t1)*100;
		if ( (timeInObjC > emptyTime) && (timingInfo != NULL) ) {
			timingInfo->addTime(image->getShortName(), timeInObjC);
		}
	}
    // mach message csdlc about dynamically unloaded images
	if ( image->addFuncNotified() && (state == dyld_image_state_terminated) ) {
		notifyKernel(*image, false);
		const struct mach_header* loadAddress[] = { image->machHeader() };
		const char* loadPath[] = { image->getPath() };
		notifyMonitoringDyld(true, 1, loadAddress, loadPath);
	}
}

耐心读代码,找到了这个

static _dyld_objc_notify_init		sNotifyObjCInit;
registerObjCNotifiers
// _dyld_objc_notify_init
void registerObjCNotifiers(_dyld_objc_notify_mapped mapped, _dyld_objc_notify_init init, _dyld_objc_notify_unmapped unmapped)
{
	// record functions to call
	sNotifyObjCMapped	= mapped;
	sNotifyObjCInit		= init;
	sNotifyObjCUnmapped = unmapped;

	// call 'mapped' function with all images mapped so far
	try {
		notifyBatchPartial(dyld_image_state_bound, true, NULL, false, true);
	}
	catch (const char* msg) {
		// ignore request to abort during registration
	}

	// <rdar://problem/32209809> call 'init' function on all images already init'ed (below libSystem)
	for (std::vector<ImageLoader*>::iterator it=sAllImages.begin(); it != sAllImages.end(); it++) {
		ImageLoader* image = *it;
		if ( (image->getState() == dyld_image_state_initialized) && image->notifyObjC() ) {
			dyld3::ScopedTimer timer(DBG_DYLD_TIMING_OBJC_INIT, (uint64_t)image->machHeader(), 0, 0);
			(*sNotifyObjCInit)(image->getRealPath(), image->machHeader());
		}
	}
}
_dyld_objc_notify_register
// _dyld_objc_notify_register
void _dyld_objc_notify_register(_dyld_objc_notify_mapped    mapped,
                                _dyld_objc_notify_init      init,
                                _dyld_objc_notify_unmapped  unmapped)
{
	dyld::registerObjCNotifiers(mapped, init, unmapped);
}

找到了一堆,但都不是调用这个方法的
然后 看到这个

// Also note, this function must be called after _dyld_objc_notify_register.

然后创建了一个项目,打了个这个断点 发现了这个
iOS底层原理之dyld应用程序加载_第4张图片
iOS底层原理之dyld应用程序加载_第5张图片
iOS底层原理之dyld应用程序加载_第6张图片
iOS底层原理之dyld应用程序加载_第7张图片
iOS底层原理之dyld应用程序加载_第8张图片
和之前的连上了

doInitialization
bool ImageLoaderMachO::doInitialization(const LinkContext& context)
{
	CRSetCrashLogMessage2(this->getPath());

	// mach-o has -init and static initializers
	doImageInit(context);
	doModInitFunctions(context);
	
	CRSetCrashLogMessage2(NULL);
	
	return (fHasDashInit || fHasInitializers);
}
dyld作为动态连接器,进行动态库的加载工作,libobjc.A.dylib库也是它要加载的内容。进入doInitialization函数的doImageInit流程中
void ImageLoaderMachO::doImageInit(const LinkContext& context)
{
	if ( fHasDashInit ) {
		const uint32_t cmd_count = ((macho_header*)fMachOData)->ncmds;
		const struct load_command* const cmds = (struct load_command*)&fMachOData[sizeof(macho_header)];
		const struct load_command* cmd = cmds;
		for (uint32_t i = 0; i < cmd_count; ++i) {
			switch (cmd->cmd) {
				case LC_ROUTINES_COMMAND:
					Initializer func = (Initializer)(((struct macho_routines_command*)cmd)->init_address + fSlide);
#if __has_feature(ptrauth_calls)
					func = (Initializer)__builtin_ptrauth_sign_unauthenticated((void*)func, ptrauth_key_asia, 0);
#endif
					// <rdar://problem/8543820&9228031> verify initializers are in image
					if ( ! this->containsAddress(stripPointer((void*)func)) ) {
						dyld::throwf("initializer function %p not in mapped image for %s\n", func, this->getPath());
					}
					if ( ! dyld::gProcessInfo->libSystemInitialized ) {
						// <rdar://problem/17973316> libSystem initializer must run first
						dyld::throwf("-init function in image (%s) that does not link with libSystem.dylib\n", this->getPath());
					}
					if ( context.verboseInit )
						dyld::log("dyld: calling -init function %p in %s\n", func, this->getPath());
					{
						dyld3::ScopedTimer(DBG_DYLD_TIMING_STATIC_INITIALIZER, (uint64_t)fMachOData, (uint64_t)func, 0);
						func(context.argc, context.argv, context.envp, context.apple, &context.programVars);
					}
					break;
			}
			cmd = (const struct load_command*)(((char*)cmd)+cmd->cmdsize);
		}
	}
}
加载c++文件
void ImageLoaderMachO::doModInitFunctions(const LinkContext& context)
{
	if ( fHasInitializers ) {
		const uint32_t cmd_count = ((macho_header*)fMachOData)->ncmds;
		const struct load_command* const cmds = (struct load_command*)&fMachOData[sizeof(macho_header)];
		const struct load_command* cmd = cmds;
		for (uint32_t i = 0; i < cmd_count; ++i) {
			if ( cmd->cmd == LC_SEGMENT_COMMAND ) {
				const struct macho_segment_command* seg = (struct macho_segment_command*)cmd;
				const struct macho_section* const sectionsStart = (struct macho_section*)((char*)seg + sizeof(struct macho_segment_command));
				const struct macho_section* const sectionsEnd = &sectionsStart[seg->nsects];
				for (const struct macho_section* sect=sectionsStart; sect < sectionsEnd; ++sect) {
					const uint8_t type = sect->flags & SECTION_TYPE;
					if ( type == S_MOD_INIT_FUNC_POINTERS ) {
						Initializer* inits = (Initializer*)(sect->addr + fSlide);
						const size_t count = sect->size / sizeof(uintptr_t);
						// <rdar://problem/23929217> Ensure __mod_init_func section is within segment
						if ( (sect->addr < seg->vmaddr) || (sect->addr+sect->size > seg->vmaddr+seg->vmsize) || (sect->addr+sect->size < sect->addr) )
							dyld::throwf("__mod_init_funcs section has malformed address range for %s\n", this->getPath());
						for (size_t j=0; j < count; ++j) {
							Initializer func = inits[j];
							// <rdar://problem/8543820&9228031> verify initializers are in image
							if ( ! this->containsAddress(stripPointer((void*)func)) ) {
								dyld::throwf("initializer function %p not in mapped image for %s\n", func, this->getPath());
							}
							if ( ! dyld::gProcessInfo->libSystemInitialized ) {
								// <rdar://problem/17973316> libSystem initializer must run first
								const char* installPath = getInstallPath();
								if ( (installPath == NULL) || (strcmp(installPath, libSystemPath(context)) != 0) )
									dyld::throwf("initializer in image (%s) that does not link with libSystem.dylib\n", this->getPath());
							}
							if ( context.verboseInit )
								dyld::log("dyld: calling initializer function %p in %s\n", func, this->getPath());
							bool haveLibSystemHelpersBefore = (dyld::gLibSystemHelpers != NULL);
							{
								dyld3::ScopedTimer(DBG_DYLD_TIMING_STATIC_INITIALIZER, (uint64_t)fMachOData, (uint64_t)func, 0);
								func(context.argc, context.argv, context.envp, context.apple, &context.programVars);
							}
							bool haveLibSystemHelpersAfter = (dyld::gLibSystemHelpers != NULL);
							if ( !haveLibSystemHelpersBefore && haveLibSystemHelpersAfter ) {
								// now safe to use malloc() and other calls in libSystem.dylib
								dyld::gProcessInfo->libSystemInitialized = true;
							}
						}
					}
					else if ( type == S_INIT_FUNC_OFFSETS ) {
						const uint32_t* inits = (uint32_t*)(sect->addr + fSlide);
						const size_t count = sect->size / sizeof(uint32_t);
						// Ensure section is within segment
						if ( (sect->addr < seg->vmaddr) || (sect->addr+sect->size > seg->vmaddr+seg->vmsize) || (sect->addr+sect->size < sect->addr) )
							dyld::throwf("__init_offsets section has malformed address range for %s\n", this->getPath());
						if ( seg->initprot & VM_PROT_WRITE )
							dyld::throwf("__init_offsets section is not in read-only segment %s\n", this->getPath());
						for (size_t j=0; j < count; ++j) {
							uint32_t funcOffset = inits[j];
							// verify initializers are in image
							if ( ! this->containsAddress((uint8_t*)this->machHeader() + funcOffset) ) {
								dyld::throwf("initializer function offset 0x%08X not in mapped image for %s\n", funcOffset, this->getPath());
							}
							if ( ! dyld::gProcessInfo->libSystemInitialized ) {
								// <rdar://problem/17973316> libSystem initializer must run first
								const char* installPath = getInstallPath();
								if ( (installPath == NULL) || (strcmp(installPath, libSystemPath(context)) != 0) )
									dyld::throwf("initializer in image (%s) that does not link with libSystem.dylib\n", this->getPath());
							}
                            Initializer func = (Initializer)((uint8_t*)this->machHeader() + funcOffset);
							if ( context.verboseInit )
								dyld::log("dyld: calling initializer function %p in %s\n", func, this->getPath());
#if __has_feature(ptrauth_calls)
							func = (Initializer)__builtin_ptrauth_sign_unauthenticated((void*)func, ptrauth_key_asia, 0);
#endif
							bool haveLibSystemHelpersBefore = (dyld::gLibSystemHelpers != NULL);
							{
								dyld3::ScopedTimer(DBG_DYLD_TIMING_STATIC_INITIALIZER, (uint64_t)fMachOData, (uint64_t)func, 0);
                                func(context.argc, context.argv, context.envp, context.apple, &context.programVars);
                            }
							bool haveLibSystemHelpersAfter = (dyld::gLibSystemHelpers != NULL);
							if ( !haveLibSystemHelpersBefore && haveLibSystemHelpersAfter ) {
								// now safe to use malloc() and other calls in libSystem.dylib
								dyld::gProcessInfo->libSystemInitialized = true;
							}
						}
					}
				}
			}
			cmd = (const struct load_command*)(((char*)cmd)+cmd->cmdsize);
		}
	}
}

总结

模拟器(simulator)

dyld_dyld_start ->
dylddyldbootstrap::start->
dylddyld::_main ->
dylddyld::useSimulatorDyld ->
dyld_simdyld::_main->
dyld_simdyld::initializeMainExecutable() ->
dyld_simImageLoader::runInitializers ->
dyld_simImageLoader::processInitializers ->
dyld_simImageLoader::recursiveInitialization ->
dyld_simdyld::notifySingle ->
libobjc.A.dylibload_images

真机(iPhone)

dyld_dyld_start ->
dylddyldbootstrap::start ->
dylddyld::_main ->
dylddyld::initializeMainExecutable()->
dyldImageLoader::runInitializers->
dyldImageLoader::processInitializers
dyldImageLoader::recursiveInitialization ->
dylddyld::notifySingle->
libobjc.A.dylibload_images

流程图

下一步
下一步
下一步
_objc_init
libobjc
dyld
libsystem
libdispath
libobjc
main函数
dyld:start
dyldbootstrap:start
dyld:main
initializeMainExecutable
dyld`ImageLoader::runInitializers
dyld`ImageLoader::processInitializers
ImageLoader::recursiveInitialization
doInitialization
doModInitFunctions
libSystem_Initializa
libdispatch_init
_os_object_init
notifySingle
loadImages
class load

你可能感兴趣的:(底层原理,ios)