iOS-底层原理12-应用程序加载

《iOS底层原理文章汇总》

前面所学内容回顾

  • 1.对象底层-结构体-alloc
  • 2.对象的本质-isa
  • 3.类的结构 isa superclass cache bit-data() methodList?rw
  • 4.cache:方法 bucket mask insert
  • 5.objc_msgSend 消息的发送,快速查找
  • 6.慢速查找,二分查找
  • 7.动态方法决议
  • 8.消息转发:快速+慢速

应用程序的加载过程

先了解App的编译过程,可参考后面的博客LLVM,生成可执行二进制文件的过程

.h.m文件 -> 预编译 -> 编译 -> 汇编 -> 链接 -> 可执行文件(存在于.app文件显示包内容)

iOS-底层原理12-应用程序加载_第1张图片
编译过程.png

实质,由后面LLVM得知,编译分为七步,第六步链接器把编译产生的.o目标文件和.o目标文件中应用到的动态库(绑定)静态库(.dylib .a复制拷贝合并)文件进行链接,生成一个mach-o文件即可执行文件。


iOS-底层原理12-应用程序加载_第2张图片
Clang编译流程@2x.png

iOS-底层原理12-应用程序加载_第3张图片
Clang编译七步@2x.png

静态库:编译时会被完整的复制到可执行文件中,被多次使用就有多份拷贝。静态库的加载在编译期。静态库的后缀名是以.a或.framework结尾。


iOS-底层原理12-应用程序加载_第4张图片
静态库.png

动态库:链接时不复制,程序运行时由系统动态加载到内存,系统只加载一次,多个程序共用(如系统的UIKit.framework等),节省内存。动态库的加载在运行期,把相同的库用一份共享库的实例加载进来,节省了整个内存。共享内存,节约资源,减少App打包后的大小。动态库的后缀名可以是.dylib或.tbd或.framework结尾,所有的系统库都属于动态库,在iOS中一般使用framework作为动态库。


iOS-底层原理12-应用程序加载_第5张图片
动态库.png

在使用app时,静态库和动态库都会被加载到内存中。当多个app使用同一个库时,如果这个库是动态库,由于动态库是可以被多个app的进程共用的,所以在内存中只会存在一份;如果是静态库,由于每个app的mach-o文件中都会存在一份,则会存在多份。相对静态库,使用动态库可以减少app占用的内存大小。

另外,使用动态库可以缩短app的启动时间。原因是,使用动态库时,app的mach-o文件都会比较小;app依赖的动态库可能已经存在于内存中了(其他已启动的app也依赖了这个动态库),所以不需要重复加载。
动态库和静态库,编译静态库体积更小,因为动态库还会保留一些外部符号,给外部调用,静态库合并成一个文件,不需要符号了,静态库没有符号更小一点

之前总是有一个思维误区,说iOS不能使用动态库,否则Apple不给上架AppStore,准确的来说,这句话是错误的。!!!不能上架使用的动态库是使用了dlopen方式在运行时加载动态库

使用动态库

使用动态库有两种方式,一种是将动态库添加为依赖库,这样会在工程启动时加载动态库,一种是使用dlopen在运行时加载动态库,这两种方式的区别在于加载动态库的时机。

在iOS中一般使用第一种方法,第二种方式一般在mac开发中使用,如果在iOS中使用了这种方式,是不能上架到App Store的。

创建动态库

1.创建一个新工程,选择iOS -> Cocoa Touch Framework


iOS-底层原理12-应用程序加载_第6张图片
新建动态库工程@2x.png

2.实现framework并指定对外的头文件,指定DynamicDemo.h和BCPerson.h为对外的头文件

#import 

//! Project version number for DynamicDemo.
FOUNDATION_EXPORT double DynamicDemoVersionNumber;

//! Project version string for DynamicDemo.
FOUNDATION_EXPORT const unsigned char DynamicDemoVersionString[];

// In this header, you should import all the public headers of your framework using statements like #import 

#import 

@interface BCPerson : NSObject
@property (nonatomic, copy) NSString *name ;
- (void)watch;
- (void)eat;
@end
对外头文件@2x.png

DynamicDemo动态库编译后,在Products中会生成DynamicDemo.framework动态库

指定framework的架构模式,这里选择了Generic iOS Device机型,然后build一下,就会创建一个通用mach-o文件,包含了arm64和arm_v7两种架构。如果选择了模拟器,会创建一个x86_64架构的mach-o文件。

需要注意的是,App和它依赖的framework的架构必须兼容,也就是说,在创建可执行文件时,要么都是真机,要么都是模拟器。当然,也可以分别在真机和模拟器两种模式下创建framwork,然后使用lipo命令来将两个framework内部的同名mach-o文件合并成一个通用mach-o文件,这样,不管App是什么架构模式,都能正确使用这个framework了。

查看.framework是动态库静态库的方法

(1)cd xx.framework
(2)file xx 注释:xx为.framwork下的二进制文件
(3)判断:静态库包含“current ar archive random library”字样。动态库包含“dynamically linked shared library”字样

iOS-底层原理12-应用程序加载_第7张图片
fileDynamicFramework.png

使用动态库

1.创建一个新的工程DynamicFrameworkDemo,并引入DynamicDemo.framework,在main.m文件中调用这个framework中的方法


iOS-底层原理12-应用程序加载_第8张图片
依赖Framewrok.png

iOS-底层原理12-应用程序加载_第9张图片
调用动态库中方法.png
iOS-底层原理12-应用程序加载_第10张图片
链接.o文件和动态库进行绑定生成mach-o文件@2x.png

2.Cmd + B编译DynamicFrameworkDemo,在products中生成的DynamicFrameworkDemo.app中右键显示包内容,发现多了一个Frameworks文件夹,此文件夹是用来专门存放自定义的动态库的。


iOS-底层原理12-应用程序加载_第11张图片
Frameworks下面动态库.png

动态库加载时间优化

对动态库加载的时间优化。每个App都进行动态库加载,其中系统级别的动态库占据了绝大数,而针对系统级别的动态库都是经过系统高度优化的,不用担心时间的花费。我们应该关注于自己集成到App的那些动态库,这也是最能消耗加载时间的地方。苹果的建议是减少在App里动态库的集成或者有可能地将其多个动态库最终集成一个动态库后进行导入,尽量保证将App现有的非系统级的动态库个数保证在6个以内。

后面LLVM中得知,APP启动时候pre-main会加载动态库,动态库太多会耗时,增加App的启动时间,dylib loading,下图是微信的pre-main时间,微信使用的动态库个数,在Frameworks下刚刚好是苹果官方建议的6个。


iOS-底层原理12-应用程序加载_第12张图片
WeChatApp6个动态库@2x.png
iOS-底层原理12-应用程序加载_第13张图片
微信pre-main时间.png

继续回到应用程序的加载

如下文件的执行顺序:load->C++函数->main()?为什么是这个顺序呢?




App运行后,动态库加载,静态库加载,可执行二进制文件加载,怎么把这些文件链接在一起呢?怎么加载?谁先初始化?谁先实例化?谁先init?
dyld链接器来进行主动链接


iOS-底层原理12-应用程序加载_第16张图片
App加载过程@2x.png

总结:App启动,在下层有很多动静态库,成为镜像文件images,交给dyld处理,从加载的内存中读出来,读到相应的表里面,一个一个加载到主程序,进行link,初始化,相应的库也要进行初始化,如runtime的初始化方法_objc_init,libSystem.init,GCD_OSinit,libdispatchinit

查看dyld的入口,程序走入load方法断点,因为load方法比C++函数和main函数先执行,查看堆栈,入口是dyld`_dyld_start,也可以通过汇编查看,应用程序的加载从dyld_start开始,下载一份dyld源码查找dyld_start入口

iOS-底层原理12-应用程序加载_第17张图片
load方法应用程序堆栈@2x.png

iOS-底层原理12-应用程序加载_第18张图片
load方法之前@2x.png

iOS-底层原理12-应用程序加载_第19张图片
load方法之前@2x.png

dyld.start中看到call dyldbootstrap::start(app_mh,argc,argv,dyld_mh,&startGlue)方法


//
//  This is code to bootstrap dyld.  This work in normally done for a program by dyld and crt.
//  In dyld we have to do this manually.
//
uintptr_t start(const dyld3::MachOLoaded* appsMachHeader, int argc, const char* argv[],
                const dyld3::MachOLoaded* dyldsMachHeader, uintptr_t* startGlue)
{

    // Emit kdebug tracepoint to indicate dyld bootstrap has started 
    dyld3::kdebug_trace_dyld_marker(DBG_DYLD_TIMING_BOOTSTRAP_START, 0, 0, 0, 0);

    // if kernel had to slide dyld, we need to fix up load sensitive locations
    // we have to do this before using any global variables
    rebaseDyld(dyldsMachHeader);

    // kernel sets up env pointer to be just past end of agv array
    const char** envp = &argv[argc+1];
    
    // kernel sets up apple pointer to be just past end of envp array
    const char** apple = envp;
    while(*apple != NULL) { ++apple; }
    ++apple;

    // set up random value for stack canary
    __guard_setup(apple);

#if DYLD_INITIALIZER_SUPPORT
    // run all C++ initializers inside dyld
    runDyldInitializers(argc, argv, envp, apple);
#endif

    // now that we are done bootstrapping dyld, call dyld's main
    uintptr_t appsSlide = appsMachHeader->getSlide();
    return dyld::_main((macho_header*)appsMachHeader, appsSlide, argc, argv, envp, apple, startGlue);
}

这儿的macho_header文件,macho头文件,是编译过程中相应的header,编译后会生成xxx.app里面有一个xxx的可执行文件,拖到MachOview中查看


dyld:main函数,6191-6828行代码都是,精简为如下步骤

uintptr_t
_main(const macho_header* mainExecutableMH, uintptr_t mainExecutableSlide, 
        int argc, const char* argv[], const char* envp[], const char* apple[], 
        uintptr_t* startGlue)
{
    ......

    // 设置运行环境,可执行文件准备工作
    ......

    // load shared cache   加载共享缓存
    mapSharedCache();
    ......

reloadAllImages:

    ......
    // instantiate ImageLoader for main executable 加载可执行文件并生成一个ImageLoader实例对象,实例化主程序
    sMainExecutable = instantiateFromLoadedImage(mainExecutableMH, mainExecutableSlide, sExecPath);

    ......

    // load any inserted libraries   加载插入的动态库
    if  ( sEnv.DYLD_INSERT_LIBRARIES != NULL ) {
        for (const char* const* lib = sEnv.DYLD_INSERT_LIBRARIES; *lib != NULL; ++lib) 
            loadInsertedDylib(*lib);
    }
        
    // link main executable  链接主程序
    link(sMainExecutable, sEnv.DYLD_BIND_AT_LAUNCH, true, ImageLoader::RPathChain(NULL, NULL), -1);

    ......
    // link any inserted libraries   链接所有插入的动态库
    if ( sInsertedDylibCount > 0 ) {
        for(unsigned int i=0; i < sInsertedDylibCount; ++i) {
            ImageLoader* image = sAllImages[i+1];
            link(image, sEnv.DYLD_BIND_AT_LAUNCH, true, ImageLoader::RPathChain(NULL, NULL), -1);
            image->setNeverUnloadRecursive();
        }
        if ( gLinkContext.allowInterposing ) {
            // only INSERTED libraries can interpose
            // register interposing info after all inserted libraries are bound so chaining works
            for(unsigned int i=0; i < sInsertedDylibCount; ++i) {
                ImageLoader* image = sAllImages[i+1];
                // 注册符号插入
                image->registerInterposing(gLinkContext);
            }
        }
    }

    ......
    //弱符号绑定
    sMainExecutable->weakBind(gLinkContext);
        
    sMainExecutable->recursiveMakeDataReadOnly(gLinkContext);

    ......
    // run all initializers   执行初始化方法
    initializeMainExecutable(); 

    // notify any montoring proccesses that this process is about to enter main()
    notifyMonitoringDyldMain();

    return result;
}

返回了result,看result的赋值,找到主程序sMainExecutable,从而找sMainExecutale的赋值

// main executable uses LC_UNIXTHREAD, dyld needs to let "start" in program set up for main()
                result = (uintptr_t)sMainExecutable->getEntryFromLC_UNIXTHREAD();
                *startGlue = 0;
                
// instantiate ImageLoader for main executable
        sMainExecutable = instantiateFromLoadedImage(mainExecutableMH, mainExecutableSlide, sExecPath);
        gLinkContext.mainExecutable = sMainExecutable;
        gLinkContext.mainExecutableCodeSigned = hasCodeSignatureLoadCommand(mainExecutableMH);

instantiateFromLoadedImage方法

// The kernel maps in main executable before dyld gets control.  We need to 
// make an ImageLoader* for the already mapped in main executable.
static ImageLoaderMachO* instantiateFromLoadedImage(const macho_header* mh, uintptr_t slide, const char* path)
{
    // try mach-o loader
    if ( isCompatibleMachO((const uint8_t*)mh, path) ) {
        ImageLoader* image = ImageLoaderMachO::instantiateMainExecutable(mh, slide, path, gLinkContext);
        addImage(image);
        return (ImageLoaderMachO*)image;
    }
    throw "main executable not a known format";
}

ImageLoaderMachO::instantiateMainExecutable方法

// create image for main executable
ImageLoader* ImageLoaderMachO::instantiateMainExecutable(const macho_header* mh, uintptr_t slide, const char* path, const LinkContext& context)
{
    //dyld::log("ImageLoader=%ld, ImageLoaderMachO=%ld, ImageLoaderMachOClassic=%ld, ImageLoaderMachOCompressed=%ld\n",
    //  sizeof(ImageLoader), sizeof(ImageLoaderMachO), sizeof(ImageLoaderMachOClassic), sizeof(ImageLoaderMachOCompressed));
    bool compressed;
    unsigned int segCount;
    unsigned int libCount;
    const linkedit_data_command* codeSigCmd;
    const encryption_info_command* encryptCmd;
    sniffLoadCommands(mh, path, false, &compressed, &segCount, &libCount, context, &codeSigCmd, &encryptCmd);
    // instantiate concrete class based on content of load commands
    if ( compressed ) 
        return ImageLoaderMachOCompressed::instantiateMainExecutable(mh, slide, path, segCount, libCount, context);
    else
#if SUPPORT_CLASSIC_MACHO
        return ImageLoaderMachOClassic::instantiateMainExecutable(mh, slide, path, segCount, libCount, context);
#else
        throw "missing LC_DYLD_INFO load command";
#endif
}

sniffLoadCommands方法进行loadCommands,构建整个主程序的命令格式,通过MachOView可以查看,之后addImage(image),返回主程序。



有了主程序之后,接下来dyld的主要步骤

1.设置运行环境,环境变量配置,为可执行文件的加载做准备工作;
2.映射共享缓存到当前进程的逻辑内存空间;共享缓存:UIKit.framework,Foundation.framework,CoreFoundation.framework
3.主程序的实例化:主程序实例化成对象出来
4.加载插入的动态库;
5.链接主程序;
6.链接插入的动态库;
7.执行弱符号绑定(weakBind);
8.执行主程序所有的初始化方法;
9.查找程序入口并返回main( ).

dyld:main函数,6191-6828行代码主要做了哪些事情

1.判断版本信息version,


iOS-底层原理12-应用程序加载_第24张图片
版本信息@2x.png

2.读取平台信息(platforms),

    // Set the platform ID in the all image infos so debuggers can tell the process type
    // FIXME: This can all be removed once we make the kernel handle it in rdar://43369446
    if (gProcessInfo->version >= 16) {
        __block bool platformFound = false;
        ((dyld3::MachOFile*)mainExecutableMH)->forEachSupportedPlatform(^(dyld3::Platform platform, uint32_t minOS, uint32_t sdk) {
            if (platformFound) {
                halt("MH_EXECUTE binaries may only specify one platform");
            }
            gProcessInfo->platform = (uint32_t)platform;
            platformFound = true;
        });
        if (gProcessInfo->platform == (uint32_t)dyld3::Platform::unknown) {
            // There were no platforms found in the binary. This may occur on macOS for alternate toolchains and old binaries.
            // It should never occur on any of our embedded platforms.
#if __MAC_OS_X_VERSION_MIN_REQUIRED
            gProcessInfo->platform = (uint32_t)dyld3::Platform::macOS;
#else
            halt("MH_EXECUTE binaries must specify a minimum supported OS version");
#endif
        }
    }

3.MachOfile文件所在的地址,
4.设置上下文setContext路径信息,
5.CRSetCrashMessage信息的绑定,
6.环境变量数组envp,
7.共享缓存进行处理load shared cache mapSharedCache()


iOS-底层原理12-应用程序加载_第25张图片
共享缓存@2x.png

8.主程序的初始化,
9.动态库的加入loadInsertDylib(*lib)


iOS-底层原理12-应用程序加载_第26张图片
插入动态库@2x.png

10.进行link



11.run整个images,跑起来initializeMainExecutable()
12.进入main()


iOS-底层原理12-应用程序加载_第28张图片
notifyMonitoringDyldMain()@2x.png

initializeMainExecutable()探索

for循环递归,跑插入的动态库,initializeMainExecutable()->processInitializers()->images.imagesAndPaths[i].first->recursiveInitialization()动态库可能会引入其他动态库,多表结构调用,使用递归循环实例化->context.notifySingle->sNotifyObjCInit,这一层好像断了,并没有看到sNotifyObjCInit实现的地方,流程不能往下走了?

void initializeMainExecutable()
{
    // record that we've reached this step
    gLinkContext.startedInitializingMainExecutable = true;

    // run initialzers for any inserted dylibs
    ImageLoader::InitializerTimingList initializerTimes[allImagesCount()];
    initializerTimes[0].count = 0;
    const size_t rootCount = sImageRoots.size();
    if ( rootCount > 1 ) {
        for(size_t i=1; i < rootCount; ++i) {
            sImageRoots[i]->runInitializers(gLinkContext, initializerTimes[0]);
        }
    }
    
    // run initializers for main executable and everything it brings up 
    sMainExecutable->runInitializers(gLinkContext, initializerTimes[0]);
    
    // register cxa_atexit() handler to run static terminators in all loaded images when this process exits
    if ( gLibSystemHelpers != NULL ) 
        (*gLibSystemHelpers->cxa_atexit)(&runAllStaticTerminators, NULL, NULL);

    // dump info if requested
    if ( sEnv.DYLD_PRINT_STATISTICS )
        ImageLoader::printStatistics((unsigned int)allImagesCount(), initializerTimes[0]);
    if ( sEnv.DYLD_PRINT_STATISTICS_DETAILS )
        ImageLoaderMachO::printStatisticsDetails((unsigned int)allImagesCount(), initializerTimes[0]);
}

并没有看到sNotifyObjCInit实现的地方,流程不能往下走了?但发现registerObjCNotifiers方法中给sNotifyObjCInit赋值为init,查看哪里调用了registerObjCNotifiers方法,

void registerObjCNotifiers(_dyld_objc_notify_mapped mapped, _dyld_objc_notify_init init, _dyld_objc_notify_unmapped unmapped)
{
    // record functions to call
    sNotifyObjCMapped   = mapped;
    sNotifyObjCInit     = init;
    sNotifyObjCUnmapped = unmapped;

    // call 'mapped' function with all images mapped so far
    try {
        notifyBatchPartial(dyld_image_state_bound, true, NULL, false, true);
    }
    catch (const char* msg) {
        // ignore request to abort during registration
    }

    //  call 'init' function on all images already init'ed (below libSystem)
    for (std::vector::iterator it=sAllImages.begin(); it != sAllImages.end(); it++) {
        ImageLoader* image = *it;
        if ( (image->getState() == dyld_image_state_initialized) && image->notifyObjC() ) {
            dyld3::ScopedTimer timer(DBG_DYLD_TIMING_OBJC_INIT, (uint64_t)image->machHeader(), 0, 0);
            (*sNotifyObjCInit)(image->getRealPath(), image->machHeader());
        }
    }
}

发现只有一个地方调用了registerObjCNotifiers方法_dyld_objc_notify_register(参数1,参数2,参数3),参数二会将init赋值给sNotifyObjCInit,那在什么地方调用了_dyld_objc_notify_register方法呢?

void _dyld_objc_notify_register(_dyld_objc_notify_mapped    mapped,
                                _dyld_objc_notify_init      init,
                                _dyld_objc_notify_unmapped  unmapped)
{
    dyld::registerObjCNotifiers(mapped, init, unmapped);
}

我们回到objc的源码中搜索_dyld_objc_notify_register(参数1,参数2,参数3),发现在objc源码中的_objc_init中调用了此方法,那什么时候调用了objc_init这个方法呢?
runinit执行的时候要想执行_dyld_objc_notify_register(参数1,参数2,参数3)属于回调函数,必须触发_objc_init(void)函数的调用,什么地方执行的objc_init呢?第二个参数正好为load_images。

/***********************************************************************
* _objc_init
* Bootstrap initialization. Registers our image notifier with dyld.
* Called by libSystem BEFORE library initialization time
**********************************************************************/

void _objc_init(void)
{
    static bool initialized = false;
    if (initialized) return;
    initialized = true;
    
    // fixme defer initialization until an objc-using image is found?
    environ_init();
    tls_init();
    static_init();
    runtime_init();
    exception_init();
    cache_init();
    _imp_implementationWithBlock_init();

    _dyld_objc_notify_register(&map_images, load_images, unmap_image);

#if __OBJC2__
    didCallDyldNotifyRegister = true;
#endif
}

我们在void ImageLoader::recursiveInitialization初始化的时候调用了bool hasInitializers = this->doInitialization(context);

iOS-底层原理12-应用程序加载_第29张图片
initializeMainExecutable.png

void ImageLoader::recursiveInitialization(const LinkContext& context, mach_port_t this_thread, const char* pathToInitialize,
                                          InitializerTimingList& timingInfo, UninitedUpwards& uninitUps)
{
    recursive_lock lock_info(this_thread);
    recursiveSpinLock(lock_info);

    if ( fState < dyld_image_state_dependents_initialized-1 ) {
        uint8_t oldState = fState;
        // break cycles
        fState = dyld_image_state_dependents_initialized-1;
        try {
            // initialize lower level libraries first
            for(unsigned int i=0; i < libraryCount(); ++i) {
                ImageLoader* dependentImage = libImage(i);
                if ( dependentImage != NULL ) {
                    // don't try to initialize stuff "above" me yet
                    if ( libIsUpward(i) ) {
                        uninitUps.imagesAndPaths[uninitUps.count] = { dependentImage, libPath(i) };
                        uninitUps.count++;
                    }
                    else if ( dependentImage->fDepth >= fDepth ) {
                        dependentImage->recursiveInitialization(context, this_thread, libPath(i), timingInfo, uninitUps);
                    }
                }
            }
            
            // record termination order
            if ( this->needsTermination() )
                context.terminationRecorder(this);

            // let objc know we are about to initialize this image
            uint64_t t1 = mach_absolute_time();
            fState = dyld_image_state_dependents_initialized;
            oldState = fState;
            context.notifySingle(dyld_image_state_dependents_initialized, this, &timingInfo);
            
            // initialize this image
            bool hasInitializers = this->doInitialization(context);

            // let anyone know we finished initializing this image
            fState = dyld_image_state_initialized;
            oldState = fState;
            context.notifySingle(dyld_image_state_initialized, this, NULL);
            
            if ( hasInitializers ) {
                uint64_t t2 = mach_absolute_time();
                timingInfo.addTime(this->getShortName(), t2-t1);
            }
        }
        catch (const char* msg) {
            // this image is not initialized
            fState = oldState;
            recursiveSpinUnLock();
            throw;
        }
    }
    recursiveSpinUnLock();
}

从而进入bool ImageLoaderMachO::doInitialization(const LinkContext& context)方法,进入doImageInit(context);方法,得知libSystem动态库必须先初始化

bool ImageLoaderMachO::doInitialization(const LinkContext& context)
{
    CRSetCrashLogMessage2(this->getPath());
    // mach-o has -init and static initializers
    doImageInit(context);
    doModInitFunctions(context);
    CRSetCrashLogMessage2(NULL);
    return (fHasDashInit || fHasInitializers);
}
iOS-底层原理12-应用程序加载_第30张图片
libSystem动态库必须先初始化@2x.png
void ImageLoaderMachO::doImageInit(const LinkContext& context)
{
    if ( fHasDashInit ) {
        const uint32_t cmd_count = ((macho_header*)fMachOData)->ncmds;
        const struct load_command* const cmds = (struct load_command*)&fMachOData[sizeof(macho_header)];
        const struct load_command* cmd = cmds;
        for (uint32_t i = 0; i < cmd_count; ++i) {
            switch (cmd->cmd) {
                case LC_ROUTINES_COMMAND:
                    Initializer func = (Initializer)(((struct macho_routines_command*)cmd)->init_address + fSlide);
#if __has_feature(ptrauth_calls)
                    func = (Initializer)__builtin_ptrauth_sign_unauthenticated((void*)func, ptrauth_key_asia, 0);
#endif
                    //  verify initializers are in image
                    if ( ! this->containsAddress(stripPointer((void*)func)) ) {
                        dyld::throwf("initializer function %p not in mapped image for %s\n", func, this->getPath());
                    }
                    if ( ! dyld::gProcessInfo->libSystemInitialized ) {
                        //  libSystem initializer must run first
                        dyld::throwf("-init function in image (%s) that does not link with libSystem.dylib\n", this->getPath());
                    }
                    if ( context.verboseInit )
                        dyld::log("dyld: calling -init function %p in %s\n", func, this->getPath());
                    {
                        dyld3::ScopedTimer(DBG_DYLD_TIMING_STATIC_INITIALIZER, (uint64_t)fMachOData, (uint64_t)func, 0);
                        func(context.argc, context.argv, context.envp, context.apple, &context.programVars);
                    }
                    break;
            }
            cmd = (const struct load_command*)(((char*)cmd)+cmd->cmdsize);
        }
    }
}

再重新查看当前的堆栈情况

堆栈里面的dyld的加载顺序和我们刚刚分析的流程顺序一模一样:dyld_start--> dyldbootstrap::start-->dyld::main-->dyld::useSimulatorDyld-->start_sim-->dyld::main-->dyld::initializeMainExecutable()-->ImageLoader::runInitializers()-->ImageLoader::processInitializers()-->ImageLoader::recursiveInitialization()-->dyld::notifySingle()-->load_images-->+[ViewController load]


iOS-底层原理12-应用程序加载_第31张图片
堆栈流程.png
iOS-底层原理12-应用程序加载_第32张图片
dyld堆栈@2x.png
iOS-底层原理12-应用程序加载_第33张图片
dyld堆栈流程@2x.png

由上面的分析得出load_images是由void _objc_init(void)-->_dyld_objc_notify_register(&map_images, load_images, unmap_image);传入load_images-->dyld::registerObjCNotifiers(mapped, init, unmapped);从而赋值sNotifyObjCInit = init = load_images;,在notifySingle方法中触发sNotifyObjCInit从而调用load_images

具体结合堆栈流程如下

iOS-底层原理12-应用程序加载_第35张图片
堆栈流程.png

在ObjC源码中的_objc_init中打断点查看堆栈情况,发现正如前文猜测,调用了libSystem_initializer,dyld`ImageLoaderMachO::doInitialization:-->dyld`ImageLoaderMachO::doModInitFunctions:-->libSystem.B.dylib`libSystem_initializer:,void ImageLoaderMachO::doModInitFunctions(const LinkContext& context)方法中触发了lisSystem_initializer

iOS-底层原理12-应用程序加载_第36张图片
_objc_init堆栈@2x.png

查看官方libSystem源码

查看libSystem_initializer源码,进行了一系列的初始化,包括_dyld_initializer();dyld的初始化,有人可能会有疑问,为嘛dyld_start都执行了,还需要初始化。并不是说dyld.start了它就已经初始化了,dyld也是一个库,也需要初始化
之后libdispatch_init();,由堆栈得知,libdispatch_init代码来自于libdispatch.dylib libdispatch_init

static void
libSystem_initializer(int argc,
              const char* argv[],
              const char* envp[],
              const char* apple[],
              const struct ProgramVars* vars)
{
    static const struct _libkernel_functions libkernel_funcs = {
        .version = 3,
        // V1 functions
        .dlsym = dlsym,
        .malloc = malloc,
        .free = free,
        .realloc = realloc,
        ._pthread_exit_if_canceled = _pthread_exit_if_canceled,
        // V2 functions (removed)
        // V3 functions
        .pthread_clear_qos_tsd = _pthread_clear_qos_tsd,
    };

    static const struct _libpthread_functions libpthread_funcs = {
        .version = 2,
        .exit = exit,
        .malloc = malloc,
        .free = free,
    };
    
    static const struct _libc_functions libc_funcs = {
        .version = 1,
        .atfork_prepare = libSystem_atfork_prepare,
        .atfork_parent = libSystem_atfork_parent,
        .atfork_child = libSystem_atfork_child,
#if defined(HAVE_SYSTEM_CORESERVICES)
        .dirhelper = _dirhelper,
#endif
    };

    __libkernel_init(&libkernel_funcs, envp, apple, vars);

    __libplatform_init(NULL, envp, apple, vars);

    __pthread_init(&libpthread_funcs, envp, apple, vars);

    _libc_initializer(&libc_funcs, envp, apple, vars);

    // TODO: Move __malloc_init before __libc_init after breaking malloc's upward link to Libc
    __malloc_init(apple);

#if TARGET_OS_OSX
    /*  */
    __keymgr_initializer();
#endif

    // No ASan interceptors are invoked before this point. ASan is normally initialized via the malloc interceptor:
    // _dyld_initializer() -> tlv_load_notification -> wrap_malloc -> ASanInitInternal

    _dyld_initializer();

    libdispatch_init();
    _libxpc_initializer();

#if CURRENT_VARIANT_asan
    setenv("DT_BYPASS_LEAKS_CHECK", "1", 1);
#endif

    // must be initialized after dispatch
    _libtrace_init();

#if !(TARGET_OS_EMBEDDED || TARGET_OS_SIMULATOR)
    _libsecinit_initializer();
#endif

#if defined(HAVE_SYSTEM_CONTAINERMANAGER)
    _container_init(apple);
#endif

    __libdarwin_init();

    __stack_logging_early_finished();

#if !TARGET_OS_IPHONE
    /*  - Preserve the old behavior of apple[] for
     * programs that haven't linked against newer SDK.
     */
#define APPLE0_PREFIX "executable_path="
    if (dyld_get_program_sdk_version() < DYLD_MACOSX_VERSION_10_11){
        if (strncmp(apple[0], APPLE0_PREFIX, strlen(APPLE0_PREFIX)) == 0){
            apple[0] = apple[0] + strlen(APPLE0_PREFIX);
        }
    }
#endif

    /* 
     * C99 standard has the following in section 7.5(3):
     * "The value of errno is zero at program startup, but is never set
     * to zero by any library function."
     */
    errno = 0;
}

查看官网libdispatch源码,查找libdispatch_init的实现,会发现调用了_os_object_init();

void
libdispatch_init(void)
{
    dispatch_assert(sizeof(struct dispatch_apply_s) <=
            DISPATCH_CONTINUATION_SIZE);

    if (_dispatch_getenv_bool("LIBDISPATCH_STRICT", false)) {
        _dispatch_mode |= DISPATCH_MODE_STRICT;
    }
#if HAVE_OS_FAULT_WITH_PAYLOAD && TARGET_OS_IPHONE && !TARGET_OS_SIMULATOR
    if (_dispatch_getenv_bool("LIBDISPATCH_NO_FAULTS", false)) {
        _dispatch_mode |= DISPATCH_MODE_NO_FAULTS;
    } else if (getpid() == 1 ||
            !os_variant_has_internal_diagnostics("com.apple.libdispatch")) {
        _dispatch_mode |= DISPATCH_MODE_NO_FAULTS;
    }
#endif // HAVE_OS_FAULT_WITH_PAYLOAD && TARGET_OS_IPHONE && !TARGET_OS_SIMULATOR


#if DISPATCH_DEBUG || DISPATCH_PROFILE
#if DISPATCH_USE_KEVENT_WORKQUEUE
    if (getenv("LIBDISPATCH_DISABLE_KEVENT_WQ")) {
        _dispatch_kevent_workqueue_enabled = false;
    }
#endif
#endif

#if HAVE_PTHREAD_WORKQUEUE_QOS
    dispatch_qos_t qos = _dispatch_qos_from_qos_class(qos_class_main());
    _dispatch_main_q.dq_priority = _dispatch_priority_make(qos, 0);
#if DISPATCH_DEBUG
    if (!getenv("LIBDISPATCH_DISABLE_SET_QOS")) {
        _dispatch_set_qos_class_enabled = 1;
    }
#endif
#endif

#if DISPATCH_USE_THREAD_LOCAL_STORAGE
    _dispatch_thread_key_create(&__dispatch_tsd_key, _libdispatch_tsd_cleanup);
#else
    _dispatch_thread_key_create(&dispatch_priority_key, NULL);
    _dispatch_thread_key_create(&dispatch_r2k_key, NULL);
    _dispatch_thread_key_create(&dispatch_queue_key, _dispatch_queue_cleanup);
    _dispatch_thread_key_create(&dispatch_frame_key, _dispatch_frame_cleanup);
    _dispatch_thread_key_create(&dispatch_cache_key, _dispatch_cache_cleanup);
    _dispatch_thread_key_create(&dispatch_context_key, _dispatch_context_cleanup);
    _dispatch_thread_key_create(&dispatch_pthread_root_queue_observer_hooks_key,
            NULL);
    _dispatch_thread_key_create(&dispatch_basepri_key, NULL);
#if DISPATCH_INTROSPECTION
    _dispatch_thread_key_create(&dispatch_introspection_key , NULL);
#elif DISPATCH_PERF_MON
    _dispatch_thread_key_create(&dispatch_bcounter_key, NULL);
#endif
    _dispatch_thread_key_create(&dispatch_wlh_key, _dispatch_wlh_cleanup);
    _dispatch_thread_key_create(&dispatch_voucher_key, _voucher_thread_cleanup);
    _dispatch_thread_key_create(&dispatch_deferred_items_key,
            _dispatch_deferred_items_cleanup);
#endif

#if DISPATCH_USE_RESOLVERS // rdar://problem/8541707
    _dispatch_main_q.do_targetq = _dispatch_get_default_queue(true);
#endif

    _dispatch_queue_set_current(&_dispatch_main_q);
    _dispatch_queue_set_bound_thread(&_dispatch_main_q);

#if DISPATCH_USE_PTHREAD_ATFORK
    (void)dispatch_assume_zero(pthread_atfork(dispatch_atfork_prepare,
            dispatch_atfork_parent, dispatch_atfork_child));
#endif
    _dispatch_hw_config_init();
    _dispatch_time_init();
    _dispatch_vtable_init();
    _os_object_init();
    _voucher_init();
    _dispatch_introspection_init();
}
libdispatch_init调用[email protected]

继续查看_os_object_init();发现调用了_objc_init()和前面堆栈中的调用一模一样extern void _objc_init(void);,此时回回到ObjC的源码void _objc_init(void)进而将load_images传入_dyld_objc_notify_register(&map_images, load_images, unmap_image);方法中进而对dyld中的回调函数进行赋值

有ObjC源码_objc_init()调用过来传入load_images
void _dyld_objc_notify_register(_dyld_objc_notify_mapped    mapped,
                                _dyld_objc_notify_init      init,
                                _dyld_objc_notify_unmapped  unmapped)
{
    dyld::registerObjCNotifiers(mapped, init, unmapped);
}

传入load_images到registerObjCNotifiers,对回调函数进行赋值
void registerObjCNotifiers(_dyld_objc_notify_mapped mapped, _dyld_objc_notify_init init, _dyld_objc_notify_unmapped unmapped)
{
    // record functions to call
    sNotifyObjCMapped   = mapped;
    sNotifyObjCInit     = init;//对回调函数进行赋值
    sNotifyObjCUnmapped = unmapped;

    // call 'mapped' function with all images mapped so far
    try {
        notifyBatchPartial(dyld_image_state_bound, true, NULL, false, true);
    }
    catch (const char* msg) {
        // ignore request to abort during registration
    }

    //  call 'init' function on all images already init'ed (below libSystem)
    for (std::vector::iterator it=sAllImages.begin(); it != sAllImages.end(); it++) {
        ImageLoader* image = *it;
        if ( (image->getState() == dyld_image_state_initialized) && image->notifyObjC() ) {
            dyld3::ScopedTimer timer(DBG_DYLD_TIMING_OBJC_INIT, (uint64_t)image->machHeader(), 0, 0);
            (*sNotifyObjCInit)(image->getRealPath(), image->machHeader());
        }
    }
}
---------------------libdispatch源码-------------------------
void
_os_object_init(void)
{
    _objc_init();
    Block_callbacks_RR callbacks = {
        sizeof(Block_callbacks_RR),
        (void (*)(const void *))&objc_retain,
        (void (*)(const void *))&objc_release,
        (void (*)(const void *))&_os_objc_destructInstance
    };
    _Block_use_RR2(&callbacks);
#if DISPATCH_COCOA_COMPAT
    const char *v = getenv("OBJC_DEBUG_MISSING_POOLS");
    if (v) _os_object_debug_missing_pools = _dispatch_parse_bool(v);
    v = getenv("DISPATCH_DEBUG_MISSING_POOLS");
    if (v) _os_object_debug_missing_pools = _dispatch_parse_bool(v);
    v = getenv("LIBDISPATCH_DEBUG_MISSING_POOLS");
    if (v) _os_object_debug_missing_pools = _dispatch_parse_bool(v);
#endif
}

----------------ObjC源码------------------

void _objc_init(void)
{
    static bool initialized = false;
    if (initialized) return;
    initialized = true;
    
    // fixme defer initialization until an objc-using image is found?
    environ_init();
    tls_init();
    static_init();
    runtime_init();
    exception_init();
    cache_init();
    _imp_implementationWithBlock_init();

    _dyld_objc_notify_register(&map_images, load_images, unmap_image);

#if __OBJC2__
    didCallDyldNotifyRegister = true;
#endif
}

回调函数sNotifyObjCInit = init;,相当于load_images,从而触发dyld中notifySingle-->sNotifyObjCInit方法的回调,从而加载load_imageslibobjc.A.dylib`load_images:

--------------------dyld源码--------------------
static void notifySingle(dyld_image_states state, const ImageLoader* image, ImageLoader::InitializerTimingList* timingInfo)
{
    //dyld::log("notifySingle(state=%d, image=%s)\n", state, image->getPath());
    std::vector* handlers = stateToHandlers(state, sSingleHandlers);
    if ( handlers != NULL ) {
        dyld_image_info info;
        info.imageLoadAddress   = image->machHeader();
        info.imageFilePath      = image->getRealPath();
        info.imageFileModDate   = image->lastModified();
        for (std::vector::iterator it = handlers->begin(); it != handlers->end(); ++it) {
            const char* result = (*it)(state, 1, &info);
            if ( (result != NULL) && (state == dyld_image_state_mapped) ) {
                //fprintf(stderr, "  image rejected by handler=%p\n", *it);
                // make copy of thrown string so that later catch clauses can free it
                const char* str = strdup(result);
                throw str;
            }
        }
    }
    if ( state == dyld_image_state_mapped ) {
        //  Save load addr + UUID for images from outside the shared cache
        if ( !image->inSharedCache() ) {
            dyld_uuid_info info;
            if ( image->getUUID(info.imageUUID) ) {
                info.imageLoadAddress = image->machHeader();
                addNonSharedCacheImageUUID(info);
            }
        }
    }
    if ( (state == dyld_image_state_dependents_initialized) && (sNotifyObjCInit != NULL) && image->notifyObjC() ) {
        uint64_t t0 = mach_absolute_time();
        dyld3::ScopedTimer timer(DBG_DYLD_TIMING_OBJC_INIT, (uint64_t)image->machHeader(), 0, 0);
        (*sNotifyObjCInit)(image->getRealPath(), image->machHeader());
        uint64_t t1 = mach_absolute_time();
        uint64_t t2 = mach_absolute_time();
        uint64_t timeInObjC = t1-t0;
        uint64_t emptyTime = (t2-t1)*100;
        if ( (timeInObjC > emptyTime) && (timingInfo != NULL) ) {
            timingInfo->addTime(image->getShortName(), timeInObjC);
        }
    }
    // mach message csdlc about dynamically unloaded images
    if ( image->addFuncNotified() && (state == dyld_image_state_terminated) ) {
        notifyKernel(*image, false);
        const struct mach_header* loadAddress[] = { image->machHeader() };
        const char* loadPath[] = { image->getPath() };
        notifyMonitoringDyld(true, 1, loadAddress, loadPath);
    }
}

总结:闭环

dyld runinit --> do init --> libsys --> init --> libdispatch --> _os_objct_init --> libobjc.A.dylib --> _objc_init()

void _dyld_objc_notify_register(_dyld_objc_notify_mapped    mapped,
                                _dyld_objc_notify_init      init,
                                _dyld_objc_notify_unmapped  unmapped)
{
    dyld::registerObjCNotifiers(mapped, init, unmapped);
}

参数二为load_images = init = sNotifyObjCInit
notifySingle --> sNotifyObjCInit:=参数2 sNotifyObjCInit()

iOS-底层原理12-应用程序加载_第38张图片
dyld流程闭环.png

load(),C++,main()函数调用顺序?前文一直是load()-->C++-->main()的调用顺序,为什么呢?

上一节知道回调函数走入load_images方法中,调用call_load_methods()方法,循环将所有的load方法调用一遍

void
load_images(const char *path __unused, const struct mach_header *mh)
{
    if (!didInitialAttachCategories && didCallDyldNotifyRegister) {
        didInitialAttachCategories = true;
        loadAllCategories();
    }
    // Return without taking locks if there are no +load methods here.
    if (!hasLoadMethods((const headerType *)mh)) return;
    recursive_mutex_locker_t lock(loadMethodLock);
    // Discover load methods
    {
        mutex_locker_t lock2(runtimeLock);
        prepare_load_methods((const headerType *)mh);
    }
    // Call +load methods (without runtimeLock - re-entrant)
    call_load_methods();
}
void call_load_methods(void)
{
    static bool loading = NO;
    bool more_categories;
    loadMethodLock.assertLocked();
    // Re-entrant calls do nothing; the outermost call will finish the job.
    if (loading) return;
    loading = YES;
    void *pool = objc_autoreleasePoolPush();
    do {
        // 1. Repeatedly call class +loads until there aren't any more
        while (loadable_classes_used > 0) {
            call_class_loads();
        }
        // 2. Call category +loads ONCE
        more_categories = call_category_loads();
        // 3. Run more +loads if there are classes OR more untried categories
    } while (loadable_classes_used > 0  ||  more_categories);
    objc_autoreleasePoolPop(pool);
    loading = NO;
}

load方法调用完毕后,接下来会回到dyld源码中的doInitialization方法,从而调用doModInitFunctions方法,调用MachO文件中image所有的Cxx头文件方法,怎么验证呢?看堆栈信息是否是调用了doModInitFunctions(context);发现一模一样,

bool ImageLoaderMachO::doInitialization(const LinkContext& context)
{
    CRSetCrashLogMessage2(this->getPath());
    // mach-o has -init and static initializers
    doImageInit(context);
    doModInitFunctions(context);
    CRSetCrashLogMessage2(NULL);
    return (fHasDashInit || fHasInitializers);
}
Cxx方法堆栈@2x.png

C++方法走完之后,接下来会走哪个方法呢?我们断点看汇编和寄存器

main函数作为入口,是被写定的一个函数编译到内存中,若改个名字,程序会报错,main以固定的方式存在寄存器rax中,能否从dyld源码中看到呢?

main函数改名字报错@2x.png

流程从dyld_start-->bootstramp::start执行完后call to main()

将C++函数和main()函数调换顺序,还是一样的执行顺序结果。dyld image init 注册回调通知 - dyld_start -> dyld::main() -> main()
iOS-底层原理12-应用程序加载_第40张图片
C++函数和main函数调换顺序@2x.png

补充说明:在dyld的main方法中加载共享缓存分为3种情况,共享缓存是指UIKit,Foundation等系统的动态库,当多个App启动时,只有一份在内存中,从共享缓存中去取,动态库的共享缓存在整个应用启动过程中,最先被加载的一个

图片.png

1.若强制私有,共享缓存只加载到当前进程中
2.共享缓存已经存在,什么都不做
3.不存在,当前进程首次加载共享缓存

共享缓存被加载之后,加载依赖的framework和第三方动态库,以及插入的动态库
iOS11.0之后,加载方式为dyld3,流程和dyld2一样,方式closureMode闭包模式更加高效

每个App都会分配一个dyld吗???

并不会,dyld只有一个,进程内存,每个进程有个ASLR偏移地址,所有的应用程序MachO可执行文件都从ASLR开始,外部的也要做ASLR的rebase修正,得到的都是ASLR+实际地址=虚拟地址,此虚拟地址只能当前进程能访问到,dyld在真实的物理地址中只有一份,但在各个进程中都是唯一的,这个地址给到CPU会做地址翻译,翻译成真正的物理地址去调用dyld,dyld加载时先要进行dyld的rebase操作,原来的dyld物理地址不能直接暴露给应用程序,根据当前进程dyldASLR做rebase生成虚拟地址,rebase后的地址都是当前虚拟内存空间的地址,通过此地址找物理地址通过CPU去找,这样应用程序就无法更改dyld,相对每个进程都是独立的地址,每个地址都是自己的假地址

dyldbootstrap::start方法中,先进行重定位dyld,进程启动因为是虚拟地址,都需做重定位


iOS-底层原理12-应用程序加载_第41张图片

iOS-底层原理12-应用程序加载_第42张图片
图片.png

dyld3进行rebase后,最后直接return main函数,进入主程序也要rebase,开始加载主程序MainExecuable,MachO文件进入到内存为Image,加载MachO头文件,loadCommands,代码签名,代码加密,实例化主程序加载到AllImages中,AllImages中添加的第一个是主程序的MachO文件

代码签名
检测主程序
配置,设置加载动态库的版本
检测关于动态库的加载
有个环境变量sEnv.DYLD_INSERT_LIBRARIES(将来在越狱环境中会一直使用,此环境变量程序员无法修改,root环境中可以修改)
判断此环境变量不为NULL执行loadInsertedDylib
加载插入的动态库
link链接主程序
设置起始时间,后面记录结束时间,相减得到间隔时间,用于配置环境变量记录加载时长,启动优化用到
递归加载主程序依赖的库,完成之后发通知
Rebase修正ASLR,每个动态库都需要修正ASLE,都有偏移值,在MachOView中可以查看到


iOS-底层原理12-应用程序加载_第43张图片
image

绑定NoLazy符号
绑定弱符号
递归应用插入的动态库(共享缓存中的系统库UIKit,Foudation以及用到的第三方动态库Flutter,andromeda,Lottie,OpenSSL等),通过在WeChat的load方法断点image list得知


图片.png

iOS-底层原理12-应用程序加载_第44张图片
图片.png

注册
计算结束时间,目的是配置环境变量就能看到dyld加载的时长
通过环境变量配置可打印加载动态库时间,Rebase修正时间,绑定时间,弱绑定时间等
iOS-底层原理12-应用程序加载_第45张图片
图片.png

在AllImages中添加主程序后链接加载插入的动态库(UIKit,Foundation,第三方库等),此时从第2个位置开始,因为前面第0个位置已经加入了主程序,第1个位置是dyld本身
图片.png

iOS-底层原理12-应用程序加载_第46张图片
图片.png

链接插入的动态库,链接方式和主程序一致,前期是主程序的链接
判断所有的Image是否加载完成,没有加载完成会持续的递归reloadAllImages
循环加载插入的动态库
iOS-底层原理12-应用程序加载_第47张图片
图片.png

绑定插入的动态库
弱符号绑定


iOS-底层原理12-应用程序加载_第48张图片
图片.png

初始化Main方法initialMainExecutable()
runInitializers()
processInitializers()
recursiveInitialization
context.notifySingle()
(*sNotifyObjcInit)(image->getRealPath(),image->machHeader());
call_all_methods加载所有的load方法,程序继续执行
doInitialization()
doModInitFunctions(context)执行C++构造方法
notifyMonitoringDyldMain()通知监控进程将要进入主程序main函数
sMainExecutable->getEntryFromLC_MAIN()找到主程序main函数入口

流程图

iOS-底层原理12-应用程序加载_第49张图片
dyld流程分析图.png

你可能感兴趣的:(iOS-底层原理12-应用程序加载)