znone-orz

Android6.0中ART执行类方法的过程分析一

参考：http://blog.csdn.net/luoshengyang/article/details/40289405 罗升阳老师的 Android运行时ART执行类方法的过程分析一文所写，主要将代码实现部分做了改动。

OatFile* OatFile::Open函数用来加载oat文件，原本的后端是portable和quick，而现在的是quick和optimizing，现在的这两个后端在open函数实现上并没有区别，选择的加载器都是安卓自带的android_dlopen_ext动态库加载器。而通过Portable后端和Quick后端生成的OAT文件的本质区别在于，前者使用标准的动态链接器加载，而后者使用自定义的加载器加载。
optimizing和quick的区分是在dex2oat的CompilerDriver中，根据dex2oat的参数不同，选择具体的compiler-kQuick还是kOptimizing。
在前面分析Dalvik虚拟机的文章Dalvik虚拟机进程和线程的创建过程分析中，我们提到每一个Dalvik虚拟机线程在内部都通过一个Thread对象描述。这个Thread对象包含了一些与虚拟机相关的信息。例如，JNI函数调用函数表。在ART运行时中创建的线程，和Davik虚拟机线程一样，在内部也会通过一个Thread对象来描述。这个新的Thread对象内部除了定义JNI调用函数表之外，还定义了我们在上面提到的外部函数调用跳转表。
在前面Android运行时ART加载OAT文件的过程分析一文，我们提到了ART运行时的启动和初始化过程。其中的一个初始化过程便是将主线程关联到ART运行时去，如下所示：

 782 bool Runtime::Init(const RuntimeOptions& raw_options, bool ignore_unrecognized) {
    ...
 967   java_vm_ = new JavaVMExt(this, runtime_options);
     ...
 971   // ClassLinker needs an attached thread, but we can't fully attach a thread without creating
 972   // objects. We can't supply a thread group yet; it will be fixed later. Since we are the main
 973   // thread, we do not get a java peer.
 974   Thread* self = Thread::Attach("main", false, nullptr, false);
     ...
1124   return true;
1125 }

这个函数定义在文件art/runtime/runtime.cc中。
在Runtime类的成员函数Init中，通过调用Thread类的静态成员函数Attach将当前线程，也就是主线程，关联到ART运行时去。在关联的过程中，就会初始化一个外部库函数调用跳转表。
Thread类的静态成员函数Attach的实现如下所示：

 513 Thread* Thread::Attach(const char* thread_name, bool as_daemon, jobject thread_group,
 514                        bool create_peer) {
 515   Runtime* runtime = Runtime::Current();
    ...
 520   Thread* self;
  521   {
 522     MutexLock mu(nullptr, *Locks::runtime_shutdown_lock_);
 523     if (runtime->IsShuttingDownLocked()) {
 524       LOG(ERROR) << "Thread attaching while runtime is shutting down: " << thread_name;
 525       return nullptr;
 526     } else {
 527       Runtime::Current()->StartThreadBirth();
 528       self = new Thread(as_daemon);
 529       bool init_success = self->Init(runtime->GetThreadList(), runtime->GetJavaVM());
 530       Runtime::Current()->EndThreadBirth();
 531       if (!init_success) {
 532         delete self;
 533         return nullptr;
 534       }
 535     }
 536   }
    ...
 576   return self;
 577 }

这个函数定义在文件art/runtime/thread.cc中。
在Thread类的静态成员函数Attach中，最重要的就是创建了一个Thread对象来描述当前被关联到ART运行时的线程。创建了这个Thread对象之后，马上就调用它的成员函数Init来对它进行初始化。
Thread类的成员函数Init的实现如下所示：

 471 bool Thread::Init(ThreadList* thread_list, JavaVMExt* java_vm, JNIEnvExt* jni_env_ext) {
    ... 
 488   InitTlsEntryPoints();
 489   RemoveSuspendTrigger();
 490   InitCardTable();//jvm的卡表机制。参见：http://blog.csdn.net/lihuifeng/article/details/51681089

 498   if (jni_env_ext != nullptr) {
 499     DCHECK_EQ(jni_env_ext->vm, java_vm);
 500     DCHECK_EQ(jni_env_ext->self, this);
 501     tlsPtr_.jni_env = jni_env_ext;
 502   } else {
 503     tlsPtr_.jni_env = JNIEnvExt::Create(this, java_vm);
 504     if (tlsPtr_.jni_env == nullptr) {
 505       return false;
 506     }
 507   }
 508 
 509   thread_list->Register(this);
 510   return true;
 511 }

这个函数定义在文件art/runtime/thread.cc中。
Thread类的成员函数Init除了给当前的线程创建一个JNIEnvExt对象来描述它的JNI调用接口之外，还通过调用另外一个成员函数InitTlsEntryPoints来初始化一个外部库函数调用跳转表。
Thread类的成员函数InitTlsEntryPoints的实现如下所示：

  97 void Thread::InitTlsEntryPoints() {
    ...
 105   InitEntryPoints(&tlsPtr_.interpreter_entrypoints, &tlsPtr_.jni_entrypoints,
 106                   &tlsPtr_.quick_entrypoints);
 107 }

这个函数定义在文件art/runtime/thread.cc中。
调用的InitEntryPoints函数，该函数与架构相关，选择的是在~/android-6.0.1_r62/art/runtime/arch/arm64/entrypoints_init_arm64.cc中实现的：

 40 void InitEntryPoints(InterpreterEntryPoints* ipoints, JniEntryPoints* jpoints,
 41                      QuickEntryPoints* qpoints) {...}

Thread类定义了**三个成员变量**interpreter_entrypoints_、jni_entrypoints_和quick_entrypoints_，如下所示：

 139 class Thread {
     ...
1241     // Entrypoint function pointers.
1242     // TODO: move this to more of a global offset table model to avoid per-thread duplication.
1243     InterpreterEntryPoints interpreter_entrypoints;
1244     JniEntryPoints jni_entrypoints;
1245     QuickEntryPoints quick_entrypoints;
    ...
    };

Thread类的声明定义在文件art/runtime/thread.h中。
Thread类将外部库函数调用跳转表划分为3个，其中，interpreter_entrypoints_描述的是解释器要用到的跳转表，jni_entrypoints_描述的是JNI调用相关的跳转表，而quick_entrypoints_描述的是Quick后端生成的本地机器指令要用到的跳转表。
回到Thread类的成员函数InitTlsEntryPoints中，它通过调用一个全局函数InitEntryPoints来初始化上述的3个跳转表。全局函数InitEntryPoints的实现是和CPU体系结构相关的，因为跳转表里面的函数调用入口是用汇编语言来实现的。
以ARM64架构为例：

 40 void InitEntryPoints(InterpreterEntryPoints* ipoints, JniEntryPoints* jpoints,
 41                      QuickEntryPoints* qpoints) {
 42   // Interpreter
 43   ipoints->pInterpreterToInterpreterBridge = artInterpreterToInterpreterBridge;
 44   ipoints->pInterpreterToCompiledCodeBridge = artInterpreterToCompiledCodeBridge;
 45 
 46   // JNI
 47   jpoints->pDlsymLookup = art_jni_dlsym_lookup_stub;
 48 
 49   // Alloc
 50   ResetQuickAllocEntryPoints(qpoints);
 51 
 52   // Cast
 53   qpoints->pInstanceofNonTrivial = art_quick_assignable_from_code;
 54   qpoints->pCheckCast = art_quick_check_cast;
 55 
 56   // DexCache
 57   qpoints->pInitializeStaticStorage = art_quick_initialize_static_storage;
 58   qpoints->pInitializeTypeAndVerifyAccess = art_quick_initialize_type_and_verify_access;
 59   qpoints->pInitializeType = art_quick_initialize_type;
 60   qpoints->pResolveString = art_quick_resolve_string;
 61
  62   // Field
 63   qpoints->pSet8Instance = art_quick_set8_instance;
 64   qpoints->pSet8Static = art_quick_set8_static;
 65   qpoints->pSet16Instance = art_quick_set16_instance;
 66   qpoints->pSet16Static = art_quick_set16_static;
 67   qpoints->pSet32Instance = art_quick_set32_instance;
 68   qpoints->pSet32Static = art_quick_set32_static;
 69   qpoints->pSet64Instance = art_quick_set64_instance;
 70   qpoints->pSet64Static = art_quick_set64_static;
 71   qpoints->pSetObjInstance = art_quick_set_obj_instance;
 72   qpoints->pSetObjStatic = art_quick_set_obj_static;
 73   qpoints->pGetBooleanInstance = art_quick_get_boolean_instance;
 74   qpoints->pGetByteInstance = art_quick_get_byte_instance;
 75   qpoints->pGetCharInstance = art_quick_get_char_instance;
 76   qpoints->pGetShortInstance = art_quick_get_short_instance;
 77   qpoints->pGet32Instance = art_quick_get32_instance;
 78   qpoints->pGet64Instance = art_quick_get64_instance;
 79   qpoints->pGetObjInstance = art_quick_get_obj_instance;
 80   qpoints->pGetBooleanStatic = art_quick_get_boolean_static;
 81   qpoints->pGetByteStatic = art_quick_get_byte_static;
 82   qpoints->pGetCharStatic = art_quick_get_char_static;
 83   qpoints->pGetShortStatic = art_quick_get_short_static;
 84   qpoints->pGet32Static = art_quick_get32_static;
 85   qpoints->pGet64Static = art_quick_get64_static;
 86   qpoints->pGetObjStatic = art_quick_get_obj_static;
 87
  88   // Array
 89   qpoints->pAputObjectWithNullAndBoundCheck = art_quick_aput_obj_with_null_and_bound_check;
 90   qpoints->pAputObjectWithBoundCheck = art_quick_aput_obj_with_bound_check;
 91   qpoints->pAputObject = art_quick_aput_obj;
 92   qpoints->pHandleFillArrayData = art_quick_handle_fill_data;
 93 
 94   // JNI
 95   qpoints->pJniMethodStart = JniMethodStart;
 96   qpoints->pJniMethodStartSynchronized = JniMethodStartSynchronized;
 97   qpoints->pJniMethodEnd = JniMethodEnd;
 98   qpoints->pJniMethodEndSynchronized = JniMethodEndSynchronized;
 99   qpoints->pJniMethodEndWithReference = JniMethodEndWithReference;
100   qpoints->pJniMethodEndWithReferenceSynchronized = JniMethodEndWithReferenceSynchronized;
101   qpoints->pQuickGenericJniTrampoline = art_quick_generic_jni_trampoline;
102 
103   // Locks
104   qpoints->pLockObject = art_quick_lock_object;
105   qpoints->pUnlockObject = art_quick_unlock_object;
106 
107   // Math
108   // TODO null entrypoints not needed for ARM64 - generate inline.
109   qpoints->pCmpgDouble = nullptr;
110   qpoints->pCmpgFloat = nullptr;
111   qpoints->pCmplDouble = nullptr;
112   qpoints->pCmplFloat = nullptr;
113   qpoints->pFmod = art_quick_fmod;
114   qpoints->pL2d = nullptr;
115   qpoints->pFmodf = art_quick_fmodf;
116   qpoints->pL2f = nullptr;
117   qpoints->pD2iz = nullptr;
118   qpoints->pF2iz = nullptr;
119   qpoints->pIdivmod = nullptr;
120   qpoints->pD2l = nullptr;
121   qpoints->pF2l = nullptr;
122   qpoints->pLdiv = nullptr;
123   qpoints->pLmod = nullptr;
124   qpoints->pLmul = nullptr;
125   qpoints->pShlLong = nullptr;
126   qpoints->pShrLong = nullptr;
127   qpoints->pUshrLong = nullptr;
128 
129   // Intrinsics
130   qpoints->pIndexOf = art_quick_indexof;
131   qpoints->pStringCompareTo = art_quick_string_compareto;
132   qpoints->pMemcpy = art_quick_memcpy;
133 
134   // Invocation
135   qpoints->pQuickImtConflictTrampoline = art_quick_imt_conflict_trampoline;
136   qpoints->pQuickResolutionTrampoline = art_quick_resolution_trampoline;
137   qpoints->pQuickToInterpreterBridge = art_quick_to_interpreter_bridge;
138   qpoints->pInvokeDirectTrampolineWithAccessCheck =
139       art_quick_invoke_direct_trampoline_with_access_check;
140   qpoints->pInvokeInterfaceTrampolineWithAccessCheck =
141       art_quick_invoke_interface_trampoline_with_access_check;
142   qpoints->pInvokeStaticTrampolineWithAccessCheck =
143       art_quick_invoke_static_trampoline_with_access_check;
144   qpoints->pInvokeSuperTrampolineWithAccessCheck =
145       art_quick_invoke_super_trampoline_with_access_check;
146   qpoints->pInvokeVirtualTrampolineWithAccessCheck =
147       art_quick_invoke_virtual_trampoline_with_access_check;
148 
149   // Thread
150   qpoints->pTestSuspend = art_quick_test_suspend;
151 
152   // Throws
153   qpoints->pDeliverException = art_quick_deliver_exception;
154   qpoints->pThrowArrayBounds = art_quick_throw_array_bounds;
155   qpoints->pThrowDivZero = art_quick_throw_div_zero;
156   qpoints->pThrowNoSuchMethod = art_quick_throw_no_such_method;
157   qpoints->pThrowNullPointer = art_quick_throw_null_pointer_exception;
158   qpoints->pThrowStackOverflow = art_quick_throw_stack_overflow;
159 
160   // Deoptimize
161   qpoints->pDeoptimize = art_quick_deoptimize;
162 
163   // Read barrier
164   qpoints->pReadBarrierJni = ReadBarrierJni;
165 };

这个函数定义在文件art/runtime/arch/arm64/entrypoints_init_arm64.cc中。
从函数InitEntryPoints的实现就可以看到Quick后端要使用到的外部库函数调用跳转表的初始化过程了。例如，如果在生成的本地机器指令中，需要调用一个JNI函数，那么就需要通过art_jni_dlsym_lookup_stub函数来间接地调用，以便可以找到正确的JNI函数来调用。
此外，我们还可以看到，解释器要用到的跳转表只包含了两项，分别是artInterpreterToInterpreterBridge和artInterpreterToCompiledCodeBridge。前者用来从一个解释执行的类方法跳到另外一个也是解释执行的类方法去执行，后者用来从一个解释执行的类方法跳到另外一个以本地机器指令执行的类方法去执行。
剩下的其它代码均是用来初始化Quick后端生成的本地机器指令要用到的跳转表，它包含的项非常多，但是可以划分为Alloc（对象分配）、Cast（类型转换）、DexCache（Dex缓访问）、Field（成员变量访问）、FillArray（数组填充）、JNI（JNI函数调用）、Locks（锁）、Math（数学计算）、Intrinsics（内建函数调用）、Invocation（类方法调用）、Thread（线程操作），Throws（异常处理），Deoptimize（反优化）和Read barrier（读屏障）等14类。
有了这些跳转表之后，当我们需要在生成的本地机器指令中调用一个外部库提供的函数时，只要找到用来描述当前线程的Thread对象，然后再根据上述的四个跳转表在该Thread对象内的偏移位置，那么就很容易找到所需要的跳转项了。

在前面Android运行时ART加载类和方法的过程分析这篇文章中，我们提到，在类的加载过程中，需要对类的各个方法进行链接，实际上就是确定它们是通过解释器来执行，还是以本地机器指令来直接执行，如下所示：

2191 void ClassLinker::LinkCode(ArtMethod* method, const OatFile::OatClass* oat_class,
2192                            uint32_t class_def_method_index) {
2193   Runtime* const runtime = Runtime::Current();
2194   if (runtime->IsAotCompiler()) {
2195     // 以下代码只适用于非compiler的runtime
2196     return;
2197   }
2198   // method不应该已经被linked了。
2199   DCHECK(method->GetEntryPointFromQuickCompiledCode() == nullptr);
2200   if (oat_class != nullptr) {
2201     // 每种method应当至少从oat_method得到一个invoke stub
2202     // 非抽象method也会获得他们的code pointers
2203     const OatFile::OatMethod oat_method = oat_class->GetOatMethod(class_def_method_index);
2204     oat_method.LinkMethod(method);
2205   }
2206 
2207   // Install entry point from interpreter.
2208   bool enter_interpreter = NeedsInterpreter(method, method->GetEntryPointFromQuickCompiledCode());
2209   if (enter_interpreter && !method->IsNative()) {
2210     method->SetEntryPointFromInterpreter(artInterpreterToInterpreterBridge);
2211   } else {
2212     method->SetEntryPointFromInterpreter(artInterpreterToCompiledCodeBridge);
2213   }
2214 
2215   if (method->IsAbstract()) {
2216     method->SetEntryPointFromQuickCompiledCode(GetQuickToInterpreterBridge());
2217     return;
2218   }
2219 
2220   if (method->IsStatic() && !method->IsConstructor()) {
2221     // 对于静态method，除了class initializer外，install the trampoline。
2222     // 在初始化类后，它将会被ClassLinker::FixupStaticTrampolines用恰当的entry point取代。
2223     
2224     method->SetEntryPointFromQuickCompiledCode(GetQuickResolutionStub());
2225   } else if (enter_interpreter) {
2226     if (!method->IsNative()) {
2227       // Set entry point from compiled code if there's no code or in interpreter only mode.
2228       method->SetEntryPointFromQuickCompiledCode(GetQuickToInterpreterBridge());
2229     } else {
2230       method->SetEntryPointFromQuickCompiledCode(GetQuickGenericJniStub());
2231     }
2232   }
2233 
2234   if (method->IsNative()) {
2235     // Unregistering restores the dlsym lookup stub.
2236     method->UnregisterNative();
2237 
2238     if (enter_interpreter) {
2239       // We have a native method here without code. Then it should have either the generic JNI
2240       // trampoline as entrypoint (non-static), or the resolution trampoline (static).
2241       // TODO: this doesn't handle all the cases where trampolines may be installed.
2242       const void* entry_point = method->GetEntryPointFromQuickCompiledCode();
2243       DCHECK(IsQuickGenericJniStub(entry_point) || IsQuickResolutionStub(entry_point));
2244     }
2245   }
2246 }

这个函数定义在文件art/runtime/class_linker.cc中。
函数LinkCode的详细解释可以参考前面Android运行时ART加载类和方法的过程分析一文，这里我们只对结论进行总结，以及对结论进行进一步的分析：
1. ART运行时有两种执行方法：解释执行模式和本地机器指令执行模式。默认是本地机器指令执行模式，但是在启动ART运行时时可以通过-Xint选项指定为解释执行模式。
2. 即使是在本地机器指令模式中，也有类方法可能需要以解释模式执行。反之亦然。解释执行的类方法通过函数artInterpreterToCompiledCodeBridge的返回值调用本地机器指令执行的类方法；本地机器指令执行的类方法通过函数GetQuickToInterpreterBridge的返回值调用解释执行的类方法；解释执行的类方法通过函数artInterpreterToInterpreterBridge 的返回值解释执行的类方法。
3. 在解释执行模式下，除了JNI方法和动态Proxy方法，其余所有的方法均通过解释器执行，它们的入口点设置为函数GetQuickToInterpreterBridge的返回值。
4. 抽象方法不能执行，它必须要由子类实现，因此会将抽象方法的入口点设置为函数GetQuickToInterpreterBridge的返回值，目的检测是否在本地机器指令中调用了抽象方法。如果调用了，上述入口点就会抛出一个异常。
5. 静态类方法的执行模式延迟至类初始化确定。在类初始化之前，它们的入口点由函数GetQuickResolutionStub的返回值代理。

    接下来，我们就着重分析**artInterpreterToCompiledCodeBridge、GetQuickToInterpreterBridge、artInterpreterToInterpreterBridge和GetQuickResolutionStub**这4个函数以及它们所返回的函数的实现，以便可以更好地理解上述5个结论。
    函数artInterpreterToCompiledCodeBridge用来在解释器中调用以本地机器指令执行的函数，它的实现如下所示：

 28 extern "C" void artInterpreterToCompiledCodeBridge(Thread* self, const DexFile::CodeItem* code_item,
 29                                                    ShadowFrame* shadow_frame, JValue* result) {
 30   ArtMethod* method = shadow_frame->GetMethod();
 31   // Ensure static methods are initialized.
 32   if (method->IsStatic()) {
 33     mirror::Class* declaringClass = method->GetDeclaringClass();
 34     if (UNLIKELY(!declaringClass->IsInitialized())) {
 35       self->PushShadowFrame(shadow_frame);
 36       StackHandleScope<1> hs(self);
 37       Handle<mirror::Class> h_class(hs.NewHandle(declaringClass));
 38       if (UNLIKELY(!Runtime::Current()->GetClassLinker()->EnsureInitialized(self, h_class, true,
 39                                                                             true))) {
 40         self->PopShadowFrame();
 41         DCHECK(self->IsExceptionPending());
 42         return;
 43       }
 44       self->PopShadowFrame();
 45       CHECK(h_class->IsInitializing());
 46       // Reload from shadow frame in case the method moved, this is faster than adding a handle.
 47       method = shadow_frame->GetMethod();
 48     }
 49   }
 50   uint16_t arg_offset = (code_item == nullptr) ? 0 : code_item->registers_size_ - code_item->ins_size_;
 51   method->Invoke(self, shadow_frame->GetVRegArgs(arg_offset),
 52                  (shadow_frame->NumberOfVRegs() - arg_offset) * sizeof(uint32_t),
 53                  result, method->GetInterfaceMethodIfProxy(sizeof(void*))->GetShorty());
 54 }

这个函数定义在文件art/runtime/entrypoints/interpreter/interpreter_entrypoints.cc中。
被调用的类方法通过一个ArtMethod对象来描述，并且可以在调用栈帧shadow_frame中获得。获得了用来描述被调用方法的ArtMehtod对象之后，就可以调用它的成员函数Invoke来对它进行执行。后面我们就会看到，ArtMethod类的成员函数Invoke会找到类方法的本地机器指令来执行。
在调用类方法的本地机器指令的时候，从解释器调用栈获取的传入参数根据ART运行时使用的是Quick后端还是Portable后端来生成本地机器指令有所不同。不过最终都会ArtMethod类的成员函数Invoke来执行被调用类方法的本地机器指令。
函数GetQuickToInterpreterBridge用来返回一个函数指针，这个函数指针指向的函数用来从以本地机器指令执行的类方法中调用以解释执行的类方法，它的实现如下所示：

 37 // Return the address of quick stub code for bridging from quick code to the interpreter.
 38 extern "C" void art_quick_to_interpreter_bridge(ArtMethod*);
 39 static inline const void* GetQuickToInterpreterBridge() {
 40   return reinterpret_cast<const void*>(art_quick_to_interpreter_bridge);
 41 }

这个函数定义在runtime/entrypoints/runtime_asm_entrypoints.h中。
而里面用到的art_quick_to_interpreter_bridge函数与架构相关，是汇编实现的，用来从本地机器指令进入到解释器的。

1638 /*
1639  * Called to bridge from the quick to interpreter ABI. On entry the arguments match those
1640  * of a quick call:
1641  * x0 = method being called/to bridge to.
1642  * x1..x7, d0..d7 = arguments to that method.
1643  */
1644 ENTRY art_quick_to_interpreter_bridge
1645     SETUP_REFS_AND_ARGS_CALLEE_SAVE_FRAME   // Set up frame and save arguments.
1646 
1647     //  x0 will contain mirror::ArtMethod* method.
1648     mov x1, xSELF                          // How to get Thread::Current() ???
1649     mov x2, sp
1650 
1651     // uint64_t artQuickToInterpreterBridge(mirror::ArtMethod* method, Thread* self,
1652     //                                      mirror::ArtMethod** sp)
1653     bl   artQuickToInterpreterBridge
1654 
1655     RESTORE_REFS_AND_ARGS_CALLEE_SAVE_FRAME  // TODO: no need to restore arguments in this case.
1656 
1657     fmov d0, x0
1658 
1659     RETURN_OR_DELIVER_PENDING_EXCEPTION
1660 END art_quick_to_interpreter_bridge

这个实现在runtime/arch/arm64/quick_entrypoints_arm64.S中
很明显，函数art_quick_to_interpreter_bridge通过调用另外一个函数artQuickToInterpreterBridge从本地机器指令进入到解释器中去。
函数artQuickToInterpreterBridge的实现如下所示：

 598 extern "C" uint64_t artQuickToInterpreterBridge(ArtMethod* method, Thread* self, ArtMethod** sp)
 599     SHARED_LOCKS_REQUIRED(Locks::mutator_lock_) {
 600   // Ensure we don't get thread suspension until the object arguments are safely in the shadow
 601   // frame.
 602   ScopedQuickEntrypointChecks sqec(self);
 603 
 604   if (method->IsAbstract()) {
 605     ThrowAbstractMethodError(method);
 606     return 0;
 607   } else {
 608     DCHECK(!method->IsNative()) << PrettyMethod(method);
 609     const char* old_cause = self->StartAssertNoThreadSuspension(
 610         "Building interpreter shadow frame");
 611     const DexFile::CodeItem* code_item = method->GetCodeItem();
 612     DCHECK(code_item != nullptr) << PrettyMethod(method);
 613     uint16_t num_regs = code_item->registers_size_;
 614     void* memory = alloca(ShadowFrame::ComputeSize(num_regs));
 615     // No last shadow coming from quick.
 616     ShadowFrame* shadow_frame(ShadowFrame::Create(num_regs, nullptr, method, 0, memory));
 617     size_t first_arg_reg = code_item->registers_size_ - code_item->ins_size_;
 618     uint32_t shorty_len = 0;
 619     auto* non_proxy_method = method->GetInterfaceMethodIfProxy(sizeof(void*));
 620     const char* shorty = non_proxy_method->GetShorty(&shorty_len);
 621     BuildQuickShadowFrameVisitor shadow_frame_builder(sp, method->IsStatic(), shorty, shorty_len,
 622                                                       shadow_frame, first_arg_reg);
 623     shadow_frame_builder.VisitArguments();
 624     const bool needs_initialization =
 625         method->IsStatic() && !method->GetDeclaringClass()->IsInitialized();
 626     // Push a transition back into managed code onto the linked list in thread.
 627     ManagedStack fragment;
 628     self->PushManagedStackFragment(&fragment);
 629     self->PushShadowFrame(shadow_frame);
 630     self->EndAssertNoThreadSuspension(old_cause);
 631 
  632     if (needs_initialization) {
 633       // Ensure static method's class is initialized.
 634       StackHandleScope<1> hs(self);
 635       Handle<mirror::Class> h_class(hs.NewHandle(shadow_frame->GetMethod()->GetDeclaringClass()));
 636       if (!Runtime::Current()->GetClassLinker()->EnsureInitialized(self, h_class, true, true)) {
 637         DCHECK(Thread::Current()->IsExceptionPending()) << PrettyMethod(shadow_frame->GetMethod());
 638         self->PopManagedStackFragment(fragment);
 639         return 0;
 640       }
 641     }
 642     JValue result = interpreter::EnterInterpreterFromEntryPoint(self, code_item, shadow_frame);
 643     // Pop transition.
 644     self->PopManagedStackFragment(fragment);
 645 
 646     // Request a stack deoptimization if needed
 647     ArtMethod* caller = QuickArgumentVisitor::GetCallingMethod(sp);
 648     if (UNLIKELY(Dbg::IsForcedInterpreterNeededForUpcall(self, caller))) {
 649       self->SetException(Thread::GetDeoptimizationException());
 650       self->SetDeoptimizationReturnValue(result, shorty[0] == 'L');
 651     }
 652 
 653     // No need to restore the args since the method has already been run by the interpreter.
 654     return result.GetJ();
 655   }
 656 }

这个函数定义在文件art/runtime/entrypoints/quick/quick_trampoline_entrypoints.cc中。
函数artQuickToInterpreterBridge的作用实际上就是找到被调用类方法method的DEX字节码code_ item，然后根据调用传入的参数构造一个解释器调用栈帧shadow_frame，最后就可以通过函数interpreter::EnterInterpreterFromEntryPoint进入到解释器去执行了。
既然已经知道了要执行的类方法的DEX字节码，以及已经构造好了要执行的类方法的调用栈帧，我们就不难理解解释器是如何执行该类方法了，具体可以参考一下Dalvik虚拟机的运行过程分析这篇文章描述的Dalvik虚拟机解释器的实现。
如果要执行的类方法method是一个静态方法，那么我们就需要确保它的声明类是已经初始化过了的。如果还没有初始化过，那么就需要调用ClassLinker类的成员函数EnsureInitialized来对它进行初始化。
函数artInterpreterToInterpreterBridge用来从解释执行的函数调用到另外一个也是解释执行的函数，它的实现如下所示：

443 extern "C" void artInterpreterToInterpreterBridge(Thread* self, const DexFile::CodeItem* code_item,
444                                                   ShadowFrame* shadow_frame, JValue* result) {
445   bool implicit_check = !Runtime::Current()->ExplicitStackOverflowChecks();
446   if (UNLIKELY(__builtin_frame_address(0) < self->GetStackEndForInterpreter(implicit_check))) {
447     ThrowStackOverflowError(self);
448     return;
449   }
450 
451   self->PushShadowFrame(shadow_frame);
452   // Ensure static methods are initialized.
453   const bool is_static = shadow_frame->GetMethod()->IsStatic();
454   if (is_static) {
455     mirror::Class* declaring_class = shadow_frame->GetMethod()->GetDeclaringClass();
456     if (UNLIKELY(!declaring_class->IsInitialized())) {
457       StackHandleScope<1> hs(self);
458       HandleWrapper<Class> h_declaring_class(hs.NewHandleWrapper(&declaring_class));
459       if (UNLIKELY(!Runtime::Current()->GetClassLinker()->EnsureInitialized(
460           self, h_declaring_class, true, true))) {
461         DCHECK(self->IsExceptionPending());
462         self->PopShadowFrame();
463         return;
464       }
465       CHECK(h_declaring_class->IsInitializing());
466     }
467   }
468 
469   if (LIKELY(!shadow_frame->GetMethod()->IsNative())) {
470     result->SetJ(Execute(self, code_item, *shadow_frame, JValue()).GetJ());
471   } else {
472     // We don't expect to be asked to interpret native code (which is entered via a JNI compiler
473     // generated stub) except during testing and image writing.
474     CHECK(!Runtime::Current()->IsStarted());
475     Object* receiver = is_static ? nullptr : shadow_frame->GetVRegReference(0);
476     uint32_t* args = shadow_frame->GetVRegArgs(is_static ? 0 : 1);
477     UnstartedRuntime::Jni(self, shadow_frame->GetMethod(), receiver, args, result);
478   }
479 
480   self->PopShadowFrame();
481 }

这个函数定义在文件art/runtime/interpreter/interpreter.cc中。
对比函数artInterpreterToInterpreterBridge和artQuickToInterpreterBridge的实现就可以看出，虽然都是要跳入到解释器去执行一个被调用类方法，但是两者的实现是不一样的。前者由于调用方法本来就是在解释器中执行的，因此，调用被调用类方法所需要的解释器栈帧实际上已经准备就绪，并且被调用方法的DEX字节码也已经知晓，因此这时候就可以直接调用另外一个函数Execute来继续在解释器中执行。
同样，如果被调用的类方法是一个静态方法，并且它的声明类还没有被初始化，那么就需要调用ClassLinker类的成员函数EnsureInitialized来确保它的声明类是已经初始化好了的。
如果被调用的类方法是一个JNI方法，那么此种情况在ART运行时已经启动之后不允许的（ART运行时启动之前允许，但是只是测试ART运行时时才会用到），因为JNI方法在解释器中有自己的调用方式，而函数函数artInterpreterToInterpreterBridge仅仅是用于调用非JNI方法，因此这时候就会调用另外一个函数UnstartedRuntime::Jni记录和抛出错误。

函数GetQuickResolutionStub用来获得一个延迟链接类方法的函数。这个延迟链接类方法的函数用作那些在类加载时还没有链接好的方法的调用入口点，也就是还没有确定调用入口的类方法。对于已经链接好的类方法来说，无论它是解释执行，还是本地机器指令执行，相应的调用入口都是已经通过ArtMehtod类的成员函数SetEntryPointFromCompiledCode和SetEntryPointFromInterpreter设置好了的。如上所述，这类典型的类方法就是静态方法，它们需要等到类初始化的时候才会进行链接。

 55 // Return the address of quick stub code for resolving a method at first call.
 56 extern "C" void art_quick_resolution_trampoline(ArtMethod*);
 57 static inline const void* GetQuickResolutionStub() {
 58   return reinterpret_cast<const void*>(art_quick_resolution_trampoline);
 59 }

这个函数定义在runtime/entrypoints/runtime_asm_entrypoints.h中。
art_quick_resolution_trampoline同样是与架构相关，用汇编语言实现，返回在first call时用来resolve method的quick stub code的地址。

1477 ENTRY art_quick_resolution_trampoline
1478     SETUP_REFS_AND_ARGS_CALLEE_SAVE_FRAME
1479     mov x2, xSELF
1480     mov x3, sp
1481     bl artQuickResolutionTrampoline  // (called, receiver, Thread*, SP)
1482     cbz x0, 1f
1483     mov xIP0, x0            // Remember returned code pointer in xIP0.
1484     ldr x0, [sp, #0]        // artQuickResolutionTrampoline puts called method in *SP.
1485     RESTORE_REFS_AND_ARGS_CALLEE_SAVE_FRAME
1486     br xIP0
1487 1:
1488     RESTORE_REFS_AND_ARGS_CALLEE_SAVE_FRAME
1489     DELIVER_PENDING_EXCEPTION
1490 END art_quick_resolution_trampoline

这个函数在runtime/arch/arm64/quick_entrypoints_arm64.S中实现。函数art_quick_resolution_trampoline首先是调用另外一个函数artQuickResolutionTrampoline来获得真正要调用的函数的地址，并且通过bl指令跳到该地址去执行。函数artQuickResolutionTrampoline的作用就是用来延迟链接类方法的，也就是等到该类方法被调用时才会对它进行解析链接，确定真正要调用的函数。
函数artQuickResolutionTrampoline的实现如下所示：

 813 // Lazily resolve a method for quick. Called by stub code.
 814 extern "C" const void* artQuickResolutionTrampoline(
 815     ArtMethod* called, mirror::Object* receiver, Thread* self, ArtMethod** sp)
 816     SHARED_LOCKS_REQUIRED(Locks::mutator_lock_) {
 817   ScopedQuickEntrypointChecks sqec(self);
 818   // Start new JNI local reference state
 819   JNIEnvExt* env = self->GetJniEnv();
 820   ScopedObjectAccessUnchecked soa(env);
 821   ScopedJniEnvLocalRefState env_state(env);
    ...
 824   // Compute details about the called method (avoid GCs)
 825   ClassLinker* linker = Runtime::Current()->GetClassLinker();
 826   ArtMethod* caller = QuickArgumentVisitor::GetCallingMethod(sp);
 827   InvokeType invoke_type;
 828   MethodReference called_method(nullptr, 0);
 829   const bool called_method_known_on_entry = !called->IsRuntimeMethod();
 830   if (!called_method_known_on_entry) {
 831     uint32_t dex_pc = caller->ToDexPc(QuickArgumentVisitor::GetCallingPc(sp));
 832     const DexFile::CodeItem* code;
 833     called_method.dex_file = caller->GetDexFile();
 834     code = caller->GetCodeItem();
 835     CHECK_LT(dex_pc, code->insns_size_in_code_units_);
 836     const Instruction* instr = Instruction::At(&code->insns_[dex_pc]);
 837     Instruction::Code instr_code = instr->Opcode();
 838     bool is_range;
 839     switch (instr_code) {
 840       case Instruction::INVOKE_DIRECT:
 841         invoke_type = kDirect;
 842         is_range = false;
 843         break;
 844       case Instruction::INVOKE_DIRECT_RANGE:
 845         invoke_type = kDirect;
 846         is_range = true;
 847         break;
 848       case Instruction::INVOKE_STATIC:
 849         invoke_type = kStatic;
 850         is_range = false;
 851         break;
 852       case Instruction::INVOKE_STATIC_RANGE:
 853         invoke_type = kStatic;
 854         is_range = true;
 855         break;
 856       case Instruction::INVOKE_SUPER:
 857         invoke_type = kSuper;
 858         is_range = false;
 859         break;
 860       case Instruction::INVOKE_SUPER_RANGE:
 861         invoke_type = kSuper;
 862         is_range = true;
 863         break;
 864       case Instruction::INVOKE_VIRTUAL:
 865         invoke_type = kVirtual;
 866         is_range = false;
 867         break;
 868       case Instruction::INVOKE_VIRTUAL_RANGE:
 869         invoke_type = kVirtual;
 870         is_range = true;
 871         break;
 872       case Instruction::INVOKE_INTERFACE:
 873         invoke_type = kInterface;
 874         is_range = false;
 875         break;
 876       case Instruction::INVOKE_INTERFACE_RANGE:
 877         invoke_type = kInterface;
 878         is_range = true;
 879         break;
 880       default:
 881         LOG(FATAL) << "Unexpected call into trampoline: " << instr->DumpString(nullptr);
 882         UNREACHABLE();
 883     }
 884     called_method.dex_method_index = (is_range) ? instr->VRegB_3rc() : instr->VRegB_35c();
 885   } else {
 886     invoke_type = kStatic;
 887     called_method.dex_file = called->GetDexFile();
 888     called_method.dex_method_index = called->GetDexMethodIndex();
 889   }
    ...
 898   // Resolve method filling in dex cache.
 899   if (!called_method_known_on_entry) {
    ...
 905     called = linker->ResolveMethod(self, called_method.dex_method_index, caller, invoke_type);
 906   }
 907   const void* code = nullptr;
 908   if (LIKELY(!self->IsExceptionPending())) {
     ...
 912     if (virtual_or_interface) {
  913       // Refine called method based on receiver.
 914       CHECK(receiver != nullptr) << invoke_type;
 915 
 916       ArtMethod* orig_called = called;
 917       if (invoke_type == kVirtual) {
 918         called = receiver->GetClass()->FindVirtualMethodForVirtual(called, sizeof(void*));
 919       } else { 
 920         called = receiver->GetClass()->FindVirtualMethodForInterface(called, sizeof(void*));
 921       }
     ...
 962     // Ensure that the called method's class is initialized.
 963     StackHandleScope<1> hs(soa.Self());
 964     Handle called_class(hs.NewHandle(called->GetDeclaringClass()));
 965     linker->EnsureInitialized(soa.Self(), called_class, true, true);
 966     if (LIKELY(called_class->IsInitialized())) {
 967       if (UNLIKELY(Dbg::IsForcedInterpreterNeededForResolution(self, called))) {
 968         // If we are single-stepping or the called method is deoptimized (by a
 969         // breakpoint, for example), then we have to execute the called method
 970         // with the interpreter.
 971         code = GetQuickToInterpreterBridge();
 972       } else if (UNLIKELY(Dbg::IsForcedInstrumentationNeededForResolution(self, caller))) {
 973         // If the caller is deoptimized (by a breakpoint, for example), we have to
 974         // continue its execution with interpreter when returning from the called
 975         // method. Because we do not want to execute the called method with the
 976         // interpreter, we wrap its execution into the instrumentation stubs.
 977         // When the called method returns, it will execute the instrumentation
 978         // exit hook that will determine the need of the interpreter with a call
 979         // to Dbg::IsForcedInterpreterNeededForUpcall and deoptimize the stack if
 980         // it is needed.
 981         code = GetQuickInstrumentationEntryPoint();
 982       } else {
 983         code = called->GetEntryPointFromQuickCompiledCode();
 984       }
 985     } else if (called_class->IsInitializing()) {
 986       if (UNLIKELY(Dbg::IsForcedInterpreterNeededForResolution(self, called))) {
 987         // If we are single-stepping or the called method is deoptimized (by a
 988         // breakpoint, for example), then we have to execute the called method
 989         // with the interpreter.
 990         code = GetQuickToInterpreterBridge();
 991       } else if (invoke_type == kStatic) {
 992         // Class is still initializing, go to oat and grab code (trampoline must be left in place
 993         // until class is initialized to stop races between threads).
 994         code = linker->GetQuickOatCodeFor(called);
 995       } else {
 996         // No trampoline for non-static methods.
 997         code = called->GetEntryPointFromQuickCompiledCode();
 998       }
 999     } else {
1000       DCHECK(called_class->IsErroneous());
1001     }
1002   }
    ...
1006   // Place called method in callee-save frame to be placed as first argument to quick method.
1007   *sp = called;
1008 
1009   return code;
1010 }

这个函数定义在文件art/runtime/entrypoints/quick/quick_trampoline_entrypoints.cc中。
第一个参数called表示被调用的类方法，第二个参数receiver表示被调用的对象，也就是接收消息的对象，第三个参数thread表示当前线程，第四个参数sp指向调用栈顶。通过调用QuickArgumentVisitor类的静态成员函数GetCallingMethod可以在调用栈找到类方法called的调用者，保存在变量caller中。
被调用类方法called有可能是一个运行时方法（Runtime Method）。运行时方法相当是一个替身，它是用来找到被替换的类方法。当调用类方法called是一个运行时方法时，调用它的成员函数IsRuntimeMethod得到的返回值为true，这时候我们就需要找到被替换的类方法。那么问题就来了，怎么找到此时被替换的类方法呢？运行时方法只是一个空壳，没有任何线索可以提供给我们，不过我们却可以在DEX字节码的调用指令中找到一些蜘丝马迹。在DEX字节码中，我们在一个类方法中通过invoke-static/invoke-direct/invoke-interface/invoke-super/invoke-virtual等指令来调用另外一个类方法。在这些调用指令中，有一个寄存器记录了被调用的类方法在DEX文件中的方法索引dex_method_index。有了这个DEX文件方法索引之后，我们就可以在相应的DEX文件找到被替换的类方法了。现在第二个问题又来了，我们要在哪一个DEX文件查找被替换的类方法呢？函数artQuickResolutionTrampoline适用的是调用方法caller和被调用方法called均是位于同一个DEX文件的情况。因此，我们可以通过调用方法caller来得到要查找的DEX文件dex_file。有了上述两个重要的信息之后，函数artQuickResolutionTrampoline接下来就可以调用ClassLinker类的成员函数ResolveMethod来查找被替换的类方法了，并且继续保存在参数called中。另一方面，如果被调用类方法called不是运行时方法，那么情况就简单多了，因为此时called描述的便是要调用的类方法。
经过上面的处理之后，参数called指向的ArtMethod对象还不一定是最终要调用的类方法。这是因为当前发生的可能是一个虚函数调用或者接口调用。在上述两种情况下，我们需要通过接收消息的对象receiver来确定真正被调用的类方法。为了完成这个任务，我们首先通过调用Object类的成员函数GetClass获得接收消息的对象receiver的类对象，接着再通过调用过Class类的成员函数FindVirtualMethodForVirtual或者FindVirtualMethodForInterface来获得真正要被调用的类方法。前者针对的是虚函数调用，而后者针对的是接口调用。
最终我们得到的真正被调用的类方法仍然是保存在参数called中。这时候事情还没完，因为此时被调用的类方法所属的类可能还没有初始化好。因此，在继续下一步操作之前，我们需要调用ClassLinker类的成员函数EnsureInitialized来确保存被调用类方法called所属的类已经初始好了。在调用ClassLinker类的成员函数EnsureInitialized的时候，如果被调用类方法called所属的类还没有初始化，那么就会对它进行初始化，不过不等它初始化完成就返回了。因此，这时候就可能会出现两种情况。
第一种情况是被调用类方法called所属的类已经初始好了。这时候我们就可以直接调用它的成员函数GetEntryPointFromCompiledCode来获得它的本地机器指令或者DEX字节码，取决于它是以本地机器指令方式执行还是通过解释器来执行。
第二种情况是被调用方法called所属的类正在初始化中。这时候需要区分静态和非静态调用两种情况。在进一步解释之前，我们需要明确，类加载和类初始化是两个不同的操作。类加载的过程并不一定会伴随着类的初始化。此时我们唯一确定的是被调用方法called所属的类已经被加载（否则它的类方法无法被调用）。又从前面Android运行时ART加载类和方法的过程分析这篇文章可以知道，当一个类被加载时，除了它的静态成员函数，其余所有的成员函数均已加载完毕。这意味着我们可以直接调用ArtMethod类的成员函数GetEntryPointFromCompiledCode来获得被调用方法called的本地机器指令或者DEX字节码。对于静态成员函数的情况，我们就唯有到DEX文件去查找到被调用方法called的本地机器指令了。这是通过调用ClassLinker类的成员函数GetOatCodeFor来实现的。当然，如果该静态成员函数不存在本地机器指令，那么ClassLinker类的成员函数GetOatCodeFor返回的是进入解释器的入口函数地址。这样我们就可以通过解释器来执行该静态成员函数了。
最后，函数artQuickResolutionTrampoline将获得的真正被调用的类方法的执行入口地址code返回给前一个函数，即art_quick_resolution_trampoline，以便后者可以通过bl跳过去执行。函数artQuickResolutionTrampoline在返回之前，同时还会将此时栈顶的内容设置为真正被调用的类方法对象，以便真正被调用的类方法在运行时，可以获得正确的调用栈帧。
到这里，函数artQuickResolutionTrampoline的实现就分析完成了。不过对于上面提到的运行时方法，我们还需要继续解释。只有了理解了运行时方法的作用之后，我们才能真正理解函数artQuickResolutionTrampoline的作用。
运行时方法与另一个称为Dex Cache的机制有关。在ART运行时中，每一个DEX文件都有一个关联的Dex Cache，用来缓存对应的DEX文件中已经被解析过的信息，例如类方法和类属性等。这个Dex Cache使用类DexCache来描述，它的定义如下所示：

 38 // C++ mirror of java.lang.DexCache.
 39 class MANAGED DexCache FINAL : public Object {
 40  public:
     ...
150  private:
151   HeapReference

Android6.0中ART执行类方法的过程分析一

你可能感兴趣的:(Android6.0中ART执行类方法的过程分析一)