Android虚拟机线程启动过程解析, 获取Java线程真实线程Id的方式

背景

最近在项目开发中遇到一个场景,需要监控某个Java线程 的cpu使用率信息,这需要通过读取/proc/ p i d / t a s k / {pid}/task/ pid/task/{tid}/stat 文件来实现,这里的tid是系统层级线程ID,而 Java层的 Thread对象提供的API无法获取对应的系统层级线程id。 因此重新阅读了下ART虚拟机线程相关的源码。
本文是对ART虚拟机Java线程创建过程源码学习的一个总结,并在文章最后,实践了获取Java线程tid。

Java Thread的创建、执行过程

new Thread

public
class Thread implements Runnable {
    /**
     * Reference to the native thread object.
     *
     * 

Is 0 if the native thread has not yet been created/started, or has been destroyed. */ private volatile long nativePeer; // END Android-added: Android specific fields lock, nativePeer. private volatile String name; private int priority; /* For autonumbering anonymous threads. */ private static int threadInitNumber; private static synchronized int nextThreadNum() { return threadInitNumber++; } /* * Thread ID */ private long tid; /* For generating thread ID */ private static long threadSeqNumber; public Thread() { init(null, null, "Thread-" + nextThreadNum(), 0); } public Thread(Runnable target) { init(null, target, "Thread-" + nextThreadNum(), 0); } public Thread(String name) { init(null, null, name, 0); } //... /** * Initializes a Thread. */ private void init(ThreadGroup g, Runnable target, String name, long stackSize, AccessControlContext acc) { if (name == null) { throw new NullPointerException("name cannot be null"); } this.name = name; Thread parent = currentThread(); if (g == null) { g = parent.getThreadGroup(); } g.addUnstarted(); this.group = g; this.daemon = parent.isDaemon(); this.priority = parent.getPriority(); this.target = target; init2(parent); /* Stash the specified stack size in case the VM cares */ this.stackSize = stackSize; /* Set thread ID */ tid = nextThreadID(); } }

直接进入主题,先重新熟悉下Java层 Thread类的构造启动过程。当创建一个Thread对象,其构造函数中包含以下逻辑:
首先会设置 Java层 线程名,Thread类包含多个重载构造函数,如果调用的对应构造函数参数未显式设置线程名称,则默认的线程名称为 “thread-” 加上一个 递增的 id,这个id 是在每次调用new Thread()构造函数时自动递增的,其作用只是对于未显示配置线程名的线程在名称上做一个简单的区分。
无论调用哪个构造函数,在构造函数的工作都一样,会配置 线程 的 group、daemon、priority 、stackSize等属性。

 /* Set thread ID */
tid = nextThreadID();

在构造函数的最后会分配一个 tid, 需要注意的是 这里的tid 和linux 系统层面的tid无关,这个只是Java层面的一个线程Id。

start Thread

    public synchronized void start() {
	    //如果线程已启动过,抛出异常 
        if (started)
            throw new IllegalThreadStateException();
        //添加线程到 group中
        group.add(this);

        started = false;
        try {
            //调用native 函数进行真正的线程创建工作,传入2个参数,分别为线程函数栈大小及daemon书写
            nativeCreate(this, stackSize, daemon);
            started = true;
        } finally {
            try {
                //创建流程失败的逻辑处理
                if (!started) {
                    group.threadStartFailed(this);
                }
            } catch (Throwable ignore) {
                /* do nothing. If start0 threw a Throwable then
                  it will be passed up the call stack */
            }
        }
    }

创建出Thread对象后,只有在调用thread对象的start()函数后,才开始进行真正的线程创建工作(new Thread操作几乎没有销毁什么资源、可以理解为只是一个普通对象的场景过程),start函数内部会调用native层的_nativeCreate_ 函数进行真正的线程创建流程(系统资源层面)。

Thread_nativeCreate

接下来分析native层线程创建的过程,nativeCreate函数对应的实现是 thread.cc 的Thread::CreateNativeThread函数。 由于 nativeCreate的流程较长,因此将内部的几个重要函数分几个小节进行分析。

void Thread::CreateNativeThread(JNIEnv* env, jobject java_peer, size_t stack_size, bool is_daemon) {
  CHECK(java_peer != nullptr);
  Thread* self = static_cast<JNIEnvExt*>(env)->GetSelf();
  //前序验证: 如果是在 runtime shutdown阶段则直接返回
  Runtime* runtime = Runtime::Current();
  // Atomically start the birth of the thread ensuring the runtime isn't shutting down.
  bool thread_start_during_shutdown = false;
  {
    MutexLock mu(self, *Locks::runtime_shutdown_lock_);
    if (runtime->IsShuttingDownLocked()) {
      thread_start_during_shutdown = true;
    } else {
      runtime->StartThreadBirth();
    }
  }
  if (thread_start_during_shutdown) {
    ScopedLocalRef<jclass> error_class(env, env->FindClass("java/lang/InternalError"));
    env->ThrowNew(error_class.get(), "Thread starting during runtime shutdown");
    return;
  }
  // 流程1 创建 ART 虚拟机对应的 Thread对象
  Thread* child_thread = new Thread(is_daemon);
  // Use global JNI ref to hold peer live while child thread starts.
  child_thread->tlsPtr_.jpeer = env->NewGlobalRef(java_peer);
  stack_size = FixStackSize(stack_size);

  // Thread.start is synchronized, so we know that nativePeer is 0, and know that we're not racing
  // to assign it.
  env->SetLongField(java_peer, WellKnownClasses::java_lang_Thread_nativePeer,
                    reinterpret_cast<jlong>(child_thread));

  // Try to allocate a JNIEnvExt for the thread. We do this here as we might be out of memory and
  // do not have a good way to report this on the child's side.
  std::string error_msg;
  std::unique_ptr<JNIEnvExt> child_jni_env_ext(
      JNIEnvExt::Create(child_thread, Runtime::Current()->GetJavaVM(), &error_msg));

  int pthread_create_result = 0;
  if (child_jni_env_ext.get() != nullptr) {
    pthread_t new_pthread;
    pthread_attr_t attr;
    child_thread->tlsPtr_.tmp_jni_env = child_jni_env_ext.get();
    // 创建 linux内核对应的 thread
    pthread_create_result = pthread_create(&new_pthread,
                                           &attr,
                                           Thread::CreateCallback,
                                           child_thread);
    CHECK_PTHREAD_CALL(pthread_attr_destroy, (&attr), "new thread");

    if (pthread_create_result == 0) {
      // pthread_create started the new thread. The child is now responsible for managing the
      // JNIEnvExt we created.
      child_jni_env_ext.release();  // NOLINT pthreads API.
      return;
    }
  }

  // Either JNIEnvExt::Create or pthread_create(3) failed, so clean up.
  {
    MutexLock mu(self, *Locks::runtime_shutdown_lock_);
    runtime->EndThreadBirth();
  }
  // Manually delete the global reference since Thread::Init will not have been run. Make sure
  // nothing can observe both opeer and jpeer set at the same time.
  child_thread->DeleteJPeer(env);
  delete child_thread;
  child_thread = nullptr;
  // 设置 java_peer  
  env->SetLongField(java_peer, WellKnownClasses::java_lang_Thread_nativePeer, 0);
  {
    std::string msg(child_jni_env_ext.get() == nullptr ?
        StringPrintf("Could not allocate JNI Env: %s", error_msg.c_str()) :
        StringPrintf("pthread_create (%s stack) failed: %s",
                                 PrettySize(stack_size).c_str(), strerror(pthread_create_result)));
    ScopedObjectAccess soa(env);
    soa.Self()->ThrowOutOfMemoryError(msg.c_str());
  }
}
new Thread
  //Thread_nativeCreate 头部部分代码 ↓
  Thread* child_thread = new Thread(is_daemon);
  // Use global JNI ref to hold peer live while child thread starts.
  child_thread->tlsPtr_.jpeer = env->NewGlobalRef(java_peer);
  //调整期望的栈大小 
  stack_size = FixStackSize(stack_size);

  // Thread.start is synchronized, so we know that nativePeer is 0, and know that we're not racing
  // to assign it.
  //这行代码在后面会进行解释
  env->SetLongField(java_peer, WellKnownClasses::java_lang_Thread_nativePeer,
                    reinterpret_cast<jlong>(child_thread));

  // Try to allocate a JNIEnvExt for the thread. We do this here as we might be out of memory and
  // do not have a good way to report this on the child's side.
  std::string error_msg;
  std::unique_ptr<JNIEnvExt> child_jni_env_ext(
      JNIEnvExt::Create(child_thread, Runtime::Current()->GetJavaVM(), &error_msg));
  Thread* child_thread = new Thread(is_daemon);

首先通过 new Thread()构造出ART 虚拟机native层所对应的 Thread对象, 简单跟踪下对应的构造函数实现。

Thread::Thread(bool daemon)
    : tls32_(daemon),
      wait_monitor_(nullptr),
      is_runtime_thread_(false) {
  wait_mutex_ = new Mutex("a thread wait mutex", LockLevel::kThreadWaitLock);
  wait_cond_ = new ConditionVariable("a thread wait condition variable", *wait_mutex_);
  tlsPtr_.mutator_lock = Locks::mutator_lock_;
  tlsPtr_.instrumentation_stack =
      new std::map<uintptr_t, instrumentation::InstrumentationStackFrame>;
  tlsPtr_.name.store(kThreadNameDuringStartup, std::memory_order_relaxed);

  static_assert((sizeof(Thread) % 4) == 0U,
                "art::Thread has a size which is not a multiple of 4.");
  StateAndFlags state_and_flags = StateAndFlags(0u).WithState(ThreadState::kNative);
  tls32_.state_and_flags.store(state_and_flags.GetValue(), std::memory_order_relaxed);
  tls32_.interrupted.store(false, std::memory_order_relaxed);
  // Initialize with no permit; if the java Thread was unparked before being
  // started, it will unpark itself before calling into java code.
  tls32_.park_state_.store(kNoPermit, std::memory_order_relaxed);
  memset(&tlsPtr_.held_mutexes[0], 0, sizeof(tlsPtr_.held_mutexes));
  std::fill(tlsPtr_.rosalloc_runs,
            tlsPtr_.rosalloc_runs + kNumRosAllocThreadLocalSizeBracketsInThread,
            gc::allocator::RosAlloc::GetDedicatedFullRun());
  tlsPtr_.checkpoint_function = nullptr;
  for (uint32_t i = 0; i < kMaxSuspendBarriers; ++i) {
    tlsPtr_.active_suspend_barriers[i] = nullptr;
  }
  tlsPtr_.flip_function = nullptr;
  tlsPtr_.thread_local_mark_stack = nullptr;
  tls32_.is_transitioning_to_runnable = false;
  ResetTlab();
}

构造函数中代码比较简单,其实现主要是对一些成员变量进行初始化操作,Thread类包含非常多的变量,系统根据变量的不同类型,如指针类型、32位大小变量、64位大小的变量,会将一些变量分别存放在tlsptr、tls32_、tls64_、结构体中 , 以tls32_对应的结构体为例,其包含了线程状态、线程suspen次数计数、tid、daemon属性、是否OOM等属性_

  explicit tls_32bit_sized_values(bool is_daemon)
        : state_and_flags(0u),
          suspend_count(0),
          thin_lock_thread_id(0),
          tid(0),
          daemon(is_daemon),
          throwing_OutOfMemoryError(false),
          no_thread_suspension(0),
          thread_exit_check_count(0),
          //...
        {
    // The state and flags field must be changed atomically so that flag values aren't lost.
    // See `StateAndFlags` for bit assignments of `ThreadFlag` and `ThreadState` values.
    // Keeping the state and flags together allows an atomic CAS to change from being
    // Suspended to Runnable without a suspend request occurring.
    Atomic<uint32_t> state_and_flags;
    static_assert(sizeof(state_and_flags) == sizeof(uint32_t),
                  "Size of state_and_flags and uint32 are different");

    // A non-zero value is used to tell the current thread to enter a safe point
    // at the next poll.
    int suspend_count GUARDED_BY(Locks::thread_suspend_count_lock_);

    // Thin lock thread id. This is a small integer used by the thin lock implementation.
    // This is not to be confused with the native thread's tid, nor is it the value returned
    // by java.lang.Thread.getId --- this is a distinct value, used only for locking. One
    // important difference between this id and the ids visible to managed code is that these
    // ones get reused (to ensure that they fit in the number of bits available).
    uint32_t thin_lock_thread_id;

    // System thread id.
    uint32_t tid;

    // Is the thread a daemon?
    const bool32_t daemon;
            
} tls32_

回到CreateNativeThread函数,继续分析剩下的流程

  //将 创建的thread对象地址 赋值到 Java层Thread对象的 java_peer字段
  env->SetLongField(java_peer, WellKnownClasses::java_lang_Thread_nativePeer,
                    reinterpret_cast<jlong>(child_thread));

在创建完Thread对象后,会将这个Thread对象的指针地址写回到Java层Thread对象的 nativePeer 成员变量中。
这个值的含义在Java层Thread类的注释中也有相应的说明, 表明了这个字段指向的是native层的thread object, 因此如果该值为0 则表示native thread 尚未创建或启动,亦或者是该线程已被销毁。

public
class Thread implements Runnable {
	/**
     * Reference to the native thread object.
     *
     * 

Is 0 if the native thread has not yet been created/started, or has been destroyed. */ private volatile long nativePeer; }

有了这个概念,在Java层我们可以通过反射获取 nativePeer的值,即可获取 虚拟机的 native thread对象指针。代码如下:

    public static final long getNativePeer(Thread t)throws IllegalAccessException{
        try {
            Field nativePeerField = Thread.class.getDeclaredField("nativePeer");
            nativePeerField.setAccessible(true);
            Long nativePeer = (Long) nativePeerField.get(t);
            return nativePeer;
        } catch (NoSuchFieldException e) {
            throw new IllegalAccessException("failed to get nativePeer value");
        } catch (IllegalAccessException e) {
            throw e;
        }
    }

我们还是先继续分析native层线程创建剩下的流程


  // Try to allocate a JNIEnvExt for the thread. We do this here as we might be out of memory and
  // do not have a good way to report this on the child's side.
  //在当前线程为子线程创建所需要的JNIEnvExt
  std::string error_msg;
  std::unique_ptr<JNIEnvExt> child_jni_env_ext(
      JNIEnvExt::Create(child_thread, Runtime::Current()->GetJavaVM(), &error_msg));

  int pthread_create_result = 0;
  if (child_jni_env_ext.get() != nullptr) {
    pthread_t new_pthread;
    pthread_attr_t attr;
    child_thread->tlsPtr_.tmp_jni_env = child_jni_env_ext.get();
    //初始化线程属性对象  
    CHECK_PTHREAD_CALL(pthread_attr_init, (&attr), "new thread");
    // 设置为分离线程
    CHECK_PTHREAD_CALL(pthread_attr_setdetachstate, (&attr, PTHREAD_CREATE_DETACHED),
                       "PTHREAD_CREATE_DETACHED");
    //设置线程栈大小  
    CHECK_PTHREAD_CALL(pthread_attr_setstacksize, (&attr, stack_size), stack_size);
    //调用 pthread_create 创建线程  
    pthread_create_result = pthread_create(&new_pthread,
                                           &attr,
                                           Thread::CreateCallback,
                                           child_thread);
    //释放 attr内存  
    CHECK_PTHREAD_CALL(pthread_attr_destroy, (&attr), "new thread");
    
    if (pthread_create_result == 0) {
      //创建线程成功  
      // pthread_create started the new thread. The child is now responsible for managing the
      // JNIEnvExt we created.
      child_jni_env_ext.release();  // NOLINT pthreads API.
      return;
    }
  }
 //以下为线程创建失败的处理
 // Either JNIEnvExt::Create or pthread_create(3) failed, so clean up.
  {
    MutexLock mu(self, *Locks::runtime_shutdown_lock_);
    runtime->EndThreadBirth();
  }
  // Manually delete the global reference since Thread::Init will not have been run. Make sure
  // nothing can observe both opeer and jpeer set at the same time.
  // 资源清理工作
  child_thread->DeleteJPeer(env);
  delete child_thread;
  child_thread = nullptr;

  env->SetLongField(java_peer, WellKnownClasses::java_lang_Thread_nativePeer, 0);
  {
    std::string msg(child_jni_env_ext.get() == nullptr ?
        StringPrintf("Could not allocate JNI Env: %s", error_msg.c_str()) :
        StringPrintf("pthread_create (%s stack) failed: %s",
                                 PrettySize(stack_size).c_str(), strerror(pthread_create_result)));
    ScopedObjectAccess soa(env);
    //抛出OOM异常  
    soa.Self()->ThrowOutOfMemoryError(msg.c_str());
  }

在对nativePeer进行赋值后,会在当前线程为子线程创建JNIEnvExt对象,JNIEnvExt对象代表JNI环境,每一个需要和Java事件进行交互的系统线程都需要有一个独立的JNIEnv对象。考虑到内存不足创建JniEnvExt需要进行的错误处理问题,创建JniEnvExt对象不是在子线程执行的,而是在当前线程执行,这样如果因为内存不足创建失败,在当前线程直接抛出OOM异常,更方便处理异常。

pthread_create

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-JE1EIYDm-1662434405406)(https://p3-juejin.byteimg.com/tos-cn-i-k3u1fbpfcp/37c1c74fb8464e0696265ca3b8648b07~tplv-k3u1fbpfcp-zoom-1.image)]
在完成JNIEnvExt对象初始化后,会调用 pthread_create 函数创建操作系统层面的线程,和创建关键 JNIEnv一样, 如果调用 pthread_create 创建系统线程失败 则会进行资源清理,并抛出 OOM异常。

Thread::Init

注意上节在调用 pthread_create时的参数,其第三个参数指定了该线程启动后运行的函数为 Thread::CreateCallback, 传入该启动函数的参数为刚创建的 ART虚拟机对应的 art::Thread 对象, 我们继续分析 CreateCallback的逻辑。

void* Thread::CreateCallback(void* arg) {
  Thread* self = reinterpret_cast<Thread*>(arg);
  Runtime* runtime = Runtime::Current();
  {
    //..
    //调用Thread::Init()  函数进行Thread对象的一些初始化操
    CHECK(self->Init(runtime->GetThreadList(), runtime->GetJavaVM(), self->tlsPtr_.tmp_jni_env));
    self->tlsPtr_.tmp_jni_env = nullptr;
    Runtime::Current()->EndThreadBirth();
  }
  {
    ScopedObjectAccess soa(self);
    self->InitStringEntryPoints();

    // Copy peer into self, deleting global reference when done.
    CHECK(self->tlsPtr_.jpeer != nullptr);
    self->tlsPtr_.opeer = soa.Decode<mirror::Object>(self->tlsPtr_.jpeer).Ptr();
    // Make sure nothing can observe both opeer and jpeer set at the same time.
    self->DeleteJPeer(self->GetJniEnv());

	// 设置线程名称
    self->SetThreadName(self->GetThreadName()->ToModifiedUtf8().c_str());

    ArtField* priorityField = jni::DecodeArtField(WellKnownClasses::java_lang_Thread_priority);
    //设置线程优先级
    self->SetNativePriority(priorityField->GetInt(self->tlsPtr_.opeer));
    
    runtime->GetRuntimeCallbacks()->ThreadStart(self);

    ArtField* unparkedField = jni::DecodeArtField(
        WellKnownClasses::java_lang_Thread_unparkedBeforeStart);
    bool should_unpark = false;
    {
      art::MutexLock mu(soa.Self(), *art::Locks::thread_list_lock_);
      should_unpark = unparkedField->GetBoolean(self->tlsPtr_.opeer) == JNI_TRUE;
    }
    if (should_unpark) {
      self->Unpark();
    }
    // Invoke the 'run' method of our java.lang.Thread.
    ObjPtr<mirror::Object> receiver = self->tlsPtr_.opeer;
    jmethodID mid = WellKnownClasses::java_lang_Thread_run;
    ScopedLocalRef<jobject> ref(soa.Env(), soa.AddLocalReference<jobject>(receiver));
    InvokeVirtualOrInterfaceWithJValues(soa, ref.get(), mid, nullptr);
  }
  // Detach and delete self.
  Runtime::Current()->GetThreadList()->Unregister(self);

  return nullptr;
}

在CreateCallback中,会先调用Thread::Init() 函数进行Thread对象的一些初始化操作,先跟踪Init 函数。

bool Thread::Init(ThreadList* thread_list, JavaVMExt* java_vm, JNIEnvExt* jni_env_ext) {
  //..
  // Set pthread_self_ ahead of pthread_setspecific, that makes Thread::Current function, this
  // avoids pthread_self_ ever being invalid when discovered from Thread::Current().
  tlsPtr_.pthread_self = pthread_self();

  ScopedTrace trace("Thread::Init");

  SetUpAlternateSignalStack();
  if (!InitStackHwm()) {
    return false;
  }
  InitCpu();
  InitTlsEntryPoints();
  RemoveSuspendTrigger();
  InitCardTable();
  InitTid();

#ifdef __BIONIC__
  __get_tls()[TLS_SLOT_ART_THREAD_SELF] = this;
#else
  CHECK_PTHREAD_CALL(pthread_setspecific, (Thread::pthread_key_self_, this), "attach self");
  Thread::self_tls_ = this;
#endif

  tls32_.thin_lock_thread_id = thread_list->AllocThreadId(this);

  if (jni_env_ext != nullptr) {
    DCHECK_EQ(jni_env_ext->GetVm(), java_vm);
    DCHECK_EQ(jni_env_ext->GetSelf(), this);
    tlsPtr_.jni_env = jni_env_ext;
  } else {
    std::string error_msg;
    tlsPtr_.jni_env = JNIEnvExt::Create(this, java_vm, &error_msg);
    if (tlsPtr_.jni_env == nullptr) {
      LOG(ERROR) << "Failed to create JNIEnvExt: " << error_msg;
      return false;
    }
  }
  ScopedTrace trace3("ThreadList::Register");
  thread_list->Register(this);
  return true;
}

在Init函数内部会依次 InitCpu() 、InitTlsEntryPoints()、RemoveSuspendTrigger()、InitCardTable()、InitTid() 函数。

跟踪InitCpu函数:

void Thread::InitCpu() {
  CHECK_EQ(THREAD_FLAGS_OFFSET, ThreadFlagsOffset<PointerSize::k32>().Int32Value());
  CHECK_EQ(THREAD_CARD_TABLE_OFFSET, CardTableOffset<PointerSize::k32>().Int32Value());
  CHECK_EQ(THREAD_EXCEPTION_OFFSET, ExceptionOffset<PointerSize::k32>().Int32Value());
  CHECK_EQ(THREAD_ID_OFFSET, ThinLockIdOffset<PointerSize::k32>().Int32Value());
}

InitCpu()函数在不同CPU架构下具体的实现不同,在arm结构下 几乎没做什么工作,只做了一些简单的检查。

跟踪 InitTlsEntryPoints函数:

void Thread::InitTlsEntryPoints() {
  ScopedTrace trace("InitTlsEntryPoints");
  // Insert a placeholder so we can easily tell if we call an unimplemented entry point.
  uintptr_t* begin = reinterpret_cast<uintptr_t*>(&tlsPtr_.jni_entrypoints);
  uintptr_t* end = reinterpret_cast<uintptr_t*>(
      reinterpret_cast<uint8_t*>(&tlsPtr_.quick_entrypoints) + sizeof(tlsPtr_.quick_entrypoints));
  for (uintptr_t* it = begin; it != end; ++it) {
    *it = reinterpret_cast<uintptr_t>(UnimplementedEntryPoint);
  }
  bool monitor_jni_entry_exit = false;
  PaletteShouldReportJniInvocations(&monitor_jni_entry_exit);
  if (monitor_jni_entry_exit) {
    AtomicSetFlag(ThreadFlag::kMonitorJniEntryExit);
  }
  InitEntryPoints(&tlsPtr_.jni_entrypoints, &tlsPtr_.quick_entrypoints, monitor_jni_entry_exit);
}

InitTlsEntryPoints函数的工作是初始化ART虚拟机线程执行过程中需要用到的调用跳转表 。该函数内部,首先将 tlsptr_ 中的 jni_entry_points 及 quick_entrypoints 函数入口指针全部赋值为UnimplementedEntryPoint, 这样可以更方便地处理调用某个函数,但其Unimplemented的情况。 最后会调用InitEntryPoints() 进行函数跳转表的初始化操作。

void InitEntryPoints(JniEntryPoints* jpoints,
                     QuickEntryPoints* qpoints,
                     bool monitor_jni_entry_exit) {
  DefaultInitEntryPoints(jpoints, qpoints, monitor_jni_entry_exit);

  // Cast
  qpoints->SetInstanceofNonTrivial(artInstanceOfFromCode);
  qpoints->SetCheckInstanceOf(art_quick_check_instance_of);

  // Math
  qpoints->SetIdivmod(__aeabi_idivmod);
  qpoints->SetLdiv(__aeabi_ldivmod);
  qpoints->SetLmod(__aeabi_ldivmod);  // result returned in r2:r3
  //..
  qpoints->SetF2l(art_quick_f2l);
  qpoints->SetL2f(art_quick_l2f);

  // More math.
  qpoints->SetCos(cos);
  qpoints->SetSin(sin);
  qpoints->SetAcos(acos);
  //..
  qpoints->SetSinh(sinh);
  qpoints->SetTan(tan);
  qpoints->SetTanh(tanh);

  // Intrinsics
  qpoints->SetIndexOf(art_quick_indexof);
  // The ARM StringCompareTo intrinsic does not call the runtime.
  qpoints->SetStringCompareTo(nullptr);
  qpoints->SetMemcpy(memcpy);

  // Read barrier.
  UpdateReadBarrierEntrypoints(qpoints, /*is_active=*/ false);
  qpoints->SetReadBarrierMarkReg12(nullptr);  // Cannot use register 12 (IP) to pass arguments.
  qpoints->SetReadBarrierMarkReg13(nullptr);  // Cannot use register 13 (SP) to pass arguments.
  qpoints->SetReadBarrierMarkReg14(nullptr);  // Cannot use register 14 (LR) to pass arguments.
  qpoints->SetReadBarrierMarkReg15(nullptr);  // Cannot use register 15 (PC) to pass arguments.
  // ARM has only 16 core registers.
  qpoints->SetReadBarrierMarkReg16(nullptr);
  //..
  qpoints->SetReadBarrierMarkReg29(nullptr);
  qpoints->SetReadBarrierSlow(artReadBarrierSlow);
  qpoints->SetReadBarrierForRootSlow(artReadBarrierForRootSlow);
}

initEntryPoints同 initCpu()函数一样,不同的Cpu架构 在具体实现上会有一些不同,以上代码是thread_arm.cc中的实现,对于不同CPU架构通用的实现部分, 被封装在 DefaultInitEntryPoints函数中。

static void DefaultInitEntryPoints(JniEntryPoints* jpoints,
                                   QuickEntryPoints* qpoints,
                                   bool monitor_jni_entry_exit) {
  // JNI
  jpoints->pDlsymLookup = reinterpret_cast<void*>(art_jni_dlsym_lookup_stub);
  jpoints->pDlsymLookupCritical = reinterpret_cast<void*>(art_jni_dlsym_lookup_critical_stub);

  // Alloc
  ResetQuickAllocEntryPoints(qpoints);

  // Resolution and initialization
  qpoints->SetInitializeStaticStorage(art_quick_initialize_static_storage);
  qpoints->SetResolveTypeAndVerifyAccess(art_quick_resolve_type_and_verify_access);
  qpoints->SetResolveType(art_quick_resolve_type);
  qpoints->SetResolveMethodHandle(art_quick_resolve_method_handle);
  qpoints->SetResolveMethodType(art_quick_resolve_method_type);
  qpoints->SetResolveString(art_quick_resolve_string);

  // Field
  qpoints->SetSet8Instance(art_quick_set8_instance);
  qpoints->SetSet8Static(art_quick_set8_static);
  //...
  qpoints->SetGet64Static(art_quick_get64_static);
  qpoints->SetGetObjStatic(art_quick_get_obj_static);

  // Array
  qpoints->SetAputObject(art_quick_aput_obj);

  // JNI
  qpoints->SetJniMethodStart(art_jni_method_start);
  qpoints->SetJniMethodEnd(art_jni_method_end);
  //..
  qpoints->SetJniMethodEntryHook(art_jni_method_entry_hook);

  // Locks
  if (UNLIKELY(VLOG_IS_ON(systrace_lock_logging))) {
    qpoints->SetJniLockObject(art_jni_lock_object_no_inline);
    qpoints->SetJniUnlockObject(art_jni_unlock_object_no_inline);
    qpoints->SetLockObject(art_quick_lock_object_no_inline);
    qpoints->SetUnlockObject(art_quick_unlock_object_no_inline);
  } else {
    qpoints->SetJniLockObject(art_jni_lock_object);
    qpoints->SetJniUnlockObject(art_jni_unlock_object);
    qpoints->SetLockObject(art_quick_lock_object);
    qpoints->SetUnlockObject(art_quick_unlock_object);
  }

  // Invocation
  qpoints->SetQuickImtConflictTrampoline(art_quick_imt_conflict_trampoline);
  qpoints->SetQuickResolutionTrampoline(art_quick_resolution_trampoline);
  qpoints->SetQuickToInterpreterBridge(art_quick_to_interpreter_bridge);
  //...
  qpoints->SetInvokePolymorphic(art_quick_invoke_polymorphic);
  qpoints->SetInvokeCustom(art_quick_invoke_custom);

  // Thread
  qpoints->SetTestSuspend(art_quick_test_suspend);

  // Throws
  qpoints->SetDeliverException(art_quick_deliver_exception);
  qpoints->SetThrowArrayBounds(art_quick_throw_array_bounds);
  qpoints->SetThrowDivZero(art_quick_throw_div_zero);
  qpoints->SetThrowNullPointer(art_quick_throw_null_pointer_exception);
  qpoints->SetThrowStackOverflow(art_quick_throw_stack_overflow);
  qpoints->SetThrowStringBounds(art_quick_throw_string_bounds);

  // Deoptimize
  qpoints->SetDeoptimize(art_quick_deoptimize_from_compiled_code);

  // StringBuilder append
  qpoints->SetStringBuilderAppend(art_quick_string_builder_append);

  // Tiered JIT support
  qpoints->SetUpdateInlineCache(art_quick_update_inline_cache);
  qpoints->SetCompileOptimized(art_quick_compile_optimized);

  // Tracing hooks
  qpoints->SetMethodEntryHook(art_quick_method_entry_hook);
  qpoints->SetMethodExitHook(art_quick_method_exit_hook);

  if (monitor_jni_entry_exit) {
    qpoints->SetJniMethodStart(art_jni_monitored_method_start);
    qpoints->SetJniMethodEnd(art_jni_monitored_method_end);
  }
}

}  // namespace art

接下里跟踪RemoveSuspendTrigger:

  // Remove the suspend trigger for this thread by making the suspend_trigger_ TLS value
  // equal to a valid pointer.
  // TODO: does this need to atomic?  I don't think so.
  void RemoveSuspendTrigger() {
    tlsPtr_.suspend_trigger = reinterpret_cast<uintptr_t*>(&tlsPtr_.suspend_trigger);
  }

RemoveSuspendTrigger 实现比较简单,将tlsPtr_.suspend_trigger 值赋值给自己,表示不需要进行 suspend check。

跟踪InitCradTable实现:

void Thread::InitCardTable() {
  tlsPtr_.card_table = Runtime::Current()->GetHeap()->GetCardTable()->GetBiasedBegin();
}

:InitCardTable() 实现就一行代码,获取Runtime Heap 的CardTable 对象的 biased_begin_并赋值给 tlsPtr__.cardtable ,因为线程运行时, tlrPtr_地址会被存储在 线程寄存器中,因此线程运行过程需要获取的一些重要指针会存储在 tlrPtr_中_

跟踪InitTid() 函数:

void Thread::InitTid() {
  tls32_.tid = ::art::GetTid();
}

其对 tls32_.tid 函数进行赋值,:GetTid() 根据不同的系统 调用系统函数,获取操作系统层面的线程id.

uint32_t GetTid() {
#if defined(__APPLE__)
  uint64_t owner;
  CHECK_PTHREAD_CALL(pthread_threadid_np, (nullptr, &owner), __FUNCTION__);  // Requires Mac OS 10.6
  return owner;
#elif defined(__BIONIC__)
  return gettid();
#elif defined(_WIN32)
  return static_cast<pid_t>(::GetCurrentThreadId());
#else
  return syscall(__NR_gettid);
#endif
}

完成上述流程后,在Thread::Init()函数的最后,最后会调用ThreadList::Register 函数 将自身注册到 ThreadList中 进行管理。

{
  InitCpu();
  InitTlsEntryPoints();
  RemoveSuspendTrigger();
  InitCardTable();
  InitTid();
  //...
  
  thread_list->Register(this);
  return true;
}
Java 层 Thread.run函数的执行

在完成 Init函数调用后,剩下的主要工作就是执行 Java Thread对象的 run 函数。 首先需要获取 Java层 Thread对象run()函数对应的 jmethodID,Thread 类的run()函数是个抽象函数,因此需要调用 InvokeVirtualOrInterfaceWithJvalues 执行,在该函数内部会查找到该抽象函数对应的最终的实现函数进行调用。

void* Thread::CreateCallback(void* arg) {
  Thread* self = reinterpret_cast<Thread*>(arg);
  Runtime* runtime = Runtime::Current();
  {
    MutexLock mu(nullptr, *Locks::runtime_shutdown_lock_);
    CHECK(self->Init(runtime->GetThreadList(), runtime->GetJavaVM(), self->tlsPtr_.tmp_jni_env));
    self->tlsPtr_.tmp_jni_env = nullptr;
    Runtime::Current()->EndThreadBirth();
  }
  {
   //...
    runtime->GetRuntimeCallbacks()->ThreadStart(self);
    //...
    jmethodID mid = WellKnownClasses::java_lang_Thread_run;
    ScopedLocalRef<jobject> ref(soa.Env(), soa.AddLocalReference<jobject>(receiver));
    InvokeVirtualOrInterfaceWithJValues(soa, ref.get(), mid, nullptr);
  }

分析 如何获取 tid

以上就是Java线程执行的一个简单流程,回到一开始的问题,我们现在知道 tid 是保存在 tls32_结构体 中,并且其位于 Thread对象的开头,从内存分布上看, tid 处于第12个字节开头。 因此我们只要能够获取 native层 该thread对象的指针 就可以通过 内存偏移的方式 获取tid。
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-EwXjYleS-1662434405407)(https://p3-juejin.byteimg.com/tos-cn-i-k3u1fbpfcp/d861ec35d6b94f00ab5cd99173494d45~tplv-k3u1fbpfcp-zoom-1.image)]

在 Thread_nativeCreate 流程中,我们已经分析了解到 虚拟机会将 创建的native层 thread 对象的地址 会写到 Java层的 nativePeer 变量中,因此 我们只需要反射 获取 nativePeer的值,最终就可以获得 tid。

    public static final long getNativePeer(Thread t)throws IllegalAccessException{
        try {
            Field nativePeerField = Thread.class.getDeclaredField("nativePeer");
            nativePeerField.setAccessible(true);
            Long nativePeer = (Long) nativePeerField.get(t);
            return nativePeer;
        } catch (NoSuchFieldException e) {
            throw new IllegalAccessException("failed to get nativePeer value");
        } catch (IllegalAccessException e) {
            throw e;
        }
    }

native 层获取 Java Thread对象对应的tid

由于需要操作 内存,因此可以写一个JNI 函数进行处理, 传入 nativePeer的值 ,返回tid。

public static native int getTid(long nativePeer)
JNIEXPORT jint JNICALL
Java_com_demo_getTid(JNIEnv *env,jclass clazz, jlong native_peer){
	auto *tid = reinterpret_cast<uint32_t *>(native_peer + sizeof(uint32_t) + sizeof(int) + sizeof(uint32_t));
    return reinterpret_cast<jint>((int) *tid);
}

Java层获取Java Thread对象对应的tid

另一种方式,也可以直接通过 UnSafe类直接操作内存,获取对应的Int值。 由于Android 在标准库中未包含UnSafe类,不过可以通过在Java环境提前编译调用UnSafe类的包装类,再以二进制依赖的方式引入 进行调用 。可参考开源库 iamironz/unsafe。

public static final int getTid(Thread t) throws IllegalAccessException {
	UnsafeAndroid unsafeAndroid = new UnsafeAndroid();
    long nativePeer = ThreadUtil.getNativePeer(t);
    int tid = unsafeAndroid.getInt(nativePeer+12);
    return tid;
}

总结

最后总结下Java线程的创建流程:

  • Java线程创建流程
    • Java层初始化操作,设置线程名、Java层线程ID、线程组关系、线程优先级、线程栈大小等属性
    • 调用nativeCreate 进入Native层线程创建的流程
      • 创建art::Thread对象流程
        • 调用 new Thread(is_daemon) ,创建 art::Thread对象
        • 为Thread对象 创建 tlsPtr_结构,若创建tlsPtr_失败,则抛出OOM异常
        • 将art::Thread 对象的地址写回到对应的Java Thread对象的 nativePeer属性中
      • 创建系统线程对象
        • 调用 pthread_create 创建操作系统层面的真线程对象,并设置线程创建后执行的函数为Thread::CreateCallback
        • 如果pthred_create创建失败,进行资源回收,并抛出OOM异常
      • 系统线程创建成功,当前线程流程结束,Thread::CreateCallback 在新线程得到执行
      • 线程任务 Thread::CreateCallabck流程
        • Thread::Init() 函数进行Thread对象的一些初始化
        • 设置线程名称
        • 设置线程优先级
        • 获取 Java层Thread对象 对应的 run()函数,并调用run函数
    • 最后,在异步线程中,Java Thread对象的run函数得到执行

你可能感兴趣的:(APM,Android,java,android,开发语言)