2020-10-09 SurfaceFlinger 卡死问题分析:

1:使用高的地图进行CarPlay 投屏时候,出现整个系统卡死.

2:log
"Binder:880_C" prio=5 tid=103 Native

  | group="main" sCount=1 dsCount=0 flags=1 obj=0x139c2cc0 self=0x7ce0ec5800

  | sysTid=2996 nice=-8 cgrp=default sched=0/0 handle=0x7cbc06d4f0

  | state=S schedstat=( 10540078349 11945271082 40545 ) utm=773 stm=280 core=0 HZ=100

  | stack=0x7cbbf73000-0x7cbbf75000 stackSize=1005KB

  | held mutexes=

  kernel: (couldn't read /proc/self/task/2996/stack)

  native: #00 pc 000000000006a824  /system/lib64/libc.so (__ioctl+4)

  native: #01 pc 00000000000240a0  /system/lib64/libc.so (ioctl+136)

  native: #02 pc 0000000000054b50  /system/lib64/libbinder.so (android::IPCThreadState::talkWithDriver(bool)+260)

  native: #03 pc 00000000000558dc  /system/lib64/libbinder.so (android::IPCThreadState::waitForResponse(android::Parcel*, int*)+360)

  native: #04 pc 0000000000055614  /system/lib64/libbinder.so (android::IPCThreadState::transact(int, unsigned int, android::Parcel const&, android::Parcel*, unsigned int)+224)

  native: #05 pc 000000000004c318  /system/lib64/libbinder.so (android::BpBinder::transact(unsigned int, android::Parcel const&, android::Parcel*, unsigned int)+72)

  native: #06 pc 000000000007d428  /system/lib64/libgui.so (???)

  native: #07 pc 000000000008cd90  /system/lib64/libgui.so (android::SurfaceComposerClient::createSurface(android::String8 const&, unsigned int, unsigned int, int, unsigned int, android::SurfaceControl*, unsigned int, unsigned int)+232)

  native: #08 pc 00000000000f89b8  /system/lib64/libandroid_runtime.so (???)

  native: #09 pc 0000000000d2ee38  /system/framework/arm64/boot-framework.oat (Java_android_view_SurfaceControl_nativeCreate__Landroid_view_SurfaceSession_2Ljava_lang_String_2IIIIJII+264)

  at android.view.SurfaceControl.nativeCreate(Native method)

  at android.view.SurfaceControl.(SurfaceControl.java:341)

  at android.view.SurfaceControl.(SurfaceControl.java:312)

  at com.android.server.wm.SurfaceControlWithBackground.(SurfaceControlWithBackground.java:72)

  at com.android.server.wm.WindowSurfaceController.(WindowSurfaceController.java:101)

  at com.android.server.wm.WindowStateAnimator.createSurfaceLocked(WindowStateAnimator.java:698)

  at com.android.server.wm.WindowManagerService.createSurfaceControl(WindowManagerService.java:2355)

  at com.android.server.wm.WindowManagerService.relayoutWindow(WindowManagerService.java:2100)

  - locked <0x027b8624> (a com.android.server.wm.WindowHashMap)

  at com.android.server.wm.Session.relayout(Session.java:239)

  at android.view.IWindowSession$Stub.onTransact(IWindowSession.java:286)

  at com.android.server.wm.Session.onTransact(Session.java:163)

  at android.os.Binder.execTransact(Binder.java:697)


从中可以看到 SystemService 在申请WindowHashMap Lock之后,  远程调用surfaceflinger 创建一个新的Layer 的时候,一直没有返回

3.我们看一下surfaceflinger的   线程情况:

"Binder:399_4" sysTid=1067

  #00 pc 000000000001d52c  /system/lib64/libc.so (syscall+28)

  #01 pc 00000000000676b8  /system/lib64/libc.so (pthread_cond_wait+96)

  #02 pc 000000000005ee48  /system/lib64/libgui.so (android::BufferQueueProducer::waitForFreeSlotThenRelock(android::BufferQueueProducer::FreeSlotCaller, int*) const+708)

  #03 pc 000000000005f160  /system/lib64/libgui.so (android::BufferQueueProducer::dequeueBuffer(int*, android::sp*, unsigned int, unsigned int, int, unsigned long, unsigned long*, android::FrameEventHistoryDelta*)+424)

  #04 pc 0000000000074208  /system/lib64/libgui.so (android::BnGraphicBufferProducer::onTransact(unsigned int, android::Parcel const&, android::Parcel*, unsigned int)+448)

  #05 pc 000000000004af20  /system/lib64/libbinder.so (android::BBinder::transact(unsigned int, android::Parcel const&, android::Parcel*, unsigned int)+136)

  #06 pc 0000000000055004  /system/lib64/libbinder.so (android::IPCThreadState::executeCommand(int)+528)

  #07 pc 0000000000054d44  /system/lib64/libbinder.so (android::IPCThreadState::getAndExecuteCommand()+156)

  #08 pc 0000000000055394  /system/lib64/libbinder.so (android::IPCThreadState::joinThreadPool(bool)+60)

  #09 pc 000000000007728c  /system/lib64/libbinder.so

  #10 pc 0000000000011478  /system/lib64/libutils.so (android::Thread::_threadLoop(void*)+280)

  #11 pc 000000000006803c  /system/lib64/libc.so (__pthread_start(void*)+36)

  #12 pc 000000000001edfc  /system/lib64/libc.so (__start_thread+68)

可以看到  其中 GraphicBuffer 已经没有可以使用的  INVALID_BUFFER_SLOT, 一直在在等待 hwc  完成图像的合成. 释放信息buffer.(  zhijian CarPlay app  没有设置mDequeueTimeout ,  mMaxDequeuedBufferCount=16 )


4:SurfaceFlinger 的其他线程也在等待 HWC binder 执行完成.

Cmd line: /system/bin/surfaceflinger

"surfaceflinger" sysTid=399

  #00 pc 0000000000015208  /system/lib64/libhwbinder.so (android::hardware::IPCThreadState::talkWithDriver(bool)+260)

  #01 pc 00000000000160ec  /system/lib64/libhwbinder.so (android::hardware::IPCThreadState::waitForResponse(android::hardware::Parcel*, int*)+360)

  #02 pc 0000000000015e18  /system/lib64/libhwbinder.so (android::hardware::IPCThreadState::transact(int, unsigned int, android::hardware::Parcel const&, android::hardware::Parcel*, unsigned int)+224)

  #03 pc 0000000000012858  /system/lib64/libhwbinder.so (android::hardware::BpHwBinder::transact(unsigned int, android::hardware::Parcel const&, android::hardware::Parcel*, unsigned int, std::__1::function)+72)

  #04 pc 0000000000038da4  /system/lib64/[email protected] (android::hardware::graphics::composer::V2_1::BpHwComposerClient::_hidl_executeCommands(android::hardware::IInterface*, android::hardware::details::HidlInstrumentor*, unsigned int, android::hardware::hidl_vec const&, std::__1::function const&)>)+372)

  #05 pc 0000000000039c98  /system/lib64/[email protected] (android::hardware::graphics::composer::V2_1::BpHwComposerClient::executeCommands(unsigned int, android::hardware::hidl_vec const&, std::__1::function const&)>)+148)

  #06 pc 00000000000895ac  /system/lib64/libsurfaceflinger.so

  #07 pc 000000000008a448  /system/lib64/libsurfaceflinger.so

  #08 pc 000000000009367c  /system/lib64/libsurfaceflinger.so

  #09 pc 00000000000bc854  /system/lib64/libsurfaceflinger.so

  #10 pc 000000000006a55c  /system/lib64/libsurfaceflinger.so

  #11 pc 00000000000a5f90  /system/lib64/libsurfaceflinger.so

  #12 pc 00000000000a4e8c  /system/lib64/libsurfaceflinger.so

  #13 pc 00000000000a4b30  /system/lib64/libsurfaceflinger.so

  #14 pc 0000000000015e04  /system/lib64/libutils.so (android::Looper::pollInner(int)+332)

  #15 pc 0000000000015c28  /system/lib64/libutils.so (android::Looper::pollOnce(int, int*, int*, void**)+108)

  #16 pc 00000000000818ec  /system/lib64/libsurfaceflinger.so

  #17 pc 00000000000a34e0  /system/lib64/libsurfaceflinger.so (android::SurfaceFlinger::run()+20)

  #18 pc 0000000000002e80  /system/bin/surfaceflinger

  #19 pc 00000000000a1fb8  /system/lib64/libc.so (__libc_init+88)

  #20 pc 0000000000002978  /system/bin/surfaceflinger


5:HWC 没有打印线程情况,只能从GED log中分析:

09-10 13:01:41.703514 387 6848 D SkipV(3: do skip vali

09-10 13:01:41.703621  387  6848 D [HWC] (0) VAL list=2/max=12/fbt=0[-1,-1:-1,-1](OVL)/ui=1/mm=1/ovlp=4259840/fi=1/mir=-1 

09-10 13:01:41.703807  387  6848 D (0) SET job=0/max=12/fbt=0(OVL)/ui=1/mm=1/mir=-1/ult=0/flush=0/black=0/PF(fd=136, idx=72830, curr_pf_fd=136,132) 

09-10 13:01:41.703976  387  6848 D [TOL] [mm_ionImport] ion_fd(110) -> share_fd(142) 

09-10 13:01:41.703993  387  6848 D [BLT_ASYNC]SET(0,0)/rel=139/acq=-1/handle=0x7f84e9cee0/job=61360 

09-10 13:01:41.704117  387  6848 D [HWC] (0) (L3353) set s:PD=>displayPresent s:PD=>(L3268) set s:CSV=>(L3283) set s:VD=>(L1436) set s:P=> 

09-10 13:01:41.737992  387  6848 D [DEV] (0) DispSessionMode (DL) id:10000 

09-10 13:01:41.738001  387  6848 D [HWC] setBufFromSf 0:134,9d0a 0:12,9c70 

09-10 13:01:41.738066  387  6848 D SkipV(3: do skip vali 

09-10 13:01:41.738116  387  6848 D [HWC] (0) VAL list=2/max=12/fbt=0[-1,-1:-1,-1](OVL)/ui=1/mm=1/ovlp=4259840/fi=1/mir=-1 

09-10 13:01:41.738195  387  6848 D (0) SET job=0/max=12/fbt=0(OVL)/ui=1/mm=1/mir=-1/ult=0/flush=0/black=0/PF(fd=140, idx=72831, curr_pf_fd=140,136) 

09-10 13:01:41.738273  387  6848 D [TOL] [mm_ionImport] ion_fd(119) -> share_fd(146) 

09-10 13:01:41.738279  387  6848 D [BLT_ASYNC]SET(0,0)/rel=143/acq=-1/handle=0x7f84e9d0a0/job=61361 

09-10 13:01:41.738498  387  6848 W [JOB] (0) Jobs have piled up, wait for clearing!! 

09-10 13:01:41.738517  387  6848 D [WKR] Waiting for Dispatcher_0... 

09-10 13:01:41.838904  387  6848 W [WKR] Timed out waiting for Dispatcher_0 (cnt=1/val=6) 

09-10 13:01:41.939377  387  6848 W [WKR] Timed out waiting for Dispatcher_0 (cnt=3/val=6) 

09-10 13:01:42.039855  387  6848 W [WKR] Timed out waiting for Dispatcher_0 (cnt=5/val=6) 

09-10 13:01:42.140239  387  6848 W [WKR] Timed out waiting for Dispatcher_0 (cnt=7/val=6) 

09-10 13:01:42.240897  387  6848 W [WKR] Timed out waiting for Dispatcher_0 (cnt=9/val=6) 

09-10 13:01:42.341493  387  6848 W [WKR] Timed out waiting for Dispatcher_0 (cnt=11/val=6) 

09-10 13:01:42.441838  387  6848 W [WKR] Timed out waiting for Dispatcher_0 (cnt=13/val=6) 

09-10 13:01:42.542413  387  6848 W [WKR] Timed out waiting for Dispatcher_0 (cnt=15/val=6) 

09-10 13:01:42.642909  387  6848 W [WKR] Timed out waiting for Dispatcher_0 (cnt=17/val=6) 

09-10 13:01:42.743223  387  6848 W [WKR] Timed out waiting for Dispatcher_0 (cnt=19/val=6) 

09-10 13:01:42.793564  387  6848 E [WKR] Timed out waiting for Dispatcher_0 (cnt=20/val=6) 

09-10 13:01:42.843810  387  6848 W [WKR] Timed out waiting for Dispatcher_0 (cnt=21/val=6)

什么是Dispatcher 线程?

HWC 把需要执行的动作 以job的形式post 到Dispatcher 线程, 最终由Dispatcher 来执行动作.

从 log上看 Dispatcher 线程已经卡死.

你可能感兴趣的:(2020-10-09 SurfaceFlinger 卡死问题分析:)