系统死机/重启等问题发生时候的原始RAW data
由于Mobilelog service运行要在android system init阶段,而从kernel启动到这个阶段,kernel log已经在不断地送入log ring buffer,log量大的情况下ring buffer就会被覆盖
默认抓取到的kernel_log.boot不是从0s开始,对于研发debug阶段,只能靠抓取uart log来获取0s开始的log,非常影响debug效率
alps/kernel-3.18/init/Kconfig
config LOG_BUF_SHIFT
default 17 --- > 21 2^21=2MB buffer
alps/system/core/liblog/include/private/android_logger.h
#define LOG_BUFFER_SIZE (2048 * 1024) #log和logd一致
logcat是android中的一个命令行工具,可以用于得到程序的log信息
常见的日志纪录方法包括:
方法 | 描述 |
---|---|
v(String,String) (vervbose) | 显示全部信息 |
d(String,String)(debug) | 显示调试信息 |
i(String,String)(information) | 显示一般信息 |
w(String,String)(waning) | 显示警告信息 |
e(String,String)(error) | 显示错误信息 |
例如:
//开发过程中获取log
Log.i("MyActivity","MyClass.getView() - get item number"+position);
//adb获取log
adb logcat
adb logcat输出的日志格式如下:
I/ActivityManager( 1754): Waited long enough for: ServiceRecord{2b24178c u0 com.google.android.gms/.checkin.CheckinService}
adb logcat –b radio
adb logcat –b system
adb logcat –b events
adb logcat –b main
/data/aee_exp
/data/vendor/mtklog/aee_exp
alps/bootable/bootloader/lk/app/mt_boot/mt_boot.c
中,将所有printk.disable_uart=1
改成printk.disable_uart=0
,然后重新编译lk, download lk 即可。Adbd进程起来后,可以使用GAT抓取开机log(录制前先关机)。
若mtklogger可用,可以通过设置mobile log开机自启动录制开机log。
停止录制状态下mtklogger->settings->mobile log->start automticaly
若TP无法使用,可以参考FAQ06939使用adb命令控制mtklogger录制。
编译一版eng版本对应软件,做如下修改:
alps/system/core/rootdir/init.rc
on property:ro.debuggable=1
# Give writes to anyone for the trace folder on debug builds.
# The folder is used to store method traces.
chmod 0773 /data/misc/trace
start console
//add begin
on property:ro.debuggable=0
# Give writes to anyone for the trace folder on debug builds.
# The folder is used to store method traces.
chmod 0773 /data/misc/trace
start console
setprop persist.sys.usb.config mass_storage,adb //add end
alps/kernel-3.18/drivers/misc/mediatek/mtprof/bootprof.c
#ifdef CONFIG_MT_PRINTK_UART_CONSOLE
//mt_disable_uart();
#endif
alps/build/make/core/main.mk
ifeq (true,$(strip $(enable_target_debugging))) # Target is more debuggable and adbd is on by default
ADDITIONAL_DEFAULT_PROPERTIES += ro.debuggable=1 # Enable Dalvik lock contention logging.
ADDITIONAL_BUILD_PROPERTIES += dalvik.vm.lockprof.threshold=500 # Include the debugging/testing OTA keys in this build.
INCLUDE_TEST_OTA_KEYS := true
else # !enable_target_debugging
# Target is less debuggable and adbd is off by default
ADDITIONAL_DEFAULT_PROPERTIES += ro.debuggable=1
endif # !enable_target_debugging
编译好后,user版本刷入eng版本的lk+boot, 抓取uart 或者上层log
如需抓取开机向导前的log,由于系统还未正式起来,请焊uart线,uart log中输入adb logcat &
将上层log输出到uart log中
AEE (Android Exception Engine)
是安卓的一个异常捕获和调试信息生成机制。
手机发生错误(异常重启/卡死)时生成db文件(一种被加密过的二进制文件)
用来保存和记录异常发生时候的所有内存信息,通过调试和仿真这些信息,可以追踪到异常的原因
File | Description |
---|---|
__exp_main.txt | 异常类型,调用栈等关键信息 |
_exp_detail.txt | 详细异常信息 |
SYS_ANDROID_LOG | android buffer log(logcat -d -v time *:v) |
SYS_KERNEL_LOG | kernel log |
SYS_LAST_KMSG | 上次重启前的kernel log |
SYS_MINI_RDUMP | 类似coredump,可以用gdb/trace32调试 |
SYS_WDT_LOG | 看门狗复位信息 |
SYS_REBOOT_REASON | 重启时的硬件记录的信息 |
SYS_VERSION_INFO | kernel版本,用于和vmlinux对比,只有匹配的vmlinux才能用于分析这个异常 |
SYS_ANDROID_EVENT_LOG | android event log(logcat -b events -v time -d *:v) |
SYS_ANDROID_RADIO_LOG | android buffer log(logcat -b radio -v time -d *:v) |
PROCESS_COREDUMP | native program core dump |
SYS_PROPERTIES | system properties |
SWT_JBT_TRACES | /data/anr/. |
ZZ_INTERNAL | 基本异常信息 |
SYS_CPU_INFO | cpu 信息(top -n 1 -d 1 -m 30 -t) |
SYS_MEMORY_INFO | memory information (/proc/meminfo) |
struct last_reboot_reason
{
uint32_t fiq_step;
uint32_t exp_type; /* 0xaeedeadX: X=1 (HWT), X=2 (KE), X=3 (nested panic) */
uint32_t reboot_mode;
uint32_t last_irq_enter[NR_CPUS];
uint64_t jiffies_last_irq_enter[NR_CPUS];
uint32_t last_irq_exit[NR_CPUS];
uint64_t jiffies_last_irq_exit[NR_CPUS];
uint64_t jiffies_last_sched[NR_CPUS];
char last_sched_comm[NR_CPUS][TASK_COMM_LEN];
uint8_t hotplug_data1[NR_CPUS], uint8_t hotplug_data2;
uint64_t hotplug_data3;
uint32_t mcdi_wfi, mcdi_r15, deepidle_data, sodi_data, spm_suspend_data;
uint64_t cpu_dormant[NR_CPUS];
uint32_t clk_data[8], suspend_debug_flag;
uint8_t cpu_dvfs_vproc_big, cpu_dvfs_vproc_little, cpu_dvfs_oppidx, cpu_dvfs_status;
uint8_t gpu_dvfs_vgpu, gpu_dvfs_oppidx, gpu_dvfs_status;
uint64_t ptp_cpu_big_volt, ptp_cpu_little_volt, ptp_gpu_volt, ptp_temp;
uint8_t ptp_status;
uint8_t thermal_temp1, thermal_temp2, thermal_temp3, thermal_temp4, thermal_temp5;
uint8_t thermal_status;
void *kparams;
};
A:MTKlogger基本足矣
A:
A:连接adb,通过adb发送ctp报点与手势,来操作手机
A:使用GAT工具,实时抓取手机内部frame buffer,投影到电脑上,并用adb命令操作手机
A:采用adb logcat方式实时过滤带关键字关键level的log (包括kernel log)
adb shell cat /proc/bootprof or mktlog bootprof file
System mount fail 导致 service 起不来,readback system分区对比看是否文件破坏。
[138:kworker/u16:2]device-mapper: verity: 179:30: metadata block 716579 is corrupted
[246:init]JBD2: IO error reading journal superblock
[246:init]EXT4-fs (dm-0): error loading journal
[246:init]fs_mgr: __mount(source=/dev/block/dm-0,target=/system,type=ext4)=-1 <<===文件系统挂载失败
[246:init]EXT4-fs (mmcblk0p31): VFS: Can't find ext4 filesystem
经常遇到无法开机的问题,低概率、难复现,而且软、硬体跨度大,不易掌握与追踪;
事后分析:
部分有硬件实际损坏、系统映像档被破坏,或用户拔电池导致系统核心文件损坏…等几种原因。其中一部分导致无法开机的问题是由于不当操作使得文件损坏导致的。
PS:产线也会报小概率不开机的问题。
Donwload完整性检查和开机检查客制化
检查kernel log是否有emmc i/o error相关log
如果是单机问题检查emmc相关供电或作替换物料交叉实验
[ 5.030802] <0>.(0)[165:mmcqd/0]mmcblk0: error -110 transferring data, sector 5448262, nr 442, cmd
response 0x900, card status 0x0
[ 5.032358] <0>.(0)[165:mmcqd/0]blk_update_request: I/O error, dev mmcblk0, sector 5448262
[ 5.130190] <0>.(0)[179:init]EXT4-fs (dm-0): unable to read superblock
[ 5.131325] <0>.(0)[179:init]fs_mgr: __mount(source=/dev/block/dm-0,target=/system,type=ext4)=-1 <<===文件系统挂载失败
[50:31:154] [MEM] complex R/W mem test fail :FFFFFFFF
[50:31:155] memory.c:line 105 0
[50:31:155] PL fatal error
[50:31:155] PL delay for Long Press Reboot
[50:31:159] power key is pressed
[50:36:117] [PLF]Emergency Dwld mode(timeout: 5s)
[50:36:119] mtk_arch_reset at pre-loader!