ACPI_BIOS_USING_OS_MEMORY

ACPI_BIOS_USING_OS_MEMORY

 

最近我们的BIOS碰到一个奇怪的bug,最初是插上4GMemory BIOS Setup中只能显示3G后来BIOS修改代码以后总算在Setup menu里面显示出了4G显示虽然对了可再也进不去OS了,每次都是蓝底白字直接blue screen死状如下图1所示。

    

 图 1

 

          可能是因为SWPM知道我之前有写过一篇讲述ACPI debug文章,于是她就请我support让我帮帮分析这个问题。说实话蓝屏分析我实在没经 验,从没分析过L。不过最近也不是很忙,那就当作业练习一下了。架起WinDbg1394PCIE的接口插到板子上(开发板上没1394接口)。然debuggee run,一旦蓝屏WinDbg就会被断下,让我们来看看蓝屏的信息吧。 

Waiting to reconnect...

Connected to Windows 6001 x86 compatible target, ptr64 FALSE

Kernel Debugger connection established.

Symbol search path is: D:/Vista32-sp1symbol;D:/websymbols-sp1;C:/WINNT/Symbols

Executable search path is:

Windows Kernel Version 6001 MP (1 procs) Free x86 compatible

Built by: 6001.18000.x86fre.longhorn_rtm.080118-1840

Kernel base = 0x8203d000 PsLoadedModuleList = 0x82154c70

System Uptime: not available

 

*** Fatal System Error: 0x000000a5

                       (0x00001000,0x00000000,0xFFFFFF00,0x00000105)

 

Break instruction exception - code 80000003 (first chance)

 

A fatal system error has occurred.

Debugger entered on first try; Bugcheck callbacks have not been invoked.

 

A fatal system error has occurred.

 

Connected to Windows 6001 x86 compatible target, ptr64 FALSE

Loading Kernel Symbols

........................................

Loading User Symbols

 

*******************************************************************************

*                                                                             *

*                        Bugcheck Analysis                                    *

*                                                                             *

*******************************************************************************

 

Use !analyze -v to get detailed debugging information.

 

BugCheck A5, {1000, 0, ffffff00, 105}

 

Probably caused by : acpi.sys ( acpi!MapPhysMem+39 )

 

Followup: MachineOwner

---------

 

nt!RtlpBreakWithStatusInstruction:

820f5514 cc              int     3

 

由上面的信息我们知道系统发生了致命的错误错误代码为0x000000a5Fatal System Error: 0x000000a5而且这个错误应该是由于acpi.sys (acpi!MapPhysMem+39)导致的。能获得的信息就是这么多了。更详细的息需要输入

0: kd> !analyze –v

*******************************************************************************

*                                                                             *

*                        Bugcheck Analysis                                    *

*                                                                             *

*******************************************************************************

 

ACPI_BIOS_ERROR (a5)

The ACPI Bios in the system is not fully compliant with the ACPI specification.

The first value indicates where the incompatibility lies:

This bug check covers a great variety of ACPI problems.  If a kernel debugger

is attached, use "!analyze -v".  This command will analyze the precise problem,

and display whatever information is most useful for debugging the specific

error.

Arguments:

Arg1: 00001000, ACPI_BIOS_USING_OS_MEMORY

   ACPI had a fatal error when processing a memory

operation region.The memory operation region tried to

map memory that has been allocated for OS usage.

Arg2: 00000000, The high portion of the physical address

 of the memory region.

Arg3: ffffff00, The low portion of the physical address

 of the memory region.

Arg4: 00000105, The length of memory being mapped.

 

Debugging Details:

------------------

 

 

DEFAULT_BUCKET_ID:  VISTA_DRIVER_FAULT

 

BUGCHECK_STR:  0xA5

 

PROCESS_NAME:  System

 

CURRENT_IRQL:  2

 

LOCK_ADDRESS:  821715e0 -- (!locks 821715e0)

 

Resource @ nt!PiEngineLock (0x821715e0)    Exclusively owned

     Threads: 85d28668-01<*>

1 total locks, 1 locks currently held

 

PNP_TRIAGE:

   Lock address  : 0x821715e0

   Thread Count  : 1

   Thread address: 0x85d28668

   Thread wait   : 0x28

 

LAST_CONTROL_TRANSFER:  from 8210a2d7 to 820f5514

 

STACK_TEXT: 

8039920c 8210a2d7 00000003 bac8867b 00000000 nt!RtlpBreakWithStatusInstruction

8039925c 8210adbd 00000003 00000105 ffffff00 nt!KiBugCheckDebugBreak+0x1c

80399628 8210a163 000000a5 00001000 00000000 nt!KeBugCheck2+0x66d

8039964c 80627376 000000a5 00001000 00000000 nt!KeBugCheckEx+0x1e

80399674 80627f9b ffffff00 00000105 85d60e58 acpi!MapPhysMem+0x39

80399690 8062aba1 85d5f000 ffffff00 00000105 acpi!MapUnmapPhysMem+0x2f

803996b4 8062f4cb 85d5f000 00000000 00008000 acpi!OpRegion+0xcb

803996d0 806287dc 85d5f000 85d60e58 00000000 acpi!ParseTerm+0x14d

803996f8 80629c75 00000000 00000000 806374b8 acpi!RunContext+0x65

80399710 80629d40 85d5f000 00000000 806374b8 acpi!InsertReadyQueue+0xa7

8039972c 8062991b 85d5f000 00000000 0000205b acpi!RestartContext+0x27

80399768 80623af9 85d5f000 86121380 85d5f000 acpi!SyncLoadDDB+0xde

8039977c 8063c4b9 ffd19010 8039979c 80636fc4 acpi!AMLILoadDDB+0x66

80399794 8063c521 86a972b0 00000000 8200df60 acpi!ACPIInitializeDDB+0x37

803997b0 8063c645 8200dec0 80637b60 80637d00 acpi!ACPIInitializeDDBs+0x47

803997c4 8061b43c 86a97bf8 86a97a38 00000000 acpi!ACPIInitialize+0xe9

803997f4 80640f7c 86a97bf8 86a7e3b0 80640ec8 acpi!ACPIInitStartACPI+0x6a

80399820 80616e4b 86a97bf8 86a7e3b0 86a7e3b0 acpi!ACPIRootIrpStartDevice+0xb4

80399850 820f9053 86a97bf8 86a97a38 803998cc acpi!ACPIDispatchIrp+0xff

80399868 821a1605 00000000 85d104c0 86a97878 nt!IofCallDriver+0x63

80399884 8204912a 803998a8 82048f47 86a97878 nt!PnpAsynchronousCall+0x96

803998d0 821a24f6 82048f47 86a97878 86a7edc8 nt!PnpStartDevice+0xb7

8039992c 821a23b1 86a97878 00000012 00000000 nt!PnpStartDeviceNode+0x13a

80399948 8219f4db 00000000 00000000 8216f530 nt!PipProcessStartPhase1+0x65

80399b44 820489e8 85d58260 00000000 80399b88 nt!PipProcessDevNodeTree+0x187

80399b9c 820488f8 00000000 86a7e5d0 833fa3b8 nt!PnpDeviceActionWorker+0xde

80399bb8 82396078 00000000 00000007 00000000 nt!PnpRequestDeviceAction+0x127

80399c34 82398ff1 808108c4 8080e430 00000000 nt!IopInitializeBootDrivers+0x3b0

80399c94 8239ccb3 808108c4 85d28990 85d28668 nt!IoInitSystem+0x5af

80399d74 82195af1 80399dc0 82212a1c 808108c4 nt!Phase1InitializationDiscard+0xb86

80399d7c 82212a1c 808108c4 bac889e7 00000000 nt!Phase1Initialization+0xd

80399dc0 8206ba3e 82195ae4 808108c4 00000000 nt!PspSystemThreadStartup+0x9d

00000000 00000000 00000000 00000000 00000000 nt!KiThreadStartup+0x16

 

 

STACK_COMMAND:  kb

 

FOLLOWUP_IP:

acpi!MapPhysMem+39

80627376 6a00            push    0

 

SYMBOL_STACK_INDEX:  4

 

FOLLOWUP_NAME:  MachineOwner

 

MODULE_NAME: acpi

 

IMAGE_NAME:  acpi.sys

 

DEBUG_FLR_IMAGE_TIMESTAMP:  47918b80

 

SYMBOL_NAME:  acpi!MapPhysMem+39

 

FAILURE_BUCKET_ID:  0xA5_acpi!MapPhysMem+39

 

BUCKET_ID:  0xA5_acpi!MapPhysMem+39

 

Followup: MachineOwner

---------

 

0: kd> lmvm acpi

start    end        module name

8060e000 80654000   acpi       (pdb symbols)          C:/WINNT/Symbols/sys/acpi.pdb

    Loaded symbol image file: acpi.sys

    Image path: /SystemRoot/system32/drivers/acpi.sys

    Image name: acpi.sys

    Timestamp:        Fri Jan 18 21:32:48 2008 (47918B80)

    CheckSum:         00041A1F

    ImageSize:        00046000

    File version:     6.0.6001.18000

    Product version:  6.0.6001.18000

    File flags:       0 (Mask 3F)

    File OS:          40004 NT Win32

    File type:        3.7 Driver

    File date:        00000000.00000000

    Translations:     0409.04b0

    CompanyName:      Microsoft Corporation

    ProductName:      Microsoft? Windows? Operating System

    InternalName:     ACPI.sys

    OriginalFilename: ACPI.sys

    ProductVersion:   6.0.6001.18000

    FileVersion:      6.0.6001.18000 (longhorn_rtm.080118-1840)

    FileDescription:  ACPI Driver for NT

    LegalCopyright:   ? Microsoft Corporation. All rights reserved.

 

这次信息比较详细了,给出了call stack以及acpi.pdb这个符号文件的信息了经过分析我得出了导致蓝屏的主要原因如下:最直接的原因是acpi!MapPhysMem+39该处的代码导致的,而该代码会出错又是因为BIOS中的ACPI memory map出错,OS使用的memoryBIOS占用了。原因大概明了了,那么到底是哪段BIOS code出了问题了呢?静下心来,我开始努力搜索脑海中的记忆,既然是ACPI出错那么应该是在asl code中,我记得以前看过的 asl code中当需要使用系统资源的时候通常要声OperationRegion比如我需要使用一些IO资源那么我要这么写:

OperationRegion( IO_, SystemIO, 0x62, 5 ) 

如需要使用SystemMemory则需要下面的写法:

OperationRegion(XPEX, SystemMemory, 0xE0020100, 0x100)

 

bug analysis info报出如下错误:

Arguments:

Arg1: 00001000, ACPI_BIOS_USING_OS_MEMORY

   ACPI had a fatal error when processing a memory

operation region.The memory operation region tried to

map memory that has been allocated for OS usage.

Arg2: 00000000, The high portion of the physical address

 of the memory region.

Arg3: ffffff00, The low portion of the physical address

 of the memory region.

Arg4: 00000105, The length of memory being mapped.

 

从字面上看我觉得应该是在asl code中声明一段System Memory然后acpi.sys这支driver解析该段System Memory的时候出错了。么到底是哪一段code导致的呢?arg3arg4道出了天机。应该是像下面写法的一段code导致的:

OperationRegion(???, SystemMemory, 0xffffff00, 0x105)

那么我就开始搜索asl code 最终在发现了罪犯的踪迹,下面的一段aslcode存在重大作案嫌疑:

Scope(/) { OperationRegion(ATFB,SystemMemory,0xFFFFFF00,0x105)// Relocatable operationRegion. Field(ATFB,AnyAcc,NoLock,Preserve) // Field { BCMD,8, DID,32, INFO,2048, } }

 

   BIOS拿掉这段code以后,板子工作正常了,愉快的进入了OSBug是解了,可是到底为什么这么声明一段区域会导致错误呢?BIOS给出的解释是这部分code没有被用到,可是它和BIOS声明给OS的资源在地址上有冲突。于是就蓝屏了J

 

Peter

你可能感兴趣的:(image,OS,Microsoft,System,processing,debugging)