x86 SMI链路错误

This article was firstly published from http://oliveryang.net. The content reuse need include the original link.

SMI link is the Scalable Memory Interconnect communication channels between the Intel processor cores and main memory.

The topology of SMI link is,

Core----->QPI bus----->iMC(Integrated Memory Controller)----->SMI link----->Scalable Memory Buffer----->DDR2/3/4 bus-----> Memory DIMMs

On Intel platform, it could be detected by BIOS SMI(System Management Interrupt) handler.
A SEL log could be found like below,

9 | 06/28/2015 | 07:32:16 | SMI Link CRC Correctable Errors #0x0a | Persistent Parity Status | Asserted |  Memory_Slot=3 SMI_Link=1

3. What kind of actions should we take?

Replace the faulty hardware if we could see massive SMI link errors, even if the errors are correctable type.
If BIOS error handling could cannot locate exact error location, the fault components (FRU) could be,

  • Memory DIMMs?

    Not sure, some high end platforms could know whether error is caused by link or ECC errors

  • Memory risers

    Some high end platforms may have

  • Mother board

    Due to faulty DIMM slots, Scalable Memory Buffer chipset

  • CPU

    Faulty iMC

The hardware replacement work could be done per cost considerations.

你可能感兴趣的:(server,hardware,X86)