一、故障定位
1.1.故障信息
Log摘要 Error/Event Logs Platform Event Log - 5025732D Created at : 10/27/2013 12:17:24 Driver Name : fips340/b1112a_0842.340 Subsystem : Memory DIMM Event Severity : Unrecoverable Error Action Flags : Report to Operating System Service Action Required HMC Call Home Service Processor Call Home Required Action Status : Processed HMC-Acknowledged Primary System Reference Code Reference Code : B123E504 Hex Words 2 - 5 : 030000F0 28A30110 C13920FF C10000FF Hex Words 6 - 9 : 008126C1 00000103 0E630030 00000000 Normal Hardware FRU Priority : Mandatory, replace all with this type as a unit Location Code : U78A0.001.DNWH1H2-P1-C13-C2 Part Number : 77P6500 CCIN : 31A6 Serial Number : MFG Replacement Unit Id : 0x00810404 Priority : Mandatory, replace all with this type as a unit Normal Hardware FRU Priority : Lowest priority replacement Location Code : U78A0.001.DNWH1H2-P1-C13 Part Number : 10N9725 CCIN : 53E1 Serial Number : YL1078002154 MFG Replacement Unit Id : 0x00811161 Priority : Lowest priority replacement Log Hex Dump 00000000 50480030 0100F000 20131027 12172434 PH.0.... ..'..$4 00000010 20131027 12172439 45000106 00000000 ..'..$9E....... 00000020 00000000 00000000 5025732D 5025732D ........P%s-P%s- 00000030 55480018 01008300 23034000 00000000 UH......#.@..... 00000040 0000A902 01015000 505300EC 0101F000 ......P.PS...... 00000050 02010009 000000E4 030000F0 28A30110 ............(... 00000060 C13920FF C10000FF 008126C1 00000103 .9 .......&..... 00000070 0E630030 00000000 42313233 45353034 .c.0....B123E504 00000080 20202020 20202020 20202020 20202020 00000090 20202020 20202020 C0000027 4C2C481C ...'L,H. 000000A0 55373841 302E3030 312E444E 57483148 U78A0.001.DNWH1H 000000B0 322D5031 2D433133 2D433200 49441C1D 2-P1-C13-C2.ID.. 000000C0 37375036 35303000 33314136 00000000 77P6500.31A6.... 000000D0 00000000 00000000 4D521001 00000000 ........MR...... 000000E0 00000048 00810404 4C2C4C1C 55373841 ...H....L,L.U78A 000000F0 302E3030 312E444E 57483148 322D5031 0.001.DNWH1H2-P1 00000100 2D433133 00000000 49441C1D 31304E39 -C13....ID..10N9 00000110 37323500 35334531 594C3130 37383030 725.53E1YL107800 00000120 32313534 4D521001 00000000 0000004C 2154MR.........L 00000130 00811161 55440094 02043100 00007D0A ...aUD....1...}. 00000140 2F6F7074 2F666970 732F6269 6E2F6475 /opt/fips/bin/du 00000150 6D707379 7374656D 00000000 00000000 mpsystem........ 00000160 00000000 00000000 00000000 00000000 ................ 00000170 00000000 00000000 00000000 00000000 ................ 00000180 00000000 00000000 00000000 00000000 ................ 00000190 00000000 66697073 3334302F 62313131 ....fips340/b111 000001A0 32615F30 3834322E 33343000 00000000 2a_0842.340..... 000001B0 00000000 00000001 00000002 00000801 ................ 000001C0 00000001 0000000F 4D54001C 01003100 ........MT....1. 000001D0 38323034 2D453841 30363145 45333400 8204-E8A061EE34. 000001E0 00000000 53570014 0201F000 00000B00 ....SW.......... 000001F0 00030008 00000001 ........ |
1.2.故障定位
系统内存统计信息 |
|
Memory Size: |
31616 MB |
内存位置统计信息 |
|||
序号 |
位置 |
容量 |
QUAD |
1 |
U78A0.001.DNWH1H2-P1-C13-C2 |
4096KB |
Pairs A |
2 |
U78A0.001.DNWH1H2-P1-C13-C4 |
4096KB |
Pairs B |
3 |
U78A0.001.DNWH1H2-P1-C13-C7 |
4096KB |
Pairs A |
4 |
U78A0.001.DNWH1H2-P1-C13-C9 |
4096KB |
Pairs B |
5 |
U78A0.001.DNWH1H2-P1-C14-C2 |
4096KB |
Pairs C |
6 |
U78A0.001.DNWH1H2-P1-C14-C4 |
4096KB |
Pairs D |
7 |
U78A0.001.DNWH1H2-P1-C14-C7 |
4096KB |
Pairs C |
8 |
U78A0.001.DNWH1H2-P1-C14-C9 |
4096KB |
Pairs D |
U78A0.001.DNWH1H2-P1-C13-C2槽位内存预告警;
如果更换内存,以Pair(1组2根)的方式进行更换,更换的槽位为P1-Cn-C2 and P1-Cn-C7.
二、故障处理
2.1.先决条件
注意 |
确保系统关机,电源断开 操作时,使用防静电护腕 添加或更换硬件组件之前请作好数据备份。如果部件未正确安装,则可能会导致数据丢失。 |
2.2.准备项
准备确认项 |
||
类型 |
准备项 |
状态 |
硬件 |
笔记本一台 |
已准备就绪 |
网线一根 |
已准备就绪 |
|
一字、十字螺丝刀各一把 |
已准备就绪 |
|
防静电护腕一个 |
已准备就绪 |
|
新内存4根 |
已准备就绪 |
|
软件 |
HMC环境 |
已准备就绪 |
其它 |
||
2.3.操作项
操作项列表 |
|||
序号 |
操作项 |
操作内容 |
状态 |
1 |
确认系统关机 |
建议客户应用及业务数据备份 |
|
2 |
佩戴防静电护腕 |
确认已经佩戴防静电护腕,并且防静电护腕连接到机柜上的未涂漆部分 |
|
3 |
断开电源 |
断开主电源和次电源 |
|
4 |
移除服务检修盖 |
||
5 |
拆除处理器板 |
||
6 |
将取下的处理器板放置在防静电的材质表面 |
||
7 |
拆开移除处理器前盖 |
||
8 |
确认更换内存位置 |
||
9 |
从防静电包装中取出内存 |
||
10 |
安装内存 |
||
11 |
重新安装处理器板 |
||
12 |
确认故障影响消失 |
确认新更换的硬件无告警 |
|
确认新的硬件在系统中就绪 |
|||
用户确认应用及业务数据不受影响 |
|||
13 |
收尾 |
清理现场,结束工作 |
三、参考信息
Processors and memory | Where to install memory modules |
Memory plugged in pairs | Plugthe first pair of memory modules into memory module slots P1-Cn-C2 andP1-Cn-C7. Plug the second pair of memory modules into memory module slotsP1-Cn-C4 and P1-Cn-C9. Important: Memory modulesinstalled on the same processor assembly must be identical in size, speed, andfeature code. After the second pair of memory modules, the memory modulesbecome a quad. |
One processor card, memory plugged in quads | ・Plug the first quad of memory modules into memory module slotsP1-Cn-C2, P1-Cn-C4, P1-Cn-C7, and P1-Cn-C9. ・Plug the second quad of memory modules into memory module slotsP1-Cn-C1, P1-Cn-C3, P1-Cn-C6, and P1-Cn-C8. Important: Memory modulesinstalled on the same processor assembly must be identical in size, speed, andfeature code. After the second pair of memory modules, the memory modulesbecome a quad. |
Multiple processor cards, memory plugged in quads | Memoryshould be balanced on each processor card, unless you have an odd number ofmemory modules. ・Plug the first quad of memory modules into memory module slotsP1-Cn-C2, P1-Cn-C4, P1-Cn-C7, and P1-Cn-C9 on each processor card. ・Plug the second quad of memory modules in slots P1-Cn-C1,P1-Cn-C3, P1-Cn-C6, and P1-Cn-C8 on each processor card. Important: Ensure that thememory modules installed on the same processor assembly are identical in size,speed, and feature code. However, memory module feature codes can be differentbetween processor assemblies. |