A Test Framework of BMC for platform team
Table of Contents
A Test Framework of BMC for platform team.. 1
Introduction. 1
Requirement Analyze. 2
General Design. 2
Requirement. 2
Not requirement:3
Model Design. 3
Pre-run model3
Functional model3
IPMI device “Global” Commands. 3
BMC Watchdog Timer Commands. 4
Chassis Commands. 4
Event commands. 5
PEF and Alerting command. 5
SEL commands. 5
SDR Repository command. 5
FRU Inventory Device Commands. 6
Event Filter Table. 6
Sensor Device commands. 6
Platform-specific OEM command. 7
Post-run model7
Implement. 7
Requirement. 7
User interface. 7
Schedule. 7
Introduction
As theone most important component of a server, BMC is used to monitor system environmentmonitoring, hardware error and record related logs, so its high availabilityand stability is critical to us. Currently platform always use some manualscripts or commands to do BMC test, in fact the scripts and commands can be puttogether and re-used for other platform, and we can leverage other team’s testframework such as CTH test platform to test it automatically, therefore oneTest Frame work for BMC test is need, and this document is used to describe thedesign of the framework.
RequirementAnalyze
Typically uses cases include:
1. OEM SEL decoding Test for customer ipmitool/unit test
2. BMC stress test
3. BMC function test
4. BMC regression test
5. System monitro
6. Platform monitor
7. Firmware upgrade
8. Netmon
GeneralDesign
Requirement
・ support both local KCS test and LAN test
・ Could be integrate with EVT test framework
・ Could be integrated with CTH test tools
o Graphic user interface;
o Support auto Log;
o Support automatically report;
o Tests mode/collection configurable;
・ To be easily expanded for new platform;
・ To be easily expanded for other component
This requires that other software component own by platform team such asplatmon/error handling will be easily tested by this framework.
・ Support auto-training;
That means the framework will record the test cases which once detectFW/Software problem, and will automatically add them into test case for latertests.Accordingto experience, because we always meet problem when run BMC reset, firmwareupgrade , sensor reading and power cycle test, these type commands should becollected by auto-training mechanism.
・ Support multiply test modes: function test;random test; stress test
o Function test: test for every component/unit one by one;
o Random test: the random test generator willproduce the random test cases, such as random raw data, random SELs to BMC
o Stress test:
a. Component level stress test:
Many commands run together on the same component such as SEL / SDR / FRU/ Watchdog / CPLD at the same time toverify its availability.
b. System level stress test:
Many components ran stress test at the same time
・ Support an interface to add platform-specifictest;
・ The framework makes full use of code withcurrent DDOS
The customer ipmitool is implemented in our DDOS according coming BMCspecification, so maybe we could reuse the code and test scripts
・ The framework should define test level/log level
The test level is used to define the test granularity (sensor level,component level, bus level, function level, chip level) while the log levelhelp debug and log more details about the execution and output of the testcases.
・ Base on small generic test sets
To start this work easily, we willchoose a basic test cases collection maybe included by all platform, such asBMC SEL, user , SDR, reset, Lan, Fan, FRU and power control command.
Not requirement:
・ Test for some OEM command which must operatehardware manually;
・ Test for whose result may need manuallyinvestigation.
ModelDesign
Pre-run model
Check BMC version requirement;
Check BIOS version requirement;
check extra parameters;
check BMC is in normal state;
save BMC IPMIuser/sol/serial/channel/mac/DHCP/IP default configuration;
stop system and platform monitro;
Functional Test
IPMI device “Global” Commands
・ Get Device ID command
・ Warm Reset command
・ Cold Reset command
・ Get Self Tests Results command
・ Manufacturing Test on command
・ Set ACPI power state command
・ Get ACPI power state command
・ Get Device GUID command
・ Broadcast “Get Device ID” command
・ Firmware Firewall & Command Discoverycommands
・ IPMI Messaging Support Commands
o Set BMC Global Enables
o Get BMC Global Enables
o Clear Message Flags
o Get Message Flags
o Get Message
o Send message
§ Get BT interface Capabilities
§ Master write-read
§ IPMI serial/Modem Commands
§ Set Serial/Modem Configuration
§ Get Serial/Modem configuration
§ Serial / Modem connection Activ
§ SOL command ( Optional)
§ SOL Activating
§ Get SOL configuration Parameters
§ Set SOL configuration Parameters
Get PEF Capabilities
Arm PEF Postpone Timer
Set PEF Configuration Parameters
Get PEF Configuration Parameters
Set Last Processor Event ID
Get Last Processor Event ID
Reset watchdog timer
Set Watchdog timer
Get Watchdog timer
(On expiration of the Watchdog timeout:
v System Reset
v System Power Off
v System power cycle
v Pre-timeout interrupt (Optional)
Get Chassis capabilities
Get Chassis Status
Chassis control
Chassis Reset
Set event receiver
Get event receiver
Platform event message command
Trigger all possible SELs in scripts;
Delete SELS;
Full SELs;
Empty SELs;
Get SEL Info
Get SEL Entry
Add SEL Entry
Partial Add SEL entry
Clear SEL
Get SEL time
Set SEL time
OEM SEL decoding test (How to do the test with our OEM SELsdecoding code?)
Get SDR Repository Info
Reserve SDR Repository
Get SDR
Add SDR
Partial Add SDR
Clear SDR Repository
Get SDR Repository Time
Set SDR Repository Time
Get Device ID
Get Self Test Results
Broadcast Get Device ID
Get Sensor Reading
Set Event Receiver
Get Event Receiver
Platform Event
Get EFT list
Check whether EFT works
How to do PEF auto test?
Static and Dynamic Senor Devices
Get Device SDR info commands
Get Device SDR command
Reserve Devices SDR Repositorycommand
Get sensor Reading Factors command
Set sensor Hysteresis command
Get sensor Hysteresis command
Set Sensor Threshold command
Get sensors event Enable command
Set sensors event disabled command
Re-arm sensor Events command
OEM SELs;
OEM SLIC command;
OEM NVRAM/BBU command
(How to classified different OEM commands per platforms?)
Using “ipmitoolevent” command to cover all OEM SEL decoding, this is easy to be done by ascript:
Step 1: get allsensors;
Step 2: for everysensor, list all possible state
Step 3:Assert/Deassert every state
Step 4: checkwhether SEL decoding the event correctly.
Another method isusing “raw 0x40 0x41” command to emulate all possible SEL, and then using “ipmitoolsel list” to check whether successful to decode them.
Short presspre-boot
Long press pre-boot
Short presspost-boot
Long presspost-boot
One Fan enterauto-manual control mode
Multiply Fancontrolling method switch
Sequent Fancontrolling command
Chassis LED Test
SP LED Test
FAN LED Test
SLIC LED Test
PSU LED Test
Disk LED Test
Unpowered PSUinserted
Powered PSUinserted
Unpowered PSU removed
powered PSU removed
Firmware versionget
Firmware upgrade
Firmware downgrade
Firmware checksum
Firmware otheroption
Measure keyperformance for most frequent used commands, and compare it with other platform
check BMC is in normal state;
restore BMC IPMI user/sol/serial/FAN/LED/channel/mac/DHCP/IP defaultconfiguration;
restore system and platform monitor;
Record logs in three different lawer:
And then analyze above logs: PASS rate/ FAILrate:
・ Command level log: OK/ERROR
・ Case level log
Component level log;
Log upload;
Implement
Requirement
・ Unified final result format: $component test :PASS/FAIL / Unsupported/Not Run
・ Easily to expand to new platform for all OEMcommand
(Try to reuse all command function in sub script)
・ Cover all possible IPMI command maybe used by system daemon process