Some questions about ASPM
1. What is ASPM
ASPM stands for Active StatePower Management. It is a feature to save power when PCIE link is idle.
2. Related Bits about ASPM
ASPM support and compliance Bit
The first related bits are in PCIE Express Capability Structure,its name is link capabilities Register (offset 0CH)
ASPM Optionality Compliance:used to indicate whether it conforms to current specification.
ASPM Support: used to showwhether ASPM is supported.
The second relevant bit is inPCIE Root Complex Internal Link Control capabilities with name Root complexLink control Registers:
It is used to disable/enable ASPM.
Link Bandwidth Management Bit
This bit is used toindicate PCIE link width/speed changed or re-training occurs.
3. Software behavior about ASPM
After scanning PCI buses anddevices, kernel will check whether ASPM is supported && enabled and thenbegin do ASPM initialization. During the ASPM initialization, Re-train controlBit will be set to trigger a PCIE link re-training. This behavior will triggerthe Link Bandwidth management Status Bit of Link Status Register (Offset 12H)in PCIE Capability Structure to be set. For Eos platform, this bit will getpci_link_bandwidth_changed_status (Vendor Specific Information Capabilities: offset 30H)to be set according to followingrule:
Once SMI is triggered, The SMI handler will pollpci_link_bandwidth_changed_status bit and post warning SEL as following if thebit has been set:
1 | 03/04/2015 |01:06:38 | PCI-e Device Errors CPU Integrated I/O 0 | Non-Fatal Error Detected| Asserted | bus:0x00 dev:0x01 func:0x00 // Root port of SLOT 3
ELOG(65) PCI link bandwidth changed status. Bus:00H Dev:01H Fn:00H PS:C0H
2 | 03/04/2015 |01:06:38 | PCI-e Device Errors CPU Integrated I/O 0 | Non-Fatal Error Detected| Asserted | bus:0x00 dev:0x02 func:0x00 // Root port of SLOT 0
ELOG(65) PCI link bandwidth changed status. Bus:00H Dev:02H Fn:00H PS:C0H
3 | 03/04/2015 |01:06:38 | PCI-e Device Errors CPU Integrated I/O 0 | Non-Fatal Error Detected| Asserted | bus:0x00 dev:0x02 func:0x02 // Root port of on-board PMC SAS
ELOG(65) PCI link bandwidth changed status.Bus:00H Dev:02H Fn:02H PS:C0H
Then the Bit is cleared bySMI handler. While in older platform, although the Link Bandwidth management Status Bitis also set, we never see any SEL/warning/Alert for this Bit.
If ASPM is not supported ordisabled, The ASPM initialization should be skipped after PCI scanning duringkernel boot phase.
4. Concerns about ASPM
Do we need enableASPM feature?
Currently, the ASPM isenabled and running, that is why all root port with SLIC inserted has beenre-trained. However, neither older nor new platform has SLOTs which ASPM feature is supported,although I did see some PCIE/intel device has ASPM support.
Why fail to disable ASPM in kernel?
Per the code in drivers/pci/pcie/aspm.c,ASPM can be forced off with appended “pcie_aspm=off” option in kernel commandline, then there won’t be any PCIE link re-training, however I still find theLink Bandwidth Management Bit is set with the option in kernel command line.The appended option “pcie_aspm=off” doesn’t work well until I changed the codein pcie_aspm_sanity_check() as following:
/*
* If ASPM is disabled thenwe're not going to change
* the BIOS state. It's safe tocontinue even if it's a
* pre-1.1 device
*/
if (aspm_disabled)
return -EINVAL;
//continue; ……………………………….
It seems to be a linux kernel, we have filed a bug for that.
If ASPM need tobe enabled, the SEL in new platform is not expected, correct? If ASPM is not required, dowe need/have other daemon to monitor related Link Bandwidth Bit?
Take Link width management Bit for example, the bit will be set if linkwidth/speed has changed (this has already been monitored by sms on older and new platform) or Link re-training occurs, should system management software takecare of the re-training case?