本系统中:CPU会发出Read包。
Field Name |
Header Byte/Bit |
Function |
|||
Length [9:0] |
Byte 3 Bit 7:0 Byte 2 Bit 1:0 |
单位是DW |
|||
Attr (Attributes) |
Byte 2 Bit 5:4 |
00,PCI才不是0 |
|||
EP (Poisoned Data) |
Byte 2 Bit 6 |
要求驱动发包这里必须都是0,否则没意义 |
|||
TD (TLP Digest Field Present) |
Byte 2 Bit 7 |
0,不摘要(要求驱动不能发摘要) |
|||
TC (Traffic Class) |
Byte 1 Bit 6:4 |
000 |
|||
Type[4:0] |
Byte 0 Bit 4:0 |
00000b = Memory Read or Write |
|||
Fmt 1:0 (Format) |
Byte 0 Bit 6:5 |
00b = Memory Read (3DW w/o data) |
|||
1st DW BE 3:0 (First DW Byte Enables) |
Byte 7 Bit 3:0 |
要求驱动发包必须DW对齐
|
|||
Last BE 3:0 (Last DW Byte Enables) |
Byte 7 Bit 7:4 |
||||
Tag 7:0 |
Byte 6 Bit 7:0 |
不一定 |
|||
Requester ID 15:0 |
Byte 5 Bit 7:0 Byte 4 Bit 7:0 |
全是0 |
|||
Address 31:2 |
Byte 15 Bit 7:2 Byte 14 Bit 7:0 Byte 13 Bit 7:0 Byte 12 Bit 7:0 |
地址 |
本系统中:CPU、板卡会发出Write包。
Field Name |
Header Byte/Bit |
Function |
|||
Length [9:0] |
Byte 3 Bit 7:0 Byte 2 Bit 1:0 |
单位是DW |
|||
Attr (Attributes) |
Byte 2 Bit 5:4 |
00,PCI才不是0 |
|||
EP (Poisoned Data) |
Byte 2 Bit 6 |
要求驱动发包这里必须都是0,否则没意义 |
|||
TD (TLP Digest Field Present) |
Byte 2 Bit 7 |
0,不摘要(要求驱动不能发摘要) |
|||
TC (Traffic Class) |
Byte 1 Bit 6:4 |
000 |
|||
Type[4:0] |
Byte 0 Bit 4:0 |
00000b = Memory Read or Write |
|||
Fmt 1:0 (Format) |
Byte 0 Bit 6:5 |
10b = Memory Write (3DW w/ data) |
|||
1st DW BE 3:0 (First DW Byte Enables) |
Byte 7 Bit 3:0 |
要求驱动发包必须DW对齐
|
|||
Last BE 3:0 (Last DW Byte Enables) |
Byte 7 Bit 7:4 |
||||
Tag 7:0 |
Byte 6 Bit 7:0 |
不一定 |
|||
Requester ID 15:0 |
Byte 5 Bit 7:0 Byte 4 Bit 7:0 |
全是0 |
|||
Address 31:2 |
Byte 15 Bit 7:2 Byte 14 Bit 7:0 Byte 13 Bit 7:0 Byte 12 Bit 7:0 |
地址 |
Field Name |
Header Byte/Bit |
Function |
|
Length [9:0] |
Byte 3 Bit 7:0 Byte 2 Bit 1:0 |
单位是DW |
|
Attr (Attributes) |
Byte 2 Bit 5:4 |
00,PCI才不是0 |
|
EP (Poisoned Data) |
Byte 2 Bit 6 |
必须都是0,否则没意义 |
|
TD (TLP Digest Field Present) |
Byte 2 Bit 7 |
0,不摘要 |
|
TC (Traffic Class) |
Byte 1 Bit 6:4 |
000 |
|
Type[4:0] |
Byte 0 Bit 4:0 |
00000b = Memory Read or Write |
|
Fmt 1:0 (Format) |
Byte 0 Bit 6:5 |
10b = Memory Write (3DW w/ data) |
|
1st DW BE 3:0 (First DW Byte Enables) |
Byte 7 Bit 3:0 |
1111 4’b1111 |
这里最为重要,设置不好会死机。 |
Last BE 3:0 (Last DW Byte Enables) |
Byte 7 Bit 7:4 |
<=1DW:0000 --4’b0 >1DW:1111—4’b1111 |
|
Tag 7:0 |
Byte 6 Bit 7:0 |
不一定 |
|
Requester ID 15:0 |
Byte 5 Bit 7:0 Byte 4 Bit 7:0 |
自己的ID |
|
Address 31:2 |
Byte 15 Bit 7:2 Byte 14 Bit 7:0 Byte 13 Bit 7:0 Byte 12 Bit 7:0 |
地址 |
本系统中,板卡需要回复CPUMemory Read Completion包。
3DW Completion Header Format
Field Name |
Header Byte/Bit |
Function |
Length 9:0 |
Byte 3 Bit 7:0 Byte 2 Bit 1:0 |
DW 与Read包的Length一致 |
Attr 1:0 (Attributes) |
Byte 2 Bit 5:4 |
与Read包的Attr一致 |
EP |
Byte 2 Bit 6 |
0,否则没意义 |
TD |
Byte 2 Bit 7 |
0,不摘要 |
TC 2:0 (Transfer Class) |
Byte 2 Bit 6:4 |
与Read包的TC一致 |
Type 4:0 |
Byte 0 Bit 4:0 |
TLP packet type field. Always set to 01010b for a completion. |
Fmt 1:0 (Format) |
Byte 0 Bit 6:5 |
10b = Completion with data (CplD) |
Byte Count |
Byte 7 Bit 7:0 Byte 6 Bit 3:0 |
先全0 |
BCM (Byte Count Modified) |
Byte 6 Bit 4 |
0 |
CS 2:0 (Completion Status Code) |
Byte 6 Bit 7:5 |
000b = Successful Completion (SC) |
Completer ID 15:0 |
Byte 5 Bit 7:0 Byte 4 Bit 7:0 |
自己的ID |
Lower Address 6:0 |
Byte 11 Bit 6:0 |
Read包中Address的DW地址的低5位加两个0(加偏移量,因为这里要求CPU的read必须是first enable和last enable都enable,所以偏移量为0) |
Tag 7:0 |
Byte 10 Bit 7:0 |
与Read包的Tag一致 |
Requester ID 15:0 |
Byte 9 Bit 7:0 Byte 8 Bit 7:0 |
CPU的话就是全0 |
As in the PCIprotocol, PCI Express requires a mechanism for reconciling its DW addressingand data transfers with the need, at times, for byte resolution in transfersizes and transaction start/end addresses. To achieve byte resolution, PCIExpress makes use of the two Byte Enable fields introduced earlier in Figure4-3 on page 162 and in Table 4-4 on page 163.
为了达到Byte分辨率,使用ByteEnable。
The First DWByte Enable field and the Last DW Byte Enable fields allow the requester toqualify the bytes of interest within the first and last double wordstransferred; this has the effect of allowing smaller transfers than a fulldouble word and offsetting the start and end addresses from DW boundaries.
使用ByteEnable可以允许更小的传输。
Byte enable bitsare high true. A value of"0" indicates the corresponding byte in the data payload should notbe written by the completer. A value of "1", indicates it should.
高位有效
If the validdata transferred is all within a single aligned double word, the Last DW Byteenable field must be = 0000b.
如果长度小于等于为1DW,LastDWByteEnable必须是0000b。
If the headerLength field indicates a transfer is more than 1DW, the First DW Byte Enablemust have at least one bit enabled.
如果长度大于1DW,First DW Byte Enable必须至少有一个是1b。
If the Lengthfield indicates a transfer of 3DW or more, then neither the First DW ByteEnable field or the Last DW Byte Enable field may have discontinuous byteenable bits set. In these cases, the Byte Enable fields are only being used tooffset the effective start address of a burst transaction.
Discontinuousbyte enable bit patterns in the First DW Byte enable field are allowed if thetransfer is 1DW.
Discontinuousbyte enable bit patterns in both the First and Second DW Byte enable fields areallowed only if the transfer is Quadword aligned (2DWs).
Discontinuous byte enable非常复杂,不用它。
A write requestwith a transfer length of 1DW and no byte enables set is legal, but has noeffect on the completer.
If a readrequest of 1 DW is done with no byte enable bits set, the completer returns a1DW data payload of undefined data. This may be used as a Flush mechanism.Because of ordering rules, a flush may be used to force all previously postedwrites to memory before the completion is returned.
长度为1DW,Byte Enable全为0b的写包,合法但没有效果。这种包的完成者回复长度为1DW包含1DW的随机数据的完成包。这可用于Flush机制。
An example of byteenable use in this case is illustrated in Figure 4-4 on page 168. Note that thetransfer length must extend from the first DW with any valid byte enabled tothe last DW with any valid bytes enabled. Because the transfer is more than2DW, the byte enables may only be used to specify the start address location(2d) and end address location (34d) of the transfer.
Figure 4-4. Using First DWand Last DW Byte Enable Fields
从图中看,First Byte其实是最后一个字节,Last byte其实是第一个字节。这是个误解,紧跟Header的是地址最小的数据,那么First Byte Enable肯定是管它的了。这个图仅仅是从地址的角度画的,不是从包结构的角度画的。
The following rules apply when a TLPincludes a data payload.
紧接Header的那个DW有着最小的地址。
3DW Memory Request HeaderFormats
Field Name |
Header Byte/Bit |
Function |
Length [9:0] |
Byte 3 Bit 7:0 Byte 2 Bit 1:0 |
单位是DW |
Attr (Attributes) |
Byte 2 Bit 5:4 |
都是0就好,PCI才不是0 |
EP (Poisoned Data) |
Byte 2 Bit 6 |
0 |
TD (TLP Digest Field Present) |
Byte 2 Bit 7 |
0,不摘要 |
TC (Traffic Class) |
Byte 1 Bit 6:4 |
000即可 |
Type[4:0] |
Byte 0 Bit 4:0 |
TLP packet Type field: 00000b = Memory Read or Write即可 00001b = Memory Read Locked Type field is used with Fmt [1:0] field to specify transaction type, header size, and whether data payload is present. |
Fmt 1:0 (Format) |
Byte 0 Bit 6:5 |
Packet Format: 00b = Memory Read (3DW w/o data)即可 10b = Memory Write (3DW w/ data)即可 01b = Memory Read (4DW w/o data) 11b = Memory Write (4DW w/ data) |
1st DW BE 3:0 (First DW Byte Enables) |
Byte 7 Bit 3:0 |
见前面的描述 |
Last BE 3:0 (Last DW Byte Enables) |
Byte 7 Bit 7:4 |
见前面的描述 |
Tag 7:0 |
Byte 6 Bit 7:0 |
These bits are used to identify each outstanding request issued by the requester. As non-posted requests are sent, the next sequential tag is assigned. Default: only bits 4:0 are used (32 outstanding transactions at a time) If Extended Tag bit in PCI Express Control Register is set = 1, then all 8 bits may be used (256 tags). 这个地方还有待确定,它要求tag连续。好像不连续也可以,主要是为了防止包顺序的错误。 |
Requester ID 15:0 |
Byte 5 Bit 7:0 Byte 4 Bit 7:0 |
Identifies the requester so a completion may be returned, etc. Byte 4, 7:0 = Bus Number Byte 5, 7:3 = Device Number Byte 5, 2:0 = Function Number |
Address 31:2 |
Byte 15 Bit 7:2 Byte 14 Bit 7:0 Byte 13 Bit 7:0 Byte 12 Bit 7:0 |
The lower 32 bits of the 64 bit start address for the memory transfer. Note that the lower two bits of the 32 bit address are reserved (00b), forcing the start address to be DW aligned. |
1、不申请的话用不到TC,他们是为了Qos设计的。
2、TC在包里,VC是系统维护的,还有一张表对应了TC和VC。
Softwareis permitted a great deal of flexibility in assigning VC IDs and mapping theassociated TCs. However, the specification states several rules associated withthe TC/VC mapping:
TC/VC mapping must be identical for the two ports attached to thesame link.
One TC must not be mapped to multiple VCs in any PCI Express Port.
One or multiple TCs can be mapped to a single VC.
也即TC多对一到VC
3、TC用000就可以。别的都先不用管了。
3DW Completion Header Format
Field Name |
Header Byte/Bit |
Function |
Length 9:0 |
Byte 3 Bit 7:0 Byte 2 Bit 1:0 |
DW 与Read包的Length一致 |
Attr 1:0 (Attributes) |
Byte 2 Bit 5:4 |
与Read包的Attr一致 |
EP |
Byte 2 Bit 6 |
0,否则没意义 |
TD |
Byte 2 Bit 7 |
0,不摘要 |
TC 2:0 (Transfer Class) |
Byte 2 Bit 6:4 |
与Read包的TC一致 |
Type 4:0 |
Byte 0 Bit 4:0 |
TLP packet type field. Always set to 01010b for a completion. |
Fmt 1:0 (Format) |
Byte 0 Bit 6:5 |
Packet Format. Always a 3DW header 00b = Completion without data (Cpl) 10b = Completion with data (CplD) |
Byte Count |
Byte 7 Bit 7:0 Byte 6 Bit 3:0 |
This is the remaining byte count until a read request is satisfied. Generally, it is derived from the original request Length field. See "Data Returned For Read Requests:" on page 188 for special cases caused by multiple completions. |
BCM (Byte Count Modified) |
Byte 6 Bit 4 |
Set = 1 only by PCI-X completers. Indicates that the byte count field (see previous field) reflects the first transfer payload rather than total payload remaining. See "Using The Byte Count Modified Bit" on page 188. |
CS 2:0 (Completion Status Code) |
Byte 6 Bit 7:5 |
These bits encoded by the completer to indicate success in fulfilling the request. 000b = Successful Completion (SC) 001b = Unsupported Request (UR) 010b = Config Req Retry Status (CR S) 100b = Completer abort. (CA) others: reserved. See "Summary of Completion Status Codes:" on page 187. |
Completer ID 15:0 |
Byte 5 Bit 7:0 Byte 4 Bit 7:0 |
Identifies the completer. While not needed for routing a completion, this information may be useful if debugging bus traffic. Byte 4 7:0 = Completer Bus # Byte 5 7:3 = Completer Dev # Byte 5 2:0 = Completer Function # |
Lower Address 6:0 |
Byte 11 Bit 6:0 |
The lower 7 bits of address for the first enabled byte of data returned with a read. Calculated from request Length and Byte enables, it is used to determine next legal Read Completion Boundary. See "Calculating Lower Address Field" on page 187. |
Tag 7:0 |
Byte 10 Bit 7:0 |
These bits are set to reflect the Tag received with the request. The requester uses them to associate inbound completion with an outstanding request. |
Requester ID 15:0 |
Byte 9 Bit 7:0 Byte 8 Bit 7:0 |
Copied from the request into this field to be used in routing the completion back to the original requester. Byte 4, 7:0 = Requester Bus # Byte 5, 7:3 = Requester Device # Byte 5, 2:0 = Requester Function # |
一个Read可以有多个Completion,但多个Read不可以有一个Completion
IO和Configuration读总是1DW,所以总会有单一的Completion。
除了SC之外的Status Code会中断一个事务。
Read Completion Boundary (RCB):
这里有点复杂,摘取Xilinx论坛上一位工程师的话(参见:http://forums.xilinx.com/xlnx/board/message?board.id=PCIe&thread.id=306):
If you are returning a completion in response to a memory readrequest, you are not required to break up the completion. Its optional, but ifyou do need to return the completion in mulitple packets, you must break thecompletion on the RCB boundaries. The maximum size of any completion with datapacket is bounded by the MPS setting in your device control register.
If you are generating memory read request to the host, then themaximum size of your read is bounded by the Max Read Request size value foundin the device control register. But remember that you must account for the factthat even though you send one read request, it may result in multiplecompletions coming back from the host based on the RCB.
The Read Completion Boundary is defined in Section 7.8.7- Table 7-15 of the PCIe Base Specification v1.1. Table 7-15 defines the LinkControl Register for both Endpoint and Root ports and Bit3 within this registerdictates whether the RCB is 64bytes or 128 bytes. For root ports, this bit us ROand indicates the actual RCB value. For endpoints, this bit is RW and will beset by configuration software to indicate the RCB value of the Root Port.This bit would have to be changed from the SW side assuming that the Chipsetsupports 128B RCB. Section 2.3.1.1 also provides more information on RCB.
也即:CPU过来的读包,我们可以回复一个大包。我们向Memory Controller的读包,对方可能回复许多长度基于RCB的包。但是,我们向其他板卡发送的Memory Read如果回复大包,能不能正常处理?需要做实验。
附:实验证明,不能发大于RCB的Cpl包。
拆分后的完成包必须以升地址来传输数据。
Refer to the Lower Address field in Table4-9 on page 185. The Lower Address field is set up by the completer duringcompletions with data (CplD) to reflect the address of the first enabled byteof data being returned in the completion payload. This must be calculated inhardware by considering both the DW start address and the byte enable patternin the First DW Byte Enable field provided in the original request. Basically,the address is an offset from the DW start address:
Once calculated, the lower 7 bits areplaced in the Lower Address field of the completion header in the event thestart address was not aligned on a Read Completion Boundary (RCB) and the readcompletion must break off at the first RCB. Knowledge of the RCB is necessarybecause breaking a transaction must be done on RCBs which are based on startaddress--not transfer size.