PCIE各种包结构及常用资料汇总

一、Memory Read包

本系统中:CPU会发出Read包。

1、CPU发过来的包格式


  

Field Name

Header Byte/Bit

Function

Length [9:0]

Byte 3 Bit 7:0

Byte 2 Bit 1:0

单位是DW

Attr (Attributes)

Byte 2 Bit 5:4

00,PCI才不是0

EP (Poisoned Data)

Byte 2 Bit 6

要求驱动发包这里必须都是0,否则没意义

TD (TLP Digest Field Present)

Byte 2 Bit 7

0,不摘要(要求驱动不能发摘要)

TC (Traffic Class)

Byte 1 Bit 6:4

000

Type[4:0]

Byte 0 Bit 4:0

00000b = Memory Read or Write

Fmt 1:0 (Format)

Byte 0 Bit 6:5

00b = Memory Read (3DW w/o data)

1st DW BE 3:0 (First DW Byte Enables)

Byte 7 Bit 3:0

要求驱动发包必须DW对齐

1111

4’b1111

这里最为重要,设置不好会死机。

<=1DW:0000 --4’b0

>1DW:1111—4’b1111

Last BE 3:0 (Last DW Byte Enables)

Byte 7 Bit 7:4

Tag 7:0

Byte 6 Bit 7:0

不一定

Requester ID 15:0

Byte 5 Bit 7:0

Byte 4 Bit 7:0

全是0

Address 31:2

Byte 15 Bit 7:2

Byte 14 Bit 7:0

Byte 13 Bit 7:0

Byte 12 Bit 7:0

地址

 

二、Memory Write包

本系统中:CPU、板卡会发出Write包。

1、CPU发过来的包格式


  

Field Name

Header Byte/Bit

Function

Length [9:0]

Byte 3 Bit 7:0

Byte 2 Bit 1:0

单位是DW

Attr (Attributes)

Byte 2 Bit 5:4

00,PCI才不是0

EP (Poisoned Data)

Byte 2 Bit 6

要求驱动发包这里必须都是0,否则没意义

TD (TLP Digest Field Present)

Byte 2 Bit 7

0,不摘要(要求驱动不能发摘要)

TC (Traffic Class)

Byte 1 Bit 6:4

000

Type[4:0]

Byte 0 Bit 4:0

00000b = Memory Read or Write

Fmt 1:0 (Format)

Byte 0 Bit 6:5

10b = Memory Write (3DW w/ data)

1st DW BE 3:0 (First DW Byte Enables)

Byte 7 Bit 3:0

要求驱动发包必须DW对齐

1111

4’b1111

这里最为重要,设置不好会死机。

<=1DW:0000 --4’b0

>1DW:1111—4’b1111

Last BE 3:0 (Last DW Byte Enables)

Byte 7 Bit 7:4

Tag 7:0

Byte 6 Bit 7:0

不一定

Requester ID 15:0

Byte 5 Bit 7:0

Byte 4 Bit 7:0

全是0

Address 31:2

Byte 15 Bit 7:2

Byte 14 Bit 7:0

Byte 13 Bit 7:0

Byte 12 Bit 7:0

地址

2、板卡发过来(发出)的包格式


  

Field Name

Header Byte/Bit

Function

Length [9:0]

Byte 3 Bit 7:0

Byte 2 Bit 1:0

单位是DW

Attr (Attributes)

Byte 2 Bit 5:4

00,PCI才不是0

EP (Poisoned Data)

Byte 2 Bit 6

必须都是0,否则没意义

TD (TLP Digest Field Present)

Byte 2 Bit 7

0,不摘要

TC (Traffic Class)

Byte 1 Bit 6:4

000

Type[4:0]

Byte 0 Bit 4:0

00000b = Memory Read or Write

Fmt 1:0 (Format)

Byte 0 Bit 6:5

10b = Memory Write (3DW w/ data)

1st DW BE 3:0 (First DW Byte Enables)

Byte 7 Bit 3:0

1111

4’b1111

这里最为重要,设置不好会死机。

Last BE 3:0 (Last DW Byte Enables)

Byte 7 Bit 7:4

<=1DW:0000 --4’b0

>1DW:1111—4’b1111

Tag 7:0

Byte 6 Bit 7:0

不一定

Requester ID 15:0

Byte 5 Bit 7:0

Byte 4 Bit 7:0

自己的ID

Address 31:2

Byte 15 Bit 7:2

Byte 14 Bit 7:0

Byte 13 Bit 7:0

Byte 12 Bit 7:0

地址

 

三、Memory Read Completion包

本系统中,板卡需要回复CPUMemory Read Completion包。


3DW Completion Header Format

Field Name

Header Byte/Bit

Function

Length 9:0

Byte 3 Bit 7:0

Byte 2 Bit 1:0

DW

与Read包的Length一致

Attr 1:0 (Attributes)

Byte 2 Bit 5:4

与Read包的Attr一致

EP

Byte 2 Bit 6

0,否则没意义

TD

Byte 2 Bit 7

0,不摘要

TC 2:0 (Transfer Class)

Byte 2 Bit 6:4

与Read包的TC一致

Type 4:0

Byte 0 Bit 4:0

TLP packet type field. Always set to 01010b for a completion.

Fmt 1:0 (Format)

Byte 0 Bit 6:5

10b = Completion with data (CplD)

Byte Count

Byte 7 Bit 7:0

Byte 6 Bit 3:0

先全0

BCM

(Byte Count Modified)

Byte 6 Bit 4

0

CS 2:0

(Completion Status Code)

Byte 6 Bit 7:5

000b = Successful Completion (SC)

Completer ID 15:0

Byte 5 Bit 7:0

Byte 4 Bit 7:0

自己的ID

Lower Address 6:0

Byte 11 Bit 6:0

Read包中Address的DW地址的低5位加两个0(加偏移量,因为这里要求CPU的read必须是first enable和last enable都enable,所以偏移量为0)

Tag 7:0

Byte 10 Bit 7:0

与Read包的Tag一致

Requester ID 15:0

Byte 9 Bit 7:0

Byte 8 Bit 7:0

CPU的话就是全0

附资料

1、End Block Plus forPCI Express

 



2、Byte Enable

Using Byte Enables

As in the PCIprotocol, PCI Express requires a mechanism for reconciling its DW addressingand data transfers with the need, at times, for byte resolution in transfersizes and transaction start/end addresses. To achieve byte resolution, PCIExpress makes use of the two Byte Enable fields introduced earlier in Figure4-3 on page 162 and in Table 4-4 on page 163.

为了达到Byte分辨率,使用ByteEnable。

The First DWByte Enable field and the Last DW Byte Enable fields allow the requester toqualify the bytes of interest within the first and last double wordstransferred; this has the effect of allowing smaller transfers than a fulldouble word and offsetting the start and end addresses from DW boundaries.

使用ByteEnable可以允许更小的传输。

Byte Enable Rules

Byte enable bitsare high true. A value of"0" indicates the corresponding byte in the data payload should notbe written by the completer. A value of "1", indicates it should.

高位有效

If the validdata transferred is all within a single aligned double word, the Last DW Byteenable field must be = 0000b.

如果长度小于等于为1DW,LastDWByteEnable必须是0000b。

If the headerLength field indicates a transfer is more than 1DW, the First DW Byte Enablemust have at least one bit enabled.

如果长度大于1DW,First DW Byte Enable必须至少有一个是1b。

If the Lengthfield indicates a transfer of 3DW or more, then neither the First DW ByteEnable field or the Last DW Byte Enable field may have discontinuous byteenable bits set. In these cases, the Byte Enable fields are only being used tooffset the effective start address of a burst transaction.

Discontinuousbyte enable bit patterns in the First DW Byte enable field are allowed if thetransfer is 1DW.

Discontinuousbyte enable bit patterns in both the First and Second DW Byte enable fields areallowed only if the transfer is Quadword aligned (2DWs).

Discontinuous byte enable非常复杂,不用它。

A write requestwith a transfer length of 1DW and no byte enables set is legal, but has noeffect on the completer.

If a readrequest of 1 DW is done with no byte enable bits set, the completer returns a1DW data payload of undefined data. This may be used as a Flush mechanism.Because of ordering rules, a flush may be used to force all previously postedwrites to memory before the completion is returned.

长度为1DW,Byte Enable全为0b的写包,合法但没有效果。这种包的完成者回复长度为1DW包含1DW的随机数据的完成包。这可用于Flush机制。

An example of byteenable use in this case is illustrated in Figure 4-4 on page 168. Note that thetransfer length must extend from the first DW with any valid byte enabled tothe last DW with any valid bytes enabled. Because the transfer is more than2DW, the byte enables may only be used to specify the start address location(2d) and end address location (34d) of the transfer.

Figure 4-4. Using First DWand Last DW Byte Enable Fields


从图中看,First Byte其实是最后一个字节,Last byte其实是第一个字节。这是个误解,紧跟Header的是地址最小的数据,那么First Byte Enable肯定是管它的了。这个图仅仅是从地址的角度画的,不是从包结构的角度画的。

3、Additional RulesFor TLPs With Data Payloads

The following rules apply when a TLPincludes a data payload.

  1. The Length field refers to data payload only; the Digest field (if present) is not included in the Length.
  2. The first byte of data in the payload (immediately after the header) is always associated with the lowest (start) address.

紧接Header的那个DW有着最小的地址。

  1. The Length field always represents an integral number of doublewords (DW) transferred. Partial doublewords are qualified using First and Last Byte Enable fields.
  2. The PCI Express specification states that when multiple transactions are returned by a completer in response to a single memory request, that each intermediate transaction must end on naturally-aligned 64 and 128 byte address boundaries for a root complex (this is termed the Read Completion Boundary, or RCB). All other devices must break such transactions at naturally-aligned 128 byte boundaries. This behavior promotes system performance related to cache lines.
  3. The Length field is reserved when sending message TLPs using the transaction Msg. The Length field is valid when sending the message with data variant MsgD.
  4. PCI Express supports load tuning of links. This means that the data payload of a TLP must not exceed the current value in the Max_Payload_Size field of the Device Control Register. Only write transactions have data payloads, so this restriction does not apply to reads. A receiver is required to check for violations of the Max_Payload_Size limit during writes; violations are handled as Malformed TLPs.
  5. Receivers also must check for discrepancies between the value in the Length field and the actual amount of data transferred in a TLP with data. Violations are also handled as Malformed TLPs.
  6. Requests must not mix combinations of start address and transfer length which will cause a memory space access to cross a 4KB boundary. While checking is optional in this case, receivers checking for violations of this rule will report it as a Malformed TLP.

4、Memory Requests

3DW Memory Request HeaderFormats


      

Field Name

Header Byte/Bit

Function

Length [9:0]

Byte 3 Bit 7:0

Byte 2 Bit 1:0

单位是DW

Attr (Attributes)

Byte 2 Bit 5:4

都是0就好,PCI才不是0

EP (Poisoned Data)

Byte 2 Bit 6

0

TD (TLP Digest Field Present)

Byte 2 Bit 7

0,不摘要

TC (Traffic Class)

Byte 1 Bit 6:4

000即可

Type[4:0]

Byte 0 Bit 4:0

TLP packet Type field:

00000b = Memory Read or Write即可

00001b = Memory Read Locked

Type field is used with Fmt [1:0] field to specify transaction type, header size, and whether data payload is present.

Fmt 1:0 (Format)

Byte 0 Bit 6:5

Packet Format:

00b = Memory Read (3DW w/o data)即可

10b = Memory Write (3DW w/ data)即可

01b = Memory Read (4DW w/o data)

11b = Memory Write (4DW w/ data)

1st DW BE 3:0 (First DW Byte Enables)

Byte 7 Bit 3:0

见前面的描述

Last BE 3:0 (Last DW Byte Enables)

Byte 7 Bit 7:4

见前面的描述

Tag 7:0

Byte 6 Bit 7:0

These bits are used to identify each outstanding request issued by the requester. As non-posted requests are sent, the next sequential tag is assigned.

Default: only bits 4:0 are used (32 outstanding transactions at a time)

If Extended Tag bit in PCI Express Control Register is set = 1, then all 8 bits may be used (256 tags).

这个地方还有待确定,它要求tag连续。好像不连续也可以,主要是为了防止包顺序的错误。

Requester ID 15:0

Byte 5 Bit 7:0

Byte 4 Bit 7:0

Identifies the requester so a completion may be returned, etc.

Byte 4, 7:0 = Bus Number

Byte 5, 7:3 = Device Number

Byte 5, 2:0 = Function Number

Address 31:2

Byte 15 Bit 7:2

Byte 14 Bit 7:0

Byte 13 Bit 7:0

Byte 12 Bit 7:0

The lower 32 bits of the 64 bit start address for the memory transfer. Note that the lower two bits of the 32 bit address are reserved (00b), forcing the start address to be DW aligned.

 

5、Traffic Classes andVirtual Channels

       1、不申请的话用不到TC,他们是为了Qos设计的。

       2、TC在包里,VC是系统维护的,还有一张表对应了TC和VC。

       Softwareis permitted a great deal of flexibility in assigning VC IDs and mapping theassociated TCs. However, the specification states several rules associated withthe TC/VC mapping:

TC/VC mapping must be identical for the two ports attached to thesame link.

One TC must not be mapped to multiple VCs in any PCI Express Port.

One or multiple TCs can be mapped to a single VC.

       也即TC多对一到VC

3、TC用000就可以。别的都先不用管了。

6、Completion

3DW Completion Header Format


Field Name

Header Byte/Bit

Function

Length 9:0

Byte 3 Bit 7:0

Byte 2 Bit 1:0

DW

与Read包的Length一致

Attr 1:0 (Attributes)

Byte 2 Bit 5:4

与Read包的Attr一致

EP

Byte 2 Bit 6

0,否则没意义

TD

Byte 2 Bit 7

0,不摘要

TC 2:0 (Transfer Class)

Byte 2 Bit 6:4

与Read包的TC一致

Type 4:0

Byte 0 Bit 4:0

TLP packet type field. Always set to 01010b for a completion.

Fmt 1:0 (Format)

Byte 0 Bit 6:5

Packet Format. Always a 3DW header

00b = Completion without data (Cpl)

10b = Completion with data (CplD)

Byte Count

Byte 7 Bit 7:0

Byte 6 Bit 3:0

This is the remaining byte count until a read request is satisfied. Generally, it is derived from the original request Length field. See "Data Returned For Read Requests:" on page 188 for special cases caused by multiple completions.

BCM

(Byte Count Modified)

Byte 6 Bit 4

Set = 1 only by PCI-X completers. Indicates that the byte count field (see previous field) reflects the first transfer payload rather than total payload remaining. See "Using The Byte Count Modified Bit" on page 188.

CS 2:0

(Completion Status Code)

Byte 6 Bit 7:5

These bits encoded by the completer to indicate success in fulfilling the request.

000b = Successful Completion (SC)

001b = Unsupported Request (UR)

010b = Config Req Retry Status (CR S)

100b = Completer abort. (CA)

others: reserved. See "Summary of Completion Status Codes:" on page 187.

Completer ID 15:0

Byte 5 Bit 7:0

Byte 4 Bit 7:0

Identifies the completer. While not needed for routing a completion, this information may be useful if debugging bus traffic.

Byte 4 7:0 = Completer Bus #

Byte 5 7:3 = Completer Dev #

Byte 5 2:0 = Completer Function #

Lower Address 6:0

Byte 11 Bit 6:0

The lower 7 bits of address for the first enabled byte of data returned with a read. Calculated from request Length and Byte enables, it is used to determine next legal Read Completion Boundary. See "Calculating Lower Address Field" on page 187.

Tag 7:0

Byte 10 Bit 7:0

These bits are set to reflect the Tag received with the request. The requester uses them to associate inbound completion with an outstanding request.

Requester ID 15:0

Byte 9 Bit 7:0

Byte 8 Bit 7:0

Copied from the request into this field to be used in routing the completion back to the original requester.

Byte 4, 7:0 = Requester Bus #

Byte 5, 7:3 = Requester Device #

Byte 5, 2:0 = Requester Function #

Data Returned For Read Requests:

  1. Completions for read requests may be broken into multiple completions, but total data transfer must equal size of original request
  2. Completions for multiple requests may not be combined

一个Read可以有多个Completion,但多个Read不可以有一个Completion

  1. IO and Configuration reads are always 1 DW, so will always be satisfied with a single completion

IO和Configuration读总是1DW,所以总会有单一的Completion。

  1. A completion with a Status Code other than SC (successful completion) terminates a transaction.

除了SC之外的Status Code会中断一个事务。

  1. The Read Completion Boundary (RCB) must be observed when handling a read request with multiple completions. The RCB is 64 bytes or 128 bytes for the root complex; the value used should be visible in a configuration register.
  2. Bridges and endpoints may implement a bit for selecting the RCB size (64 or 128 bytes) under software control.
  3. Completions that do not cross an aligned RCB boundary must complete in one transfer.

Read Completion Boundary (RCB):

这里有点复杂,摘取Xilinx论坛上一位工程师的话(参见:http://forums.xilinx.com/xlnx/board/message?board.id=PCIe&thread.id=306):

If you are returning a completion in response to a memory readrequest, you are not required to break up the completion. Its optional, but ifyou do need to return the completion in mulitple packets, you must break thecompletion on the RCB boundaries. The maximum size of any completion with datapacket is bounded by the MPS setting in your device control register.

If you are generating memory read request to the host, then themaximum size of your read is bounded by the Max Read Request size value foundin the device control register. But remember that you must account for the factthat even though you send one read request, it may result in multiplecompletions coming back from the host based on the RCB.

  The Read Completion Boundary is defined in Section 7.8.7- Table 7-15 of the PCIe Base Specification v1.1. Table 7-15 defines the LinkControl Register for both Endpoint and Root ports and Bit3 within this registerdictates whether the RCB is 64bytes or 128 bytes. For root ports, this bit us ROand indicates the actual RCB value. For endpoints, this bit is RW and will beset by configuration software to indicate the RCB value of the Root Port.This bit would have to be changed from the SW side assuming that the Chipsetsupports 128B RCB. Section 2.3.1.1 also provides more information on RCB.

         也即:CPU过来的读包,我们可以回复一个大包。我们向Memory Controller的读包,对方可能回复许多长度基于RCB的包。但是,我们向其他板卡发送的Memory Read如果回复大包,能不能正常处理?需要做实验。

  附:实验证明,不能发大于RCB的Cpl包。

  1. Multiple completions for a single read request must return data in increasing address order.

拆分后的完成包必须以升地址来传输数据。

Calculating The Lower Address Field (Byte 11, bits 7:0)

Refer to the Lower Address field in Table4-9 on page 185. The Lower Address field is set up by the completer duringcompletions with data (CplD) to reflect the address of the first enabled byteof data being returned in the completion payload. This must be calculated inhardware by considering both the DW start address and the byte enable patternin the First DW Byte Enable field provided in the original request. Basically,the address is an offset from the DW start address:

  • If the First DW Byte Enable field is 1111b, all bytes are enabled in the first DW and the offset is 0. The byte start address is = DW start address.
  • If the First DW Byte Enable field is 1110b, the upper three bytes are enabled in the first DW and the offset is 1. The byte start address is = DW start address + 1.
  • If the First DW Byte Enable field is 1100b, the upper two bytes are enabled in the first DW and the offset is 2. The byte start address is = DW start address + 2.
  • If the First DW Byte Enable field is 1000b, only the upper byte is enabled in the first DW and the offset is 3. The byte start address is = DW start address + 3.

Once calculated, the lower 7 bits areplaced in the Lower Address field of the completion header in the event thestart address was not aligned on a Read Completion Boundary (RCB) and the readcompletion must break off at the first RCB. Knowledge of the RCB is necessarybecause breaking a transaction must be done on RCBs which are based on startaddress--not transfer size.

你可能感兴趣的:(FPGA)