T/s
是每秒传输数量(Transfer per second),有效带宽要结合传输协议来计算. T/s
可以看成是未经过编码的原始数据.2.5GT/s
,8b/10b编码(每次传输10bit,其中8bit是数据,2bit校验位),有效带宽为2Gbps
. 5 GT/s
,8b/10b编码(每次传输10bit,其中8bit是数据,2bit校验位),有效带宽为4Gbps
. 8 GT/s
,128b/130b编码(每次传输130bit,其中128bit是数据,2bit校验位),有效带宽约为8Gbps
.Up to 256
Bus Numbers can be assigned by configuration software. The initial Bus Number, Bus 0, is typically assigned by hardware to the Root Complex. Bus 0 consists of a Virtual PCI bus with integrated endpoints and Virtual PCI‐to‐PCI Bridges (P2P) which are hard‐coded with a Device number and Function number.
Configuration software begins the process of assigning bus numbers by searching for bridges starting with Bus 0, Device 0, Function 0. When a bridge is found, software(depth first search) assigns the new bus a bus number that is unique and larger than the bus number the bridge lives on.
32
device attachments on a single PCI bus, however, the point‐to‐point nature of PCIe means only a single device can be attached directly to a PCIe link and that device will always end up being Device 0.8
functions that all share the bus interface for that device, numbered 0‐7.Devices that have multiple Functions do not need to be implemented sequentially.TODO
what's difference between Switch and Bridge?
PCIe Switch和Bridge是用于PCIe技术的两种不同类型的设备,它们之间有以下几个不同之处:
- 功能:PCIe Switch是一个多端口设备,可以连接多个PCIe设备,并提供高速数据传输和路由功能。它允许多个设备同时进行通信,并提供更高的带宽和吞吐量。
- 数据转发:PCIe Switch可以根据目标设备的地址将数据包转发到正确的端口,从而实现设备之间的直接通信。而PCIe Bridge则只能将数据包从一个PCIe总线传递到另一个PCIe总线,起到桥接的作用。
- 扩展性:由于PCIe Switch具有多个端口,它通常用于大型系统或数据中心,可以连接多个设备。而PCIe Bridge通常用于较小的系统或单板计算机,只需要连接两个PCIe总线。
- 性能:PCIe Switch通常具有更高的带宽和更快的数据传输速度,因为它们专为高速PCIe总线设计。PCIe Bridge的性能相对较低,适用于较小规模的数据传输。
总而言之,PCIe Switch是一个多端口设备,具有高速数据传输和路由功能,用于连接多个PCIe设备,而PCIe Bridge是一个桥接设备,用于连接两个PCIe总线,功能相对较简单。
what's the virtual PCI bus?
why we need PCIe configuration?
Each PCIe device should have a Configuration space that was acquired for Host software to get device information and to configure the device.
Only the Root Sends Configuration Requests. why?
- (1) The ability to originate configuration transactions is restricted to the processor through the Root Complex to avoid the anarchy(chaos)that could result if any device had the ability to change the configuration of other devices.
- (2) Processors are generally unable to perform configuration read and write requests directly because they can only generate memory and IO requests.
- From the above two reason, we can conclude that Root Complex will need to translate certain of those accesses(memory and IO requests) into configuration requests in support of this process.
legacy PCI configuration mechanism,
using IO‐Indirect accesses.enhanced configuration mechanism
, using memory‐mapped accesses.legacy PCI configuration mechanism,
using IO‐Indirect accesses.Configuration Address Port
, at IO addresses 0CF8h ‐ 0CFBh
, and the Configuration Data Port,
at IO addresses 0CFCh ‐CFFh
.Configuration Address Port
only latches information when the processor performs a full 32‐bit write to the port and a 32‐bit read from the port returns its contents.Bit [31]
must be set to 1b
to enable translation of the subsequent IO access to the Configuration Data Port into a configuration access. If bit 31 is zero and an IO read or write is sent to the Configuration Data Port, the transaction is treated as an ordinary IO Request
.mov dx,0CF8h ;set dx = config address port address
mov eax,80040000h ;enable=1, bus 4, dev 0, func 0, DW 0
out dx,eax ;IO write to set up address port
mov dx,0CFCh ;set dx = config data port address
in ax,dx ;2-byte read from config data port
When a Request is seen, the Bridge evaluates whether the target bus number(tgrtBusnum) is within the range of bus numbers downstream, from the value of the Secondary Bus number to the Subordinate Bus number.
if (tgrtBusnum == sBusnum)
targeted the bus and the Request is passed through as a Type 0 Configuration Request
else if((trgtBusnum > sBusnum)&&(trgtBusnum<=subBusnum))
The Request will be forwarded as a Type1 configuration request on the bridge’s secondary bus.
1b
and the target bus is within the downstream range of bus numbers, the bridge translates a subsequent processor access targeting its Configuration Data Port into a configuration request on bus 0.The enhanced configuration mechanism
, using memory‐mapped accesses.
Rather than try to conserve address space, they would create a single step, uninterruptable process by mapping all of configuration space into memory addresses.
Mapping 4KB
per Function for all the possible implementations requires allocating 256MB
(256 * 32 * 8 * 4KB ) of memory address space.
To handle this mapping, each Function’s 4KB configuration space starts at a 4KB‐aligned address within the 256MB memory address space set aside for configuration access, and the address bits now carry the identifying information about which Function is targeted.
Example Enhanced Configuration Access
mov ax,[E0400000h] ;memory-mapped Config read
• Address bits 63:28 indicate the upper 36 bits of the 256MB‐aligned base address of the overall Enhanced Configuration address range (in this case, 00000000 E0000000h).
• Address bits 27:20 select the target bus (in this case, 4).
• Address bits 19:15 select the target device (in this case, 0) on the bus.
• Address bits 14:12 select the target Function (in this case, 0) within the device.
• Address bits 11:2 selects the target dword (in this case, 0) within the selected Function’s configuration space.
• Address bits 1:0 define the start byte location within the selected dword (in this case, 0).
Type 0 or Type 1
, may be generated by bridges in response to a configuration access.if (trgtBusnum == sBusnum){
if(trgtDevnum == Devnum){
if(trgtFuncnum == Funcnum){
use the target request register filed to select the target dword in its configuration space, and uses the First Dword Byte Enable field to select which bytes to read or write within the selected dword.
}
}
}
ignore Type 1 Requests
since the target resides on a different bus, but bridges that see it will make the same comparison of the target bus number to the range of buses downstream.if (trgtBusnum == sBusnum){
if(trgtDevnum == Devnum){
if(trgtFuncnum == Funcnum){
use the target request register filed to select the target dword in its configuration space, and uses the First Dword Byte Enable field to select which bytes to read or write within the selected dword.
}
}
}
else if((trgtBusnum > sBusnum)&&(trgtBusnum<=subBusnum))
The Request will be forwarded as a Type1 configuration request on the bridge’s secondary bus.
Host/PCI bridge
and that bus number 0
will be on the secondary side of that bridge.FFFFh
. If enumeration software saw that result for the read, it understood that the device wasn’t present.FFFFh
) to the processor for the data when this Completion is seen during enumeration.if(datRate <= 5.0GT/s)
Configuration software wait 100ms after reset before sending a Configuration request
else if(datRate > 5.0GT/s)
Configuration software wait 100ms after Link training completes before sending request
7 bits(6:0)
of the Header Type register (offset 0Eh in configure space header) identify the basic category of the Function, and three values are defined:• 0 = not a bridge (Endpoint in PCIe)
• 1 = PCI‐to‐PCI bridge (abbreviated as P2P) connecting two buses
• 2 = CardBus bridge (legacy interface not often used today)
bit7
of the Header Type register shows whether the device consists single function(0) or multi function(1).Figure 3‐13
(above picture) illustrates an example system after the buses and devices have been enumerated.1. Software updates the Host/PCI bridge Secondary Bus Number to zero and
the Subordinate Bus Number to 255. Setting this to the max value means
that it won’t have to be changed again until all the bus numbers downstream
have been identified. For the moment, buses 0 through 255 are identified
as being downstream.
2. Starting with Device 0 (bridge A), the enumeration software attempts to
read the Vendor ID from Function 0 in each of the 32 possible devices on
bus 0. If a valid Vendor ID is returned from Bus 0, Device 0, Function 0, the
device exists and contains at least one Function. If not, go on to probe bus 0,
device 1, Function 0.
3. The Header Type field in this example (Figure 3‐12 on page 108) contains
the value one (01h) indicating this is a PCI‐to‐PCI bridge. The Multifunction
bit (bit 7) in the Header Type register is 0, indicating that Function 0 is the
only Function in this bridge. The spec doesn’t preclude implementing multiple
Functions within this Device and each of these Functions, in turn, could represent
other virtual PCI‐to‐PCI bridges or even non‐bridge functions.
4. Now that software has found a bridge, performs a series of configuration
writes to set the bridge’s bus number registers as follows:
• Primary Bus Number Register = 0
• Secondary Bus Number Register = 1
• Subordinate Bus Number Register = 255
The bridge is now aware that the number of the bus directly attached
downstream is 1 (Secondary Bus Number = 1) and that the largest bus number
downstream of it is 255 (Subordinate Bus Number = 255).
5. Enumeration software must perform a depth‐first search. Before proceeding
to discover additional Devices/Functions on bus 0, it must proceed to
search bus 1.
6. Software reads the Vendor ID of Bus 1, Device 0, Function 0, which targets
bridge C in our example. A valid Vendor ID is returned, indicating that
Device 0, Function 0 exists on Bus 1.
7. The Header Type field in the Header register contains the value one
(0000001b) indicating another PCI‐to‐PCI bridge. As before, bit 7 is a 0, indicating that bridge C is a single‐function device.
8. Software now performs a series of configuration writes to set bridge C’s bus
number registers as follows:
• Primary Bus Number Register = 1
• Secondary Bus Number Register = 2
• Subordinate Bus Number Register = 255
9. Continuing the depth‐first search, a read is performed from bus 2, device 0,
Function 0’s Vendor ID register. The example assumes that bridge D is
Device 0, Function 0 on Bus 2.
10. A valid Vendor ID is returned, indicating bus 2, device 0, Function 0 exists.
11. The Header Type field in the Header register contains the value one
(0000001b) indicating that this is a PCI‐to‐PCI bridge, and bit 7 is a 0, indicating
that bridge D is a single‐function device.
12. Software now performs a series of configuration writes to set bridge D’s bus
number registers as follows:
• Primary Bus Number Register = 2
• Secondary Bus Number Register = 3
• Subordinate Bus Number Register = 255
13. Continuing the depth‐first search, a read is performed from bus 3, device 0,
Function 0’s Vendor ID register.
14. A valid Vendor ID is returned, indicating bus 3, device 0, Function 0 exists.
15. The Header Type field in the Header register contains the value zero
(0000000b) indicating that this is an Endpoint function. Since this is an endpoint
and not a bridge, it has a Type 0 header and there are no PCI‐compatible
buses beneath it. This time, bit 7 is a 1, indicating that this is a
multifunction device.
16. Enumeration software performs accesses to the Vendor ID of all 8 possible
functions in bus 3, device 0 and determines that only Function 1 exists in
addition to Function 0. Function 1 is also an Endpoint (Type 0 header), so
there are no additional buses beneath this device.
17. Enumeration software continues scanning across on bus 3 to look for valid
functions on devices 1 ‐ 31 but does not find any additional functions.
18. Having found every function there was to find downstream of bridge D,
enumeration software updates bridge D, with the real Subordinate Bus
Number of 3. Then it backs up one level (to bus 2) and continues scanning
across on that bus looking for valid functions. The example assumes that
bridge E is device 1, Function 0 on bus 2.
19. A valid Vendor ID is returned, indicating that this Function exists.
20. The Header Type field in bridge E’s Header register contains the value one
(0000001b) indicating that this is a PCI‐to‐PCI bridge, and bit 7 is a 0, indicating a single‐function device.
21. Software now performs a series of configuration writes to set bridge E’s bus
number registers as follows:
• Primary Bus Number Register = 2
• Secondary Bus Number Register = 4
• Subordinate Bus Number Register = 255
22. Continuing the depth‐first search, a read is performed from bus 4, device 0,
Function 0’s Vendor ID register.
23. A valid Vendor ID is returned, indicating that this Function exists.
24. The Header Type field in the Header register contains the value zero
(0000000b) indicating that this is an Endpoint device, and bit 7 is a 0, indicating that this is a single‐function device.
25. Enumeration software scans bus 4 to look for valid functions on devices 1 ‐
31 but does not find any additional functions.
26. Having reached the bottom of this tree branch, enumeration software
updates the bridge above that bus, E in this case, with the real Subordinate
Bus Number of 4. It then backs up one level (to bus 2) and moves on to read
the Vendor ID of the next device (device 2). The example assumes that
devices 2 ‐ 31 are not implemented on bus 2, so no additional devices are
discovered on bus 2.
27. Enumeration software updates the bridge above bus 2, C in this case, with
the real Subordinate Bus Number of 4 and backs up to the previous bus
(bus 1) and attempts to read the Vendor ID of the next device (device 1). The
example assumes that devices 1 ‐ 31 are not implemented on bus 1, so no
additional devices are discovered on bus 1.
28. Enumeration software updates the bridge above bus 1, A in this case, with
the real subordinate Bus Number of 4. and backs up to the previous bus
(bus 0) and moves on to read the Vendor ID of the next device (device 1).
During enumeration of the left‐hand tree structure in Figure 3‐14 on page 116,
the Host/PCI bridge in the secondary Root Complex ignores all configuration
accesses because the targeted bus number is no greater than 9. Note that,
although detected and numbered, Bus 8 has no device attached. Once that enumeration process has been completed, the enumeration software takes the following steps to enumerate the secondary Root Complex:
1. The enumeration software changes the Secondary and Subordinate Bus
Number values in the secondary Root Complex’s Host/PCI bridge to bus 64
in this example. (The values of 64 and 128 are commonly used as the starting
bus number in multi‐root systems, but this is just a software convention.
There are no PCI or PCIe rules requiring that configuration. There would be
nothing wrong with starting the secondary Root Complex’s bus numbers at 10 in this example.)
2. Enumeration software then starts searching on bus 64 and discovers the
bridge attached to the downstream Root Port.
3. A series of configuration writes are performed to set its bus number registers
as follows:
• Primary Bus Number Register = 64
• Secondary Bus Number Register = 65
• Subordinate Bus Number Register = 255
The bridge is now aware that the number of the bus directly attached to its
downstream side is 65 (Secondary Bus Number = 65) and the number of the
bus farthest downstream of it is 65 (Subordinate Bus Number = 65).
4. Device 0 is discovered on Bus 65 that implements a only Function 0, and
further searching reveals no other Devices are present on Bus 65, so the
search process moves back up one Bus level.
5. Enumeration continues on bus 64 and no additional devices are discovered,
so the Host/PCI’s Subordinate Bus Number is updated to 65.
6. This completes the enumeration process.
MindShare Arbor is a computer system debug, validation, analysis and learning tool that allows the user to read and write any memory, IO or configuration space address. It maybe need money :–(.
address spaces
supported in the system.address spaces
that were supported in PCI:"2.1 chapter"
control, status or pointer registers
.memory address space (MMIO)
, while allowing legacy software to access the internal registers of devices using IO address space
.(IO, NP‐MMIO or P‐MMIO)
to that device.
- This is all accomplished through the Base Address Registers (BARs) in the header of configuration space. 系统软件为设备简单的分配合理的地址范围和适当的类型,这些都是需要通过位于配置空间的header中BARs来完成的.
how does the device provide the information for system software?
- The
device designer
knows the collective size of the internal registers/storage that should be accessible via IO or MMIO.- The
device designer
also knows how the device will behave when those registers are accessed.This will determine whether prefetchable MMIO (reads have no side‐effects) or non‐prefetchable MMIO(reads do have side‐effects) should be requested.- The
device designer
hard‐codes the lower bits of the BARs to certain values indicating the type and size of the address space being requested.
size and type of address space
being requested by a device.In this example, as shown in Figure 4‐5 , BAR1 and BAR2 are being used to request a 64MB
block of prefetchable memory address space.
Two sequential BARs are being used here because the device supports a 64‐bit address for this request, meaning that software can allocate the requested address space above the 4GB address boundary if it wants to.
From the Figure 4‐5, we see the uninitialized state of the BAR pair. The device designer
has hard‐coded the lower bits of the lower BAR (BAR1 in our example) to indicate the request type and size, while the bits of the upper BAR (BAR2) are all read‐write.
System software’s first step is to write all 1s to every BAR.
System software’s next step is to read the next BAR (BAR1) and evaluate it to see if the device is requesting additional address space. Once BAR1 is read, software realizes that more address space is being requested and this request is for prefetchable memory address space that can be allocated anywhere in the 64‐bit address range. Since it supports a 64‐bit address, the next sequential BAR (BAR2 in this case) is treated as the upper 32 bits of BAR1.
System software’s final step is to allocate an address range to the BARs(2_4000_0000h ‐ 2_43FF_FFFFh
)
总结
:系统软件首先往BAR里面写”1“,然后读取BAR,如果BAR中位的没有任何变化,就表明该位是designer硬编码的,就可以推断出映射的空间。本例中BAR写1后,低26位没有任何变化,表明地址空间为2^26 B(64MB)
.
4000h ‐ 40FFh
.hard‐code
all bits of BAR4 and BAR5 to 0s.Each bridge (or switch ports or root complex ports)
needs to know what address ranges live beneath it so it can determine which requests should be forwarded from its primary interface (upstream side) to its secondary interface (downstream side).It is the Base and Limit registers in the Type 1 headers
that are programmed with the range of addresses that live beneath this bridge.Prefetchable Memory space
是0x2_4000_0000h~0x2_43FF_FFFFh
,我们应该怎么填写Type1 Header 中Prefetchable Memory Base和Limit相应的位呢?高32位
(bit32~bit63)填写到Prefetchable Memory Base Upper 32 Bits
中, 然后将基地址低31位中的bit20~bit31位
填写到Prefetchable Memory Base
中的bit15 ~ bit4位
。高32位
(bit32~bit63)填写到Prefetchable Memory Limit Upper 32 Bits
中, 然后将基地址低31位中的bit20~bit31位
填写到Prefetchable Memory Limit
中的bit15 ~ bit4位
。Non-Prefetchable Memory space
是0xF900_0000h~0xF90F_FFFFh
,我们应该怎么填写Type1 Header 中Non-Prefetchable Memory Base和Limit相应的位呢?bit20~bit31位
填写到Non-Prefetchable Memory Base
中的bit15 ~ bit4位
。bit20~bit31位
填写到Non-Prefetchable Memory Limit
中的bit15 ~ bit4位
。IO space
是0x4000h~0x4FFFh
,我们应该怎么填写Type1 Header 中IO Base和Limit相应的位呢?IO Base
和IO Base Upper 16Bits
. 首先选取基地址0x4000h中的bit16~bit31位
填写到IO Base Upper 16Bits
中,然后取0x4000的bit15~bit12放到IO Base
中bit7-bit4 中。IO Limit
和IO LimitUpper 16Bits
. 首先选取基地址0x4FFFh中的bit16~bit31位
填写到IO Base Upper 16Bits
中,然后取0x4FFF的bit15~bit12放到IO Base
中bit7-bit4 中。three decisions
:
- Accept the traffic and use it internally.
- Forward the traffic to the appropriate outbound (egress) port.
- Reject the traffic because it is neither the intended target, nor an interface to it.
Ordered Sets and DLLPs
are local to a link and thus are never routed to another link. TLPs
can and do move from link to link, based on routing information contained in the packet headers
.Root Complexes and Switches
, can forward TLPs between the ports and are sometimes called Routing Agents or Routing Elements. They accept TLPs that target internal resources and forward TLPs between ingress and egress ports.Endpoints
have only one Link and never expect to see ingress traffic other than what is targeting them. They simply accept or reject incoming TLPs.Why Messages?
Message transactions were introduced with PCIe. The main reason for adding Messages as a packet type was to pursue the PCIe design goal to drastically reduce the number of sideband signals implemented in PCI (e.g. interrupt pins, error pins, power management signals, etc.). Consequently, most of the sideband signals were replaced with in‐band packets in the form of Message TLPs.
Implicit routing takes advantage of the fact that Switches and other routing elements understand the concept of upstream and downstream, and that the Root Complex is found at the top of the topology while Endpoints are found at the bottom.
receive one or more requests and then respond to each request with a separate completion
. This is a significant improvement over the PCI bus protocol that used wait‐states or delayed transactions (retries) to deal with latencies in accessing targets.