在微信中不能分享,转帖
转自:June 9, 2017发表的 5-considerations-25g-vs-40g-ethernet-debate
我最近听到一些业界砖家说40G以太网已死,需要转到25/50/100G的技术演进上来,结论简单粗暴,误人不浅。为认清此问题,需要考虑以下五点:
第一点,明确我们说的是用于服务器上联还是交换机互联,到目前,25G网络只用于服务器上联,没见过交换机使用25G互联。甚至思科都没有25G光模块,惠普企业的25G光模块价格比40G/100G的还高。换句话说,现在只有脑子进水才会用25G做交换机互联,明年或许有变化,那是另外一回事。
第二点,应该看看市面上的交换机,例如,思科的93180YC-EX,端口支持10G/25G,而价格与10G的老产品差不多,整个产品是向下兼容的。SFP28支持1G/10G/25G,QSFP28支持10G/25G/40G/50G/100G
网卡也是重点。服务器刚开始有25G接口,intel的25G网卡XXV710刚刚出货,和40G XL710价格差不多,比10G的X710略高。自2016年,惠普的服务器也搭配了Mellanox的640SFP28网卡,比10G网卡更有性价比。思科的C系列 Rack-Mount服务器也开始支持25G网卡,价格介于10G的1225和40G的1385.
我认为当前的刀片服务器还没有25G接口,所有厂商都在研25G产品。思科的B系列UCS是目前仅有的支持40G接口的刀片服务器,价格比10G的产品贵40%,期待下一代25G产品价格能下降。
选择25G的最后一个原因基于功耗和散热的考虑,按Mellanox 和Qlogic的说法,25G比40G更简单,功耗应该更低。然后,intel的产品710网卡不是这样,25G XXV(双口25GBase-CR-L)网卡典型功耗8.6W, 而10G X(双口10Gbase-SR)为4.3W,双口DAC为3.3W,40G XL(双口 40Gbase-CR4)为3.6W。然而,其他家的25G网卡的确比40G网卡(思科17W, intel的老网卡10W)功耗更低。
总之,我也倾向数据中心选用支持25G的交换机,但是,这不是一个to be or not to be的选择,而是看需求,比如25G比40G可以传输更远。
我觉得交换机互联带宽更重要,在数据中心建设上,100G端口只比40G端口贵25%,但是带宽提高到250%,性价比凸显。
然而对于园区网和接入网,需要长距传输和低带宽,对于10G上行的扩容需求没有那么强烈,如果必须,现在可以选择40G。或许若干年后,25G上行会更便宜。
I’ve recently come across a number of comments from industry experts declaring that 40G Ethernet is dead — that it’s a given for organizations to jump on the 25/50/100G path. Those conclusions are simplistic and misleading. To really get at the heart of this question, organizations need to look at five critical factors.
The first consideration is whether you’re thinking of 25G for switch-to-server or switch-to-switch (or switch-to-blade switch). Right now, network vendors are positioning 25G only for switch-to-server. I don’t see any network vendor advertising 25G for switch-to-switch — Cisco doesn’t even offer a 25G fiber transceiver, and HPE has priced theirs higher than 40G and 100G transceivers! In other words, no one is talking about 25G for switch-to-switch links right now. It might be a different story in 2018; we’ll see.
The second place to focus your attention is on the switches. Most switches currently sold, like Cisco’s 93180YC-EX, support both 10G and 25G at a price matching older 10G products and with full backward compatibility — for example, each SFP28 port supports 1G, 10G or 25G, and each QSFP28 port supports 10G, 25G, 40G, 50G or 100G.
NICs are an important consideration. Servers are just beginning to see 25G. Intel recently began shipping their first 25G NIC (the XXV710), at about the same price as their equivalent 40G NIC (the XL710) and a bit more than their 10G adapters (the X710). Meanwhile, HPE has been selling a 25G Mellanox NIC since 2016 (the 640SFP28) with a price comparable to their 10G NICs. Cisco will introduce a 25G VIC for C-Series Rack-Mount Servers sometime later this year, priced between the 10G VIC-1225 and their current 40G VIC-1385.
I don’t believe any blade server does 25G today, but I’m sure all vendors are working on it. Cisco UCS B-Series is the only blade server to natively support 40G today, at about a 40 percent premium over 10G. When Cisco introduces 25G in its next-generation UCS B-Series products, I expect the price premium to drop.
Another hardware angle that needs to be covered is cabling. It’s a big mistake to ignore cabling or assume it’s the same across vendors. Specifically, 25G twinax works best within a single rack with a top-of-rack switch and 1 and 2 meter cables. 25G with 3+ meter cables requires forward error correction (FEC), which adds ~250ns of one-way latency and may introduce vendor interop issues. If you’re adopting 25G, plan to densely pack compute into 10kVA–12kVA racks.
Here is a cabling comparison with market prices — there are some big swings in vendor pricing:
Length & Type | 10G | 25G | 40G | 100G |
---|---|---|---|---|
1 to 5 meters / Passive Twinax DAC | 50– 50 – 75 | 63− 63 − 110 (Cisco) 122− 122 − 196 (HPE) 175− 175 − 316 (Intel) 3- and 5-meter cables not recommended |
125– 125 – 188 | 162– 162 – 237 |
7 to 10 meters / Active Twinax DAC | 180– 180 – 205 | N/A | 550– 550 – 625 | N/A |
1 to 30 meters / Active Optical | 105– 105 – 130 | N/A | 425– 425 – 700 | 500– 500 – 1,200 |
1 to 100 meters / SR Fiber Transceiver (for use with OM3/OM4 MMF LC cables) Note: Each connection needs two transceivers and a MMF LC cable ( 20− 20 − 100) |
$478 (Cisco SFP-10G-SR-S) $370 (Proline SFP-10G-SR) |
$226 (Intel NIC E25GSFP28SR) $1,112 (HPE NIC 845398-B21) $911 (Proline) No Cisco option currently Note: 100 meters on OM4, 70 meters on OM3 |
$550 (Cisco QSFP-40G-SR-BD) Note: This transceiver is now available from Proline, Arista and HPE. |
N/A |
Cisco has announced 100G-BiDi coming later this year, estimated at $800|
A final consideration for adopting 25G is power and heat. Since 25G is simpler, it should use less power and produce less heat than 40G (a common claim from Mellanox and Qlogic). Curiously, Intel breaks that assumption – Intel’s XXV 25G adapter draws 8.6W, while their 10G and 40G adapters draw 3.3W and 3.4W respectively (dual ports, twinax cabling). Still, today’s 25G NICs do draw less power than other 40G NICs like Cisco’s (17W) and Intel’s older 40G NICs (10W).
And the Winner Is…
In conclusion, I agree that you should prefer 25G-capable switches for your data center, but it’s not a do-or-die decision. It is likely that when you do have a need for 25G (e.g., all-flash hyperconverged cluster or a high I/O NSX Edge cluster), you’ll probably deploy dedicated 25G switches because of the 25G distance limitations.
I think it’s probably more important that you think about switch-to-switch bandwidth. Within the data center, 100G connectivity now costs just 25 percent more than 40G, but provides 250 percent of the bandwidth. So, choosing data center switches with 40/100G QSFP28 ports is a safe bet.
Meanwhile, in campus and access networks with their long fiber runs and low bandwidth needs, there is much less pressure to move from 10G uplinks. If you must, 40G is available today. Perhaps in a few years, 25G uplinks will be a cheaper alternative.
还有一个回复
The move to 25G Server connections will happen when IT and Network Managers need more bandwidth at the top of the rack to the next layer of the network.
This isn’t really a migration to 25G at the server but a migration to 100GbE at the top of the rack. When are 40GbE uplinks not enough for the density within the rack?
25G was a snoozer when the IEEE ratified 100GbE standard 802.3bj. No networking vendor pursued 25G downlinks. And rightfully so… we already had something faster; 40GbE. Why would anyone want 25GbE?
Here’s another reason 25GbE links make sense: What is the most abundant slot connector in servers today? You are right if you answered PCIe Gen3 x8. How much PCIe bandwidth does that support? You are right again! ~ 56Gbps. So, match up 2-ports of 25GbE to a 56Gbps PCIe slot and you’ve got better efficiency from network to server architecture performance. Better utilization per slot.
25GbE switches are really 100GbE switches today. Eventually these 100GbE switches will move downstream. Think about what we typically call 10GbE switches today? Do they have 40GbE uplinks? Shouldn’t we consider them 40GbE switches?