A Tale of Two FCoEs
November 21, 2011 14 Comments
A favorite topic of discussion among the data center infrastructure crowd is the state of FCoE. Depending on who you ask, FCoE is dead, stillborn, or thriving.
So, which is it? Are we dealing with FUD or are we dealing with vendor hype? Is FCoE a success, or is it a failure? The quick answer is.. yes? FCoE is both thriving, and yet-to-launch. So… are we dealing with Schrödinger’s protocol?
Note quite. To understand the answer, it’s important to make to make the distinction with two very different ways that FCoE is implemented: Edge FCoE and Multi-hop FCoE (a subject I’ve written about before, although I’ve renamed things a bit).
Edge FCoE
Edge FCoE is thriving, and has been for the past few years. Edge FCoE is when you take a server (or sometimes a storage array), connect it to an FCoE switch. And everything beyond that first switch is either native Fibre Channel or native Ethernet.
Edge FCoE is distinct from Multi-hop for one main reason: It’s a helluva lot easier to pull off than multi-hop FCoE. With edge-FCoE, the only switch that needs to understand FCoE is that edge FCoE switch. They plug into traditional Fibre Channel networks over traditional Fibre Channel links (typically with NPV mode).
Essentially, no other part of your network needs to do anything you haven’t done already. You do traditional Ethernet, and traditional Fibre Channel. FCoE only exists in that first switch, and is invisible to the rest of your LAN and SAN.
Here are the things you (for the most part) don’t have to worry about configuring on your network with Edge FCoE:
- Data Center Bridging (DCB) technologies
- Priority Flow Control (PFC) which enables lossless Ethernet
- Enhanced Transmission Selection (ETS) allowing the ability to dedicate bandwidth to various traffic (not required but recommended -Ivan Pepelnjak)
- DCBx: A method to communicate DCB functionality between switches over LLDP (oh, hey, you do PFC? Me too!)
- Whether or not your aggregation and core switches support FCoE (they probably don’t, or at least the line cards don’t)
There is PFC and DCBx in the server-to-edge FCoE link, but it’s typically inherint, and supported by the CNA and the edge-FCoE switch and turned on by default or auto-detected. In some implementations, there’s nothing to configure. PFC is there, and un-alterable. Even if there are some settings to tweak, it’s generally easier to do it on edge ports than on a aggregation/core network.
Edge FCoE is the vast majority of how FCoE is implemented today. Everyone from Cisco’s UCS to HP C7000 series can do it, and do it well.
Multi-Hop
The very term multi-hop FCoE is controversial in nature (just check the comments section of my terminology FCoE article), but for the sake of this article, multi-hop FCoE is any topological implmentation of FCoE where FCoE frames move around a converged network beyond a single switch.
Multi-hop FCoE requires a couple of things: It requires a Fibre Channel-aware network, losslessness through priority flow control (PFC), DCBx (Data Center Bridging Exchange), enhanced transmission selection (ETS), and you’ve got a recipe for a switch that I’m pretty sure ain’t in your rack right now. For instance, the old man of the data center, the Cisco Catalyst 6500, doesn’t now, and will likely never do FCoE.
Switch-wise, there are two types of ways to do multi-hop FCoE: A switch can either forward FCoE frames based on the Ethernet headers (MAC address source/destination), or you can forward frames based on the Fibre Channel headers (FCID source/destionation).
Ethernet-forwarded/Pass-through Multi-hop
If you build a multi-hop network with switches that forward based on Ethernet headers (as Juniper and Brocade do), then you’ll want something other than spanning-tree to do loop prevention and enable multi-pathing. Brocade uses a method based on TRILL, and Juniper uses their proprietary QFabric (based on unicorn tears).
Ethernet-forwaded FCoE switches don’t have a full Fibre Channel stack, so they’re unaware of what goes on in the Fibre Channel world, such as zoning and with the exception of the FIP (FCoE Initiation Protocol), which handles discovery of attached Fibre Channel devices (connecting virtual N_Ports to virtual F_Ports).
FC-Forwarded/Dual-stack Multi-hop
If you build a multi-hop network with switches that forward based on Fibre Channel headers, your FCoE switch needs to have both a full DCB-enabled Ethernet stack, and a full Fibre Channel stack. This is the way Cisco does it on their Nexus 5000s, Nexus 7000s, and MDS 9000 (with FCoE line cards), although the Nexus 4000 blade switch is the Ethernet-forwarded kind of switch.
The benefit of using a FC-Forwarded switch is that you don’t need a network that does TRILL or anything fancier than spanning-tree (spanning-tree isn’t enabled on any VLAN that passes FCoE). It’s pretty much a Fibre Channel network, with the ports being Ethernet instead of Fibre Channel. In fact, in Cisco’s FCoE reference design, storage and networking traffic are still port-gaped (a subject of a future blog post). FCoE frames and regular networking frames don’t run over the same links, there are dedicated FCoE links.
It’s like running a Fibre Channel SAN that just happens to sit on top of your Ethernet network. As Victor Moreno the LISP project manager at Cisco says: “The only way is to overlay”.
State of FCoE
It’s not accurate to say that FCoE is dead, or that FCoE is a success, or anything in between really, because the answer is very different once you separate multi-hop and edge-FCoE.
Currently, multi-hop has yet to launch in a significant way. In the past 2 months, I have heard rumors of a customer here or there implementing it, but I’ve yet to hear any confirmed reports or first hand tales. I haven’t even configured it personally. I’m not sure I’m quite as wary as Greg Ferro is, but I do agree with his wariness. It’s new, it’s not widely deployed, and that makes it riskier. There are interopability issues, which in some ways are obviated by the fact no one is doing Ethernet fabrics in a multi-vendor way, and NPV/NPIV can help keep things “native”. But historically, Fibre Channel vendors haven’t played well together. Stephen Foskett lists interopability among his reasonable concerns with FCoE multi-hop. (Greg, Stephen, and everyone else I know are totally fine with edge FCoE.)
Edge-FCoE is of course vibrant and thriving. I’ve configured it personally, and it works easily and seamlessly into an existing FC/Ethernet network. I have no qualms about deploying it, and anyone doing convergence should at least consider it.
Crystal Ball
In terms of networking and storage, it’s impossible to tell what the future will hold. There are a number of different directions FCoE, iSCSI, NFS, DCB, Ethernet Fabrics, et all could go. FCoE could end up replacing Fibre Channel entirely, or it could be relegated to the edge and never move from there. Another possibility as suggested to me by Stephen Foskett is that Ethernet will become the connection standard for Fibre Channel devices. They would still be called Fibre Channel switches, and SANs setup just like they always have been, but instead of having 8/16/32 Gbit FC ports, they’d have 10/40/100 Gbit Ethernet ports. To paraphrase Bob Metcalfe, “I don’t know what will come after Fibre Channel, but it will be called Ethernet”.
Pingback: A Tale of Two FCoEs | Storage CH Blog
Pingback: Networking Field Day 2: The Links
You forgot one aspect of Multihop FCoE – the hysteria level of storage people about non-deterministic networks. FC has been oversold for a decade as a ‘guaranteed’ circuit network, and the idea of Ethernet is anathema to nearly all storage people.
No one cares much about a few servers having problems so Edge FCoE is OK, but end to end FCOE sends more storage industry people into fits.
Hard to undo a decade of overselling to a psychotic paranoid team of individuals for who FUD on reliability, performance and uptime is a core feature.
I think some of the benefit of FC comes not from the technology necessarily, but an odd Layer 8 effect. I think some of the benefit that mainframes provide is their inflexibility. While on a Java server, you can make code changes all day everyday. It’s very dynamic and flexible, but it’s very easy to muck things up. Mainframes are much less flexible, and people tend to be far more deliberate in how they work with mainframes, in terms of technology and in terms of how we think about them.
Fibre Channel I think is like that. It’s treated with such reverence, that it becomes much more static and inflexible. It’s the mindset that gives Fibre Channel much of its advantage, not just technology.
FC over Ethernet? Why not Ethernet over Ethernet?
Most people nowadays hear Ethernet and immediately think TCP/IP, as this is by far the most common use of Ethernet fabrics today. TCP/IP and similar protocols are well-suited for lossy connections over many route points and distances — i.e. sending an email from here to California. However, they are not the most efficient way to send data from Rack 1 to Rack 2 in the datacenter where you have very high performance and low-loss connections. This is why protocols like iSCSI yield less-than-desirable throughput and IOPS, and why companies are forced to use more expensive fabrics like Fibre Channel.
Enter AoE (ATA over Ethernet). ATA commands sent over Ethernet is a layer-2 block protocol that is the most lightweight and efficient way to leverage raw Ethernet (layer-2) providing near bare metal performance. Things like multipathing, redundancy, and out-of-order packet delivery mechanisms are built in to simplify SAN deployments and maximize performance, availability and scalability in way not possible with any other fabric. AoE is a non-connection-based protocol, and folks like Coraid have storage appliances that are controller-less. This means that as you grow, the network and controllers do not become bottlenecks. AoE can provide bare-metal performance to all of the disks (SATA, SAS and SSD) whether you have 2 shelves or 2,000 shelves, best described as a one-tier-for-all design. This, coupled with the use of commodity hardware, means customers able to deliver price-performance advantages that are 5-10x of Fiber Channel rivals while still delivering industry-leading performance.
Seems to me Ethernet’s biggest value is $500 per port for 10GbE bringing a commoditized infrastructure to the datacenter. And isn’t commoditization the reason behind the compute layer (which had been made up of Sun SPARC, HP-UX, AIX, DEC ALPHA server’s) giving way to low cost x86 commodity servers? Why won’t the same thing happen to the storage layer especially as applications become smarter and continue to invade on the turf of the expensive storage arrays who used to be the only place in town to get data management tools like replication, snapshots, mirroring, etc. Today VMware, Exchange 2010 are good examples of apps that are minimizing the need for expensive storage arrays.
I realize beauty is in the eyes of the beholder but sure seems to me then that sticking FC and its connecton-based, point to point architecture inside of Ethernet misses the target of creating flexible, simple, low cost transport layers for data… At the very least I have serious doubts that we’re going to hear a lot of customers saying “hey, you got peanut butter on my chocolate and boy does that taste good great!” type of reactions from customers looking at FCoE.
“Simplicity is the Ultimate Sophistication” – Leonardo da Vinci
AoE has a couple of issues as I see it that needs to overcome before it’s taken seriously:
The “lightweightness” of a protocol has never been that big of deal. Overhead, other than causing MTU problems, has never been a performance issue in the past. It seems like it should make a difference, but really, when has it?
“Near baremetal performance” What does that even mean, how is it quantified? I haven’t seen any performance comparisons between AoE and other protocols. The bottleneck usually isn’t the protocol, but the disks underneath them.
I’ve never used AoE, which makes it a risky protocol, especially in the storage world. Only one vendor I even know of has an AoE solution. Is AoE an open standard? Who maintains the standard? Lots of unknowns.
And perhaps the biggest issue of all is that people I trust in the storage world don’t know it/aren’t down with it either. The storage world, probably more so than other facets of IT, is built on trust.
Do we really need another storage protocol? We’ve already got iSCSI, FC, and FCoE as block protocols, and NFS/CIFS as file protocols. There would have to be lots of compelling reasons to adopt yet another storage protocol.
I try not to totally discount protocols, because you never know when the next solution is going to come from. But AoE has some challenges in terms of adoption.
Hi,
Thanks for great summary.
One thing though, you said Cisco is doing multihop FCoE with Nexus 5K, 7K and MDSes.
Brocade VCS Fabric Technology is also working as Dual Stack. I reckon to check below blog post. Please take a look at : FCoE Multi-hop Forwarding Use Case
http://community.brocade.com/docs/DOC-2133
Cheers.
Dumlu
Hi Dumlu,
I asked about this at the Networking Field Day Brocade event, and it seemed that the frames were forwarded via Ethernet headers, not FC headers. From the document you linked: “is encapsulated in an FCoE frame and then switched to a 10 GE port for forwarding across the VCS Ethernet fabric.”
That would seem to indicate single stack for Ethernet forwarding, wouldn’t it?
Tony
Tony,
This document provides additional detail on the VCS technical architecture.
Click to access vcs-technical-architecture-tb.pdf
In the section “BROCADE VCS ETHERNET FABRIC AND TRILL” starting on pg 38 you will find a description of frames used in a VCS fabric. FCoE “data” frames are encapsulated inside TRILL while Fibre Channel “protocol” frames do not use TRILL encapsulation and are forwarded between switches inside Ethernet frames.
Unlike Cisco which requires separate ports, links and paths for FCoE traffic and for IP traffic, the VCS fabric sends FCoE and IP traffic on the same ports, links and paths. Or said differently, for multi-hop FCoE use cases, the VCS fabric actually implements traffic convergence while Cisco does not. At least that’s how I interpret the differences.
I this has an implication on cost to deploy FCoE since there is no need to pay for (or configure) separate ports and paths when you use a VCS fabric to transport converged traffic (FCoE + IP). With Brocade ISL Trunks providing almost perfect utilization of multiple links within a trunk (as Ivan Pepelnjak points out in this post http://blog.ioshints.info/2011/04/brocade-vcs-fabric-has-almost-perfect.html), hot spots commonly found with LAG and hashing approaches are avoided.
I hope this helps.
Hi Brook,
I actually like the idea of port-gapping storage traffic and network traffic (except at the edge). It allows for easy A/B separation, and keeps stuff like VOIP out the the same buffers/queues as storage (iSCSI/FCoE/NFS).
If the FCoE frames aren’t forwarded by TRILL, how does VCS prevent/handle Ethernet multi-pathing without TRILL or without decap and utilizing FC-forwarding?
Tony
Pingback: FDCCI Preparation with Virtual Instruments and Carahsoft
Pingback: FDCCI Preparation with Virtual Instruments and Carahsoft
Pingback: How to configure a Catalyst 3750/3750-E/3750-X Series Switches Using LLDP (Link Layer Discovery Protocol) | Pete's Packet
Nice arrival and much more balanced than most I see. I have to say that I am seeing a reasonable level of fcoe including with an extra L2 hop though as qfabric is a single hop we usually only add one (or two if there is a dcb lace switch as well) L2 hops.
Of course Ethernet, fabric or otherwise, can be deterministic if you want and indeed when ever I did iSCSI and I did some bid installs we always setup Ethernet to be as deterministic as possible and certainly lossless.
I take some exception to the suggestion that an L2 hop means no full fc stack and so no fc visibility other than fip. The reality is that your fc stack can be as sophisticated as you want. In our case we have full exchange based routing with in order delivery within the exchange, and when we snoop we don’t just program the ACL we also implicitly learn what fcoe devices are going through us, who they are, where they are going from and too, and we monitor the fip keep alive. We also then present this in nice fc/fcoe show commands. In addition good dcb means good monitoring of the individual priority. So far this seems to have given our customers all the visibility they needed whilst avoiding any issues with overlap of management as the San guys only need monitoring of the fip snooping bridge.