Jinkies! It’s an FCoE Mystery!
August 16, 2011 21 Comments
Preamble: Chances are I’m going to get something wrong in this article. Please feel free to point anything out so long as you state the correction. You can’t just say “that’s wrong” and not say why. One of the great mysteries of the data center right now is FCoE.
Ah, Fibre Channel over Ethernet. It promises to do away with separate data and storage networks, and run everything on a single unified fabric. The problem though is that FCoE is a bit of a mystery. It involves two very different protocols (Ethernet and Fibre Channel), it involves the interaction between the protocols, and vendors can bicker over requirements, make polar opposite statements, and both can be technically correct.
So that makes it kind of a mess. I’ve been teaching basics of FCoE (mostly single-hop) for a bit now, and I think I’ve come across a way to simplify perception of FCoE: Realize FCoE is implemented in three different ways.
- Single-hop FCoE (SHFCoE)
- Dense-mode FCoE (DMFCoE) [multi-hop]
- Sparse-mode FCoE (SMFCoE) [multi-hop]
When we talk about FCoE in general, we should be talking about which specific method that’s being referenced. That came to me when I read Ivan Pepelnjak’s article on the two ways to implement multi-hop FCoE , although I’m also adding single-hop as a separate way to implement FCoE.
While all three ways are technically “FCoE”, they are implemented in very different manners, have very different hardware and topology requirements, and different vendors support different methods. They’re almost three completely different beasts. So let’s talk about them separately, and be specific when we talk about it.
So let’s talk about FCoE.
Single Hop FCoE (SHFCoE)
This is the simplest way to implement FCoE, as it doesn’t really require any of the new data center standards on the rest of your network devices. Typically, a pair of switches is enabled for FCoE, as well as some server network/storage adapters known as CNAs (Converged Network Adapter).
In the Cisco realm, this is either a Nexus 5000 series or Fabric Interconnects which are part of the Cisco UCS server system. In HP, this might be part of Virtual Connect. A CNA is a Ethernet/Fibre Channel combo networking card. The server’s operating system is presented with separate native Ethernet and native Fibre Channel devices, so the OS doesn’t even know that FCoE is going on. It just thinks there’s native Ethernet and native Fibre Channel.
Ethernet frames containing FC frames are isolated onto their own FCoE VLANs. When the Ethernet frames reach the FCoE switch they are de-encapsulated and forwarded via regular Fibre Channel methods to their final destination as native Fibre Channel.
This method has been in place for a few years now, and it works (and works well). It’s pretty well understood, and there’s plenty of stick time for it. You also don’t need to do anything special on your Ethernet networks, and most of the time nothing special needs to be done on your Fibre Channel SAN (although NPV/NPIV may be needed to get the FCoE switch connected to the Fibre Channel switch). You don’t have to worry about any of the new DCB standards, such as DCBX, PFC, ETS, etc., because they only need to be on the FCoE single-hop switch, and are already there. No tweaking of those standards is typically necessary.
There are two types of multi-hop FCoE, where the FCoE goes beyond just the initial switch. J Metz from Cisco elaborated on the various definitions (and types) of multi-hop in this great blog article here, but I think we can even make it more simple by saying that multi-hop means more than one FCoE switch.
Dense-Mode FCoE (DMFCoE)
With DMFCoE, a FCoE frame is received at the DMFCoE switch and de-encapsulated into a regular FC frame. The FCF (Fibre Channel Forwarder) portion of the DMFCoE switch makes the forwarding decision and sends it to the next port. At that port, the FC frame is re-encapsulated into an FCoE Ethernet frame and send out an Ethernet port to the next hop.
With DMFCoE, each of your Ethernet switches is also a full-stack Fibre Channel switch. You’re running essentially a Fibre Channel SAN overlay on top of your Ethernet switches. Zoning, name services, FSPF, etc., are all the same as on your regular Fibre Channel network. Also, FCoE frames are routed along not by Ethernet, but by Fibre Channel routing (FSPF) which is multi-path (so no bridging loops).
The drawback is that it requires a pretty advanced switch to do it. In fact, it wasn’t until July of 2011 that Cisco had more than one switch that could even do DMFCoE (the MDS and Nexus 7000 needed 5.2 to do DMFCoE, which wasn’t released until July).
Alternative names for dense-mode FCoE:
- FC-Forwarded FCoE
- Full FCoE
- Heavy FCoE
- Overlay Mode
Sparse Mode FCoE (SMFCoE)
Sparse Mode FCoE (SMFCoE) is when an Ethernet network forwards FCoE frames via regular Ethernet forwarding mechanisms. Unlike DMFCoE, the Fibre Channel frame is not de-encapsulated (although but it might be snooped with FIP snooping if the switch supports it). For the most part, the Ethernet switches have little to no awareness of the Fibre Channel layers.
The benefit of SMFCoE is that it doesn’t require quite the beefiness that DMFCoE needs, as you don’t need silicon that can understand and forward FCP (Fibre Channel Protocol) traffic. You still need priority flow control and other DCB standards, and probably DCBx (to set up the FCoE lossless CoS and so forth).
The drawback is that you’ll usually need some sort of multi-path Ethernet protocol, such as TRILL/SPB/Fabric Path as spanning-tree would likely be a disaster for a storage protocol. Since none of the potential multi-path Ethernet protocols are in wide use with the various vendors, that makes SMFCoE somewhat dead right now.
Alternative names for SMFCoE might be:
- Ethernet-forwarded FCoE
- FCoE light
Because it gets damn confusing otherwise. Recently Juniper and Cisco had a dustup about the requirement of TRILL for FCoE. Juniper posted the article on why TRILL won’t scale for data centers, and mentioned that TRILL is required for FCoE. J Metz from Cisco counter-reponded with essentially “no, FCoE doesn’t need TRILL“. Who’s right? Well they both are.
Cisco has gone the DMFCoE route, so no you don’t need TRILL (or other multi-path Ethernet). Since Juniper is going SMFCoE, it will need some sort of multi-path (and his article is calling for QFabric to be that solution).
So can you do FCoE multi-hop right now, either DMFCoE or SMFCoE? It probably would be wise to wait. In the Cisco realm, the code that supports DMFCoE was just released in July for their Nexus 7K and MDS lines, and the 5Ks could have done DMFCoE since December I think (although I don’t know any one that did).
Right now, I don’t know of any customers actually doing mutli-hop FCoE (and I don’t know anyone who’s all that interested). SMFCoE is a moot point right now until more switches can get multi-path Ethernet, whether that be QFabric, TRILL, SPB or another method.