FCoE: I’m not Dead! Arista: You’ll Be Stone Dead in a Moment!

I was at Arista on Friday for Tech Field Day 8, and when FCoE was brought up (always a good way to get a lively discussion going), Andre Pech from Arista (who did a fantastic job as a presenter) brought up an article written by Douglas Gourlay, another Arista employee, entitled “Why FCoE is Dead, But Not Buried Yet“.

FCoE: “I feel happy!”

It’s an interesting article, because much of the player-hating seems to directed at TRILL, not FCoE, and as J Metz has said time and time again, you don’t need TRILL to do FCoE if you do FCoE the way Cisco does (by using Fibre Channel Forwarders in each FCoE switch). Arista, not having any Fibre Channel skills, can’t do it this way. If they were to do FCoE, Arista (like Juniper) would need to do it the sparse-mode/FIP-snooping FCoE way, which would need a non-STP way of handling multi-pathing such as TRILL or SPB.

Jayshree Ullal, The CEO of Arista, hated on TRILL and spoke highly of VXLAN and NVGRE (Arista is on the standards body for both). I think part of that is that like Cisco, not all of their switches will be able to support TRILL, since TRILL requires new Ethernet silicon.

Even the CEO of Arista acknowledged that FCoE worked great at the edge, where you plug a server with a FCoE CNA into an FCoE switch, and the traffic is sent along to native Ethernet and native Fibre Channel networks from there (what I call single-hop or no-hop FCoE). This doesn’t require any additional FCoE infrastructure in your environment, just the edge switch. The Cisco UCS Fabric Interconnects are a great example of this no-hop architecture.

I don’t think FCoE is quite dead, but I have to imagine that it’s not going as well as vendors like Cisco have hoped. At least, it’s not been the success that some vendors have imagined. And I think there are two major contributors to FCoE’s failure to launch, and both of those reasons are more Layer 8 than Layer 2.

Old Man of the Data Center

Reason number one is also the reason why we won’t see TRILL/Fabric Path deployed widely: It’s this guy:

Don’t let him trap you into hearing him tell stories about being a FDDI bridge, whatever FDDI is

The Catalyst 6500 series switch. This is “The Old Man of the Data Center”. And he’s everywhere. The switch is a bit long in the tooth, and although capacity is much higher on the Nexus 7000s (and even the 5000s in some cases), the Catalyst 6500 still has a huge install base.

And it won’t ever do FCoE.

And it (probably) won’t ever do TRILL/Fabric Path (spanning-tree fo-evah!)

The 6500s aren’t getting replaced in significant numbers from what I can see. Especially with the release of the Sup 2T supervisor for the 6500es, the 6500s aren’t going anywhere anytime soon. I can only speculate as to why Cisco is pursuing the 6500 so much, but I think it comes down to two reasons:

Another reason why customers haven’t replaced the 6500s are that the Nexus 7000 isn’t a full-on replacement. With no service modules, limited routing capability (it just recently got the ability to do MPLS), and a form factor that’s much larger than the 6500 (although the 7009 just hit the streets with a very similar 6500 form factor, which begs the question: Why didn’t Cisco release the 7009 first?).

Premature FCoE

So reason number two? I think Cisco jumped the gun. They’ve been pushing FCoE for a while, but they weren’t quite ready. It wasn’t until July 2011 that Cisco released NX-OS 5.2, which is what’s required to do multi-hop FCoE in the Nexus 7000s and MDS 9000. They’ve had the ability to do multi-hop FCoE in the Nexus 5000s for a bit longer, but not much. Yet they’ve been talking about multi-hop for longer than it was possible to actually implement. Cisco has had a multi-hop FCoE reference architecture posted since March 2011 on their website, showing a beautifully designed multi-hop FCoE network with 5000s, 7000s, and MDS 9000s, that for months wasn’t possible to implement. Even today, if you wanted to implement multi-hop FCoE with Cisco gear (or anyone else), you’d be a very, very early adopter.

So no, I don’t think FCoE is dead. No-hop FCoE is certainly successful (even Arista’s CEO acknowedged as such), and I don’t think even multi-hop FCoE is dead, but it certainly hasn’t caught on (yet). Will multi-hop FCoE catch on? I’m not sure. We’ll have to see.

Fibre Channel and Ethernet: The Odd Couple

Fibre Channel? Meet Ethernet. Ethernet? Meet Fibre Channel. Hilarity ensues.

The entire thesis of this blog is that the traditional data center silos are collapsing. We are witnessing the rapid convergence of networking, storage, virtualization, server administration, security, and who knows what else. It’s becoming more and more difficult to be “just a networking/server/storage/etc person”.

One of the byproducts of this is the often hilarious fallout from conflicting interests, philosophies, and mentalities. And perhaps the greatest friction comes from the conflict of storage and network administrators. They are the odd couple of the data center.

Storage and Networking: The Odd Couple

Ethernet is the messy roomate. Ethernet just throws its shit all over the place, dirty clothes never end up in the hamper, and I think you can figure out Ethernet’s policy on dish washing.  It’s disorganized and loses stuff all the time. Overflow a receive buffer? No problem. Hey, Ethernet, why’d you drop that frame? Oh, I dunno, because WRED, that’s why.

WRED is the Yosamite Sam of Networking

But Ethernet is also really flexible, and compared to Fibre Channel (and virtually all other networking technologies) inexpensive. Ethernet can be messy, because it either relies on higher protocols to handle dropped frames (TCP) or it just doesn’t care (UDP).

Fibre Channel, on the other hand, is the anal-retentive network: A place for everything, and everything in its place. Fibre Channel never loses anything, and keeps track of it all.

There now, we’re just going to put this frame right here in this reserved buffer space.

The overall philosophies are vastly different between the two. Ethernet (and TCP/IP on top of it) is meant to be flexible, mostly reliable, and lossy. You’ll probably get the Layer 2 frames and Layer 3 packets from one destination to another, but there’s no gurantee. Fibre Channel is meant to be inflexible (compared with Ethernet), absolutely reliable, and loss-less.

Fibre channel and Ethernet have a very different set of philosophies in terms of building out a network. For instance, in Ethernet networks, we cross-connect the hell out of everything. Network administrators haven’t met two switches they didn’t want to cross connect.


Did I miss a way to cross-connect? Because I totally have more cables

It’s just one big cloud to Ethernet administrators. For Fibre Channel administrators, one “SAN” is abomination. There are always two, air gap separated, completely separate fabrics.

The greatest SAN diagram ever created

The Fibre Channel host at the bottom is connected into two separate, Gandalf-separated, non-overlapping Fibre Channel fabrics. This allows the host two independent paths to get to the same storage array for full redundancy. You’ll note that the Fibre Channel switches on both sides have two links from switch to switch in the same fabric. Guess what? They’re both active. Multi-pathing in Fibre Channel is allowed through use of the FSPF protocol (Fabric Shortest Path First). Fibre Channel switch to Fibre Channel switch is, what we would consider in the Ethernet world, layer 3 routed. It’s enough to give one multi-path envy.

One of the common ways (although by no means the only way) that an Ethernet frame could meet an unfortunate demise is through tail drop or WRED of a receive buffer. As a buffer in Ethernet gets full, WRED or a similar technology will typically start to randomly drop frames. As the buffer gets closer to full, the faster the frames are randomly dropped. WRED prevents tail drop, which is bad for TCP, but dropping frames when the buffer gets closer to full.

Essentially, an Ethernet buffer is a bit like Thunderdome: Many frames enter, not all frames leave. With Ethernet, if you tried to do full line rate of two 10 Gbit links through a single 10 Gbit choke point, half the frames would be dropped.

To a Fibre Channel adminsitrator, this is barbaric. Fibre Channel is much more civilized with the use of Buffer-to-Buffer (B2B) credits. Before a Fibre Channel frame is sent from one port to another, the sending port reserves space on the receiving port’s buffer. A Fibre Channel frame won’t get sent  unless there’s guaranteed space at the receiving end. This insures that no matter how much you over subscribe a port, no frames will get lost. Also, when a Fibre Channel frame meets another Fibre Channel frame in a buffer, it asks for the Grey Poupon.

With Fibre Channel, if you tried to push two 8 Gbit links through a single 8 Gbit choke point, no frames would be lost, and each 8 Gbit port would end up throttled back to roughly 4 Gbit through the use of B2B credits.

Why is Fibre Channel so anal retentive? Because SCSI, that’s why. SCSI is the protocol that most enterprise servers use to communicate with storage. (I mean, there’s also SATA, but SCSI makes fun of SATA behind SATA’s back.) Fibre Channel runs the Fibre Channel Protocol, which encapsulates SCSI commands onto Fibre Channel fames (as odd as it sounds, Fibre Channel and Fibre Channel Protocol are two distinct technologies).  Fibre Channel is essentially SCSI over Fibre Channel.

SCSI doesn’t take kindly to dropped commands. It’s a bit of a misconception that SCSI can’t tolerate a lost command. It can, it just takes a long time to recover (relatively speaking). I’ve seen plenty of SCSI errors, and they’ll slow a system down to a crawl. So it’s best not to lose any SCSI commands.

The Converged Clusterfu… Network

We used to have separate storage and networking environments. Now we’re seeing an explosion of convergence: Putting data and storage onto the same (Ethernet) wire.

Ethernet is the obvious choice, because it’s the most popular networking technology. Port per port, Ethernet is the most inexpensive, most flexible, most widely deployed networking technology around. It has slated the FIDDI dragon, the token ring revolution, and now it has its sights on the Fibre Channel Jabberwocky.

The current two competing technologies for this convergence are iSCSI and FCoE. SCSI doesn’t tolerate failure to deliver the SCSI command very well, so both iSCSI and FCoE have ways to guarantee delivery. With iSCSI, delivery is guaranteed because iSCSI runs on TCP, the reliable Layer 4 protocol. If a lower level frame or packet carrying a TCP segment gets lost, no big deal. TCP using sequence numbers, which are like FedEx tracking numbers, and can re-send a lost segment. So go ahead, WRED, do your worst.

FCoE provides losslessness through priority flow control, which is similar to B2B credits in Fibre Channel. Instead of reserving space on the receiving buffer, PFC keeps track of how full a particular buffer is, the one that’s dedicated to FCoE traffic. If that FCoE buffer gets close to full, the receiving Ethernet port sends a PAUSE MAC control frame to the sending port, and the sending port stops. This is done on a port-per-port basis, so end-to-end FCoE traffic is guaranteed to drive without dropping frames. For this to work though, the Ethernet switches need to speak PFC, and that isn’t part of the regular Ethernet standard, and is instead part of the DCB (Data Center Bridging)  set of standards.

Hilarity Ensues

Like the shields of the Enterprise, converged networking is in a state of flux. Network administrators and storage administrators are not very happy with the result. Network administrators don’t want storage traffic (and their silly demands for losslessness) on their data networks. St0rage administrators are appalled by Ethernet and it’s devil-may-care attitude towards frames. They’re also not terribly fond of iSCSI, and only grudgingly accepting of FCoE. But convergence is happening, whether they like it or not.

Personally, I’m not invested in any particular technology. I’m a bit more pro-iSCSI than pro-FCoE, but I’m warming to the later (and certainly curious about it).

But given some dyed-in-the-wool network administrators and server administrators are, the biggest problems in convergence won’t be the technology, but instead will be the Layer 8 issues generated. My take is that it’s time to think like a data center administrator, and not a storage or network administrator. However, that will take time. Until then, hilarity ensues.

The Case for FCoE Terminology

A previous post of mine (Jinkies! It’s an FCoE Mystery) talked about the need for some additional terminology in the FCoE world, specifically three different types of FCoE deployments. It’s generated a lot of comments, some which seem even longer than the actual post. I wanted to do a follow up, specifically regarding my reasoning for having the topology definitions.

FCoE, as a term, is very broad: It means that you’re taking a Fibre Channel frame and encapsulating it into an Ethernet frame. That’s it. There’s only one “FCoE” method in terms of this encapsulation. However, my point is that there are a number of very different ways you can go about moving those FCoE frames onto your Ethernet network.

Take this scenario: You’re presented with a switch. It has a nice sticker on it that says “FCoE switch”. Now what does that tell you about how you can fit it in your network?

Attention Cisco: You’re welcome

Almost nothing.

If you said it was a data center bridge (DCB) switch, you would then know that it’s a transit switch. No FCoE frames will be encap’d/decaped on that switch, but it supports at least PFC (priority flow control) so that FCoE frames can be guaranteed to be lossless.

Now, if you were told the FCoE switch is a FCF switch (has a full Fibre Channel stack), what does that tell you about how you can deploy it?

Still, almost nothing.

Take the example of a Cisco 6X00 Fabric Interconnect, the brains behind Cisco’s UCS server system. They are FCoE devices, and they are Fibre Channel Forwarders (FCFs). However, you can’t do what I would consider multi-hop FCoE. You can connect to a native Fibre Channel fabric, but not an FCoE fabric. That is, you can’t set up an FCoE ISL (Inter-Switch Link, but not the old Cisco pre-802.1Q VLAN tagging, it means something different in Fibre Channel) to another FCoE capable switch. This is why I added a third method to Ivan Pepelnjak’s sparse-mode (SMFCoE) and dense-mode (DMFCoE) definitions. (Note: That’s an embarassing number of acronyms).

So by having those three different distinctions (dense-mode/FCF, sparse-mode/DCB, one-hop/zero-hop) you can then tell immediately how you can deploy a FCoE switch in your network. Some switches will likely support multiple ways, but most right now are limited to one in how they’re deployed on your network.

I understand the concerns that both J Metz from Cisco and Erik Smith from EMC about adding complexity, but I think having these three different topology definitions can go a long way to help simplify discussions on FCoE topology, and in fact removes a lot of complexity (and mystery).

This morning I attended a webinar held by the Ethernet Alliance (based near me in Beaverton, Oregon) and I was happy to hear they also make a distinction between FCF FCoE switches and non-FCF FCoE switches. It really helps simplify things in terms of deployment.

Jinkies! It’s an FCoE Mystery!

Preamble: Chances are I’m going to get something wrong in this article. Please feel free to point anything out so long as you state the correction. You can’t just say “that’s wrong” and not say why. One of the great mysteries of the data center right now is FCoE.

Ah, Fibre Channel over Ethernet. It promises to do away with separate data and storage networks, and run everything on a single unified fabric. The problem though is that FCoE is a bit of a mystery. It involves two very different protocols (Ethernet and Fibre Channel), it involves the interaction between the protocols, and vendors can bicker over requirements, make polar opposite statements, and both can be technically correct.

So that makes it kind of a mess. I’ve been teaching basics of FCoE (mostly single-hop) for a bit now, and I think I’ve come across a way to simplify perception of FCoE: Realize FCoE is implemented in three different ways.

  • Single-hop FCoE (SHFCoE)
  • Dense-mode FCoE (DMFCoE) [multi-hop]
  • Sparse-mode FCoE (SMFCoE) [multi-hop]

When we talk about FCoE in general, we should be talking about which specific method that’s being referenced. That came to me when I read Ivan Pepelnjak’s article on the two ways to implement multi-hop  FCoE , although I’m also adding single-hop as a separate way to implement FCoE.

While all three ways are technically “FCoE”, they are implemented in very different manners, have very different hardware and topology requirements, and different vendors support different methods. They’re almost three completely different beasts. So let’s talk about them separately, and be specific when we talk about it.

So let’s talk about FCoE.

Single Hop FCoE (SHFCoE)

This is the simplest way to implement FCoE, as it doesn’t really require any of the new data center standards on the rest of your network devices. Typically, a pair of switches is enabled for FCoE, as well as some server network/storage adapters known as CNAs (Converged Network Adapter).

In the Cisco realm, this is either a Nexus 5000 series or Fabric Interconnects which are part of the Cisco UCS server system. In HP, this might be part of Virtual Connect. A CNA is a Ethernet/Fibre Channel combo networking card. The server’s operating system is presented with separate  native Ethernet and native Fibre Channel devices, so the OS doesn’t even know that FCoE is going on. It just thinks there’s native Ethernet and native Fibre Channel.

Oh hey, look! An actual diagram. Not just proof you were alive in the 80’s.

Ethernet frames containing FC frames are isolated onto their own FCoE VLANs. When the Ethernet frames reach the FCoE switch they are de-encapsulated and forwarded via regular Fibre Channel methods to their final destination as native Fibre Channel.

This method has been in place for a few years now, and it works (and works well). It’s pretty well understood, and there’s plenty of stick time for it. You also don’t need to do anything special on your Ethernet networks, and most of the time nothing special needs to be done on your Fibre Channel SAN (although NPV/NPIV may be needed to get the FCoE switch connected to the Fibre Channel switch). You don’t have to worry about any of the new DCB standards, such as DCBX, PFC, ETS, etc., because they only need to be on the FCoE single-hop switch, and are already there. No tweaking of those standards is typically necessary.

The Multi-Hops

There are two types of multi-hop FCoE, where the FCoE goes beyond just the initial switch. J Metz from Cisco elaborated on the various definitions (and types) of multi-hop in this great blog article here, but I think we can even make it more simple by saying that multi-hop means more than one FCoE switch.

Dense-Mode FCoE (DMFCoE)

With DMFCoE, a FCoE frame is received at the DMFCoE switch and de-encapsulated into a regular FC frame. The FCF (Fibre Channel Forwarder) portion of the DMFCoE switch makes the forwarding decision and sends it to the next port. At that port, the FC frame is re-encapsulated into an FCoE Ethernet frame and send out an Ethernet port to the next hop.

With DMFCoE, each of your Ethernet switches is also a full-stack Fibre Channel switch. You’re running essentially a Fibre Channel SAN overlay on top of your Ethernet switches. Zoning, name services, FSPF, etc., are all the same as on your regular Fibre Channel network. Also, FCoE frames are routed along not by Ethernet, but by Fibre Channel routing (FSPF) which is multi-path (so no bridging loops).

The drawback is that it requires a pretty advanced switch to do it. In fact, it wasn’t until July of 2011 that Cisco had more than one switch that could even do DMFCoE (the MDS and Nexus 7000 needed 5.2 to do DMFCoE, which wasn’t released until July).

Alternative names for dense-mode FCoE:

  • FC-Forwarded FCoE
  • DMFCoE
  • Full FCoE
  • Heavy FCoE
  • Overlay Mode

Sparse Mode FCoE (SMFCoE)

Sparse Mode FCoE (SMFCoE) is when an Ethernet network forwards FCoE frames via regular Ethernet forwarding mechanisms. Unlike DMFCoE, the Fibre Channel frame is not de-encapsulated (although but it might be snooped with FIP snooping if the switch supports it). For the most part, the Ethernet switches have little to no awareness of the Fibre Channel layers.

The benefit of SMFCoE is that it doesn’t require quite the beefiness that DMFCoE needs, as you don’t need silicon that can understand and forward FCP (Fibre Channel Protocol) traffic. You still need priority flow control and other DCB standards, and probably DCBx (to set up the FCoE lossless CoS and so forth).

The drawback is that you’ll usually need some sort of multi-path Ethernet protocol, such as TRILL/SPB/Fabric Path as spanning-tree would likely be a disaster for a storage protocol. Since none of the potential multi-path Ethernet protocols are in wide use with the various vendors, that makes SMFCoE somewhat dead right now.

Alternative names for SMFCoE might be:

  • Ethernet-forwarded FCoE
  • FCoE light
  • Diet-FCoE

Why Differentiate?

Because it gets damn confusing otherwise. Recently Juniper and Cisco had a dustup about the requirement of TRILL for FCoE. Juniper posted the article on why TRILL won’t scale for data centers, and mentioned that TRILL is required for FCoE. J Metz from Cisco counter-reponded with essentially “no, FCoE doesn’t need TRILL“. Who’s right? Well they both are.

Cisco has gone the DMFCoE route, so no you don’t need TRILL (or other multi-path Ethernet). Since Juniper is going SMFCoE, it will need some sort of multi-path (and his article is calling for QFabric to be that solution).

Whither FCoE?

So can you do FCoE multi-hop right now, either DMFCoE or SMFCoE? It probably would be wise to wait. In the Cisco realm, the code that supports DMFCoE was just released in July for their Nexus 7K and MDS lines, and the 5Ks could have done DMFCoE since December I think (although I don’t know any one that did).

Right now, I don’t know of any customers actually doing mutli-hop FCoE (and I don’t know anyone who’s all that interested).  SMFCoE is a moot point right now until more switches can get multi-path Ethernet, whether that be QFabric, TRILL, SPB or another method.