Jinkies! It’s an FCoE Mystery!

August 16, 2011 21 Comments

Preamble: Chances are I’m going to get something wrong in this article. Please feel free to point anything out so long as you state the correction. You can’t just say “that’s wrong” and not say why. One of the great mysteries of the data center right now is FCoE.

Ah, Fibre Channel over Ethernet. It promises to do away with separate data and storage networks, and run everything on a single unified fabric. The problem though is that FCoE is a bit of a mystery. It involves two very different protocols (Ethernet and Fibre Channel), it involves the interaction between the protocols, and vendors can bicker over requirements, make polar opposite statements, and both can be technically correct.

So that makes it kind of a mess. I’ve been teaching basics of FCoE (mostly single-hop) for a bit now, and I think I’ve come across a way to simplify perception of FCoE: Realize FCoE is implemented in three different ways.

Single-hop FCoE (SHFCoE)
Dense-mode FCoE (DMFCoE) [multi-hop]
Sparse-mode FCoE (SMFCoE) [multi-hop]

When we talk about FCoE in general, we should be talking about which specific method that’s being referenced. That came to me when I read Ivan Pepelnjak’s article on the two ways to implement multi-hop FCoE , although I’m also adding single-hop as a separate way to implement FCoE.

While all three ways are technically “FCoE”, they are implemented in very different manners, have very different hardware and topology requirements, and different vendors support different methods. They’re almost three completely different beasts. So let’s talk about them separately, and be specific when we talk about it.

So let’s talk about FCoE.

Single Hop FCoE (SHFCoE)

This is the simplest way to implement FCoE, as it doesn’t really require any of the new data center standards on the rest of your network devices. Typically, a pair of switches is enabled for FCoE, as well as some server network/storage adapters known as CNAs (Converged Network Adapter).

In the Cisco realm, this is either a Nexus 5000 series or Fabric Interconnects which are part of the Cisco UCS server system. In HP, this might be part of Virtual Connect. A CNA is a Ethernet/Fibre Channel combo networking card. The server’s operating system is presented with separate native Ethernet and native Fibre Channel devices, so the OS doesn’t even know that FCoE is going on. It just thinks there’s native Ethernet and native Fibre Channel.

Oh hey, look! An actual diagram. Not just proof you were alive in the 80’s.

Ethernet frames containing FC frames are isolated onto their own FCoE VLANs. When the Ethernet frames reach the FCoE switch they are de-encapsulated and forwarded via regular Fibre Channel methods to their final destination as native Fibre Channel.

This method has been in place for a few years now, and it works (and works well). It’s pretty well understood, and there’s plenty of stick time for it. You also don’t need to do anything special on your Ethernet networks, and most of the time nothing special needs to be done on your Fibre Channel SAN (although NPV/NPIV may be needed to get the FCoE switch connected to the Fibre Channel switch). You don’t have to worry about any of the new DCB standards, such as DCBX, PFC, ETS, etc., because they only need to be on the FCoE single-hop switch, and are already there. No tweaking of those standards is typically necessary.

The Multi-Hops

There are two types of multi-hop FCoE, where the FCoE goes beyond just the initial switch. J Metz from Cisco elaborated on the various definitions (and types) of multi-hop in this great blog article here, but I think we can even make it more simple by saying that multi-hop means more than one FCoE switch.

Dense-Mode FCoE (DMFCoE)

With DMFCoE, a FCoE frame is received at the DMFCoE switch and de-encapsulated into a regular FC frame. The FCF (Fibre Channel Forwarder) portion of the DMFCoE switch makes the forwarding decision and sends it to the next port. At that port, the FC frame is re-encapsulated into an FCoE Ethernet frame and send out an Ethernet port to the next hop.

With DMFCoE, each of your Ethernet switches is also a full-stack Fibre Channel switch. You’re running essentially a Fibre Channel SAN overlay on top of your Ethernet switches. Zoning, name services, FSPF, etc., are all the same as on your regular Fibre Channel network. Also, FCoE frames are routed along not by Ethernet, but by Fibre Channel routing (FSPF) which is multi-path (so no bridging loops).

The drawback is that it requires a pretty advanced switch to do it. In fact, it wasn’t until July of 2011 that Cisco had more than one switch that could even do DMFCoE (the MDS and Nexus 7000 needed 5.2 to do DMFCoE, which wasn’t released until July).

Alternative names for dense-mode FCoE:

FC-Forwarded FCoE
DMFCoE
Full FCoE
Heavy FCoE
Overlay Mode

Sparse Mode FCoE (SMFCoE)

Sparse Mode FCoE (SMFCoE) is when an Ethernet network forwards FCoE frames via regular Ethernet forwarding mechanisms. Unlike DMFCoE, the Fibre Channel frame is not de-encapsulated (although but it might be snooped with FIP snooping if the switch supports it). For the most part, the Ethernet switches have little to no awareness of the Fibre Channel layers.

The benefit of SMFCoE is that it doesn’t require quite the beefiness that DMFCoE needs, as you don’t need silicon that can understand and forward FCP (Fibre Channel Protocol) traffic. You still need priority flow control and other DCB standards, and probably DCBx (to set up the FCoE lossless CoS and so forth).

The drawback is that you’ll usually need some sort of multi-path Ethernet protocol, such as TRILL/SPB/Fabric Path as spanning-tree would likely be a disaster for a storage protocol. Since none of the potential multi-path Ethernet protocols are in wide use with the various vendors, that makes SMFCoE somewhat dead right now.

Alternative names for SMFCoE might be:

Ethernet-forwarded FCoE
FCoE light
Diet-FCoE

Why Differentiate?

Because it gets damn confusing otherwise. Recently Juniper and Cisco had a dustup about the requirement of TRILL for FCoE. Juniper posted the article on why TRILL won’t scale for data centers, and mentioned that TRILL is required for FCoE. J Metz from Cisco counter-reponded with essentially “no, FCoE doesn’t need TRILL“. Who’s right? Well they both are.

Cisco has gone the DMFCoE route, so no you don’t need TRILL (or other multi-path Ethernet). Since Juniper is going SMFCoE, it will need some sort of multi-path (and his article is calling for QFabric to be that solution).

Whither FCoE?

So can you do FCoE multi-hop right now, either DMFCoE or SMFCoE? It probably would be wise to wait. In the Cisco realm, the code that supports DMFCoE was just released in July for their Nexus 7K and MDS lines, and the 5Ks could have done DMFCoE since December I think (although I don’t know any one that did).

Right now, I don’t know of any customers actually doing mutli-hop FCoE (and I don’t know anyone who’s all that interested). SMFCoE is a moot point right now until more switches can get multi-path Ethernet, whether that be QFabric, TRILL, SPB or another method.

Filed under Always Be Learning, FCoE

21 Responses to Jinkies! It’s an FCoE Mystery!

Robert says:

August 17, 2011 at 10:15 am

I wanted to say great post! You did an excellent job explaining the different multi-hop technologies, this cleared a few things up for me. Keep up the good work.

Reply
J Metz (@jmichelmetz) says:

August 17, 2011 at 1:47 pm

Hi Tony,

Thanks for the props and links to articles that I’ve written. I appreciate your effort to compliment my own in attempting to clarify some of the confusion surrounding FCoE.

(I also appreciate you spelling my name right – it seems difficult for many people. 🙂

I’m still of two minds when it comes to the nomenclature which seems to have taken hold. The idea of “dense modes” and “sparse modes” seems to imply (at least to me) that these are design considerations and/or proscriptions for deployment. Nevertheless, it appears that Ivan’s terminology has taken hold. Oh well. 🙂

The issue for me has always been to illustrate some of the apparent attempts at dissembling about what is “needed” for FCoE to work beyond just the access layer. I think, to a certain extent, I’ve been relatively successful.

At first people said that FCoE was not standardized. This was untrue.
Then they said that Multihop FCoE was not standardized. This was untrue.
Then they said that you *needed* Ethernet-based forwarding to do Multihop. This was untrue.
Then they said that you *needed* Ethernet-based congestion notification to do Multihop. This was untrue.
Now they’re saying that you *need* FC-BB-6 to accomplish high availability. This is untrue as well.

Ultimately there are consequences to choices. FCoE *is* FC as its fundamental nature. Love or hate FC, it is what it is. When *I* talk about FCoE I talk about it from storage perspective. I use the same design principles and best practices that *storage* has long required. When I talk about the difference between a multi-*hop* environment (storage-based) versus multi-*tier* (mixed Ethernet/FC-based), I make the distinction because it breaks storage best practices.

I have yet to meet *any* storage administrator who is willing to voluntarily put a “black box” in the midst of the storage fabric(s) he controls. None. And yet the “sparse mode” DCB-bridging switch model espoused by some commits this sin. This is not storage design, it’s Ethernet design.

In this way I still have an issue with the nomenclature because it risks confusing either group (Eth or FC admins) into thinking that there is some “hybrid” method that is required for them to implement FCoE, whether it be solely at the access layer or beyond.

With your your permission I’d like to include one more blog for your collection, specifically related to what a true converged network is and how this “hybrid” approach is not the way that we at Cisco are trying to solve customers’ problems.

Reply
- tonybourke says:
  
  August 18, 2011 at 11:17 am
  
  >I also appreciate you spelling my name right – it seems difficult for many people.
  
  Hah, honestly, I’d gotten it wrong, and Thomas Jones pointed it out 🙂
  
  >I’m still of two minds when it comes to the nomenclature which seems to have taken hold. >The idea of “dense modes” and “sparse modes” seems to imply (at least to me) that these >are design considerations and/or proscriptions for deployment. Nevertheless, it appears that >Ivan’s terminology has taken hold. Oh well.
  
  I didn’t realize it was Ivan’s, I’d assumed it was part of the standard, but it looks like you’re right. How about FCF-mode and bridge-mode?
  
  >At first people said that FCoE was not standardized. This was untrue.
  >Then they said that Multihop FCoE was not standardized. This was untrue.
  >Then they said that you *needed* Ethernet-based forwarding to do Multihop. This was untrue.
  
  >Then they said that you *needed* Ethernet-based congestion notification to do Multihop. >This was untrue.
  
  According to Cisco, you do. In “I/O Consolidation In The Data Center”: “One of the downsides of lossless Ethernet discussed in page 19 is that, in the presence of congestion, it tends to create undesirable Head Of Line (HOL) blocking. This is because it spreads congestions across the network”.
  
  You have a great post (which also wonderfully describes DMFCoE/FCF FCoE) http://blogs.cisco.com/datacenter/the-napkin-dialogues-fcoe-vs-qcn/ which shows that you *can’t* do QCN with DMFCoE. Given Cisco’s own book, I haven’t heard why not being able to do QCN wouldn’t be a problem.
  
  So if a vendor says that you need X to do FCoE, they might be thinking of one way, not another. It’s this kind of confusion that really hampers adoption of FCoE. And it’s not FUD, it’s genuine confusion.
  
  So that’s one of the reasons why I think they should be discussed as separate implementations. They involve somewhat different technologies under the hood as well as dramatically different deployment considerations.
  
  >I have yet to meet *any* storage administrator who is willing to voluntarily put a “black box” >in the midst of the storage fabric(s) he controls. None. And yet the “sparse mode” DCB->bridging switch model espoused by some commits this sin. This is not storage design, it’s >Ethernet design.
  
  Even with DMFCoE, you’re still putting storage on devices that, at least in part, are not controlled or understood by storage people. You’re still doing some Ethernet design with DMFCoE. While DMFCoE is closer to regular FC than SMFCoE, it’s still Ethernet.
  
  That’s somewhat mitigated in SHFCoE, since the deployment is very simple compared to multi-switch designs and considerations.
  
  >In this way I still have an issue with the nomenclature because it risks confusing either group >(Eth or FC admins) into thinking that there is some “hybrid” method that is required for them >to implement FCoE, whether it be solely at the access layer or beyond.
  
  Other vendors are planning on some sort of SMFCoE, so if you’re not a fan of SMFCoE, then I think Cisco would should make the case that DMFCoE is better than SMFCoE. But to prevent confusion when a vendor says X and Cisco says Y, be specific about the mode.
  
  >With your your permission I’d like to include one more blog for your collection, specifically >related to what a true converged network is and how this “hybrid” approach is not the way >that we at Cisco are trying to solve customers’ problems.
  
  I’ll add it to the article 🙂
  
  -Tony
  
  Reply
Pingback: FCoE vs. iSCSI vs. NFS | The Unified Computing Blog
J Metz (@jmichelmetz) says:

August 19, 2011 at 10:55 am

Hey Tony,

Sorry it took a little while to get back to you, but I wanted to make sure I addressed some of your questions.

“According to Cisco, you do [need congestion notification]. In “I/O Consolidation In The Data Center”: “One of the downsides of lossless Ethernet discussed in page 19 is that, in the presence of congestion, it tends to create undesirable Head Of Line (HOL) blocking. This is because it spreads congestions across the network”.”

I’m afraid this is one of the most misunderstood phrases in the book. In this section the conversation is a general discussion about “is lossless better?”. I asked Claudio DeSanti (the author) to explain, since it seemed better to get it from him than have my interpretation. 🙂 Here’s what he said:

“In that generic context the potential drawbacks of lossless ethernet are shown, and potential HOL is one of the drawbacks and QCN is mentioned there as a possible way to mitigate the problem (e.g., QCN is useful if you have a congested layer 2 only network). However, this generic discussion has nothing to do with the specific case of FCoE, which is a layer 3 protocol, and as such goes outside the realm of operation of QCN.

So, to summarize:

a) QCN is a layer 2 technology intended to mitigate long-lived congestion in a layer 2 network, and it is useful in that context;

b) FCoE is a layer 3 protocol and as such it goes beyond the boundaries of QCN domains, therefore QCN is useless for FCoE.”

I hope that helps put the phrase in the book into better perspective.

You also write:

“So if a vendor says that you need X to do FCoE, they might be thinking of one way, not another. It’s this kind of confusion that really hampers adoption of FCoE. And it’s not FUD, it’s genuine confusion.”

I agree here. You’ll notice that I rarely, if ever, use the phrase FUD, and only in the context of some of the vendor blogs where they really should know better. I believe that the confusion lies from the fact that people don’t know what they don’t know. In other words, they know the “other side” has ways of doing things that will affect them, but they don’t know by how much. They’re relying on us (people like you and me) to be able to turn it into plain English so that they get at least take the first steps of knowing the right questions to ask.

“Even with DMFCoE, you’re still putting storage on devices that, at least in part, are not controlled or understood by storage people. You’re still doing some Ethernet design with DMFCoE. While DMFCoE is closer to regular FC than SMFCoE, it’s still Ethernet.”

I agree with your first part, but vehemently disagree with your second. Now, I can only speak for Cisco products (and wouldn’t dream of embarrassing myself by talking about the innards of other vendors’) but the fact that the Ethernet-based switches that run storage use the exact same commands and OS as the FC-based switches (with some minor exceptions) means that storage admins see no difference in the way they do their zoning, FLOGI, etc.

As far as the operational/management side of it, I’ve always been a fan and an advocate of the human element. (See my post on the SLAM Team before I joined Cisco 🙂 )

“Other vendors are planning on some sort of SMFCoE, so if you’re not a fan of SMFCoE, then I think Cisco would should make the case that DMFCoE is better than SMFCoE. But to prevent confusion when a vendor says X and Cisco says Y, be specific about the mode.”

I agree. I feel I’m fighting a losing battle when it comes to nomenclature, but I can live with that, I suppose. 🙂 To me, the first step is simple: Do you want to maintain storage best practices and design principles? If yes, then you need to have visibility everywhere you intend to have storage. If people want to call this “dense mode,” then so be it. (Just, dear god, do not call it “overlay mode!” There’s nothing “overlay” about it.)

If IT teams want to have an Ethernet-based system, that’s fine too. There are times when this is appropriate and even necessary. But the reasons for doing this should be clear for both the IT teams and the vendors.

Basically, place visibility into storage where you want visibility into storage. That’s where an FCF belongs. If you do it on one switch, great. Two switches, fine. Whatever you call it should not override or confuse the logic behind doing it. 🙂

Reply
- tonybourke says:
  
  August 20, 2011 at 12:31 am
  
  I’m afraid this is one of the most misunderstood phrases in the book. In this section the conversation is a general discussion about “is lossless better?”. I asked Claudio DeSanti (the author) to explain, since it seemed better to get it from him than have my interpretation. Here’s what he said:
  
  “In that generic context the potential drawbacks of lossless ethernet are shown, and potential HOL is one of the drawbacks and QCN is mentioned there as a possible way to mitigate the problem (e.g., QCN is useful if you have a congested layer 2 only network). However, this generic discussion has nothing to do with the specific case of FCoE, which is a layer 3 protocol, and as such goes outside the realm of operation of QCN.
  
  So, to summarize:
  
  a) QCN is a layer 2 technology intended to mitigate long-lived congestion in a layer 2 network, and it is useful in that context;
  
  b) FCoE is a layer 3 protocol and as such it goes beyond the boundaries of QCN domains, therefore QCN is useless for FCoE.”
  
  I hope that helps put the phrase in the book into better perspective.
  
  I realize that QCN is Layer 2, and wouldn’t work for DMFCoE since it’s Layer 3 (explained beautifully by your post), but I’m not clear on what mechanism in FCP would negate the HOL problem described.
  
  Thinking about it a bit more, I think PFC would negate HOL blocking for other QoS values, but anything granted as lossless CoS could be a problem.
  
  “So if a vendor says that you need X to do FCoE, they might be thinking of one way, not another. It’s this kind of confusion that really hampers adoption of FCoE. And it’s not FUD, it’s genuine confusion.”
  
  I agree here. You’ll notice that I rarely, if ever, use the phrase FUD, and only in the context of some of the vendor blogs where they really should know better. I believe that the confusion lies from the fact that people don’t know what they don’t know. In other words, they know the “other side” has ways of doing things that will affect them, but they don’t know by how much. They’re relying on us (people like you and me) to be able to turn it into plain English so that they get at least take the first steps of knowing the right questions to ask.
  
  A lot of mistakes are going to be made in terms of assumptions and predictions, part of why I’m staring this conversation 🙂
  
  “Even with DMFCoE, you’re still putting storage on devices that, at least in part, are not controlled or understood by storage people. You’re still doing some Ethernet design with DMFCoE. While DMFCoE is closer to regular FC than SMFCoE, it’s still Ethernet.
  
  I agree with your first part, but vehemently disagree with your second. Now, I can only speak for Cisco products (and wouldn’t dream of embarrassing myself by talking about the innards of other vendors’) but the fact that the Ethernet-based switches that run storage use the exact same commands and OS as the FC-based switches (with some minor exceptions) means that storage admins see no difference in the way they do their zoning, FLOGI, etc.
  
  As far as the operational/management side of it, I’ve always been a fan and an advocate of the human element. (See my post on the SLAM Team before I joined Cisco )
  
  I agree with your SLAM post (hence my creation of the new role of data center overlord). I suspect that it won’t be a storage team and a network team administering a FCoE network, even if it’s possible to separate out the tasks. To a storage admin, the commands may be the same, but the transport would be alien to them unless they brushed up on their Ethernet, and vice versa.
  
  “Other vendors are planning on some sort of SMFCoE, so if you’re not a fan of SMFCoE, then I think Cisco would should make the case that DMFCoE is better than SMFCoE. But to prevent confusion when a vendor says X and Cisco says Y, be specific about the mode.”
  
  I agree. I feel I’m fighting a losing battle when it comes to nomenclature, but I can live with that, I suppose. To me, the first step is simple: Do you want to maintain storage best practices and design principles? If yes, then you need to have visibility everywhere you intend to have storage. If people want to call this “dense mode,” then so be it. (Just, dear god, do not call it “overlay mode!” There’s nothing “overlay” about it.)
  
  “The only way is to overlay” – Cisco’s LISP PM 🙂 If you think about it, it is somewhat of an overlay. You’re putting FC on Ethernet, a full FCF mesh in DMFCoE, overlayed on a Ethernet network.
  
  Overlay has some negative connotations, and we’ve been taught that it’s bad to tunnel for a long time. But we do it everywhere. MPLS, VPN, GRE, FCoE, LISP. He’s right, overlay is the only way. I think we need to come to terms with that fact that it’s an overlay world.
  
  Reply
  - J Metz (@jmichelmetz) says:
    
    August 22, 2011 at 9:23 am
    
    “I realize that QCN is Layer 2, and wouldn’t work for DMFCoE since it’s Layer 3 (explained beautifully by your post), but I’m not clear on what mechanism in FCP would negate the HOL problem described. Thinking about it a bit more, I think PFC would negate HOL blocking for other QoS values, but anything granted as lossless CoS could be a problem.”
    
    There are two elements here that really do need to be cleared up: One has to do with HOL blocking (which takes us a bit outside the scope of your blog article), and the other has to do with PFC.
    
    Re: HOL blocking – an undesirable situation, sure, which is one of the reasons why FCC was created in the first place. (FCC, as you may know, was an inspiration for QCN when the technology was in its embryonic form.) The issue with FCC was that people never used it. The buffer-to-buffer credit system was perfectly capable of handling sequential frames on a link-by-link basis, and the SCSI CRC handled the retransmission concerns. The benefit of FCC was apparently not worth the processing power to keep it in practice for FC.
    
    In other words, FC works fine without it currently.
    
    The second issue is a bit more important. The statement that “PFC would negate HOL blocking for other queues” (emphasis added) is simply incorrect. PFT, also known as per-priority pause, affects only one priority. Priorities (somewhat of a misnomer as they do not reflect importance of traffic queues) are non-hierarchical. That is, what happens in one does not affect the others.
    
    I tend to visualize them as an 8 lane highway divided by barriers, and PFC acts as a stoplight on one of those lanes. PFC stops only that one lane, and all the other lanes are free to ignore it.
    
    “If you think about it, it is somewhat of an overlay. You’re putting FC on Ethernet, a full FCF mesh in DMFCoE, overlayed on a Ethernet network.”
    
    I think this is going to cause far more confusion than it’s worth. By conflating OSI layers as being “overlayed” on top of the physical media you have wound up eliminating the reason why they were separated in the first place.
    
    The key is in your next statement:
    
    “Overlay has some negative connotations, and we’ve been taught that it’s bad to tunnel for a long time.”
    
    That’s precisely the point. There is nothing in FCoE that requires a “tunnel.” Does IP “tunnel” through MAC addresses? No. So why should we think FCoE does when it doesn’t?
J Metz (@jmichelmetz) says:

August 19, 2011 at 10:56 am

Damn. Can’t edit a missing html tag. Sorry!

Reply
- tonybourke says:
  
  August 20, 2011 at 12:20 am
  
  Fixed it 🙂
  
  Reply
Erik Smith says:

August 19, 2011 at 12:58 pm

Hi Tony, I agree that FCoE can function over a wide range of topologies.

I also agree there is no common FCoE taxonomy and that this is something that should be addressed.

That having been said, I have to slightly disagree with the idea that there are three different implementations of FCoE (and it could just be a problem on my part with the term implementation…). “At the end of the day”, with today’s version of FCoE (FC-BB-5), there’s really only one way to use FCoE.

In all cases, FCoE frames that originate from a CNA traverse an Ethernet network and are received by an FCF. Once the FCF recieves the frame, it decides how to forward the frame based on the Fibre Channel D_ID and S_ID (taking into account FC zoning) and then it will either:
a) forward the frame to another switch/FCF via FC or FCoE (if the Destination Domain ID does not equal the local domain ID); or
b) forward the frame directly to the destination (e.g., the CNA attached to the storage port if the Destination Domain ID matches the local Doman ID).

The Ethernet network between the CNA and FCF or between two FCFs can be as simple as a single twinax cable or as complex as several FIP Snooping Bridges that are making use of TRILL. In any case, the topology or ethernet features in use between the CNA and FCF or between two FCFs is orthogonal to the FCoE protocol.

Personally, when I’m talking about FCoE, I break things down into supported topologies.
1. A CNA directly connected to an FCF (I call this “zero-hop” due to topology #4 below)
2. A CNA connected to a FIP Snooping bridge and then to an FCF (I call this a “FIP Snooping Bridge” topology)
3. FCFs connected together via FCoE ISLs (I refer to as “multi-hop”)
4. Very soon we will be talking about direcly connecting ENodes to other ENodes using either PT2PT or VN2VN (see FC-BB-6). I refer to this as a “direct connect” topology but I’ll probably also refer to it based on the protocol in use, either “Point to Point” or “VN to VN”.

The really challenging part is figuring out which vendor supports which topology. I took a stab at documenting this here ( http://brasstacksblog.typepad.com/brass-tacks/2011/06/fcfcoe-connectivity-options-as-of-6272011.html )

If someone has better terminology, I’d gladly use it. The terms Dense mode and Sparse mode seem to make things a bit more confusing rather than less (although I’m still a huge fan of Ivan’s 🙂 )

Reply
- J Metz (@jmichelmetz) says:
  
  August 22, 2011 at 10:13 am
  
  Yeah…. what Erik said. 😛
  
  Reply
- tonybourke says:
  
  August 22, 2011 at 6:33 pm
  
  That having been said, I have to slightly disagree with the idea that there are three different implementations of FCoE (and it could just be a problem on my part with the term implementation…). “At the end of the day”, with today’s version of FCoE (FC-BB-5), there’s really only one way to use FCoE.
  
  In all cases, FCoE frames that originate from a CNA traverse an Ethernet network and are received by an FCF. Once the FCF recieves the frame, it decides how to forward the frame based on the Fibre Channel D_ID and S_ID (taking into account FC zoning) and then it will either:
  a) forward the frame to another switch/FCF via FC or FCoE (if the Destination Domain ID does not equal the local domain ID); or
  b) forward the frame directly to the destination (e.g., the CNA attached to the storage port if the Destination Domain ID matches the local Doman ID).
  
  Your description above is an example of a “hybrid” of SMFCoE and DMFCoE. A mixture of Ethernet switches that are FCFs and ones that aren’t.
  
  The Ethernet network between the CNA and FCF or between two FCFs can be as simple as a single twinax cable or as complex as several FIP Snooping Bridges that are making use of TRILL. In any case, the topology or ethernet features in use between the CNA and FCF or between two FCFs is orthogonal to the FCoE protocol.
  
  Personally, when I’m talking about FCoE, I break things down into supported topologies.
  1. A CNA directly connected to an FCF (I call this “zero-hop” due to topology #4 below)
  
  I like that term, “zero hop” might be a better way to describe something like the Cisco UCS or Nexus 5000 (and possibly HP’s Virtual Connect).
  
  2. A CNA connected to a FIP Snooping bridge and then to an FCF (I call this a “FIP Snooping Bridge” topology)
  
  I would call this SMFCoE. I think we’re essentially saying the same thing. We’re not decaping the FC, just forwarding based on (enhanced) Ethernet forwarding methods, with some look into the FCoE control plane (FIP). It doesn’t expressly require Layer 2 multi-pathing, but any STP-based network would suck. You could do this with A/B path separation (which I think is what the Nexus 4000s do).
  
  3. FCFs connected together via FCoE ISLs (I refer to as “multi-hop”)
  
  What I would call DMFCoE. Every switch has a full FCF stack. Every switch decaps at the port, reencaps before sending it to another Ethernet port. In addition, it would be the border between regular FC and FCoE.
  
  4. Very soon we will be talking about direcly connecting ENodes to other ENodes using either PT2PT or VN2VN (see FC-BB-6). I refer to this as a “direct connect” topology but I’ll probably also refer to it based on the protocol in use, either “Point to Point” or “VN to VN”.
  
  Would this require an DMFCoE network? Or two devices on a lossless Ethernet network that could communicate to each other (so FCoE decap only happens at the N_Ports)?
  
  The really challenging part is figuring out which vendor supports which topology.
  
  I think the trick to that is having a good consistent terminology. You’ve mentioned a lot of different ways FCoE can traverse a network. I think vendors have taken a pretty divergent path in terms of which ones they support. For those of us that have to figure out how it works, we need a way to describe it to each other. In load balancing we had similiar issues, and terminology helped out. You can put a load balancer in a network in one-armed mode, direct-server return (hardly used anymore), routed mode, or bridged mode. It’s pretty simple terminology that all the vendors use (more or less) and makes things a lot easier for network admins.
  
  I took a stab at documenting this here ( http://brasstacksblog.typepad.com/brass-tacks/2011/06/fcfcoe-connectivity-options-as-of-6272011.html )
  
  If someone has better terminology, I’d gladly use it. The terms Dense mode and Sparse mode seem to make things a bit more confusing rather than less (although I’m still a huge fan of Ivan’s )
  
  Sparse and dense I believe come from the multicast world, which is a dreadful world I never want to visit. I think over all, a huge distinction in FCoE would be:
  
  * Does the switch decap the FC frame out of the Ethernet frame?
  * Or, is it setup so that I don’t really care (single or zero hop).
  
  That’s a critical distinction for those responsible for setting up the FC and FCoE networks.
  
  Take Cisco UCS for example: Even though the Fabric Interconnect is a FCF, I treat it differently than a DMFCoE device, because while there is PFC, ETS, and some other goodies in there, none of it is configured by the UCS administrator. The blades themselves see a regular FC interface, and zoning is done on the FC network they plug into.
  
  Right now, multi-hop isn’t supported on Cisco UCS, and I don’t think it’s going to be supported on the next release (UCSM 2.0) either. So while it’s an FCoE switch, how you could implement it is very different depending on the supported modes.
  
  I
  
  Reply
  - Erik Smith says:
    
    August 23, 2011 at 6:30 am
    
    Tony, I’ve provided some specific responses below. Before I get to those, I’d like to point out that there is a similar situation with native FC today.
    Today if I want to connect two sites over distance via an FC ISL, I can do via several methods. A couple of examples that come to mind are via dark-fiber or via DWDM. When we discuss these links, we do not call the dark-fiber “Dense Mode” and the DWDM case “Sparse Mode”, we call them ISLs and then talk about the connectivity used to create that ISL (are they “directly connected” via a dark fiber or are they using some kind of virtual link that is created from multiple sub-components). The point is, in both cases, the protocol used doesn’t change. In the multicast space, Dense Mode and Sparse Mode appear to be different protocols..
    Responses to your points are included below:
    Your description above is an example of a “hybrid” of SMFCoE and DMFCoE. A mixture of Ethernet switches that are FCFs and ones that aren’t.
    Right, but my issue is that there are not two different modes of operation. In both the “SMFCoE” and “DMFCoE” cases, the exact same protocol is used. If such a distinction is necessary perhaps “Sparse Topology FCoE” and “Dense Topology FCoE” would be better if, but to be clear, I am not advocating the use of these terms.
    2. A CNA connected to a FIP Snooping bridge and then to an FCF (I call this a “FIP Snooping Bridge” topology)
    
    I would call this SMFCoE. I think we’re essentially saying the same thing. We’re not decaping the FC, just forwarding based on (enhanced) Ethernet forwarding methods, with some look into the FCoE control plane (FIP). It doesn’t expressly require Layer 2 multi-pathing, but any STP-based network would suck. You could do this with A/B path separation (which I think is what the Nexus 4000s do).
    I agree we are really close on this one as well. I’m just concerned that some readers will interpret “just forwarding based on (enhanced) Ethernet forwarding methods” to mean that Frames are sent directly from one FCoE end device (ENode) to another ENode without needing to transit the FCF and this is not the case. All FCoE Frames that are transmitted from an ENode have the FCF-MAC as the DA and the FPMA as the SA.
    3. FCFs connected together via FCoE ISLs (I refer to as “multi-hop”)
    What I would call DMFCoE. Every switch has a full FCF stack. Every switch decaps at the port, reencaps before sending it to another Ethernet port. In addition, it would be the border between regular FC and FCoE.
    OK I see what you are saying here but consider the fact that from an FCoE point of view, a SMFCoE topology is indistinguishable from a physical cable between the ENode and the FCF. This is why I’m having such heartburn with the usage of “mode”.
    4. Very soon we will be talking about direcly connecting ENodes to other ENodes using either PT2PT or VN2VN (see FC-BB-6). I refer to this as a “direct connect” topology but I’ll probably also refer to it based on the protocol in use, either “Point to Point” or “VN to VN”.
    Would this require an DMFCoE network? Or two devices on a lossless Ethernet network that could communicate to each other (so FCoE decap only happens at the N_Ports)?
    Both PT2PT and VN2VN allow for FCoE ENodes to be directly connected to each other with a physical cable. See (http://brasstacksblog.typepad.com/brass-tacks/2011/03/pt2pt-and-vn2vn-directly-connecting-an-fcoe-initiator-to-an-fcoe-target-part-1-overview.html) for more information.
    The really challenging part is figuring out which vendor supports which topology.
    I think the trick to that is having a good consistent terminology. You’ve mentioned a lot of different ways FCoE can traverse a network. I think vendors have taken a pretty divergent path in terms of which ones they support. For those of us that have to figure out how it works, we need a way to describe it to each other. In load balancing we had similiar issues, and terminology helped out. You can put a load balancer in a network in one-armed mode, direct-server return (hardly used anymore), routed mode, or bridged mode. It’s pretty simple terminology that all the vendors use (more or less) and makes things a lot easier for network admins.
    OK, we agree that consistent terminology is needed..
    I took a stab at documenting this here ( http://brasstacksblog.typepad.com/brass-tacks/2011/06/fcfcoe-connectivity-options-as-of-6272011.html )
    If someone has better terminology, I’d gladly use it. The terms Dense mode and Sparse mode seem to make things a bit more confusing rather than less (although I’m still a huge fan of Ivan’s )
    Sparse and dense I believe come from the multicast world, which is a dreadful world I never want to visit. I think over all, a huge distinction in FCoE would be:
    * Does the switch decap the FC frame out of the Ethernet frame?
    * Or, is it setup so that I don’t really care (single or zero hop).
    That’s a critical distinction for those responsible for setting up the FC and FCoE networks.
    I agree the physical connectivity between FCoE devices needs to be taken into account when configuring FCoE.
    Take Cisco UCS for example: Even though the Fabric Interconnect is a FCF, I treat it differently than a DMFCoE device, because while there is PFC, ETS, and some other goodies in there, none of it is configured by the UCS administrator. The blades themselves see a regular FC interface, and zoning is done on the FC network they plug into.
    The provisioning process you are describing is the same for UCS and a host that is directly attached to an FCF. The administrator will still see two different functions (network and FC).
    Right now, multi-hop isn’t supported on Cisco UCS, and I don’t think it’s going to be supported on the next release (UCSM 2.0) either. So while it’s an FCoE switch, how you could implement it is very different depending on the supported modes.
    Right the uplink between the 6100 and the fabric can either be NPV or FC-SW. In the case of FC-SW, the ISLs between the 6100s and the fabric just happen to use physical FC. I don’t know when Cisco plans to support FCoE ISLs from the 6100 to another Cisco FCF, but it shouldn’t make a difference to the FCoE protocol once it is supported.
  - Erik Smith says:
    
    August 23, 2011 at 11:03 am
    
    Apologies for the double reply. I just realized that the tags I used in my previous response were removed probably cause I used less than and greater than as delimeters. Here’s another shot:
    
    Tony, I’ve provided some specific responses below. Before I get to those, I’d like to point out that there is a similar situation with native FC today.
    
    Today if I want to connect two sites over distance via an FC ISL, I can do via several methods. A couple of examples that come to mind are via dark-fiber or via DWDM. When we discuss these links, we do not call the dark-fiber “Dense Mode” and the DWDM case “Sparse Mode”, we call them ISLs and then talk about the connectivity used to create that ISL (are they “directly connected” via a dark fiber or are they using some kind of virtual link that is created from multiple sub-components). The point is, in both cases, the protocol used doesn’t change. In the multicast space, Dense Mode and Sparse Mode appear to be different protocols..
    
    Responses to your points are included below:
    
    Your description above is an example of a “hybrid” of SMFCoE and DMFCoE. A mixture of Ethernet switches that are FCFs and ones that aren’t.
    
    ErikS: Right, but my issue is that there are not two different modes of operation. In both the “SMFCoE” and “DMFCoE” cases, the exact same protocol is used. If such a distinction is necessary perhaps “Sparse Topology FCoE” and “Dense Topology FCoE” would be better if, but to be clear, I am not advocating the use of these terms.
    
    2. A CNA connected to a FIP Snooping bridge and then to an FCF (I call this a “FIP Snooping Bridge” topology)
    
    I would call this SMFCoE. I think we’re essentially saying the same thing. We’re not decaping the FC, just forwarding based on (enhanced) Ethernet forwarding methods, with some look into the FCoE control plane (FIP). It doesn’t expressly require Layer 2 multi-pathing, but any STP-based network would suck. You could do this with A/B path separation (which I think is what the Nexus 4000s do).
    
    ErikS: I agree we are really close on this one as well. I’m just concerned that some readers will interpret “just forwarding based on (enhanced) Ethernet forwarding methods” to mean that Frames are sent directly from one FCoE end device (ENode) to another ENode without needing to transit the FCF and this is not the case. All FCoE Frames that are transmitted from an ENode have the FCF-MAC as the DA and the FPMA as the SA.
    
    3. FCFs connected together via FCoE ISLs (I refer to as “multi-hop”)
    
    What I would call DMFCoE. Every switch has a full FCF stack. Every switch decaps at the port, reencaps before sending it to another Ethernet port. In addition, it would be the border between regular FC and FCoE.
    
    ErikS: OK I see what you are saying here but consider the fact that from an FCoE point of view, a SMFCoE topology is indistinguishable from a physical cable between the ENode and the FCF. This is why I’m having such heartburn with the usage of “mode”.
    
    4. Very soon we will be talking about direcly connecting ENodes to other ENodes using either PT2PT or VN2VN (see FC-BB-6). I refer to this as a “direct connect” topology but I’ll probably also refer to it based on the protocol in use, either “Point to Point” or “VN to VN”.
    
    Would this require an DMFCoE network? Or two devices on a lossless Ethernet network that could communicate to each other (so FCoE decap only happens at the N_Ports)?
    
    ErikS: Both PT2PT and VN2VN allow for FCoE ENodes to be directly connected to each other with a physical cable. See (http://brasstacksblog.typepad.com/brass-tacks/2011/03/pt2pt-and-vn2vn-directly-connecting-an-fcoe-initiator-to-an-fcoe-target-part-1-overview.html) for more information.
    
    The really challenging part is figuring out which vendor supports which topology.
    
    I think the trick to that is having a good consistent terminology. You’ve mentioned a lot of different ways FCoE can traverse a network. I think vendors have taken a pretty divergent path in terms of which ones they support. For those of us that have to figure out how it works, we need a way to describe it to each other. In load balancing we had similiar issues, and terminology helped out. You can put a load balancer in a network in one-armed mode, direct-server return (hardly used anymore), routed mode, or bridged mode. It’s pretty simple terminology that all the vendors use (more or less) and makes things a lot easier for network admins.
    
    ErikS: OK, we agree that consistent terminology is needed..
    
    I took a stab at documenting this here ( http://brasstacksblog.typepad.com/brass-tacks/2011/06/fcfcoe-connectivity-options-as-of-6272011.html )
    If someone has better terminology, I’d gladly use it. The terms Dense mode and Sparse mode seem to make things a bit more confusing rather than less (although I’m still a huge fan of Ivan’s )
    
    Sparse and dense I believe come from the multicast world, which is a dreadful world I never want to visit. I think over all, a huge distinction in FCoE would be:
    * Does the switch decap the FC frame out of the Ethernet frame?
    * Or, is it setup so that I don’t really care (single or zero hop).
    That’s a critical distinction for those responsible for setting up the FC and FCoE networks.
    
    ErikS: I agree the physical connectivity between FCoE devices needs to be taken into account when configuring FCoE.
    
    Take Cisco UCS for example: Even though the Fabric Interconnect is a FCF, I treat it differently than a DMFCoE device, because while there is PFC, ETS, and some other goodies in there, none of it is configured by the UCS administrator. The blades themselves see a regular FC interface, and zoning is done on the FC network they plug into.
    
    ErikS: The provisioning process you are describing is the same for UCS and a host that is directly attached to an FCF. The administrator will still see two different functions (network and FC).
    
    Right now, multi-hop isn’t supported on Cisco UCS, and I don’t think it’s going to be supported on the next release (UCSM 2.0) either. So while it’s an FCoE switch, how you could implement it is very different depending on the supported modes.
    
    ErikS: Right the uplink between the 6100 and the fabric can either be NPV or FC-SW. In the case of FC-SW, the ISLs between the 6100s and the fabric just happen to use physical FC. I don’t know when Cisco plans to support FCoE ISLs from the 6100 to another Cisco FCF, but it shouldn’t make a difference to the FCoE protocol once it is supported.
J Metz (@jmichelmetz) says:

August 23, 2011 at 10:49 am

Tony, I think Erik nailed it on the head with his points, but there’s one thing that I would like to call back to:

Erik: 4. Very soon we will be talking about direcly connecting ENodes to other ENodes using either PT2PT or VN2VN (see FC-BB-6). I refer to this as a “direct connect” topology but I’ll probably also refer to it based on the protocol in use, either “Point to Point” or “VN to VN”.

Tony: Would this require an DMFCoE network? Or two devices on a lossless Ethernet network that could communicate to each other (so FCoE decap only happens at the N_Ports)?

I believe this question illustrates the exact reason why I have issues with the nomenclature. It brings up questions like this which are, IMHO, attempting to shoehorn the way the technology works into a very small and ill-fitting metaphor.

When we start talking about FC-BB-6 which goes into different ways of obtaining and forwarding frames with something called a FDF the underlying principles of the way FCoE works does not change. But what would the upshot be? It would be to have to come up with Yet Another Acronym to try to explain it. I don’t understand how that wouldn’t be adding more confusion into the mix, unfortunately, making it far more complex than it needs to be.

Reply
- tonybourke says:
  
  August 24, 2011 at 7:31 pm
  
  I believe this question illustrates the exact reason why I have issues with the nomenclature. It brings up questions like this which are, IMHO, attempting to shoehorn the way the technology works into a very small and ill-fitting metaphor.
  
  Ah, but I think we do need the nomenclature. FCoE isn’t just FCoE in terms of deployment options. If you present me with a switch, and tell me it’s FCoE capable, what does that tell me in terms of how I can fit it into my infrastructure? Almost nothing (just how Juniper and Cisco can both be correct yet contradictory in terms of TRILL).
  
  However, if you said the FCoE switch is an FCF FCoE switch (DMFCoE), or a FIP snooping bridge/DCBx compliant switch (SMFCoE), I’m getting closer. But still not quite there in terms of what I can do with it. Because a UCS 6×00 series is a FCF FCoE (DMFCoE) switch, but as Erik mentioned, doesn’t support ISLs currently over FCoE. You need to use NPV (preferred) or switching mode with an Cisco FC switch (oh boy, interopability issues!).
  
  By adding a little bit of terminology, you can clear up a lot in terms of how you fit a given device into a network.
  
  Everyone I speak to regarding FCoE, day in and day out, is utterly confused by the concepts. Saying a switch is “FCoE capable” means absolutely nothing to those responsible for designing and deploying FCoE.
  
  Perhaps the terminology needs to be tweaked in name, and perhaps it should be tweaked in terms of what we differentiate, but the way I see it, we do need some way to expound upon “FCoE”.
  
  Reply
  - Scott Lowe says:
    
    September 8, 2011 at 10:25 am
    
    Tony, I am far from an expert in these matters, but in my simplistic view, it sounds like the key thing confusing people here is simply the use of the term “FCoE capable” to describe different levels or types of FCoE support. For example, I can call a Nexus 5000 “FCoE capable” and I can call a Nexus 4000 “FCoE capable,” but one of them is an FCF and one of them is a FSB (FIP Snooping Bridge), so naturally their role in an FCoE network would be different. I’m with Erik and J Metz that it’s not the protocol that’s confusing, it’s how the vendors are using “FCoE capable” to describe differing levels or types of support for FCoE.
    
    My 2 cents…
Pingback: The Case For FCoE Terminology | The Data Center Overlords
Pingback: Technology Short Take #14 - blog.scottlowe.org - The weblog of an IT pro specializing in virtualization, storage, and servers
Pingback: A Tale of Two FCoEs | The Data Center Overlords
Pingback: A Tale of Two FCoEs | Storage CH Blog