Requiem for the ACE

Ah, the Cisco ACE. As we mourn our fallen product, I’ll take a moment to reflect on this development as well as what the future holds for Cisco and load balancing/ADC. First off, let me state I have no inside knowledge of what Cisco’s plans are in this regard. While I teach Cisco ACE courses for Firefly and develop Firefly’s courseware for both ACE products and bootcamp material for the CCIE Data Center, I’m not an employee of Cisco and have no inside knowledge of their plans. As a result, I’ve no idea what Cisco’s plans are, so this is pure speculation.

Also, it should be made clear that Cisco has not EOL’d (End of Life) or even EOS’d (End of Sale) the ACE product, and in a post on the CCIE Data Center group Walid Issa, the project manager for CCIE Data Center, made a statement reiterating this. And just as I was about to publish this post, there’s a great post by Brad Casemore also reflecting on the ACE, and there’s an interesting comment from Steven Schuchart of Cisco (analyst relations?) making a claim that ACE is, in fact, not dead.

However, there was a statement Cisco sent to CRN confirming the rumor, and my conversations with people inside Cisco have confirmed that yes, the ACE is dead. Or at least, that’s the understanding of Cisco employees in several areas. The word I’m getting will be bug-fixed and security-fixed, but further development will halt. The ACE may not officially be EOL/EOS, but for all intents and purposes, and until I hear otherwise, it’s a dead-end product.

The news of ACE’s probable demise was kind of like a red-shirt getting killed. We all knew it was coming, and you’re not going to see a Spock-like funeral, either. 

We do know one thing: For now at least, the ACE 4710 appliance is staying inside the CCIE Data Center exam. Presumably in the written (I’ve yet to sit the non-beta written) as well as in the lab. Though it seems certain now that the next iteration (2.0) of the CCIE Data Center will be ACE-less.

Now let’s take a look down memory land, to the Ghosts of Load Balancers Past…

Ghosts of Load Balancers Past

As many are aware, Cisco has long had a long yet… imperfect relationship with load balancing. This somewhat ironic considering that Cisco was, in fact, the very first vendor to bring a load balancer to market. In 1996, Cisco released the LocalDirector, the world’s first load balancer. The product itself sprung from the Cisco purchase of Network Translation Incorporated in 1996, which also brought about the PIX firewall platform.

The LocalDirectors did relatively well in the market, at least at first. It addressed a growing need for scaling out websites (rather than the more expensive, less resilient method of scaling up). The LocalDirectors had a bit of a cult following, especially from the routing and switching crowd, which I suspect had a lot to do with its relatively simple functionality: For most of its product life, the LocalDirector was just a simple Layer 4 device, and only moved up the stack in the last few years of its product life. While other vendors went higher up the stack with Layer 7 functionality, the LocalDirector stayed Layer 4 (until near the end, when it got cookie-based persistence). In terms of functionality and performance, however,  vendors were able to surpass the LocalDirector pretty quickly.

The most important feature that the other vendors developed in the late 90s was arguably cookie persistence. (The LocalDirector didn’t get this feature until about 2001 if I recall correctly.) This allowed the load balancer to treat multiple people coming from the same IP address as separate users. Without cookie-based persistence, load balancers could only do persistence based on an IP address, and was thus susceptible to the AOL megaproxy problem (you could have thousands of individual users coming from a single IP address). There was more than one client in the 1999-2000 time period where I had to yank out a LocalDirector and put in a Layer 7-capable device because of AOL.

Cookie persistence is a tough habit to break

At some point Cisco came to terms with the fact that the LocalDirector was pretty far behind and must have concluded it was an evolutionary dead end, so it paid $6.7 billion (with B) to buy ArrowPoint, a load balancing company that had a much better product than the LocalDirector. That product became the Cisco CSS, and for a short time Cisco was on par with other offerings from other vendors. Unfortunately, as with the LocalDirector, development and innovation seemed to stop after the purchase, and the CSS was forever a product frozen in the year 2000. Other vendors innovated (especially F5), and as time went on the CSS won fewer and fewer deals. By 2007, the CSS was largely a joke in load balancing circles. Many sites were happily running the CSS of course, (and some still are today), but feature-wise, it was getting its ass handed to it by the competition.

The next load balancer Cisco came up with had a very short lifecycle. The Cisco CSM (Content Switch Module), a load balancing module for the Catalyst 6500 series, didn’t last very long and as far as I can remember never had a significant install base. Also, I don’t recall ever using, and know it only through legend (as being not very good). It was replaced quickly by the next load balancing product from Cisco.

And that brings us to the Cisco ACE. Available in two iterations, the Service Module and the ACE 4710 Appliance, it looked like Cisco might have learned from its mistakes when it released the Cisco ACE. Out of the gate it was a bit more of a modern load balancer, offering features and capabilities that the CSS lacked, such as a three-tired VIP configuration mechanism (real servers, server farms, and VIPs, which made URL rules much easier) and the ability to insert the client’s true-source IP address in an HTTP header in SNAT situations. The latter was a critical function that the CSS never had.

But the ACE certainly had its downsides. The biggest issue is that the ACE could never go toe-to-toe with the other big names in load balancing in terms of features. F5 and NetScaler, as well as A10, Radware, and others, always had a far richer feature set than the ACE. It is, as Greg Ferro said, a moderately competent load balancer in that it does what it’s supposed to do, but it lacked the features the other guys had.

The number one feature that keeps ACE from eating at the big-boy table is an answer to F5′s iRules. F5′s iRules give a huge amount of control over how to load balance and manipulate traffic. You can use it to create a login page on the F5 that authenticates against AD(without ever touching a web server), re-write http:// URLs to https:// (very useful in certain SSL termination setups), and even calculate Pi everytime someone hits a web page. Many of the other high end vendors have something similar, but F5′s iRules is the king of the hill.

In contrast, the ACE can evaluate existing HTTP headers, and can manipulate headers to a certain extent, but the ACE cannot do anything with HTTP content. There’s more than one installation where I had to replace the ACE with another load balancer because of that issue.

The ACE never had a FIPS-compliant SSL implementation either, which prevented the ACE from being in a lot of deals, especially with government and financial institutions. ACE was very late to the game with OCSP support and IPv6 (both were part of the 5.0 release in 2011), and the ACE10 and ACE20 Service Modules will never, ever be able to do IPv6. You’d have to upgrade to the ACE30 Module to do IPv6, though right now you’d be better off with another vendor.

For some reason, Cisco decided to make use of MQC (Module QoS CLI) as the configuration framework in the ACE. This meant configuring a VIP required setting up class-maps, policy-maps, and service-policies in addition to real server and server farms. This was far more complicated than the configuring of most of the competition, despite the fact that the ACE had less functionality. If you weren’t a CCNP level or higher, the MQC could be maddening. (On the upside, if you mastered it on the ACE, QoS was a lot easier to learn, as was my case.)

If the CLI was too daunting, there was always the GUI on the ACE 4710 Appliance and/or the ACE Network Manager (ANM), which was separate user interface that ran on RedHat and later became it’s own OVA-based virtual appliance. The GUI in the beginning wasn’t very good, and the ACE Service Modules (ACE10, ACE20, and now the ACE30) lacked a built-in GUI. Also, when it hits the fan, the CLI is the best way to quickly diagnose an issue. If you weren’t fluent in the MQC and the ACE’s rather esoteric utilization of such, it was tough to troubleshoot.

There was also a brief period of time when Cisco was selling the ACE XML Gateway, a product obtained through the purchase of Reactivity in 2007, which provided some (but not nearly all) of the features the ACE lacked. It still couldn’t do something like iRules, but it did have Web Application Firewall abilities, FIPS compliance, and could do some interesting XML validation and other security. Of course, that product was short lived as well, and Cisco pulled the plug in 2010.

Despite these short comings, the ACE was a decent load balancer. The ACE service module was a popular service module for the Catalyst 6500 series, and could push up to 16 Gbps of traffic, making it suitable for just about any site. The ACE 4710 appliance was also a popular option at a lower price point, and could push 4 Gbps (although it only had (4) 1 Gbit ports, never 10 Gbit). Those that were comfortable with the ACE enjoyed it, and there are thousands of happy ACE customers with deployments.

But “decent” isn’t good enough in the highly competitive load balancing/ADC market. Industry juggernauts like F5 and scrappy startups like A10 smoke the ACE in terms of features, and unless a shop is going all-Cisco, the ACE almost never wins in a bake-off. I even know of more than one occasion where Cisco had to essentially invite itself to a bake-off (which in those cases never won). The ACE’s market share continued to drop from its release, and from what I’ve heard is in the low teens in terms of percentage, while F5 has about 50%.

In short, the ACE was the knife that Cisco brought to the gunfight. And F5 had a machine gun.

I’d thought for years that Cisco might just up and decide to drop the ACE. Even with the marketing might and sales channels of Cisco, the ACE could never hope to usurp F5 with the feature set it had. Cisco didn’t seem committed to developing new features, and it fell further behind.

Then Cisco included ACE in the CCIE Data Center blueprint, so I figured they were sticking with it for the long haul. Then the CRN article came out, and surprised everybody (including many in Cisco from what I understand).

So now the big question is whether or not Cisco is bowing out of load balancing entirely, or coming out with something new. We’re certainly getting conflicting information out of Cisco.

I think both are possible. Cisco has made a commitment (that they seem to be living up to) to drop businesses and products that they aren’t successful in. While Cisco has shipped tens of thousands of load balancing units since the first LocalDirector was unboxed, except for the beginning they’ve never led the market. Somewhere in the early 2000s, that title belong almost exclusively to F5.

For a company as broad as Cisco is, load balancing as a technology is especially tough to sell and support. It takes a particular skill set that doesn’t relate fully to Cisco’s traditional routing and switching strengths, as load balancing sits in two distinct worlds: Server/app development, and networking. With companies like F5, A10, Citrix, and Radware, it’s all they do, and every SE they have knows their products forwards and backwards.

The hardware platform that the ACE is based on (Cavium Octeon network processors) I think are one of the reasons why the ACE hasn’t caught up in terms of features. To do things like iRules, you need fast, generalized processors. And most of the vendors have gone with x86 cores, and lots of them. Vendors can use pure x86 power to do both Layer 4 and Layer 7 load balancing, or some like F5 and A10 incorporate FGPAs to hardware-assist the Layer 4 load balancing, and distribute flows to x86 cores for the more advanced Layer 7 processing.

The Cavium network processors don’t have the horsepower to handle the advanced Layer 7 functionality, and the ACE Modules don’t have x86 at all. The ACE 4710 Appliance has an x86 core, but it’s several generations back (it’s seriously a single Pentium 4 with one core). As Greg Ferro mentioned, they could be transitioning completely away from that dead-end hardware platform, and going all virtualized x86. That would make a lot more sense, and would allow Cisco to add features that it desperately needs.

But for now, I’m treating the ACE as dead.

12 Responses to Requiem for the ACE

  1. tonybourke says:

    The more I think about it, the current ACE hardware could never catch up to the competition. They’d need a completely new platform, one with lots of x86. Either physical boxes, or virtual. Either way, the data plane of the ACE can’t do what F5, A10, etc. do. There’s just not enough horsepower for the Layer 7 stuff.

  2. This has been an interesting turn of events Tony. I read about it online whilst at a Cisco event last week but unfortunately, the key guy in the room who could have shed light on the matter left before I had a chance to nab him.

    From what I’ve heard, this ‘announcement’ was internal and the information was then leaked in bits and bytes causing all sorts of confusion. I understand that, whilst no EOL has been declared, the platform will continue to be developed but here’s the kicker…only within it’s current feature set, which sounds like it will be supported as you would expect, but it’s not going to be catching up to the competition.

    I’ve been following the discussion of SDN and what that means to various people with some interest and also some amusement over the last year as some people made wacky claims about how quickly it would explode, or what it meant for a network engineer.

    This latest news regarding the ACE has started me thinking about what the bigger picture is and I think that Cisco are looking 5-10 years down the line with this decision. They are starting to catch up with the whole SDN discussion and I would guess throwing a lot of R&D behind it. I wouldn’t be surprised if they have a virtualised SDN based load balancer of sorts under development already.

    Who will want a static hardware load balancer in 5 years time? Everything is going to be dynamic and mobile and the services that are being hosted in data centres will be as fluid as the underlying services that support them must be to make it all work.

  3. mnarayan67 says:

    Very interesting take. Not sure how an X86 solution would improve feature set, other than lowering costs and adding processing power. And a fully virtual solution, will it have the feature set of F5, A10, Radware, Citrix. If not, other than a different form factor what does it bring to the table? It will lower development and production costs but not improve market share dynamics. Maybe they are looking 2-3 years down the road, moving all development to software, hoping to leapfrog and catch up in the SDN driven market transition. But that offers customers a product vaccum in the interim. I can imagine Cisco management team grappling with these choices.

  4. Pingback: Internets of Interest for 25th September 2012 — EtherealMind

  5. Steven Iveson says:

    F5′s iRules are SDN and they have been around since 2004 in their current form. Cisco SDN will still be L2-L4 focussed, I don’t think they have a chance.

  6. In addition to “F5, A10, Citrix, and Radware*”, Riverbed’s Stingray product (formerly Zeus Traffic Manager) is a worthy addition to anyone’s short list in this space. It’s a seriously fast pure software ADC, so you can deploy it on pretty much whatever physical/virtual/cloud platform you like and scale up onto faster servers or out via N-way clustering as needed. Additionally, complex rewrite/routing/whatever rules in its TrafficScript language are easily a match for F5 iRules in capability while generally being much easier to write and debug.

    (*Radware? Seriously??)

    • mnarayan67 says:

      Stingray/Riverbed (and Juniper QFabric in 2 years) is clearly another choice for users. Any reason why Stingray gets a lower ranking by Gartner vs Radware? Perhaps due to market share and multiple form factors. Also if Cisco does choose a buy vs build strategy, its only Radware. F5, Citrix are too large, Stingray already licensed and the way A10 wants to poach ACE customers i doubt they are a candidate.

  7. rmalayter says:

    The other nail in the coffin was, of course, the availability of solid open-source software load-balancing solutions such as HAproxy, nginx, TrafficServer, lighttpd, and commanche running on Linux or BSD. The rise of Linux in the DataCenter meant that server and application teams were quite comfortable handling load balancing themselves using these tools. On commodity hardware they match the performance and features of even big F5s, and are easily scaled out. Nginx, for example, has embedded versions of the Perl and Lua languages as a powerful alternative to iRules.

    In my own shop, load balancing has always been a server team or developer function, and not part of the networking sphere of influence. Mostly because LB configuration is usually quite application-dependent, and the server and developer types already understand an application’s behavior with regards to HTTP caching, cookies, session state, and pathing.

  8. Will Hogan says:

    This article was awesome!!!! Thanks. I was drawn in and read till the end. How funny with the local director. That was a bit before my time but I have a few sitting in storage for some reason.

    Can you tell me why the below would be helpful?
    “calculate Pi everytime someone hits a web page”

  9. Richard says:

    I was cursed from the start with the ACE. In late 2006 / early 2007, I attempted to rollout a couple of AC10 modules after getting a great many assurances from Cisco that the product was ready for prime-time since it had only been released 9 months earlier; the claim is the software was stable and on v3. I could not launch a new site after running into a couple of severe bugs in the platform from be able to persist persistent connections to randomly resetting the TCP MSS to values greater than negotiated in the beginning of the flow.

    I became the first CAP (Customer Assurance Program) case for the ACE that resulted in CSS load-balancers (which we had been using at another data center and knew quite well) being shipped to me in the eleventh-hour so I could launch.

    The ACE modules sat unused in our 6500′s from 2007-2011 when I brought them back to life to handle the load-balancing for the sites of a recent acquisition. It’s worked farily well for them after you get over the MQC framework discussed in the article and sends shivers down the spines of non-network engineers.

    And now, the news! F5 or NetScaler (via Cisco?), here I come.

  10. Pingback: General Network Error

  11. Pingback: Link Aggregation | The Data Center Overlords

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 82 other followers

%d bloggers like this: