Cisco ACE: Insert Client IP Address

Source-NAT (also referred to as one-armed mode) is a common way of implementing load balancers into a network. It has several advantages over routed-mode (where the load balancer is the default gateway of the servers), most importantly that the load balancer doesn’t need to be Layer 2 adjacent/on the same subnet as the servers.  As long as the SNAT IP address of the load balancer has bi-directional communication with the IP address of the servers, the load balancer can be anywhere. A different subnet, a different data center, even a different continent.

However, one drawback is that with Source NAT the client’s IP address is obscured. The server’s logs will show only the IP address of the SNAT address(s).

There is a way to remedy that if the traffic is HTTP/HTTPS, and that’s by having the load balancer insert the true source IP address into the HTTP request header from the client. You can do it with the ACE by putting it into the load balance policy-map.

policy-map type loadbalance http first-match VIP1_L7_POLICY
  class class-default
     serverfarm FARM1
     insert-http x-forwarded-for header-value "%is"

But alone is not enough. There are two extra steps you need to take.

The first step is you need to tell the web server to log the x-forwarded-for. For Apache, it’s a configuration file change. For IIS, you need to run an ISAPI filter in IIS.

The other thing you need to do is fix the ACE’s attention span. You see, by default the ACE has a short attention span. The HTTP protocol allows you to make multiple HTTP requests on a single TCP connection. By default, the ACE will only evaluate/manipulate the first HTTP request in a TCP connection.

So your log files will look like this: "GET /lb/archive/10-2002/index.htm"
- "GET /lb/archive/10-2003/index.html"
- "GET /lb/archive/05-2004/0100.html HTTP/1.1" "GET /lb/archive/10-2007/0010.html"
- "GET /lb/archive/index.php"
- "GET /lb/archive/09-2002/0001.html"

The “-” indicates Apache couldn’t find the header, because the ACE didn’t insert it. The ACE did add the first source IP address, but every request after it in the same TCP connection was ignored.

Why does the ACE do this? It’s less work for one, only evaluating/manipulating the first request in a connection. Since browsers will make dozens or even hundreds of requests over a single connection, this would be  a significant saving of resources. After all, most of the time when L7 configurations are used, it’s for cookie-based persistence. If that’s the case, all the requests in the same TCP connection are going to contain the same cookies anyway.

How do you fix it? By using a very ill-named feature called persistence-rebalance. This gives the ACE a longer attention span, telling the ACE to look at every HTTP request in the TCP connection.

First, create an HTTP parameter-map.

parameter-map type http HTTP_LONG_ATTENTION_SPAN

Then apply the parameter-map to the VIP in the multi-match policy map.

policy-map multi-match VIPsOnInterface
  class VIP1
    loadbalance vip inservice
    loadbalance policy VIP1_L7_POLICY
    appl-parameter http advanced-options HTTP_LONG_ATTENTION_SPAN

When that happens, the IP address will show up in all of the log entries. "GET /lb/archive/10-2002/index.htm" "GET /lb/archive/10-2003/index.html" "GET /lb/archive/05-2004/0100.html HTTP/1.1" "GET /lb/archive/10-2007/0010.html" "GET /lb/archive/index.php" "GET /lb/archive/09-2002/0001.html"

But remember, configuring the ACE (or load balancer in general) isn’t the only step you need to perform. You also need to tell the web service (Apache, Nginx, IIS) to use the header as well. None of them automatically use the X-Forwarded-for header.

I don’t know if they’ll try to trick you with this in the CCIE Lab, but it’s something to keep in mind for the CCIE and for implementations.

Health Checking On Load Balancers: More Art Than Science

One of the trickiest aspects of load balancing (and load balancing has lots of tricky aspects) is how to handle health checking. Health checking is of course the process where by the load balancer (or application delivery controller) does periodic checks on the servers to make sure they’re up and responding. If a server is down for any reason, the load balancer should detect this and stop sending traffic its way.

Pretty simple functionality, really. Some load balancers call it keep-alives or other terms, but it’s all the same: Make sure the server is still alive.

One of the misconceptions about health checking is that it can instantly detect a failed server. It can’t. Instead, a load balancer can detect a server failure within a window of time. And that window of time is dependent upon a couple of factors:

  • Interval (how often is the health check performed)
  • Timeout (how often does the load balancer wait before it gives up)
  • Count (some load balancers will try several times before marking a server as “down”)

As an example, take a very common interval setting of 15 seconds, a timeout of 5 seconds, and a count of 2. If I took a shotgun to a server (which would ensure that it’s down), how long would it take the load balancer to detect the failure?

In the worst case scenario for time to detection, the failure occurred right after that last successful health check, so that would be about 14 seconds before the first failure was even detected. The health check fails once, so we wait another 15 seconds before the second health check. Now that’s two down, and we’ve got a server marked as down.

So that’s about 29 seconds at a worst case scenario, or 16 seconds on a best case scenario. Sometimes server administrators hear that and want you to tune the variables down, so they can detect a failure quicker. However, that’s about as low as they go.

If you set the interval for more than 15 seconds, depending on the load balancer, it can unduly burden the control plane processor with all those health checks. This is especially true if you have hundreds of servers in your server farm. You can adjust the count down to 1, which is common, but remember a server would be marked down on just a single health check failure.

I see you have failed a single health check. Pity.

The worst value to tune down, however, is the timeout value. I had a client once tell me that the load balancer was causing all sorts of performance issues in their environment. A little bit of investigating, and it turned out that they had set the timeout value to 1 second. If a server didn’t come up with the appropriate response to the health check in 1 second, the server would be marked down. As a result, every server in the farm was bouncing up and down more than a low-rider in a Dr Dre video.

As a result, users where being bounced from one server to another, with lots of TCP RSTs and re-logging in (the application was stateful, requiring users being tied to a specific server to keep their session going). Also, when one server took 1.1 seconds to respond, it was taken out of rotation. The other servers would have to pick up the slack, and thus had more load. It wasn’t long before one of them took more than a second to respond. And it would cascade over and over again.

When I talked to the customer about this, they said they wanted their site to be as fast as possible, so they set the timeout very low. They didn’t want users going onto a slow server. A noble aspiration, but the wrong way to accomplish that goal. The right way would be to add more servers. We tweaked the timeout value to 5 seconds (about as low as I would set it), and things calmed down considerably. The servers were much happier.

So tweaking those knobs (interval, timeout, count) are always a compromise between detecting a server failure quickly, and giving a server a decent chance to respond as well as not overwhelming the control plane. As a result, it’s not an exact science. Still, there are guidelines to keep in mind, and if you set the expectations correctly, the server/application team will be a lot happier.

Creating Your Own SSL Certificate Authority (and Dumping Self Signed Certs)

Jan 11th, 2016: New Year! Also, there was a comment below about adding -sha256 to the signing (both self-signed and CSR signing) since browsers are starting to reject SHA1. Added (I ran through a test, it worked out for me at least).

November 18th, 2015: Oops! A few have mentioned additional errors that I missed. Fixed.

July 11th, 2015: There were a few bugs in this article that went unfixed for a while. They’ve been fixed.

SSL (or TLS if you want to be super totally correct) gives us many things (despite many of the recent shortcomings).

  • Privacy (stop looking at my password)
  • Integrity (data has not been altered in flight)
  • Trust (you are who you say you are)

All three of those are needed when you’re buying stuff from say, Amazon (damn you, Amazon Prime!). But we also use SSL for web user interfaces and other GUIs when  administering devices in our control. When a website gets an SSL certificate, they typically purchase one from a major certificate authority such as DigiCert, Symantec (they bought Verisign’s registrar business), or if you like the murder of elephants and freedom, GoDaddy.  They range from around $12 USD a year to several hundred, depending on the company and level of trust. The benefit that these certificate authorities provide is a chain of trust. Your browser trusts them, they trust a website, therefore your browser trusts the website (check my article on SSL trust, which contains the best SSL diagram ever conceived).

Your devices, on the other hand, the ones you configure and only your organization accesses, don’t need that trust chain built upon the public infrastrucuture. For one, it could get really expensive buying an SSL certificate for each device you control. And secondly, you set the devices up, so you don’t really need that level of trust. So web user interfaces (and other SSL-based interfaces) are almost always protected with self-signed certificates. They’re easy to create, and they’re free. They also provide you with the privacy that comes with encryption, although they don’t do anything about trust. Which is why when you connect to a device with a self-signed certificate, you get one of these: So you have the choice, buy an overpriced SSL certificate from a CA (certificate authority), or get those errors. Well, there’s a third option, one where you can create a private certificate authority, and setting it up is absolutely free.


OpenSSL is a free utility that comes with most installations of MacOS X, Linux, the *BSDs, and Unixes. You can also download a binary copy to run on your Windows installation. And OpenSSL is all you need to create your own private certificate authority. The process for creating your own certificate authority is pretty straight forward:

  1. Create a private key
  2. Self-sign
  3. Install root CA on your various workstations
Once you do that, every device that you manage via HTTPS just needs to have its own certificate created with the following steps:
  1. Create CSR for device
  2. Sign CSR with root CA key
You can have your own private CA setup in less than an hour. And here’s how to do it.

Create the Root Certificate (Done Once)

Creating the root certificate is easy and can be done quickly. Once you do these steps, you’ll end up with a root SSL certificate that you’ll install on all of your desktops, and a private key you’ll use to sign the certificates that get installed on your various devices.

Create the Root Key

The first step is to create the private root key which only takes one step. In the example below, I’m creating a 2048 bit key:

openssl genrsa -out rootCA.key 2048

The standard key sizes today are 1024, 2048, and to a much lesser extent, 4096. I go with 2048, which is what most people use now. 4096 is usually overkill (and 4096 key length is 5 times more computationally intensive than 2048), and people are transitioning away from 1024. Important note: Keep this private key very private. This is the basis of all trust for your certificates, and if someone gets a hold of it, they can generate certificates that your browser will accept. You can also create a key that is password protected by adding -des3:

openssl genrsa -des3 -out rootCA.key 2048

You’ll be prompted to give a password, and from then on you’ll be challenged password every time you use the key. Of course, if you forget the password, you’ll have to do all of this all over again.

The next step is to self-sign this certificate.

openssl req -x509 -new -nodes -key rootCA.key -sha256 -days 1024 -out rootCA.pem

This will start an interactive script which will ask you for various bits of information. Fill it out as you see fit.

You are about to be asked to enter information that will be incorporated
into your certificate request.
What you are about to enter is what is called a Distinguished Name or a DN.
There are quite a few fields but you can leave some blank
For some fields there will be a default value,
If you enter '.', the field will be left blank.
Country Name (2 letter code) [AU]:US
State or Province Name (full name) [Some-State]:Oregon
Locality Name (eg, city) []:Portland
Organization Name (eg, company) [Internet Widgits Pty Ltd]:Overlords
Organizational Unit Name (eg, section) []:IT
Common Name (eg, YOUR name) []:Data Center Overlords
Email Address []

Once done, this will create an SSL certificate called rootCA.pem, signed by itself, valid for 1024 days, and it will act as our root certificate. The interesting thing about traditional certificate authorities is that root certificate is also self-signed. But before you can start your own certificate authority, remember the trick is getting those certs in  every browser in the entire world.

Install Root Certificate Into Workstations

For you laptops/desktops/workstations, you’ll need to install the root certificate into your trusted certificate repositories. This can get a little tricky. Some browsers use the default operating system repository. For instance, in Windows both IE and Chrome use the default certificate management.  Go to IE, Internet Options, go to the Content tab, then hit the Certificates button. In Chrome going to Options and Under The Hood, and Manage certificates. They both take you to the same place, the Windows certificate repository. You’ll want to install the root CA certificate (not the key) under the Trusted Root Certificate Authorities tab. However, in Windows Firefox has its own certificate repository, so if you use IE or Chrome as well as Firefox, you’ll have to install the root certificate into both the Windows repository and the Firefox repository. In a Mac, Safari, Firefox, and Chrome all use the Mac OS X certificate management system, so you just have to install it once on a Mac. With Linux, I believe it’s on a browser-per-browser basis.

Create A Certificate (Done Once Per Device)

Every device that you wish to install a trusted certificate will need to go through this process. First, just like with the root CA step, you’ll need to create a private key (different from the root CA).

openssl genrsa -out device.key 2048

Once the key is created, you’ll generate the certificate signing request.

openssl req -new -key device.key -out device.csr

You’ll be asked various questions (Country, State/Province, etc.). Answer them how you see fit. The important question to answer though is common-name.

Common Name (eg, YOUR name) []:

Whatever you see in the address field in your browser when you go to your device must be what you put under common name, even if it’s an IP address.  Yes, even an IP (IPv4 or IPv6) address works under common name. If it doesn’t match, even a properly signed certificate will not validate correctly and you’ll get the “cannot verify authenticity” error. Once that’s done, you’ll sign the CSR, which requires the CA root key.

openssl x509 -req -in device.csr -CA rootCA.pem -CAkey rootCA.key -CAcreateserial -out device.crt -days 500 -sha256

This creates a signed certificate called device.crt which is valid for 500 days (you can adjust the number of days of course, although it doesn’t make sense to have a certificate that lasts longer than the root certificate). The next step is to take the key and the certificate and install them in your device. Most network devices that are controlled via HTTPS have some mechanism for you to install. For example, I’m running F5’s LTM VE (virtual edition) as a VM on my ESXi 4 host. Log into F5’s web GUI (and should be the last time you’re greeted by the warning), and go to System, Device Certificates, and Device Certificate. In the drop down select Certificate and Key, and either past the contents of the key and certificate file, or you can upload them from your workstation.

After that, all you need to do is close your browser and hit the GUI site again. If you did it right, you’ll see no warning and a nice greenness in your address bar.

And speaking of VMware, you know that annoying message you always get when connecting to an ESXi host?

You can get rid of that by creating a key and certificate for your ESXi server and installing them as /etc/vmware/ssl/rui.crt and /etc/vmware/ssl/rui.key.

Cisco ACE Gets IPv6 Support

Last month (with little fanfare) Cisco released 5(1.0) for the ACE 4710 appliance and ACE30 Service Modules, bringing IPv6 support for the first time.

Wait, what?

IPv6 was around when we were partying like it was… 1999

Yes, September of 2011, and Cisco’s load balancing platform finally gets IPv6. It’s a dual-stack implementation for free, and with an extra license fee, you can get the protocol translation (IPv6 VIP with an IPv4 server as the most common example) as well. Honestly, I’m not sure why Cisco decided to charge extra for the NAT64, since IPv6 is pretty much useless on load balancers without that ability. F5, A10, and several other load balancing vendors don’t charge for the IPv6/4 translation component. Also, the ACE10 and ACE20 service modules (the later which has a pretty large install base) will never have IPv6 support. (Cisco has an aggressive pricing plan for ACE10/20 to ACE30 upgrades).

So why are IPv6 load balancers worthless without 6/4 conversion? It’s very likely that web applications servers will be among the laggards in the transition from IPv4 to IPv6. You’ll pry IPv4 out of their cold, dead, unpatched hands. The 6/4 conversion allows you to setup an IPv6 VIP to communicate with the future Internet, while the servers run their familiar IPv4.

Honestly, I’m very underwhelmed by the Cisco ACE product line lately. They’re pretty far behind the competition (F5, A10, Citrix NetScaler, Radware) in terms of features, and Cisco doesn’t seem to be doing much about it. Don’t get me wrong, it’s fine for what it does. But other companies are innovating, and Cisco seems to be content with letting the ACE lineup stagnate, just like they did with the LocalDirector and the CSS. I’d like to see Cisco up their game with true content logic (like F5’s iRules). But considering Cisco discontinued their line of XML Gateways/Web Application Firewalls, it seems pretty unlikely they will.

Traffic control languages like iRules are double edged swords: They can solve a lot of problems, but they can also create a lot of problems when trying to solve problems. I’ve seen them save the day, and I’ve seen them consume an entire network department in a DevOps nightmare worthy of DevOps Borat. Still, I’d rather have it, than not.

TLS 1.2: The New Hotness for Load Balancers

Aright implementors of services that utilize TLS/SSL, shit just got real. TLS 1.0/SSL 3.0? Old and busted. TLS 1.2? New hotness.

We config together, we die together. Bad admins for life.

There’s an exploit for SSL and TLS, and it’s called BEAST. It takes advantage of a previously known (but though to be too impractical to exploit) weakness in CBC. Only BEAST was able to exploit that weakness in a previously unconsidered way, making it much more than a theoretical problem. (If you’re keeping track, that’s preciously the moment that shit got real).

The cure is an update to the TLS/SSL standard called TLS 1.2, and it’s been around since 2008 (TLS 1.1 also fixes it, and has been available since 2006, but we’re talking about new hotness here).

So no problem, right? Just TLS 1.2 all the things.

Well, virtually no one uses it. It’s a chicken and egg problem. Clients haven’t supported it, so servers haven’t. Servers didn’t support it, so why would clients put the effort in? Plus, there wasn’t any reason to. The CBC weakness had been known, but it was thought to be too impractical to exploit.

But now we’re in a state of shit-is-real, so it’s time to TLS up.

So every browser and server platform running SSL is going to need to be updated to support TLS 1.2. On the client side, Google Chrome, Apple Safari, Firefox, IE (although IE 9 supports TLS 1.1, but previous version will need to be back ported) will need to be updated.

On the server side, it might be a bit simpler than we think. Most of the time when we connect to a website that utilizes SSL (HTTPS), the client isn’t actually talking SSL to the server, instead they’re talking to a load balancer that terminates the SSL connection.

Since most of the world’s websites have a load balancer terminate the SSL, we can update the load balancers with TLS 1.2 and take care of a major portion of the servers on the Internet.

Right now, most of the load balancing vendors don’t support TLS 1.2. If asked, they’ll likely say that there’s been no demand for it since clients don’t support it, which was fine until now. Now is the time for the various vendors to upgrade to 1.2, and if you’re a vendor and you’re not sure if it’s worth the effort, listen to Yoda:

Right now the only vendor I know of that supports TLS 1.2 is the market leader F5 Networks with their version 11 of their LTM, for which they should be commended. However, that’s not good enough, they need to backport version 10 (which has a huge install base). Vendors like Cisco, A10 Networks, Radware, KEMP Technologies, etc., need to also update their software to TLS 1.2. We can no longer use the excuse “because browsers don’t support it”. Because of BEAST, they will soon, and so do they.

In the meantime, if you’re running a load balancer that terminates SSL, you may want to change your cipher settings to prefer RC4-SHA instead of AES (which uses CBC). It’s cryptographically weaker, but is immune to the CBC issue. In the next few days, I’ll be putting together a page on how to prefer RC4 for the various vendors.

Rembmer, TLS 1.0/SSL 3.0: Old and busted. TLS 1.2? New hotness.

SSL’s No Good, Very Bad Couple of Months

The world of SSL/TLS security has had, well, a bad couple of months.

‘Tis but a limited exploit!

First, we’ve had a rash of very serious certificate authority security breaches. An Iranian hacker was able to hack Comodo, a certificate authority, and create valid, signed certficates for sites like and Then another SSL certificate authority in the Netherlands got p0wned so bad the government stepped in and took them over, probably from the same group of hackers from Iran.

Iran and China have both been accused of spying on dissidents through government or paragovernment forces. The Comodo and DigiNotar hacks may have lead to spying of Iranian dissidents, an a suspected attack by China a while ago prompted Google to SSL all Gmail connections at all times (not just username/password) by default, for everyone.

Between OCSP and CRLs, browser updates and rogue certificates, it’s called into question the very fabric of trust that we’ve taken for granted. Some even claim that PKI is totally broken (and there’s a reasonable argument for this).

This is what happens when there’s no trust. Also, this is how someone loses an eye. Do you want to lose an eye? Because this will totally do it.

Then someone found a way to straight up decrypt an SSL connection without any of the keys.

Wait, what?

It’s getting’ all “hack the planet” up in here.

The exploit is called BEAST, and it’s one that can decrypt SSL communications without having the secret. Thai Duong, one of the authors (the other is Juliano Rizzo) of the tool saw my post on BEAST, and invited me to watch the live demonstration from the hacker conference. Sure enough, they could decrypt SSL. Here’s video from the presentation:

Let me say that again. They could straight up decrypt that shit. 

Granted, there were some caveats, and the exploit can only be used in a in a somewhat limited fashion. It was a man-in-the-middle attack, but one that didn’t terminate the SSL connection anywhere but at the server. They found a new way to attack a known (but thought to be too impractical to exploit) vulnerability in the CBC part of some encryption algorithms.

The security community has known this was a potential problem for years, and it’s been fixed in TLS 1.1 and 1.2.

Wait, TLS? I thought this was about SSL?

Quick sidebar on SSL/TLS. SSL is the old term (the one everyone still uses, however). TLS replaced SSL, and we’re currently on TLS 1.2, although most people use TLS 1.0.

And that’s the problem. Everyone, and I do mean the entire planet, uses SSL 3.0 and TLS 1.0.  TLS 1.1 has been around since 2006, and TLS 1.2 has been around since 2008. But most web servers and browsers, as well as all manner of other types of SSL-enabled devices, don’t use anything beyond TLS 1.0.

And, here’s something that’s going to shock you:

Microsoft IIS and Windows 7 support TLS 1.1. OpenSSL, the project responsible for the vast majority of SSL infrastructure used by open source products (and much of the infrastructure for closed-source projects), doesn’t. As of writing, TLS 1.1 or above hasn’t made it yet into OpenSSL libraries, which means Apache, OpenSSH, and other tools that make use of the OpenSSL libraries can’t use anything above TLS 1.0. Look at how much of a pain in the ass it is right now to enable TLS 1.2 in Apache.

We’re into double-facepalm territory now

No good, very bad month indeed.

So now we’re going to have to update all of the web servers out there, as well as all the clients. That’s going to take a bit of doing. OpenSSL runs the SSL portion of a lot of sites, and they’ve yet to bake TLS 1.1/1.2 into the versions that everyone uses (0.9.8 and 1.0.x). Load balancers are going to play a central role in all of this. so we’ll have to wait for F5, Cisco, A10 Networks, Radware, and others to support TLS 1.2. As far as I can tell, only F5’s LTM version 11 supports anything above TLS 1.0.

The tougher part will be all of the browsers out there. There are a lot of systems that run non-supported and abandoned browsers. At least SSL/TLS is smart enough to be able to downgrade to the highest common denominator, but that would mean potentially vulnerable clients.

In the meantime something that web servers and load balancers can do is take Google’s lead and prefer the RC4 symmetric algorithm. While cryptographically weaker, it’s immune to the CBC attack.

This highlights the importance of keeping your software, both clients and servers, up to date.  Out of convenience we got stale with SSL, and even security the security obsessed OpenSSL project got caught with their pants down. This is really going to shake a lot of systems down, and hopefully be a kick in the pants to those that think it’s OK to not run current software.

I worked at an organization once where the developers forked off Apache (1.3.9 I believe) to interact with an application they developed. This meant that we couldn’t update to the latest version of Apache, as they refused to put the work in to port their application module to the updated versions of Apache. I bet you can guess how this ended. Give up? The Apache servers got p0wned.

So between BEAST and PKI problems, SSL/TLS has had a rough go at it lately. It’s not time to resort to smoke signals just yet, but it’s going to take some time before we get back to the level of confidence we once had. It’s a scary-ass world out there. Stay secure, my friends.

Zeus Gets Acquired by Riverbed

It looks like Riverbed is making a move into F5’s territory by acquiring Zeus. F5 has WOC (WAN optimization controllers) technology, but the last time I took a look at them (admittedly several years ago) they were pretty terrible. However F5 has the market leader for load balancing. Riverbed is known as the leader in WOCs, but until now had no load balancing capabilities.

Strong WOCs, new load balancing. F5 has strong load balancing, iffish WOC. Is Riverbed looking to be F5 with a goatee?

Meet Riverbed

Everyone I’ve ever talked to loves the Riverbed WOCs. Their primary competitors on this space have been Cisco (with WAAS), Bluecoat, F5  (used to have WanJets, now they’re modules in the LTMs) and a handfull of others. But they haven’t had a load balancer, until now.

It makes sense, the WOC and load balancer spaces both live in that tricky realm of Layer 7 network devices. It is a realm where many network admins dare not tread. It’s tough enough keeping STP, OSPF, ISIS, let alone all the new stuff like FCoE, TRILL, DCBX, and so forth without it all bursting one’s cranial cavity like an over-ripened fruit, these products throw all those Layer 7 protocols (HTTP, IMAP, POP3, Exchange, etc.) into the mix.

To be honest, I haven’t seen a lot of Zeus in the market (while I always see F5).  My impression is that, like my favorite band Pop Will Eat Itself, they’re bigger in Europe.

See press release here: