Run a Cisco ACE? Then Do This Command Right Now!
July 2, 2011 5 Comments
It may already be too late! OK, it’s not too late, but there’s a common scenario I run into with Cisco ACE load balancers. Around 25% of the ACE load balancers (4710 appliance and Service Module) have this condition called STANDBY_COLD.
So here’s a command you should run when logged into the Admin context of your redundant ACE deployment:
show ft group detail
You’re looking for “Peer State” to stay STANDBY_HOT. STANDBY_HOT is good, and you don’t need to do anything else. However, it’s very common to see something else:
FT Group : 1 Configured Status : in-service Maintenance mode : MAINT_MODE_OFF My State : FSM_FT_STATE_ACTIVE Peer State : FSM_FT_STATE_STANDBY_COLD Peer Id : 1 No. of Contexts : 1
STANDBY_COLD is a peer state where the standby ACE context is not receiving automatic configuration syncs from the active ACE. If you had a failover right now with the status of STANDBY_COLD, you would be running on an older version of the configuration, potentially months old.
How Did We Get Here?
When you make a configuration change on the primary ACE, it DOES get automatically copied automatically to the standby ACE.
When you upload a certificate and key to the primary ACE, it DOES NOT get automatically copied to the standby.
The problem is typically that the configuration on the standby ACE references a key and certificate file that don’t exist on the standby, only the active. The standby ACE looks for the files, can’t find them, then stops accepting configuration updates.
How Do We Fix It?
The fix is to upload manually all of the certificates and keys to the standby ACE that were referenced in the configuration. You can import them into the ACE with the crypto import
command through either terminal (cut and paste in the SSH/Telnet window), SFTP, TFTP, or FTP.
Then, reboot the standby. To fix STANDBY_COLD you need to reboot. It will do a fresh configuration sync (it might take a few minutes), but then it should be in STANDBY_HOT again. You’ll need to do this on a context by context basis, as you can have soms contexts in STANDBY_HOT and others in STANDBY_COLD. If it doesn’t fix it, make sure that you’ve got the file names matched exactly.
How Do We Avoid It In The Future?
Keep in mind that when you add SSL certificates and keys, you must add them manually to both the active and standby ACE contexts. So far, no version of the ACE code (that I’m aware of) does certificate and key automatic sync. And make sure to add the files before you put them in the configuration file.
Pingback: Internets of Interest:8 Jul 2011
Hate to rain on your parade, but to claim 25% of ACE’s have this is a bit wrong. It’s definitely not a “condition” as you imply – it happens when the admin forgets to add the certificates to both ACE’s. That’s…kind…of….user error, not a condition of the ACE.
You can end up in STANDBY_COLD if the FT Vlan is down as well, but this is obviously a different problem requiring a different fix.
“sh ft group brief” will give you the details of the STANDBY HOT/COLD contexts in a more consise way, especially if the ACE runs a lots of contexts.
You can also force a bulk config sync on a per-context basis by issuing the command “no ft auto-sync run” in config mode followed by “ft auto-sync run” which saves you having to reload the ACE.
I agree it’s not really a condition, however it is very common. The need to upload keys and certs to both the active and standby isn’t obvious, and easy to overlook. People new to Cisco ACE often assume that the keys and certs are copied automatically as with the configuration, and there’s nothing to contradict that unless you really peruse the documentation.
When I teach ACE, I mention that to my students, and at least one of them typically says “oh yeah, look at that… STANDBY_COLD…”
Pingback: Clarification on Cisco ACE Post | The Data Center Overlords