Automation Conundrum

During Tech Field Day 8, we saw lots of great presentations from amazing companies. One presentation that disappointed, however, was Symantec. It was dull (lots of marketing) and a lot of the stuff they had to offer (like dedupe and compression in their storage file system) had been around in competing products for many years. If you use the products, that’s probably pretty exciting, but it’s not exactly making me want to jump in. And I think Robin Harris had the best description of their marketing position when he called it “Cloudwashing”. I’ve taken the liberty of creating a dictionary-style definition for Cloudwashing:

Cloudwashing: Verb. 1. The act of taking your shitty products and contriving some relevance to the blah blah cloud. 

One product that I particularly wasn’t impressed by was Symantec’s Veritas Operations Manager. It’s a suite that’s supposed to automate and report on a disparate set of operating systems and platforms, providing a single pane of glass for virtualized data center operations. “With just a few clicks on Veritas Operations Manager, you can start and stop multi-tier applications decreasing downtime.” That’s the marketing, anyway.

In reality, what they seemed to have create was an elaborate system to… automatically restart a service if it failed. You’d install this pane of glass, pay the licensing fee or whatever, configure the hooks into all your application server, web servers, database servers… and what does it do? It restarts the process if it fails. What does it do beyond that? Not much more, from what I could see in the demo. I pressed them on a few issues during the presentation (which you can see here, the Virtual Business Services part starts around the 32 minutes mark), and that’s all they seemed to have. Restarting a process.

So, not terribly useful. But I don’t think The problem is one of engineering, instead I think it’s the overall philosophy of top-down automation.

See, we all have visions of Iron Man in our heads.

Iron Man Says: Cloud That Shit

Wait, what? Iron Man?

Yes, Iron Man. In the first Iron Man movie, billionaire industrialist Tony Stark built three suits: One to get him out of the cave, the second, all silver, had an icing problem, and the third which was used in the rest of the movie to punch the shit out of a lot of bad guys. He built the first two by hand. The third however, was built through complete automation. He said “build it” and the Jarvis, his computer said: “Commencing automated assembly. Estimated completion time is five hours.”

And then he takes his super car to go to a glamorous party.

Fuck you, Tony Stark. I bet you never had to restart a service manually.

How many times have you said “Jarvis, spin up 1,000 new desktop VMs, replicate our existing environment to the standby datacenter, and resolve the last three trouble tickets” and then went off in an Audi R8 to a glamorous party? I’m guessing none.

So, are we stuck doing everything manually, by hand, like chumps? No, but I don’t believe the solution will be top-down. It will be bottom-up.

The real benefit of automation that we’re seeing today is from automating the simple tasks, not by orchestrating some amazing AI with a self-healing, self-replicating SkyNet-type singularity. It’s from automating the little mundane tasks here and there. The time savings are enormous, and while it isn’t as glamorous as having a self-healing sky-net style data center, it does give us a lot more time to do actual glamorous things.

Since I teach Cisco’s UCS blade systems, I’ll use them as an example (sorry, HP). In UCS, there is the concept of service profiles, which are an abstraction of aspects of a server that are usually tied to a physical server, and found in disparate places. Boot order (BIOS), connectivity (SAN and LAN switches), BIOS and HBA firmware (typically flashed separately and manually), MAC and WWN addresses (burnt in), and more are all stored and configured via a single service profile, and that profile is then assigned to a blade. Cisco even made a demonstration video showing they could get a new chassis with a single blade up and online with ESXi in less than 30 minutes from sitting-in-the-box to install.

The Cisco UCS system isn’t particularly intelligent, doesn’t respond dynamically to increased load, but it automates a lot of tasks that we used to have to do manually. It “lubes” the process, as Chris Sacca used the term lube in a great talk he did at Le Web 2009. I’ll take that over some overly complicated pane-of-glass solution that essentially restarts processes when they stop any day.

Perhaps at some point we’ll get to the uber-smart self-healing data center, but right now everyone who has tried has come up really, really short. Instead, there have been tremendous benefits in automating the mundane tasks, the unsexy tasks.

5 Responses to Automation Conundrum

  1. Pingback: Tech Field Day 8: The Links

  2. Mike Kantowski says:

    That’s hilarious. Once could just use SEC or other to automate restarting of tasks.

  3. Great post. Egenera believes in your ideas around automation too.

    Here is a short video showing how Egenera PAN Manager can automatically heal a VMware cluster after a failure:

    http://www.egenera.com/egenera-failover

    and another one showing how Egenera PAN Manager™ can automatically scale an overloaded VMware cluster:

    http://www.egenera.com/vShere-Capacity-on-Demand-through-Virtual-BMC

    Regards,

    Dan Bettinger

  4. Pingback: Brace Yourself: Networking Field Day 2 Posts Are Coming | The Data Center Overlords

  5. Pingback: The Problem | The Data Center Overlords

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: