In the networking world, we’re starting to see the term “cloud” more and more. When I teach classes, if I so much as mention the word cloud, I start to see some eyes roll. That’s completely understandable, as the term cloud was such an overused buzzword, only having been recently supplanted only by “software defined”.
Here’s real-life supervillain (dude owns an MiG 29 and an island with a volcano on it… seriously) Larry Ellison freaking out about the term cloud.
“It’s not water vapor! All it is, is a computer attached to a network!”
But here’s the thing, it’s actually a thing now. Rather than a catch-all buzzword, it’s being used more and more to define a particular type of operational model. And it’s defined by NIST, the National Institute of Standards and Technology, part of the US Department of Commerce. With the term cloud, we now get a higher degree of specificity.
That first item on the list, the on-demand self service, is a huge change in how we will be doing networking. Right now network configurations are mostly done by network administrators. If you have a network need and aren’t a network admin, you open up a ticket and wait.
In (private) cloud computing, which will include a large networking component, the network elements, end points, and devices will be configured by end-users/developers, not the IT staff. The IT staff will maintain the overall cloud infrastructure, but will not do the day-to-day changes. The changes will happen far too frequently, and they will happen in the middle of the day. Change control will probably be handled for the underlying infrastructure, but the tenants will likely make many changes during the day. The fault domains will be a lot smaller, making mistakes impactful to a small segment for these changes, and the automation will make chance that a change (such as adding a new load balancing VIP) will be done correctly much higher.
This is how things have been done in public clouds (Amazon, Rackspace, etc.) for a while now.
When people talk about the death of the CLI, this is what they’re referring to. The configuration changes we make won’t be on a Cisco or Juniper CLI, but through some sort of portal (which can be either GUI, CLI, or API calls) and will be largely automated. We’ve hit the twilight of the age of Conf T.
With OpenStack, Docker, CoreOS, containers, DevOps, ACI, NSX, and all of the new operational models, technologies, and platforms, the next generation data center will be a self-service data center.
Hey, remember vTax/vRAM? It’s dead and gone, but with 6 Terabyte of RAM servers now available, imagine what could have been (your insanely high licensing costs).
Set the wayback machine to 2011, when VMware introduced vSphere version 5. It had some really great enhancements over version 4, but no one was talking about the new features. Instead, they talked about the new licensing scheme and how much it sucked.
While somedefendedVMware’s position, most were critical, and my own opinion… let’s just say I’ve likely ensured I’ll never be employed by VMware. Fortunately, VMware came to their senses and realized what a bone-headed, dumbass move that vRAM/vTax was, and repealed the vRAM licensing one year later in 2012. So while I don’t want to beat a dead horse (which, seriously, disturbing idiom), I do think it’s worth looking back for just a moment to see how monumentally stupid that licensing scheme was for customers, and serve as a lesson in the economies of scaling for the x86 platform, and as a reminder about the ramifications of CapEx versus OpEx-oriented licensing.
Why am I thinking about this almost 2 years after they got rid of vRAM/vTax? I’ve been reading up on the newly released Intel’s E7 v2 processors, and among the updates to Intel’s high-end server chip is the ability to have 24 DIMMs per socket (the previous limit was 12) and the support of 64 GB DIMMs. This means that a 4-way motherboard (which you can order now from Cisco, HP, and others) can support up to 6 TB of RAM, using 96 DIMM slots and 64 GB DIMMs. And you’d get up to 60 cores/120 threads with that much RAM, too.
And I remembered one (of many) aspects about vRAM that I found horrible, which was just how quickly costs could spiral out of control, because server vendors (which weren’t happy about vRAM either) are cramming more and more RAM into these servers.
The original vRAM licensing with vSphere 5 was that for every socket you paid for, you were entitled to/limited to 48 GB of vRAM with Enterprise Plus. To be fair the licensing scheme didn’t care how much physical RAM (pRAM) you had, only how much of the RAM was consumed by spun-up VMs (vRAM). With vSphere 4 (and the current vSphere licensing, thankfully), RAM had been essentially free: you only paid per socket. You could use as much RAM as you could cram into a server. But with the vRAM licensing, if you had a dual-socket motherboard with 256 GB of RAM you would have to buy 6 licenses instead of 2. At the time, 256 GB servers weren’t super common, but you could order them from the various server vendors (IBM, Cisco, HP, etc.). So with vSphere 4, you would have paid about $7,000 to license that system. With vSphere 5, assuming you used all the RAM, you’d pay about $21,000 to license the system, a 300% increase in licensing costs. And that was day 1.
Now lets see how much it would cost to license a system with 6 TB of RAM. If you use the original vRAM allotment amounts from 2011, each socket granted you 48 GB of vRAM with Enterprise Plus (they did up the allotments after all of the backlash, but that ammended vRAM licensing model was so convoluted you literally needed an application to tell you how much you owed). That means to use all 6 TB (and after all, why would you buy that much RAM and not use it), you would need 128 socket licences, which would have cost $448,000 in licensing. A cluster of 4 vSphere hosts would cost just shy of $2 million to license. With current, non-insane licensing, the same 4-way 6 TB server costs a whopping $14,000. That’s a 32,000% price differential.
Again, this is all old news. VMware got rid of the awful licensing, so it’s a non-issue now. But still important to remember what almost happened, and how insane licensing costs could have been just a few years later.
My graph from 2011 was pretty accurate.
Rumor has it VMware is having trouble getting customers to go for OpEx-oriented licensing for NSX. While VMware hasn’t publicly discussed licensing, it’s a poorly kept secret that VMware is looking to charge for NSX on a per VM, per month basis. The number I’d been hearing is $10 per month ($120 per year), per VM. I’ve also heard as high as $40, and as low as $5. But whatever the numbers are, VMware is gunning for OpEx-oriented licensing, and no one seems to be biting. And it’s not the technology, everyone agrees that it’s pretty nifty, but the licensing terms are a concern. NSX is viewed as network infrastructure, and in that world we’re used to CapEx-oriented licensing. Some of VMware’s products are OpEx-oriented, but their attempt to switch vSphere over to OpEx was disastrous. And it seems to be the same for NSX.
There are very few technologies in that data center that have had as significant of an impact of VMware’s vMotion. It allowed us to de-couple operating system and server operations. We could maintain, update, and upgrade the underlying compute layer without disturbing the VMs they ran on. We can write web applications in the same model that we’re used to, when we wrote them to specific physical servers. From an application developer perspective, nothing needed to be changed. From a system administrator perspective, it helped make (virtual) server administration easier and more flexible. vMotion helped us move almost seamlessly from the physical world to the virtualization world with nary a hiccup. Combined with HA and DRS, it’s made VMware billions of dollars.
And it’s time for it to go.
From a networking perspective, vMotion has reeked havoc on our data center designs. Starting in the mid 2000s, we all of a sudden needed to build huge Layer 2 networking domains, instead of beautiful and simple Layer 3 fabrics. East-West traffic went insane. With multi-layer switches (Ethernet switches that could route as fast as they could switch), we had just gotten to the point where we could build really fast Layer 3 fabrics, and get rid of spanning-tree. vMotion required us to undo all that, and go back to Layer 2 everywhere.
But that’s not why it needs to go.
Redundant data centers and/or geographic diversification is another area that vMotion is being applied to. Having the ability to shift a workload from one data center to another is one of the holy grails of data centers but to accomplish this we need Layer 2 data center interconnects (DCI), with technologies like OTV, VPLS, EoVPLS, and others. There’s also a distance limitation, as the latency between two datacenters needs to be 10 milliseconds or less. And since light can only travel so far in 10 ms, there is a fairly limited distance that you can effectively vMotion (200 kilometers, or a bit over 120 miles). That is, unless you have a Stargate.
You do have a Stargate in your data center, right?
And that’s just getting a VM from one data center to another, which someone described to me once as a parlor trick. By itself, it serves no purpose to move a VM from one data center to another. You have to get its storage over as well (VMDK files if your lucky, raw LUNs if you’re not) and deal with the traffic tromboning problem from one data center to another.
The IP address is still coupled to the server (identity and location are coupled in normal operations, something LISP is meant to address), so traffic still comes to the server via the original data center, traverses the DCI, then the server responds through its default gateway, which is still likely the original data center. All that work to get a VM to a different data center, wasted.
All for one very simple reasons: A VM needs to keep its IP address when it moves. It’s IP statefullness, and there are various solutions that attempt to address the limitations of IP statefullness. Some DCI technologies like OTV will help keep default gateways to the local data center, so when a server responds it at least doesn’t trombone back through the original data center. LISP is (another) overlay protocol meant to decouple the location from the identity of a VM, helping with mobility. But as you add all these stopgap solutions on top of each other, it becomes more and more cumbersome (and expensive) to manage.
All of this because a VM doesn’t want to give its IP address up.
But that isn’t the reason why we need to let go of vMotion.
The real reason why it needs to go is that it’s holding us back.
Do you want to really scale your application? Do you want to have fail-over from one data center to another, especially over distances greater than 200 kilometers? Do you want to be be able to “follow the Sun” in terms of moving your workload? You can’t rely on vMotion. It’s not going to do it, even with all the band-aids meant to help it.
The sites that are doing this type of scaling are not relying on vMotion, they’re decoupling the application from the VM. It’s the metaphor of pets versus cattle (or as I like to refer to it, bridge crew versus redshirts). Pets is the old way, the traditional virtualization model. We care deeply what happens to a VM, so we put in all sorts of safety nets to keep that VM safe. vMotion, HA, DRS, even Fault Tolerance. With cattle (or redshirts), we don’t really care what happens to the VMs. The application is decoupled from the VM, and state is not solely stored on a single VM. The “shopping cart” problem, familiar to those who work with load balancers, isn’t an issue. So a simple load balancer is all that’s required, and can send traffic to another server without disrupting the user experience. Any VM can go away at any level (database, application, presentation/web layer) and the user experience will be undisturbed. We don’t shed a tear when a redshirt bites it, thus vMotion/HA/DRS are not needed.
If you write your applications and build your application stack as if vMotion didn’t exit, scaling and redundancy are geographic diversification get a lot easier. If your platform requires Layer 2 adjacency, you’re doing it wrong (and you’ll be severely limited in how you can scale).
And don’t take my word for it. Take a look at any of the huge web sites, Netflix, Twitter, Facebook: They all shard their workloads across the globe and across their infrastructure (or Amazons). Most of them don’t even use virtualization. Traditional servers sitting behind a load balancer with a active/standby pair of databases on the back-end isn’t going to cut it.
When you talk about sharding, make sure people know it’s spelled with a “D”.
If you write an application on Amazon’s AWS, you’re probably already doing this since there’s no vMotion in AWS. If an Amazon data center has a problem, as long as the application is architected correctly (again, done on the application itself), then I can still watch my episodes of Star Trek: Deep Space 9. It takes more work to do it this way, but it’s a far more effective way to scale/diversify your geography.
It’s much easier (and quicker) to write a web application for the traditional model of virtualization. And most sites first outing will probably be done in this way. But if you want to scale, it will be way easier (and more effective) to build and scale an application.
VMware’s vMotion (and Live Migration, and other similar technologies by other vendors) had their place, and they helped us move from the physical to the virtual. But now it’s holding us back, and it’s time for it to go.