Recently, my colleague, Cody Bunch (he’s one presenting in bare feet), and I had the privilege of giving a keynote at the New England Virtualization Technology User Group (NE VTUG) Winter Warmer. There, we were able to address a room of VMware administrators and architects who were interested in learning about OpenStack. The keynote was a revised version of the talk Scott Lowe, from VMware, and I had given at the recent OpenStack Summit in Hong Kong. As I’ve now had multiple opportunities to give this talk and sat in numerous conversations with customers to discuss this topic, it is clear to me that there is growing curiosity and interest about OpenStack in the VMware community.
The revised deck from the aforementioned talk can be found on SlideShare and can also be viewed below. My goal in this post is not to walk through the deck slide by slide but to provide a summary of the talk and to hit on some major points.
The outline of the talk is as follows:
- Starting with Cloudy principles
- Introducing OpenStack
- Comparing architectures
- Understanding operational differences
- VMware’s relationship to OpenStack
To understand the how and the why of the OpenStack architecture, one must begin with the design principles of Cloud Computing and how Infrastructure-as-a-Service (IaaS) aims to deliver resources and services to tenants:
- Resources are ephemeral – In the Cloud, resources such as virtual machines are expected to be temporary and short-lived. While you have may have some steady-state instances, many or most are created for short periods and then terminated when no longer needed. If an instance fails, don’t waste time fixing it; just spin up a new instance and move on.
- Failure is assumed – In the Cloud, failure is expected and built into the design of the applications that run on it. This is not because a Cloud platform is inherently more brittle than a legacy infrastructure; but as a Cloud platform scales in size, it is understood that the probability of component failure increases so the environment should be designed to tolerate any number of such failures.
- Focus on application resiliency – Cloud design takes the approach that overall availability is not solely the responsibility of the underlying infrastructure. As such, there is a strong focus on application resiliency and application owners taking responsibility for their own availability. This requires a class of application that is stateless in nature and is designed to survive the failure of one or more instances underneath it.
- Scale out is better than scale up – The Cloud works best with applications that are distributed in nature and makes good use of an elastic infrastructure. This type of elasticity delivers multiple benefits, including the following:
- Workload bursting – When there is an unexpected spike in resource demands, create new resources, add them to the cluster, and rebalance the workload. When demand drops, remove resources from the cluster and terminate them. Not only is this a more efficient use of resources but it allows IT to be more responsive to the needs of its users.
- Fault-tolerance – By distributing a workload across multiple instances, you reduce the impact from the failure of any single instance while ensuring better service levels for end-users. In a failure scenario, surviving instances can take over the application workload, the failed instance terminated, and a new instance automatically created to take over.
- Scalability – By distributing an application across many nodes, I can more quickly and dynamically scale that application than would be possible with an application that requires a scale-up architecture.
- Automate all the things – Cloud is both an enabler and a beneficiary of automation. By automating the creation of resources, the Cloud delivers an on-demand, self-service infrastructure that can be consumed by end-users with little to no interaction with IT personnel. Simultaneously, as an IT organization embraces automation in other areas such as configuration management and application deployment, the benefits of the Cloud are magnified.
- Embrace cultural change – The Cloud is neither the magic elixir for all of ITs problems nor the final destination for IT. It is a part of the toolkit for an IT organization which recognizes that its members must collaborate together to better serve the Business. An end result of this collaborative culture and use of Cloud Computing should be a more agile IT that helps the Business succeed in meeting its goals. For example, what if a company that used to spin up 10 projects a year in the hope that 40%/ 4 projects would succeed could now spin up 40 projects in the same amount of time and using the same amount of resources? Even if only 20% succeeded, that would be 2x the number of successes.
Some folks, including VMware administrators, mistakenly think that OpenStack is a hypervisor like vSphere. In fact, OpenStack is a manager of multiple hypervisors, including vSphere, KVM, Xen, Hyper-V, etc.. But more than just a hypervisor manager, OpenStack is a collection of tools for managing and orchestrating multiple types of virtual resources. Vendors such as EMC and VMware have historically focused on virtualizing hardware resources, such as compute, network, and storage so that disparate components can be pooled and used more efficiently. However, these are still siloed resources with their own distinct control planes and tools. OpenStack sits above these pools of virtualized resource as a master control plane that provides users with a single point of control and orchestration; this enables end-users to access IT resources on-demand and at scale using a dashboard, command line, or via open APIs. Unlike vSphere, which is a monolithic product with a certain set of features, OpenStack is a collection of open source software projects that are tied together via APIs and a common authentication mechanism.
VMware’s flagship vSphere product has a monolithic and tightly coupled architecture. The vCenter Server can be designed to have its database running on a separate server; however, it doesn’t take advantage of a message queuing system and cannot distribute its services across multiple control nodes, effectively limiting the scalability of vSphere itself In contrast, OpenStack was designed from the very beginning to deploy and manage resources at scale and as such, has a distributed and loosely coupled architecture with an enterprise message queuing system.
Now, let’s compare the management, compute, networking, and storage components of vSphere to OpenStack:
- Management – The vSphere client is strictly an admin tool designed to allow administrators to manage resources and fulfill user requests for those resources. The OpenStack dashboard provides both admin and user access with separation of responsibilities. While the admin role allows an administrator to manage the overall Cloud platform, the user role allows end-users to request their own infrastructure resources, subject to quotas and boundaries set by the Cloud Administrator.
- Compute – vSphere has the concept of virtual machines while OpenStack uses the concept of Cloud instances. Out of the box, vSphere assumes each of these virtual machine is a unique entity with very specific hardware configurations. OpenStack enforces standardization by having users choose from a catalog of server instances called flavors; each flavor represents a virtual server template mapped to specific compute resource specifications. Finally, both vSphere and OpenStack enables VMs/instances to be created with a guest operating system installed on request; the difference is that vSphere requires the administrator to create and use the OS templates while OpenStack allows end-users to create, upload, and choose the OS images.
- Network – Both vSphere and OpenStack can leverage a kernel level virtual switch; for vSphere that can be the Standard vSwitch, the Virtual Distributed Switch, or the NSX Switch while OpenStack can leverage the Open vSwitch. In addition, OpenStack can leverage third-party hardware and software switches via plugins which allows switch vendors to expose their specific capabilities to OpenStack.
- Storage – vSphere maintains a traditional storage architecture where physical storage resources are virtualized and presented to the virtual machine as standard SCSI drives; this is the case regardless if the physical storage is accessed by the hypervisor via SCSI, iSCSI, fibre channel, or NFS. In contrast, OpenStack offers a mix of storage options, purpose fit to different use cases. There are currently three storage options available with OpenStack – ephemeral, block (Cinder), and object (Swift) storage. The table below from the OpenStack Operations Guide helpfully outlines the differences between the three options:
- vMotion and LIve Migration – VMware administrators are very familiar with vMotion and Storage vMotion. OpenStack with KVM (which is the most frequently used hypervisor with OpenStack) has comparable capabilities via Live Migration. typically, this is used for maintenance operations where VMs/instances can be manually migrated to other hypervisor nodes so the initial node can be serviced. What OpenStack does not support is Distributed Resource Scheduler (DRS), which leverage vMotion, or Storage Distributed Resource Scheduler (SDRS), which leverage Storage vMotion.
- HA and Instance Evacuation – A key feature of vSphere is High-Availability, which allows VMs hosted on a failed hypervisor node to be automatically restarted on surviving nodes. While OpenStack has the concept of instance evacuation, which allows a Cloud Administrator to recreate instances on surviving compute nodes, this is a manual process.
When I explain this to VMware customers and administrators, they are often surprised by the lack of “Enterprise-level capabilities” such as DRS and HA. This is where an understanding of the Cloud principles that inform the design of OpenStack, contrasted with the legacy data center principles that informed the design of vSphere, is important. Recall that we assume Cloud resources are ephemeral, failure is inevitable, and scalability is a requirement; the best design to handle that reality is designing for failure at the application layer and going with a share nothing, scale-out design. A Cloud platform built on those principles and assumptions should not require features like HA or DRS. Instead we can focus on building an anti-fragile system that is adaptable to changing conditions and is flexible enough so that each failure can be learned from to create a more robust system.
VMware with OpenStack
So is VMware strictly a competitor to OpenStack, which is what most people would assume? The reality is that VMware is both a competitor and a partner, depending on the product and technology in question. While their vCloud Suite is a clear competitor to OpenStack, VMware has increased their development efforts in regards to integrating vSphere. But why would a customer consider using vSphere with OpenStack? Recall that OpenStack was designed for a certain class of applications that are not dependent on the underlying infrastructure for resiliency and fault-tolerance; however, a feature rich hypervisor such as vSphere is probably better suited for legacy applications, such as Oracle and Microsoft Exchange. So a user who has chosen OpenStack as their Cloud platform but want to run legacy applications on that platform may opt for using vSphere as one of its hypervisors. Another reason may be that an existing VMware customer already has a sunk investment, in terms of money and personnel, in vSphere and want to leverage that investment while using OpenStack as their Cloud platform of choice.
For these users, they do have the option of deploying vSphere as the lone or second hypervisor in an OpenStack Cloud. To help enable this option, VMware has created a dedicated OpenStack team with the charter of making VMware technologies a viable option for running with OpenStack. Besides software development, the Team OpenStack @VMware has been putting out content and demo software to show off their OpenStack integration.
For VMware administrators who are interested in learning more about OpenStack, there are a number of options available to them, including books, blog posts, videos, white papers, and podcasts; examples of these are in the “Resource” section of this deck. Also, a video of the original Summit talk I collaborated with Scott on can be seen below:
BTW, if you have any questions, need further information, or would like to have me give this talk at an OpenStack or VMware user group meeting, feel free to contact me via e-mail or Twitter.