In part 1 of this series, I provided a basic overview of the OpenStack block storage service project, called Cinder, and how it is implemented with commodity storage as well as various other storage solutions. As you would expect, most of these solutions and the use cases defined are for the KVM hypervisor. In the Grizzly release, integration with VMware vSphere and ESXi worked pretty much the same way as KVM, with Cinder volumes attached via iSCSI as a Raw Device Mapping (RDM) disk. In Havana however, VMware made a significant change in how vSphere and ESXi implements Cinder. Note that I will focus on how Cinder is implemented in vSphere, using ESXi and vCenter. There is integration with ESXi as a stand-alone hypervisor but as I’ve argued in my recent blog series, there is little point in implementing standalone ESXi with OpenStack due to the lack of advanced features such as HA and DRS; you may as well deploy KVM with libvirt since it is free and offers the same basic functionality.
What’s The Big Deal With VMDK?
The big architectural change in Havana, in terms of vSphere integration is the development of a Virtual Machine Disk (VMDK) driver for Cinder that allows VMDKs to function as the backing storage for Cinder volumes in place of RDM disks in Grizzly. Those who are unfamiliar with vSphere may wonder why this is significant given that other hypervisors, such as KVM, also use flat files to encapsulate a VM’s disk device. The key is to understand the relationship between VMDKs and their underlying datastores. Because of the abstraction provided by VMDKs and their associated datastores, various services and features can be enabled in the underlying storage without any modifications to the vSphere hosted virtual machines. By mapping the Cinder volume construct to a VMDK files, VMs hosted on ESXi hypervisors can attach to persistent block volumes with the following characteristics:
- Multiple VMDK backed Cinder volumes can be hosted on the same datastore.
- A Cinder volume can be created with different disk formats:
- Thin Provisioned Disk (Default format with Cinder) – This is simply an as-used consumption model. This disk format is not pre-written to disk and is not zeroed out until run time.
- Thick Provisioned Disk (Zeroed Thick) - The Zeroed Thick option is the pre-allocation of the entire boundary of the VMDK disk when it is created.
- Thick Provisioned Disk (Eager Zeroed Thick) - This pre-allocates the disk space as well as each block of the file being pre-zeroed within the VMDK. Because of the increased I/O requirement, this requires additional time to write out the VMDK.
- Advanced vSphere features, such as Storage vMotion and Storage Distributed Resource Scheduler (SDRS), can be utilized to balanced workloads across a pool of datastores.
- Perhaps most importantly, with a VMDK as the backing storage, a Cinder volume can be hosted on any storage solution and be accessed by any storage protocol supported by vSphere.
In Grizzly, only iSCSI was supported with Cinder but in Havana, a VMDK backed volume is hosted on a datastore that could be using iSCSI, Fibre Channel, FCoE, or NFS. More interestingly, the VMDK driver will allow a Cinder volume to leverage future vSphere storage features such as Virtual SAN (VSAN) and Virtual Volumes (VVOLs).
Now I expect that as more folks in both the OpenStack and VMware communities understand this change in Cinder integration with vSphere, divergent opinions will surface regarding this change. Some will see this as a show of committment on the part of VMware to the OpenStack project and welcomed enhancements to the Cinder and Nova projects. Others will view this change as a move to promote the vSphere franchise without giving back to the OpenStack project and proof that VMware has no real interest in the core principles of open source technology. Time will tell which viewpoint gains ascendancy.
Under The Hood: What In The World Is A Shadow VM?
To understand the mechanics of how VMware is able to leverage VMDKs with Cinder, I need to introduce the concept of the “shadow VM.”
In the diagram above, the vSphere Web Client shows a VM with a VMDK attached; notice that this VM has a host name that begins with the word, “volume.” This shadow VM is created via API calls and is never actually powered on; so why does it exist? Due to the hierarchical nature of vSphere, a VMDK must exist as a child object of a parent VM; that’s why a VMDK is always created as part of the VM creation process and not as a “stand-alone” process. Cinder, however, assumes a block volume is a first class object; VMware gets around this by a creating a shadow VM for every VMDK that will be used as backing storage for Cinder volume. This allows the VMDK to be created and for features which require an attached VM, such as snapshots and clones, to be enabled. The key here is to understand that while the shadow VM is required, it is an abstracted object that is not seen or used by the Cloud instance that is attaching the associated Cinder volume.
Under The Hood: The life-cycle Of A Cinder Volume
So let’s walk though how a Cinder volume works with a vSphere hosted VM. We’ll focus on the creation, attachment, detachment, and deletion of a Cinder volume; there are other functions that are detailed in the OpenStack Configuration Reference Guide.
- create_volume – If you recall from my previous post, the default behavior with non-vSphere instances is for the cinder create command to instantiate a block volume using a logical volume off either the Cinder volume node, storage array, etc.. With VMDK backed Cinder volumes, the volume is actually not created just yet; instead the Cinder volume and associated shadow VM are created only upon an attach_volume request in what VMware terms a lazy-create process. The reason for this difference in the workflow is that since the datastore that will host the VMDK is chosen from a pool, holding off creation until an attach request is made allows a datastore to be chosen that is already visible to the ESXi server hosting the VM that will attach that volume. If the VMDK is created first, there is potentially the need to incur the extra cost of having to move that VMDK to a different datastore.
- attach_volume - When a request is made to attach a VMDK backed Cinder volume to a VM, one of 2 processes may occur:
- If this is a new volume and therefore does not have backing yet, a VMDK and it’s associated shadow VM is created on an appropriate datastore that is visible to the appropriate ESXi server. The Cinder volume is then attached by Nova Compute to the appropriate VM.
- If this is a previously attached volume and the backing VMDK is on a datastore visible to the appropriate ESXi server, the Cinder volume is attached by Nova Compute to the appropriate VM.
- If this is a previously attached volume and the backing VMDK is NOT on a datastore visible to the appropriate ESXi server, a Storage vMotion process is invoked to move the VMDK to a datastore that is visible to the appropriate ESXi server. The Cinder volume is then attached by Nova Compute to the appropriate VM.
- detach_volume – The VMDK backed Cinder volume is removed from the appropriate VM.
- delete_volume – The VMDK backed Cinder volume is removed from inventory.
So hopefully this post will be helpful to those who want to gain a better understanding of how VMware is integrating Cinder with vSphere. I do want to make the caveat that though I have contacts on the OpenStack team at VMware, I am not a VMware employee so I welcome any corrections to what I’ve written.
I would also encourage readers who want to learn more to watch the video of the “Cinder And vSphere” talk that was given by VMware at the recent OpenStack Summit.
I would also readers to view the video of the “VMware With OpenStack” demo from the same OpenStack Summit.
- Laying Cinder Block (Volumes) In OpenStack, Part 1: The Basics (cloudarchitectmusings.com)
- VMware Doubles Down on OpenStack Havana Cloud (eweek.com)
- VMware and OpenStack: Better Together? (chucksblog.emc.com)
- VMware Advances Support for OpenStack Framework (sys-con.com)