When 60 SCSI Devices Are Not Enough: Virtualizing Databases On vSphere 5.1

I came across a particular issue with several customer last year who were looking to virtualize their databases on vSphere; they ran into the long-standing limit of 60 SCSI devices per virtual machine.  For those who are unfamiliar with the 60 device limit, you can refer to the “vSphere 5.1 Configuration Maximums Guide,” which lists the following virtual machine storage maximums:

Screen Shot 2013-01-06 at 5.29.29 PM

My understanding is that these limits are in place due to the continued use of the SCSI-2 standard for SCSI controller emulation in ESXi.  As many of you know, the SCSI-2 and SCSI-3 standards both specify a maximum of 16 devices per SCSI bus, including the host adapter.  This leaves 15 devices that can be addressed by a single adapter.  The same SCSI-2 standard is also the reason for the 2 TB minus 512 bytes virtual disk size limit; the calculations for this limit can be found in VMware KB article 3371739, under the section, “Understanding the Limitation.”

Cormac Hogan, from VMware’s Technical Marketing group, has a helpful blog post outlining how much capacity can be presented to a VM, based on the maximums stated in the table above.  Particularly helpful is the fact that Cormac breaks out the theoretical maximum capacity using virtual disks and both variants of Raw Device Mappings (RDMs).  To summarize, here are the maximum capacities supported with vSphere 5.1:

  • VMDK – ~120 TB (2 TB minus 512 bytes per device)
  • Virtual RDMs – ~120 TB (2 TB minus 512 bytes per device)
  • Physical RDMs – 3.75 PB (64 TB per device)

VMware has consistently recommended, and rightly so, that virtual disks should be used whenever possible.  And over the years, the 60 SCSI devices and ~120 TB limits have generally not being a barrier to VMware’s expansion within the data center.  However, as more customers consider virtualizing their Business-Critical Applications (BCAs), we are starting to see these limits tested.  Michael Webster has a great blog post arguing for an increase in the 2 TB limit for a virtual disk and presenting various options for how to work around this limit.  My experience is that most enterprise customers who are looking to virtualize BCAs, such as Oracle and Exchange, do not view NFS or in-guest iSCSI as viable options due to performance concerns.  This means the best option for them has been to use physical RDMs, which provide the capacity and performance enterprises need but also imposes certain feature limitations such as the loss of ability to perform Storage vMotion or cloning of those RDM disks.  But, for many customers, that is an acceptable trade-off since they can work around the feature limitations.

However, I’ve encountered two use cases where customers have had issues with virtualizing databases due to the 60 SCSI devices limit and using Physical RDMs is not a viable workaround.  These uses cases are Microsoft Exchange 2010 and Oracle 11g with Automatic Storage Management (ASM).

  • Microsoft Exchange – Exchange 2010 has a default soft limit of 1 TB per database which, with a Windows registry change, can be increased up to the NTFS maximum of 16 TBs.  Microsoft has a recommended maximum of 2 TBs per database (physical or virtualized), with most customers I know keeping to the 1 TB soft limit or smaller.  Exchange 2010 also allows a maximum of 100 databases per server which, assuming a 1:1 mapping between databases and virtual disks, means that customers would hit the limit of what a VM, running Exchange, can address before they hit the 100 database limit; that is even without accounting for log volumes.  Now, it is the case that most shops tend to scale out their Exchange environment instead of scaling up, so they are unlikely to need more than 60 devices on a single Exchange VM.  However, as bigger Enterprises look to virtualize their Exchange environments, the likelihood for “monster Exchange” VMs become more of a possibility.
  • Oracle 11g with ASMASM is a volume manager and a file system for Oracle database files that supports single-instance Oracle Database and Oracle RAC configurations.  It is Oracle’s recommended storage management solution and an alternative to conventional volume managers, file systems, and raw devices.  Increasingly, I am speaking with customers who are, at Oracle’s prompting, looking to deploy ASM with their databases, rather on physical servers or virtualized.  And that is where the 60 device limit starts to become an issue for users with large Oracle databases that they want to virtualize on vSphere.  According to the Oracle ASM Administrator’s Guide, the stated limits for ASM without Exadata storage is as follows:
  • 2 terabytes (TB) maximum storage for each Oracle ASM disk
  • 20 petabytes (PB) maximum for the storage system

This effectively means that a customer looking to virtualize Oracle with ASM will be limited to ~110 TBs of capacity per server, rather they are using virtual disks or physical RDM disks.  While this is still sizable, it falls below the requirements of certain enterprises; for example, I have a customer who virtualized Oracle on a Vblock but has a database server that requires 250 TBs of capacity.  For customers like them, there are several options, none of which I would consider ideal:

  1. Re-architect their Oracle environment to use smaller databases across a larger number of VMs.  While there are certainly benefits to going this route. beyond as a workaround to the 60 devices limit, this may not be a desirable option for customers who may already have concerns about virtualizing their BCAs.
  2. Forgo ASM in favor of the native volume manager in the guest OS, leveraging physical RDM disks.  This is not a great option for customers who may already have standardized on ASM.
  3. Run Oracle with ASM on physical servers so they can leverage more than 60 devices per server.  An obvious loss for VMware but also a loss for customers who won’t be able to take advantage of VMware features such as HA or DRS and won’t be able to realize the benefits of virtualization.
  4. Run Oracle with ASM on Exadata. An obvious loss for VCE and VMware but also a loss for customers who won’t be able to realize the benefits of hyper-consolidation in their data center since they would have to run their databases on single-purpose systems.  EMC vSpecialist, Larry Grant, worked at Oracle in a former life and has a blog post that lays out the reasons why it is better for customers to run Oracle on a Vblock instead of on Exadata.

You may be wondering why Oracle with Exadata is an option for large databases; it’s actually not only because they are able to address more than 60 devices.  According to the same Oracle ASM Administrator’s Guide, cited above, the stated limits for ASM with Exadata storage is as follows:

  • 4 PB maximum storage for each Oracle ASM disk
  • 40 exabytes (EB) maximum for the storage system

I am not an expert in Oracle Enterprise Linux (OEL), which is the OS that runs on Exadata, but it is not obvious to me why ASM can support a 4 PB disk with OEL and only a 2 TB disk with any other version of Linux running as a guest OS and addressing physical RDM disks.  Is it strictly a technical difference or more of a support/ we-want-you-to-buy-Exadata-from-us stance?

VMware has indicated on multiple occasions that they plan to rewrite their storage stack to accommodate virtual disk larger than ~2 TBs and more than 60 devices per VM.  I would think that may necessitate moving away from the SCSI-2 standard for their controller emulation and moving to another standard such as SAS.  Until then, any one architecting virtualized BCA solutions should understand the configuration maximums of vSphere 5.1 and account for them in their designs.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: