Back to Basics: vSphere Sizing For Business-Critical Applications

Following up on my previous post on “vSphere general sizing,” I want to briefly discuss sizing Business-Critical Applications (BCA) for vSphere.  Business-Critical aka. Tier-One Applications are databases, ERP systems, e-mail servers, and other applications that a customer has indicated is critical to the operation of their business and there are numerous benefits to virtualizing BCAs, as part of the journey to IT-as-a-Service/Cloud.  However, special considerations need to taken account when sizing for BCAs.Top-Reasons-to-Virtualize-Business-Critical-Applications

Table Taken From VMware’s “Virtualizing Business-Critical Applications” White Paper

Here are some guidelines to consider:

vCPU Sizing

Ideally, we would have actual processor utilization data so we can determine if we have enough CPU cycles to oversubscribe vCPUs.  In the absence of that data, it is recommended that we use a 1:1 ratio of vCPU to pCPU cores; so when sizing a database server that requires 4 vCPUs, it should be assumed that 4 pCPU cores will be assigned exclusively to that BCA VM.  If a customer insists on over-subscribing by co-hosting a BCA VM with a tier 2 or 3 VM, consider using CPU reservations to guarantee resources for the BCA VM.  Otherwise, start with a 1:1 ratio and adjust over time if it is shown that pCPU resources are not fully utilized and over-subscription is safe to implement.

Memory Sizing

Generally, VMs tend to be memory constrained more than CPU constrained.  This means that over-subscription should be avoided for any BCA VMs unless hard data dictates otherwise.  If a customer insists on over-subscribing by co-hosting a BCA with a tier 2 or 3 VM, consider using RAM reservations to guarantee resources for the BCA VM.  Otherwise, start with a 1:1 ratio and adjust over time if it is shown that vRAM resources are not fully utilized and over-subscription is safe to implement.

For both vCPU and vRAM sizing, consider failure modes and maintenance requirements when sizing for BCA.  At minimum, an N+1 design should be considered so that if a single ESXi server that is hosting a BCA instance fails or has to be taken offline for maintenance, there are enough resources in the surviving HA cluster node to handle the load.  For certain BCAs, the HA cluster may include a N+2 design to allow for a double-node failure or a node failure while another node is offline.

When possible, size a BCA VM along Non-Uniformed Memory Access (NUMA) node boundaries for best performance.  Due to NUMA design considerations, consider using smaller VMs instead of “Monster” VMs when feasible.  You can go back to my previous post on the impact of NUMA for details.

Storage Sizing

The main thing to understand with storage sizing is that, unlike CPU and RAM, there is no over-subscription.  Some people mistakenly believe that if a given RAID group can deliver 500 IOPS for a physical server, it can somehow deliver 800 IOPS for a virtual server.  The fact is, storage sizing for a virtual environment is not much different from sizing for a physical environment except that you need to take into account what the storage design needs to look like if you aggregate workload requirements for multiple servers on to a single ESXi host.  However, special care should be taken to understand the I/O profile of the BCA you are intending to virtualize.  This includes knowing information such as:

  • Block size
  • Random vs. sequential I/O pattern
  • Read/Write ratio split

In most cases, VMFS should be used instead of Raw Device Mapping (RDM).  In the past, an argument could be made for using RDM with BCA servers due to the virtualization overhead, which could be in the 10% to 15% range.  However, with vSphere 5, the storage stack has been re-written to allow for more parallelization of tasks related to an I/O request; as a result, the overhead is now more in the 1% to 5% range.

For best performance, spread the BCA across multiple datastores and use smaller VMs instead of  a “Monster’ VMs if possible.  Doing both or either will give you more I/O queues and facilitate more parallelization of I/O requests.

Consider using VMware’s PVSCSI since it requires less CPU overhead and will generate better throughput with applications that use large block I/Os.  There had been issues with small I/Os, but that was addressed back in vSphere 4.1.

For more information regarding virtualizing Business-Critical Applications, reference the BCA page  and BCA blog on the VMware website.  In particular, take a look at the Virtualizing Business-Critical Applications white paper.  As mentioned in a previous post, I’ve also been helped by reading and listening to Michael Webster; you can hear him talk about “Virtualizing Tier One Applications” on the APAC Virtualization Roundtable Podcast.

Advertisement

One comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s