VSPHERE 5.1 STORAGE DRS LOAD BALANCING AND SIOC THRESHOLD ENHANCEMENTS
Lately I have been receiving questions on best practices and considerations for aligning the Storage DRS latency and Storage IO Control (SIOC) latency, how they are correlated and how to configure them to work optimally together. Let’s start with identifying the purpose of each setting, review the enhancements vSphere 5.1 has introduced and discover the impact when misaligning both thresholds in vSphere 5.0. Purpose of the SIOC threshold The main goal of the SIOC latency is to give fair access to the datastores, throttling virtual machine outstanding I/O to the datastores across multiple hosts to keep the measured latency below the threshold. It can have a restrictive effect on the I/O flow of virtual machines. Purpose of the Storage DRS latency threshold The Storage DRS latency is a threshold to trigger virtual machine migrations. To be more precise, if the average latency (VMObservedLatency) of the virtual machines on a particular datastore is higher than the Storage DRS threshold, then Storage DRS will mark that datastore as “source” for load balancing migrations. In other words, that datastore provide the candidates (virtual machines) for Storage DRS to move around to solve the imbalance of the datastore cluster. This means that the Storage DRS threshold metric has no “restrictive” access limitations. It does not limit the ability of the virtual machines to send I/O to the datastore. It is just an indicator for Storage DRS which datastore to pick for load balance operations. SIOC throttling behavior When the average device latency detected by SIOC is above the threshold, SIOC throttles the outstanding IO of the virtual machines on the hosts connected to that datastore. However due to different number of shares, various IO sizes, random versus sequential workload and the spatial locality of the changed blocks on the array, we are almost certain that no virtual machine will experience the same performance. Some virtual machines will experience a higher latency than other virtual machines running on that datastore. Remember SIOC is driven by shares, not reservations, we cannot guarantee IO slots (reservations). Long story short, when the datastore is experiencing latency, the VMkernel manages the outbound queue, resulting in creating a buildup of I/O somewhere higher up in the stack. As the SIOC latency threshold is the weighted average of D/AVG per host, the weight is the number of IOPS on that host. For more information how SIOC calculates the Device average latency, please read the article: “To which host-level latency statistic is the SIOC congestion threshold related?” Has SIOC throttling any effect on Storage DRS load balancing? Depending on which vSphere version you run is the key whether SIOC throttling has impact on Storage DRS load balancing. As stated in the previous paragraph, if SIOC throttles the queues, the virtual machine I/O does not disappear, vSphere always allows the virtual machine to generate I/O to the datastore, it just builds up somewhere higher in the stack between the virtual machine and the HBA queue. In vSphere 5.0, Storage DRS measures latency by averaging the device latency of the hosts running VMs on that datastore. This is almost the same metric as the SIOC latency. This means that when you set the SIOC latency equal to the Storage DRS latency, the latency will be build up in the stack above the Storage DRS measure point. This means that in worst-case scenario, SIOC throttles the I/O, keeping it above the measure point of Storage DRS, which in turn makes the latency invisible to Storage DRS and therefore does not trigger the load balance operation for that datastore. Introducing vSphere 5.1 VMObservedLatency To avoid this scenario Storage DRS in vSphere 5.1 is using the metric VMObservedLatency. This metric measures the round-trip of I/O from the moment the VMkernel receives the I/O (Virtual Machine Monitor) to the datastore and all the way back to the VMM. This means that when you set the SIOC latency to a lower threshold than the Storage DRS latency, Storage DRS still observes the latency build up in the kernel layer. vSphere 5.1 Automatic latency SIOC To help you avoid building up I/O in the host queue, vSphere 5.1 offers automatic threshold computation for SIOC. SIOC sets the latency to 90% of the throughput level of the device. To determine this, SIOC derives a latency setting after a series of tests, mapping maximum throughput to a latency value. During the tests SIOC detects where the throughput of I/O levels out, while the latency keeps on increasing. To be conservative, SIOC derives a latency value that allows the host to generate up to 90% of the throughput, leaving a burst space of 10%. This provides the best performance of the devices, avoiding unnecessary restrictions by building up latency in the queues. In my opinion, this feature alone warrants the upgrade to vSphere 5.1. How to set the two thresholds to work optimally together SIOC in vSphere 5.1 allows the host to go up to 90% of the throughput before adjusting the queue length of each host, and generate queuing in the kernel instead of queuing on the storage array. As Storage DRS uses VMObservedLatency it monitors the complete stack. It observes the overall latency, disregarding the location of the latency in the stack and tries to move VMs to other datastores to level out the overall experienced latency in the datastore cluster. Therefore you do not need to worry about misaligning the SIOC latency and the Storage DRS I/O latency. If you are running vSphere 5.0 it’s recommended setting the SIOC threshold to a higher value than the Storage DRS I/O latency threshold. Please refer to your storage vendor to receive the accurate SIOC latency threshold. Get notification of these blogs postings and more DRS and Storage DRS information by following me on Twitter: @frankdenneman
STORAGE DRS AND ALTERNATIVE SWAP FILE LOCATIONS
By default a virtual machine swap file is stored in the working directory of the virtual machine. However, it is possible to configure the hosts in the DRS cluster to place the virtual machine swapfiles on an alternative datastore. A customer asked if he could create a datastore cluster exclusively used for swap files. Although you can use the datastores inside the datastore cluster to store the swap files, Storage DRS does not load balance the swap files inside the datastore cluster in this scenario. Here’s why: Alternative swap location This alternative datastore can either be a local datastore or a shared datastore. Please note that placing the swapfile on a non-shared local datastore impacts the vMotion lead-time, as the vMotion process needs to copy the contents of the swap file to the swapfile location of the destination host. However a datastore that is shared by the host inside the DRS cluster can be used, a valid use case is a small pool of SSD drives, providing a platform that reduces the impact of swapping in case of memory contention. Storage DRS What about Storage DRS and using alternative swap file locations? By default a swap file is placed in the working directory of a virtual machine. This working directory is “encapsulated” in a DRMdisk. A DRMdisk is the smallest entity Storage DRS can migrate. For example when placing a VM with 3 hard disks in a datastore cluster, Storage DRS creates 4 DRMdisks, one DRMdisk for the working directory and separate DRMdisks for each hard disk. Extracting the swap file out of the working directory DRMdisk When selecting an alternative location for swap files, vCenter will not place this swap file in the working directory of the virtual machine, in essence extracting it from – or placing it outside - the working directory DRMdisk entity. Therefore Storage DRS will not move the swap file during load balancing operations or maintenance mode operations as it can only move DRMdisks. Alternative Swap file location a datastore inside a datastore cluster? Storage DRS does not encapsulate a swap file in its own DRMdisk and therefore it is not recommended to use a datastore that is part of a datastore cluster as a DRS cluster swap file location. As Storage DRS cannot move these files, it can impact load-balancing operations. The user interface actually reveals the incompatibility of a datastore cluster as a swapfile location because when you configure the alternate swap file location, the user interface only displays datastores and not a datastore cluster entity. DRS responsibility Storage DRS can operate with swap files placed on datastores external to the datastore cluster. It will move the DRMdisks of the virtual machines and leave the swap file on its specified location. It is the task of DRS moving the swap file if it’s placed on a non-shared swap file datastore when migrating the compute state between two hosts. Get notification of these blogs postings and more DRS and Storage DRS information by following me on Twitter: @frankdenneman
HA ADMISSION CONTROL IS NOT A CAPACITY MANAGEMENT TOOL.
I receive a lot of questions on why HA doesn’t work when virtual machines are not configured with VM-level reservations. If no VM-level reservations are used, the cluster will indicate a fail over capacity of 99%, ignoring the CPU and memory configuration of the virtual machines. Usually my reply is that HA admission control is not a capacity management tool and I noticed I have been using this statement more and more lately. As it doesn’t scale well explaining it on a per customer basis, it might be a good idea to write a blog article about it. The basics Sometimes it’s better to review the basics again and understand where the perception of HA and the actual intended purpose of the product part ways. Let’s start of what HA admission control is designed for. In the availability guide the two following statement can be found: Quote 1:
WANT TO HAVE A VSPHERE 5.1 CLUSTERING DEEPDIVE BOOK FOR FREE?
Want to have a vSphere 5.1 clustering deepdive book for free? CloudPhysics are giving away some vSphere 5.1 clustering deepdive books, do the following if you want to receive a copy: Action required Email [email protected] with a subject of “Book”. No message is needed. Register at http://www.cloudphysics.com/ by clicking “SIGN UP”. Install the CloudPhysics Observer vApp to activate your dashboard. Eligibility rules You are a new CloudPhysics user. You fully install the CloudPhysics ‘Observer’ vApp in your vSphere environment. The first 150 users gets a free book, but what’s even better, the Cloudphysics service gives you great insights on your current environment. For more info read the following blogposts: CloudPhysics in a nutshell and VM reservations and limits card - a closer look
PARTIALLY CONNECTED DATASTORE CLUSTERS - WHERE CAN I FIND THE WARNINGS AND HOW TO SOLVE IT VIA THE WEB CLIENT?
During my Storage DRS presentation at VMworld I talked about datastore cluster architecture and covered the impact of partially connected datastore clusters. In short – when a datastore in a datastore cluster is not connected to all hosts of the connected DRS cluster, the datastore cluster is considered partially connected. This situation can occur when not all hosts are configured identically, or when new ESXi hosts are added to the DRS cluster. The problem I/O load balancing does not support partially connected datastores in a datastore cluster and Storage DRS disables the IO load balancing for the entire datastore cluster. Not only on that single partially connected datastore, but the entire cluster. Effectively degrading a complete feature set of your virtual infrastructure. Therefore having an homogenous configuration throughout the cluster is imperative. Warning messages An entry is listed in the Storage DRS Faults window. In the web vSphere client: 1. Go to Storage 2. Select the datastore cluster 3. Select Monitor 4. Storage DRS 5. Faults. The connectivity menu option shows the Datastore Connection Status, in the case of a partially connected datastore, the message Datastore Connection Missing is listed. When clicking on the entry, the details are shown in the lower part of the view: Returning to a fully connected state To solve the problem, you must connect or mount the datastores to the newly added hosts. In the web client this is considered a host-operation, therefore select the datacenter view and select the hosts menu option. 1. Right-click on a newly added host 2. Select New Datastore 3. Provide the name of the existing datastore 4. Click on Yes when the warning “Duplicate NFS Datastore Name” is displayed. 5. As the UI is using existing information, select next until Finish. 6. Repeat steps for other new hosts. After connecting all the new hosts to the datastore, check the connectivity view in the monitor menu of the of the datastore cluster Get notification of these blogs postings and more DRS and Storage DRS information by following me on Twitter: @frankdenneman
VSPHERE 5.1 DRS ADVANCED OPTION LIMITVMSPERESXHOST
During the Resource Management Group Discussion here in VMworld Barcelona a customer asked me about limiting the number of VMs per Host. vSphere 5.1 contains an advanced option on DRS clusters to do this. If the advanced option: “LimitVMsPerESXHost” is set, DRS will not admit or migrate more VMs to the host than that number. For example, when setting the LimitVMsPerESXHost to 40, each host allows up to 40 virtual machines. No correction for existing violation Please note that DRS will not correct any existing violation if the advanced feature is set while virtual machines are active in the cluster. This means that if you set LimitVMsPerESXHost to 40 and at the time 45 virtual machines are running on an ESX host, DRS will not migrate the virtual machines out of that host. However It does not allow any more virtual machines on the host. DRS will not allow any power-ons or migration to the host, both manual (by administrator) and automatic (by DRS). High Availability As this is a DRS cluster setting, HA will not honor the setting during a host failover operation. This means that HA can power on as many virtual machines on a host it deems necessary. This is to avoid any denial of service by not allowing virtual machines to power-on if the “LimitVMsPerESXHost” is set too conservative. Impact on load balancing Please be aware that this setting can impact VM happiness. This setting can restrict DRS in finding a balance with regards to CPU and Memory distribution. Use cases This setting is primary intended to contain the failure domain. A popular analogy to describe this setting would be “Limiting the number of eggs in one basket”. As virtual infrastructures are generally dynamic, try to find a setting that restricts the impact of a host failure without restricting growth of the virtual machines. I’m really interested in feedback on this advanced setting, especially if you consider implementing it, the use case and if you want to see this setting to be further developed. Get notification of these blogs postings and more DRS and Storage DRS information by following me on Twitter: @frankdenneman
CHANGING YOUR VCENTER LOGGING LEVEL USING THE WEBCLIENT
In order to monitor some load behavior I needed to increase the logging level of the vCenter server. The logging level still is included in the vCenter Server Settings, however it takes a few more clicks to get the Statistics option compared to the old vSphere client. 1. In the home screen click on vCenter. 2. click on the vCenter Servers link in the Inventory list. 3. Select the vCenter (probably Localhost). 4. Select the Manage tab in the right-pane. 5. Select Setting. 6. Click on the Edit button located on the far end right side of the screen. 7. Change the appropriate Statistic setting. Get notification of these blogs postings and more DRS and Storage DRS information by following me on Twitter: @frankdenneman
CLOUDPHYSICS VM RESERVATION & LIMITS CARD – A CLOSER LOOK
The VM Reservation and Limits card was released yesterday. CloudPhysics decided to create this card based on the popularity of this topic in the contest. So what does this card do? Let’s have a closer look. This card provides you an easy overview of all the virtual machines configured with any reservation or limits for CPU and memory. Reservations are a great tool to guarantee the virtual machine continuous access to physical resources. When running business critical applications reservations could provide a constant performance baseline that helps you meet your SLA. However reservations can impact your environment as the VM reservations impacts the resource availability of other virtual machines in your virtual infrastructure. It can lower your consolidation ratio: The Admission Control Family and it can even impact other vSphere features such as vSphere High Availability. The CloudPhysics HA Simulation card can help you understand the impact of reservations on HA. Besides reservations virtual machine limits are displayed. A limit restricts the use of physical access of the virtual machine. A limit could be helpful to test the application during various level of resource availability. However virtual machine limits are not visible to the Guest OS, therefor it cannot scale and size its own memory management (or even worse the application memory management) to reflect the availability of physical memory. For more information about memory limits, please read this post by Duncan: Memory limits. As the VMkernel is forced to provide alternative memory resources limits can lead to the increased use of VM swap files. This can lead to performance problems of the application but can also impact other virtual machines and subsystems used in the virtual infrastructure. The following article zooms into one of the many problems when relying on swap files: Impact of host local VM swap on HA and DRS. Color indicators As virtual machine level limits can impact the performance of the entire virtual infrastructure, the CloudPhysics engineers decided to add an additional indicator to help you easily detect limits. When a virtual machine is configured with a memory limit still greater than 50% of its configured size an Amber dot is displayed next to the configured limit size. If the limit is smaller or equal to 50% of its configured size than a red dot is displayed next to the limit size. Similar for CPU limits, an amber dot is displayed when the limit of a virtual machine is set but is more than 500MHz, a red dot indicates that the virtual machine is configured with a CPU limit of 500MHz or less. For example: Virtual Machine Load06 is configured with 16GB of memory. A limit is set to 8GB (8192MB), this limit is equal to 50% of the configured size. Therefore the VM reservation and Limits card displays the configured limit in red and presents an additional red dot. Flow of information The indicators are also a natural divider between the memory resource controls and the CPU controls. As memory resource control impacts the virtual infrastructure more than the CPU resource controls, the card displays the memory resource controls at the left side of the screen. We are very interested in hearing feedback about this card, please leave a comment. Get notification of these blogs postings and more DRS and Storage DRS information by following me on Twitter: @frankdenneman
CLOUDPHYSICS VM RESERVATION & LIMITS CARD – A CLOSER LOOK
The VM Reservation and Limits card was released yesterday. CloudPhysics decided to create this card based on the popularity of this topic in the contest. So what does this card do? Let’s have a closer look. This card provides you an easy overview of all the virtual machines configured with any reservation or limits for CPU and memory. Reservations are a great tool to guarantee the virtual machine continuous access to physical resources. When running business critical applications reservations could provide a constant performance baseline that helps you meet your SLA. However reservations can impact your environment as the VM reservations impacts the resource availability of other virtual machines in your virtual infrastructure. It can lower your consolidation ratio: The Admission Control Family and it can even impact other vSphere features such as vSphere High Availability. The CloudPhysics HA Simulation card can help you understand the impact of reservations on HA. Besides reservations virtual machine limits are displayed. A limit restricts the use of physical access of the virtual machine. A limit could be helpful to test the application during various level of resource availability. However virtual machine limits are not visible to the Guest OS, therefor it cannot scale and size its own memory management (or even worse the application memory management) to reflect the availability of physical memory. For more information about memory limits, please read this post by Duncan: Memory limits. As the VMkernel is forced to provide alternative memory resources limits can lead to the increased use of VM swap files. This can lead to performance problems of the application but can also impact other virtual machines and subsystems used in the virtual infrastructure. The following article zooms into one of the many problems when relying on swap files: Impact of host local VM swap on HA and DRS. Color indicators As virtual machine level limits can impact the performance of the entire virtual infrastructure, the CloudPhysics engineers decided to add an additional indicator to help you easily detect limits. When a virtual machine is configured with a memory limit still greater than 50% of its configured size an Amber dot is displayed next to the configured limit size. If the limit is smaller or equal to 50% of its configured size than a red dot is displayed next to the limit size. Similar for CPU limits, an amber dot is displayed when the limit of a virtual machine is set but is more than 500MHz, a red dot indicates that the virtual machine is configured with a CPU limit of 500MHz or less. For example: Virtual Machine Load06 is configured with 16GB of memory. A limit is set to 8GB (8192MB), this limit is equal to 50% of the configured size. Therefore the VM reservation and Limits card displays the configured limit in red and presents an additional red dot. Flow of information The indicators are also a natural divider between the memory resource controls and the CPU controls. As memory resource control impacts the virtual infrastructure more than the CPU resource controls, the card displays the memory resource controls at the left side of the screen. We are very interested in hearing feedback about this card, please leave a comment. Get notification of these blogs postings and more DRS and Storage DRS information by following me on Twitter: @frankdenneman
FROM THE ARCHIVES - AN OLD ISOMETRIC DIAGRAM
While searching for a diagram I stumbled upon an old diagram I made in 2007. I think this diagram started my whole obsession with diagrams and to add “cleanness” to my diagrams. This diagram depicts a virtual infrastructure located in two datacenters with replication between them. This infrastructure is no longer in use, but to make absolutely sure, I changed the device names into generic text labels such as ESX host, array, SW switch, etc. Back then I really liked to draw Isometric style. Now I’m more focused onto block diagrams and trying to minimalize the number of components in a diagram. In essence I follow the words from Colin Chapman: Simplify, then add lightness. But then applied to diagrams :) The fact that this diagram is still stored on my system tells me that I’m still very proud of this diagram. So that made me wonder, which diagram did you design and are you proud of? Get notification of these blogs postings and more DRS and Storage DRS information by following me on Twitter: @frankdenneman