Frank Denneman

CLOUDPHYSICS VM RESERVATION & LIMITS CARD – A CLOSER LOOK

The VM Reservation and Limits card was released yesterday. CloudPhysics decided to create this card based on the popularity of this topic in the contest. So what does this card do? Let’s have a closer look. This card provides you an easy overview of all the virtual machines configured with any reservation or limits for CPU and memory. Reservations are a great tool to guarantee the virtual machine continuous access to physical resources. When running business critical applications reservations could provide a constant performance baseline that helps you meet your SLA. However reservations can impact your environment as the VM reservations impacts the resource availability of other virtual machines in your virtual infrastructure. It can lower your consolidation ratio: The Admission Control Family and it can even impact other vSphere features such as vSphere High Availability. The CloudPhysics HA Simulation card can help you understand the impact of reservations on HA. Besides reservations virtual machine limits are displayed. A limit restricts the use of physical access of the virtual machine. A limit could be helpful to test the application during various level of resource availability. However virtual machine limits are not visible to the Guest OS, therefor it cannot scale and size its own memory management (or even worse the application memory management) to reflect the availability of physical memory. For more information about memory limits, please read this post by Duncan: Memory limits. As the VMkernel is forced to provide alternative memory resources limits can lead to the increased use of VM swap files. This can lead to performance problems of the application but can also impact other virtual machines and subsystems used in the virtual infrastructure. The following article zooms into one of the many problems when relying on swap files: Impact of host local VM swap on HA and DRS. Color indicators As virtual machine level limits can impact the performance of the entire virtual infrastructure, the CloudPhysics engineers decided to add an additional indicator to help you easily detect limits. When a virtual machine is configured with a memory limit still greater than 50% of its configured size an Amber dot is displayed next to the configured limit size. If the limit is smaller or equal to 50% of its configured size than a red dot is displayed next to the limit size. Similar for CPU limits, an amber dot is displayed when the limit of a virtual machine is set but is more than 500MHz, a red dot indicates that the virtual machine is configured with a CPU limit of 500MHz or less. For example: Virtual Machine Load06 is configured with 16GB of memory. A limit is set to 8GB (8192MB), this limit is equal to 50% of the configured size. Therefore the VM reservation and Limits card displays the configured limit in red and presents an additional red dot. Flow of information The indicators are also a natural divider between the memory resource controls and the CPU controls. As memory resource control impacts the virtual infrastructure more than the CPU resource controls, the card displays the memory resource controls at the left side of the screen. We are very interested in hearing feedback about this card, please leave a comment. Get notification of these blogs postings and more DRS and Storage DRS information by following me on Twitter: @frankdenneman

Fri, Oct 5, 2012

CLOUDPHYSICS VM RESERVATION &AMP; LIMITS CARD – A CLOSER LOOK

The VM Reservation and Limits card was released yesterday. CloudPhysics decided to create this card based on the popularity of this topic in the contest. So what does this card do? Let’s have a closer look. This card provides you an easy overview of all the virtual machines configured with any reservation or limits for CPU and memory. Reservations are a great tool to guarantee the virtual machine continuous access to physical resources. When running business critical applications reservations could provide a constant performance baseline that helps you meet your SLA. However reservations can impact your environment as the VM reservations impacts the resource availability of other virtual machines in your virtual infrastructure. It can lower your consolidation ratio: The Admission Control Family and it can even impact other vSphere features such as vSphere High Availability. The CloudPhysics HA Simulation card can help you understand the impact of reservations on HA. Besides reservations virtual machine limits are displayed. A limit restricts the use of physical access of the virtual machine. A limit could be helpful to test the application during various level of resource availability. However virtual machine limits are not visible to the Guest OS, therefor it cannot scale and size its own memory management (or even worse the application memory management) to reflect the availability of physical memory. For more information about memory limits, please read this post by Duncan: Memory limits. As the VMkernel is forced to provide alternative memory resources limits can lead to the increased use of VM swap files. This can lead to performance problems of the application but can also impact other virtual machines and subsystems used in the virtual infrastructure. The following article zooms into one of the many problems when relying on swap files: Impact of host local VM swap on HA and DRS. Color indicators As virtual machine level limits can impact the performance of the entire virtual infrastructure, the CloudPhysics engineers decided to add an additional indicator to help you easily detect limits. When a virtual machine is configured with a memory limit still greater than 50% of its configured size an Amber dot is displayed next to the configured limit size. If the limit is smaller or equal to 50% of its configured size than a red dot is displayed next to the limit size. Similar for CPU limits, an amber dot is displayed when the limit of a virtual machine is set but is more than 500MHz, a red dot indicates that the virtual machine is configured with a CPU limit of 500MHz or less. For example: Virtual Machine Load06 is configured with 16GB of memory. A limit is set to 8GB (8192MB), this limit is equal to 50% of the configured size. Therefore the VM reservation and Limits card displays the configured limit in red and presents an additional red dot. Flow of information The indicators are also a natural divider between the memory resource controls and the CPU controls. As memory resource control impacts the virtual infrastructure more than the CPU resource controls, the card displays the memory resource controls at the left side of the screen. We are very interested in hearing feedback about this card, please leave a comment. Get notification of these blogs postings and more DRS and Storage DRS information by following me on Twitter: @frankdenneman

Fri, Oct 5, 2012

FROM THE ARCHIVES - AN OLD ISOMETRIC DIAGRAM

While searching for a diagram I stumbled upon an old diagram I made in 2007. I think this diagram started my whole obsession with diagrams and to add “cleanness” to my diagrams. This diagram depicts a virtual infrastructure located in two datacenters with replication between them. This infrastructure is no longer in use, but to make absolutely sure, I changed the device names into generic text labels such as ESX host, array, SW switch, etc. Back then I really liked to draw Isometric style. Now I’m more focused onto block diagrams and trying to minimalize the number of components in a diagram. In essence I follow the words from Colin Chapman: Simplify, then add lightness. But then applied to diagrams :) The fact that this diagram is still stored on my system tells me that I’m still very proud of this diagram. So that made me wonder, which diagram did you design and are you proud of? Get notification of these blogs postings and more DRS and Storage DRS information by following me on Twitter: @frankdenneman

Fri, Oct 5, 2012 miscellaneous

STORAGE DRS AUTOMATION LEVEL AND INITIAL PLACEMENT BEHAVIOR

Recently I was asked why Storage DRS was missing a “Partially Automated mode”. Storage DRS has two automation levels, no automation (Manual Mode) and Fully Automated mode. When comparing this with DRS, we notice that Storage DRS is missing a “Partially Automated mode”. But in reality the modes of Storage DRS cannot be compared to DRS at all. This article explains the difference in behavior. DRS automation modes: There are three cluster automation levels: Manual automation level: When a virtual machine is configured with the manual automation level, DRS generate both initial placement and load balancing migration recommendations, however the user needs to manual approve these recommendations. Partially automation level: DRS automatically places a virtual machine with a partially automation level, however it will generate a migration recommendation which requires manual approval. Fully automated level: DRS automatically places a virtual machine on a host and vCenter automatically applies migration recommendation generated by DRS Storage DRS automation modes: There are two datastore cluster automation levels: No Automation (Manual mode): Storage DRS will make migration recommendations for virtual machine storage, but will not perform automatic migrations. Fully Automated: Storage DRS will make migration recommendations for virtual machine storage, vCenter automatically confirms migration recommendations. No automatic Initial placement in Storage DRS Storage DRS does not provide placement recommendations for vCenter to automatically apply. (Remember that DRS and Storage DRS only generate recommendations, it is vCenter that actually approves these recommendations if set to Automatic). The automation level only applies to migration recommendation of exisiting virtual machines inside the datastore cluster. However, Storage DRS does analyze the current state of the datastore cluster and generates initial placement recommendations based on space utilization and I/O load of the datastore and disk footprint and affinity rule set of the virtual machine. When provisioning a virtual machine, the summary screen provided in the user interface displays a datastore recommendation. When clicking on the “more recommendations” less optimal recommendations are displayed. This screen provides information about the Space Utilization % before placement, the Space Utilization % after the virtual machine is placed and the measured I/O Latency before placement. Please note that even when I/O load balancing is disabled, Storage DRS uses overall vCenter I/O statistics to determine the best placement for the virtual machine. In this case the I/O Latency metric is a secondary metric, which means that Storage DRS applies weighing to the space utilization and overall I/O latency. It will satisfy space utilization first before selecting a datastore with an overall lower I/O latency. Adding new hard-disks to a existing VM in a datastore cluster As vCenter does not apply initial placement recommendations automatically, adding new disks to an existing virtual machine will also generate an initial placement recommendation. The placement of the disk is determined by the default affinity cluster rule. The datastore recommendation depicted below shows that the new hard disk is placed on datastore nfs-f-01, why? Because it needs to satisfy storage initial placement requests and in this case this means satisfying the datastore cluster default affinity rule. If the datastore cluster were configured with a VMDK anti-affinity rule, the datastore recommendation would show any other datastore except datastore nfs-f-01. Get notification of these blogs postings and more DRS and Storage DRS information by following me on Twitter: @frankdenneman

Thu, Oct 4, 2012 sdrs

STORAGE DRS DEVICE MODELING BEHAVIOR

During a recent meeting the behavior of Storage DRS device modeling was discussed. When I/O load balancing is enabled, Storage DRS leverages the SIOC injector to determine the device characteristics of the disks backing the datastore. Because the injector stops when there is activity detected on the datastore, the customer was afraid that Storage DRS wasn’t able to get a proper model of his array due to the high levels of activity seen on the array. Storage DRS was designed to cope with these environments, as the customer was reassured after explaining the behavior I thought it might be interesting enough for to share it with you too. The purpose of device modeling Device modeling is used by Storage DRS to characterize the performance levels of the datastore. This information is used when Storage DRS needs to predict the benefit of a possible migration of a virtual machine. The workload model provides information about the I/O behavior of the VM, Storage DRS uses that as input and mixes this with the device model of the datastore in order to predict the increase of latency after the move. The device modeling of the datastore is done with the SIOC injector The workload To get a proper model, the SIOC injector injects random read I/O to the disk. SIOC uses different amounts of outstanding IO to measure the latency. The duration of the complete cycle is 30 seconds and is trigger once a day per datastore. Although it’s a short-lived process, this workload does generate some overhead on the array and Storage DRS is designed to enable storage performance for your virtual machines, not to interfere with them. Therefor this workload will not run when activity is detected on the devices backing the datastore. Timer As mentioned, the device modeling process runs for 30 seconds in order to characterize the device. If the IO injector starts and the datastore is active or becomes active, the IO injector will wait for 1 minute to start again. If the datastore is still busy, it will try again in 2 minutes, after that it idles for 4 minutes, after that 8 minutes, 16 minutes, 32 minutes, 1 hour and finally 2 hours. When the datastore is still busy after two hours after the initial start it will try to start the device modeling with an interval of 2 hours until the end of the day. If SIOC is not able to characterize the disk during that day, it will use the average value of all the other datastores in other not to influence the load balancing operations with false information and provide information that would favor this disk over other datastores that did provide actual data. The next day SIOC injector will try model the device again, but uses a skew back and forth of 2 hours from the previous period, this way during the year, Storage DRS will retrieve info across every period of the day. Key takeway Overall we do not expect the array to be busy 24/7, there is always a window of 30 seconds where the datastore is idling. Having troubleshooting many storage related problems I know arrays are not stressed all day long, therefor I’m more than confident that Storage DRS will have accurate device models to use for its prediction models. Get notification of these blogs postings and more DRS and Storage DRS information by following me on Twitter: @frankdenneman

Wed, Oct 3, 2012 sdrs device-modeling storage-drs

APPLY USER-DEFINED STORAGE CAPABILITIES TO MULTIPLE DATASTORE AT ONCE

To get a datastore cluster to surface a (user-defined) storage capability, all datastores inside the datastore cluster must be configured with the same storage capability. When creating Storage Capabilities, the UI does not contain a view where to associate a storage capability with multiple datastores. However that does not mean the web client does not provide you with the ability to do so. Just use the multi-select function of the webclient. Go to Storage, select the datastore cluster, select Related Objects and go to Datastores view. To select all datastores, click the first datastore, hold shift and select the last datastore. Right click and select assign storage capabilities. Select the appropriate Storage capability and click on OK. The Datastore Cluster summary tab now shows the user-defined Storage Capability. Get notification of these blogs postings and more DRS and Storage DRS information by following me on Twitter: @frankdenneman

Mon, Oct 1, 2012 sdrs assign-multiple-user-defined-storage-capabilities storage-profiles

AVOIDING VMDK LEVEL OVER-COMMITMENT WHILE USING THIN DISKS AND STORAGE DRS

The behavior of thin provisioned disk VMDKs in a datastore cluster is quite interesting. Storage DRS supports the use of thin provisioned disks and is aware of both the configured size and the actual data usage of the virtual disk. When determining placement of a virtual machine, Storage DRS verifies the disk usage of the files stored on the datastore. To avoid getting caught out by instant data growth of the existing thin disk VMDKs, Storage DRS adds a buffer space to each thin disk. This buffer zone is determined by the advanced setting “PercentIdleMBinSpaceDemand". This setting controls how conservative Storage DRS is with determining the available space on the datastore for load balancing and initial placement operations of virtual machines. IdleMB The main element of the advanced option “PercentIdleMBinSpaceDemand” is the amount of IdleMB a thin-provisioned VMDK disk file contains. When a thin disk is configured, the user determines the maximum size of the disk. This configured size is referred to as “Provisioned Space”. When a thin disk is in use, it contains an x amount of data. The size of the actual data inside the thin disk is referred to as “allocated space”. The space between the allocated space and the provisioned space is called identified as the IdleMB. Let’s use this in an example. VM1 has a single VMDK on Datastore1. The total configured size of the VMDK is 6GB. VM1 written 2GB to the VMDK, this means the amount of IdleMB is 4GB. PercentIdleMBinSpaceDemand The PercentIdleMBinSpaceDemand setting defines percentage of IdleMB that is added to the allocated space of a VMDK during free space calculation of the datastore. The default value is set to 25%. When using the previous example, the PercentIdleMBinSpaceDemand is applied to the 4GB unallocated space, 25% of 4GB = 1 GB. Entitled Space Use Storage DRS will add the result of the PercentIdleMBinSpaceDemand calculation to the consumed space to determine the “entitled space use”. In this example the entitled space use is: 2GB + 1GB = 3GB of entitled space use. Calculation during placement The size of Datastore1 is 10GB. VM1 entitled space use is 3GB, this means that Storage DRS determines that Datastore1 has 7GB of available free space. Changing the PercentIdleMBinSpaceDemand default setting Any value from 0% to 100% is valid. This setting is applied on datastore cluster level. There can be multiple reasons to change the default percentage. By using 0%, Storage DRS will only use the allocated space, allowing high consolidation. This is might be useful in environments with static or extremely slow data increase. There are multiple use cases for setting the percentage to 100%, effectively disabling over-commitment on VMDK level. Setting the value to 100% forces Storage DRS to use the full size of the VMDK in its space usage calculations. Many customers are comfortable managing over-commitment of capacity only at storage array layer. This change allows the customer to use thin disks on thin provisioned datastores. Use case 1: NFS datastores A use case is for example using NFS datastores. Default behavior of vSphere is to create thin disks when the virtual machine is placed on a NFS datastore. This forces the customer to accept a risk of over-commitment on VMDK level. By setting it to 100%, Storage DRS will use the provisioned space during free space calculations instead of the allocated space. Use case 2: Safeguard to protect against unintentional use of thin disks This setting can also be used as safeguard for unintentional use of thin disks. Many customers have multiple teams for managing the virtual infrastructure, one team for managing the architecture, while another team is responsible for provisioning the virtual machines. The architecture team does not want over-commitment on VMDK level, but is dependent on the provisioning team to follow guidelines and only use thick disks. By setting “PercentIdleMBinSpaceDemand” to 100%, the architecture team is ensured that Storage DRS calculates datastore free space based on provisioned space, simulating “only-thick disks” behavior. Use-case 3: Reducing Storage vMotion overhead while avoiding over-commitment By setting the percentage to 100%, no over-commitment will be allowed on the datastore, however the efficiency advantage of using thin disks remains. Storage DRS uses the allocated space to calculate the risk and the cost of a migration recommendation when a datastore avoids its I/O or space utilization threshold. This allows Storage DRS to select the VMDK that generates the lowest amount of overhead. vSphere only needs to move the used data blocks instead of all the zeroed out blocks, reducing CPU cycles. Overhead on the storage network is reduced, as only used blocks need to traverse the storage network. Get notification of these blogs postings and more DRS and Storage DRS information by following me on Twitter: @frankdenneman

Mon, Oct 1, 2012 sdrs storage-drs thin-disk

STORAGE DRS DEMO AVAILABLE ON VMWARE TV

If you haven’t seen Storage DRS in action, check out the Storage DRS demo I’ve created for VMwareTV. Get notification of these blogs postings and more DRS and Storage DRS information by following me on Twitter: @frankdenneman

Thu, Sep 27, 2012 sdrs

HOW TO CREATE VM TO HOST AFFINITY RULES USING THE WEBCLIENT

This article shows you how to create a VM to Host affinity rule using the new webclient. 1. Select host and clusters in the home screen. 2. Select the appropriate cluster. 3. Select the tab Manage and click on Settings. 4. Click on the » to expand the Cluster setting menu. 5. Select DRS Groups. 6. Click on Add to create a DRS Group. The dropdown box provides the ability to create a VM DRS group and a Host DRS group. The behavior of this window is a little tricky. When you create a group, you need to click on OK to actually create the group. If you create a VM DRS group first and then select the Host DRS group in the dropdown box before you click OK, the VM DRS group configuration is discarded. 7. Create the VM DRS Group and provide the VM group a meaningful name. 8. Click on “Add” to select the virtual machines. 9. Click on OK to add the virtual machines to the group. 10. Review the configuration and click on OK to create the VM DRS Group. 11. Click on “Add” again to create the Host DRS Group. 12. Select Host DRS Group in the dropdown box and provide a name for the Host DRS Group. 13. Click on “Add” to select the hosts that participate in this group. 14. Click on OK to add the hosts to the group. 15. Review the configuration and click on OK to create the Host DRS Group. 16. The DRS Groups view displays the different DRS groups in a single view. The groups are created, now it’s time to create the rules. 17. Select DRS Rules in the Cluster settings menu. 18. Click on “Add” to create the rule. 19. Provide a name for this rule and check if the rule is enabled (default enabled) 20. Select the “Virtual Machines to Hosts” rule in the Type dropbox. 21. Select the appropriate VM Group and the corresponding Host Group. 22. Select the type affinity rule. For more information about the difference between should and must rule, read the article: “Should or Must VM-Host affinity rules?”. In this example I’m selecting the should rule. 23. Click on Ok to create the rule. 24. Review your configuration in DRS rules screen. Get notification of these blogs postings and more DRS and Storage DRS information by following me on Twitter: @frankdenneman

Fri, Sep 21, 2012

TECHNICAL PAPER: “VMWARE VCLOUD DIRECTOR RESOURCE ALLOCATION MODELS” AVAILABLE FOR DOWNLOAD

Today the technical paper “VMware vCloud Director Resource Allocation Models” has been made available for download on VMware.com. This whitepaper covers the allocation models used by vCloud Director 1.5 and how they interact with the vSphere layer. This paper helps you correlate the vCloud allocation model settings with the vSphere resource allocation settings. For example what happens on the vSphere layer when I set a guarantee on an Org VDC configured with the Allocation Pool Model. It provides insight on the distribution of resources on both the vCloud layer and vSphere layer and illustrates the impact of various allocation model settings on vSphere admission control. The paper contains a full chapter about allocation model in practice and demonstrates the effect of using various combinations of allocation models within a single provider vDC. Please note that this paper is based on vCloud Director 1.5 http://www.vmware.com/resources/techresources/10325

Thu, Sep 20, 2012 vmware