LIMITING THE NUMBER OF STORAGE VMOTIONS

When enabling datastore maintenance mode, Storage DRS will move virtual machines out of the datastore as fast as it can. The number of virtual machines that can be migrated in or out of a datastore is 8. This is related to the concurrent migration limits of hosts, network and datastores. To manage and limit the number of concurrent migrations, either by vMotion or Storage vMotion, a cost and limit factor is applied. Although the term limit is used, a better description of limit is maximum cost. In order for a migration operation to be able to start, the cost cannot exceed the max cost (limit). A vMotion and Storage vMotion are considered operations. The ESXi host, network and datastore are considered resources. A resource has both a max cost and an in-use cost. When an operation is started, the in-use cost and the new operation cost cannot exceed the max cost. The operation cost of a storage vMotion on a host is “4”, the max cost of a host is “8”. If one Storage vMotion operation is running, the in-use cost of the host resource is “4”, allowing one more Storage vMotion process to start without exceeding the host limit. As a storage vMotion operation also hits the storage resource cost, the max cost and in-use cost of the datastore needs to be factored in as well. The operation cost of a Storage vMotion for datastores is set to 16, the max cost of a datastore is 128. This means that 8 concurrent Storage vMotion operations can be executed on a datastore. These operations can be started on multiple hosts, not more than 2 storage vMotion from the same host due to the max cost of a Storage vMotion operation on the host level. [caption id=“attachment_2099” align=“aligncenter” width=“366” caption=“Storage vMotion in progress”][/caption] How to throttle the number of Storage vMotion operations? To throttle the number of storage vMotion operations to reduce the IO hit on a datastore during maintenance mode, it preferable to reduce the max cost for provisioning operations to the datastore. Adjusting host costs is strongly discouraged. Host costs are defined as they are due to host resource limitation issues, adjusting host costs can impact other host functionality, unrelated to vMotion or Storage vMotion processes. Adjusting the max cost per datastore can be done by editing the vpxd.cfg or via the advanced settings of the vCenter Server Settings in the administration view. If done via the vpxd.cfg, the value vpxd.ResourceManager.MaxCostPerEsx41Ds is added as follows:

FAB-FOUR: VMWORLD 2012 SESSIONS APPROVED

This morning I found out that my four sessions are accepted. I’m really pleased and I am looking forward to presenting at each one of them. Two sessions, Architecting Storage DRS Datastore Clusters and vSphere Cluster Resource Pool Best Practices are also scheduled for VMWorld Barcelona. Session ID: STO1545 Session Title: Architecting Storage DRS Datastore Clusters Track: Infrastructure Presenting at: US and Barcelona Presenting with: Valentin Hamburger Session ID: VSP1504 Session Title: Ask the Expert vBloggers Track: Infrastructure Presenting at: US Presenting with: Duncan Epping, Scott Lowe, Rick Scherer and Chad Sakac Session ID: VSP1683 Session Title: vSphere Cluster Resource Pools Best Practices Track: Infrastructure Presenting at: US and Barcelona Presenting with Rawlinson Rivera Session ID: CSM1167 Session Title: Architecting for vCloud Allocation Models Track: Operations Presenting at: US Presenting with Chris Colotti Can’t wait to attend VMworld 2012! See you there.

VMWARE VSPHERE STORAGE DRS INTEROPERABILITY TECHNICAL PAPER AVAILABLE

Today my second white paper, VMware vSphere Storage DRS Interoperability, is made available for download at the Technical Resource Center at VMware.com. This white paper presents an overview of best practices for customers considering the implementation of VMware vSphere Storage DRS in combination with advanced storage device features or other VMware products. This document zooms in on Storage DRS interoperability with array based features, such as Auto-Tiering, Thin provisioning, Depulication but also explains VMware products such as Snapshots. A small preview: VMware vSphere Snapshots Storage DRS supports virtual machine snapshots. By default, it collocates them with the virtual machine disk file to prevent fragmentation of the virtual machine. Also by default, Storage DRS applies a VMDK affinity rule to each new virtual machine. If it migrates the virtual machine to another datastore, all the files, including the snapshot files, move with it. If the virtual machine is configured with an inter-VMDK affinity setting, the snapshot is placed in the directory of its related disk and is moved to the same destination datastore as when migrated by a Storage vMotion operation. VMware supports the use of vSphere snapshots in combination with Storage DRS. Go and download it here: http://www.vmware.com/resources/techresources/10286

VMWORLD SESSION PROPOSALS

Here is just a quick overview of the sessions I submitted for VMworld events in San Francisco and Barcelona. I’ve submitted three sessions in total, as my passion for resource management and Storage DRS is a public secret it should be no suprise that all sessions I participate in focus on either vSphere resource managagement or Storage DRS. :) I’ve split them up into two categories, vSphere centric and vCloud centric. The fourth session is the annual Ask the Expert vBloggers with the all-star crew Scott Lowe, Duncan Epping, Rick Scherer, and Chad Sakac. I hope to see you at VMworld! vSphere centric sessions Session 1545 Architecting Storage DRS Datastore Clusters Abstract: In this session Frank Denneman and Valentin Hamburger will cover and explain in great detail what to consider when building a Storage DRS datastore cluster. Introducing the concept of datastore clusters can affect or shift the paradigm of storage management in virtual infrastructures. The goal is to demonstrate the relationship between the datastore cluster and existing objects in the virtual infrastructure and how the introduction of datastore clusters can effect various design decisions. This session is a must for anyone implementing Storage DRS that wants to maximize their cluster and vSphere resource designs. Session 1683 vSphere Cluster Resource Pools Best Practices Abstract: In this session Frank Denneman and Rawlinson Rivera will cover and explain in great detail what to consider when using resource pool inside a vSphere cluster. Introducing the concept of resource pools can affect virtual machine performance and overall resource management in virtual infrastructures. Join Frank and Rawlinson and discover both common pitfalls and best practices of resource pool design. This session is a must for anyone implementing resource pools that wants to maximize their cluster and vSphere resource designs. vCloud Director centric sessions Session 1167 vCloud tracks Architecting for vCloud Allocation Models Abstract: In this session Frank Denneman and Chris Colotti will break down the three vCloud Director Allocation models in depth. Each model’s settings will be shown in detail to explain the effect on vSphere resource scheduling. They will then show how Allocation models of the same type with different configurations, as well as different allocation models could live on the same Provider vDC. The goal is to demonstrate that by not only fully understanding the allocation models, but the vSphere resource allocation together you can design for multiple allocation models on a single Provider vDC. This session is a must for anyone implementing vCloud Director that wants to maximize their cluster and vCloud resource designs. Session 504 Ask the Expert vBloggers - Scott Lowe, Duncan Epping, Rick Scherer, Frank Denneman, Chad Sakac Abstract: One of the highest rated sessions at VMworld is back for it’s fifth year! Come meet four VMware Certified Design Experts (VCDX) on stage answering your questions. We get the top Virtualization Bloggers in the industry and get them on stage answering your questions in a wide array of topics. Simon at Techhead.co.uk wrote a nice article about how to vote for your favorite session at the VMworld.com portal

BLOG POST ON BLOGS.VMWARE.COM/VSPHERE/

As part of the Technical Marketing team of VMware focusing on the vSphere platform I contribute to the vSphere blog on VMware.com. From this point forward I will post a link to new articles posted on the vSphere blog. SDRS maintenance mode impossible because “The virtual machine is pinned to a host.”

VMWARE VSPHERE METRO STORAGE CLUSTER CASE STUDY TECHNICAL PAPER AVAILABLE

As of today the VMware vSphere Metro Storage Cluster Case Study Technical Paper is available at http://www.vmware.com/resources/techresources/10284 VMware vSphere Metro Storage Cluster (VMware vMSC) is a new configuration within the VMware Hardware Compatibility List. This type of configuration is commonly referred to as a stretched storage cluster or metro storage cluster. It is implemented in environments where disaster/downtime avoidance is a key requirement. This case study was developed to provide additional insight and information regarding operation of a VMware vMSC infrastructure in conjunction with VMware vSphere. This paper will explain how vSphere handles specific failure scenarios and will discuss various design considerations and operational procedures.

DRS CLUSTERS AND ALLOCATING RESERVED MEMORY

As mentioned in the admission control family, multiple features on multiple layers check to there is enough unused reserved memory available. This article is a part of a short series of articles on how memory is being claimed and provided as reserved memory; other articles will be posted throughout the week. Refresher I’ve published two articles that describes memory reservation at the VM level and the resource pool level. These two sources are an excellent way to refresh your memory (no pun intended) on the reservation construct: • Impact of memory reservation (VM-level) • Resource pool memory reservations “Unclaimed” reserved memory? If a memory reservation is configured on a child object (virtual machine or resource pool) admission control checks if there is enough reserved memory available. Which memory can be claimed for reserved memory? And how about the host-level memory and cluster level memory? Let’s dissect the cluster tree of resource providers and resource consumers and start with a bottom-up approach. Host-level to DRS cluster Both the host and DRS cluster are resource providers to the resource consumers i.e. resource pools and virtual machines. When a host is made a member of a DRS cluster, all its available memory resources are placed at the DRS disposal. The available memory of a host is the memory that is left after the VMkernel claimed host memory. The DRS cluster, also called the root resource pool, reserves this remaining memory. As the DRS cluster reserves this memory per host, all the memory aggregated inside the root resource pool and is actually designated as reserved memory. However to prevent confusion, this reserved memory is labeled as unused reserved memory in the vSphere Client user interface and as such provided to the child resource pools and child virtual machines.At the Resource Allocation Tab of the cluster, the Total memory capacity of the cluster is listed as well as the reserved capacity. The Available capacity is the result of Total capacity – Reserved capacity. Note that if HA is configured the amount of resources reserved for failover is automatically added to the reserved capacity. Child resource pools Resource pools allow for hierarchical partitioning of the cluster, but they always span the entire cluster. Resource pools draw resources from the root resource pool and do not pick and select resources from a specific hosts. The root resource pool functions as an abstraction layer. When configuring a reservation on resource pool level the specified amount of memory is claimed by that specific resource pool and cannot be allocated by other resource pools. Note that the claim of reserved resources by the resource pool is done immediately during the creation of the resource pool. It does not matter if there are running virtual machine inside the resource pools or not. The total configured memory is withdrawn from the root resource pool and thus unavailable for other resource pools. Please keep this in mind when sizing resources pools. The next article will expand on virtual machines inside a resource pool.

ADMISSION CONTROL AND VCLOUD ALLOCATION POOL MODEL

The previous article outlines the multiple admission controls active in a virtual infrastructure. One that always interested me in particular is the admission control feature that verifies resource availability. With the introduction of vCloud director another level of resource construct were introduced. Along with Provider virtual datacenter (vDC) and Organization vDCs, allocation models were introduced. An allocation model defines how resources are allocated from the provider vDC. An organization vDC must be configured with one of the following three allocation models: “Pay As You Go”, “Allocation Pool” and “Reservation Pool”. It is out of the scope to describe all three models, please visit Chris Colotti’s blog or Yellow Bricks to read more about allocation models. As mentioned before the distinction between each of the allocation models is how resources are consumed. Depending on the chosen allocation model reservations and limits will be set on resource pool, virtual machine level, or both. One of the most interesting allocation model is the Allocation Pool model as it sets reservations on both resource pool level and virtual machine level simultaneously. During configuration of the allocation pool model, an amount of guaranteed resources can be specified. (Guaranteed is the vCloud term for vSphere reservation). The question I was given is will lowering the default value of 100% guaranteed memory result in an increase of more virtual machines inside the Organization vCD? And the answer lies within the working of vSphere admission control. Allocation Pool model settings By default the Allocation Pool model sets a 100% memory reservation on both resource pool level and virtual machine level. By lowering the default guarantee, it allows for opportunistic memory allocation on both resource pool level and virtual machine level. Creating this burstable space (resources available for opportunistic access) usually provides an higher consolidation ratio of virtual machines, however due to the simultaneous configuration of reservation on both resource pool and virtual machine level, this is not the case. Virtual machine level reservation During power-on operation admission control checks if the resource pool can satisfy the virtual machine level reservation. Because expandable reservation is disabled in this model, the resource pool is not able to allocate any additional resources from the provider vDC. Therefor the virtual machine memory reservation can only be satisfied by the resource pool level reservation of the organization vDC itself. When a virtual machine is using memory protected by a virtual machine level reservation, this memory is withdrawn from the resource pool-level reservation. If the resource pool does not have enough available memory to guarantee the virtual machine reservation, the power-on operation fails. Let’s use a scenario to visualize the process a bit better. Scenario An organization vCD is created with the Allocation Pool model and the memory allocation is set to 20GB; the memory guarantee is set to 50%. These settings result in a resource pool memory limit of 20GB and a memory reservation of 10GB. When powering up a 2GB virtual machine, 1GB of reserved resources will be allocated to that virtual machine and withdrawn from the available reserved memory pool. Admission control allows to power-on virtual machines until the reserved memory pool is reduced to zero. Following the previous example, virtual machine 2 is powered on. The resource pool providing resources to the organization vDC has 9 GB available in its pool of reserved memory. Admission control allows the power-on operation of the virtual machine as this pool can provide the reserved resources specified by the virtual machine level reservation. During each power-on operation 1GB of reserved memory is withdrawn from the reserved memory pool available to the organization vDC. Resulting in admission control allowing to power on ten virtual machines. When attempting to deploy virtual machine 11, admission controls fails the power-on operation as the organization vDC has no available reserved memory to satisfy the virtual machine level reservation. Note: This scenario excludes the impact of memory overhead reservation of each virtual machine. Under normal circumstances, the number of virtual machines that could be powered on would be close to 8 instead of 10 as the reserved pool available to the organization vDC is used to satisfy the memory overhead reservation of each virtual machine as well. Because the guarantee setting of the Allocation Pool model configures resource pool and virtual machine memory reservation settings simultaneously, the supply and demand of reserved memory resources are always equal regardless of the configured percentage setting. Therefore offering opportunistic access to resources inside the organization vDC does not allow an increase of the number of virtual machines inside the organization vDC. The next question arises, why should you lower the percentage of guaranteed resources? Providing burstable space increases the number of Organization vCDs inside the Provider vDC. Resource pool memory reservation Upon creation resource pools claim and withdraw the configured reserved resources from their parent instantaneously. This memory cannot be provided or distributed to other organization vDCs regardless of utilization of these resources. Although new resource constructs are introduced in a vCloud environment, consolidation ratios and resource management still leverage traditional vSphere resource management constructs and rules. Chris Colotti and I are currently working on a technical paper describing the allocation models in details and the way they interact with vSphere resource management. We hope to see this published soon.

THE ADMISSION CONTROL FAMILY

It’s funny how sometimes something, in this case a vSphere feature, becomes a “trending topic” on any given day or week. Yesterday I was discussing admission control policies with Rawlinson Riviera (@punchingclouds) and we discussed how to properly calculate a percentage for the percentage based admission control with keeping consolidation ratios in mind. And today Gabe published an article about his misconception of admission control, which triggered me to write an article of my own about admission control. When discussing admission control usually only HA admission control policies are mentioned. However, HA isn’t the only feature using some sort of admission control. Storage DRS as well as DRS and the ESX(i) host have each their own admission control. Let’s take a closer look what admission control actually is and see how each admission control fits in the process of a virtual machine power-on operation. What’s the function of admission control? I like to call it our team of virtual bouncers. Admission control is there to ensure that sufficient resources are available for the virtual machine to function within it’s specified parameters / resource requirements. The last part about the parameters and requirements is the key to understand admission control. During a virtual machine power-on or a move operation, admission control checks if sufficient unreserved resources are available before allowing a virtual machine to power on or moved into the cluster. If a virtual machine is configured with reservation, this could be CPU, memory or even both, admission control needs to make sure that the datastore cluster, compute cluster, resource pool and host can provide these resources. If one of these constructs cannot allocate and provide these resources, then the datastore cluster, compute cluster, resource pool or host cannot provide an environment where the virtual machine can operate within its required parameters. In other words, the moment a virtual machine is configured to have an X amount of resources guaranteed, you want the environment to actually oblige to that wish and that’s why admission control is developed. As a vSphere environment can be configured in many different ways, each feature sports its own admission control, as you do not want to introduce dependencies for such a crucial component. Let’s take a closer look at each admission control feature and their checkpoints. High Availability Admission control: During a virtual machine power-on operation, HA checks if the virtual machine can be powered-on without violating the required capacity to cope with a host failure event. Depending on the HA admission control policy, HA checks if the cluster can provide enough unreserved resources to satisfy the virtual machine reservation. The internals of each type Admission control policies is outside the scope of this article, more information can be found in the clustering deep dive books or online at Yellow-bricks. After HA admission control gives the green light, it’s up to Storage DRS admission control if the virtual machine is placed in a Storage DRS datastore cluster. Storage DRS admission control checks datastore connectivity amongst the hosts in the datastore cluster and selects the hosts with the highest datastore connectivity to ensure the highest portability of a virtual machine. If there are multiple hosts with the same number of datastore connected it selects the host with the lowest compute utilization. Up next is DRS admission control to review the cluster state. DRS ensures that sufficient unreserved resources are available in the cluster before allowing the virtual machine to power on. If the virtual machine is placed inside a resource pool, DRS checks if the resource pool can provide enough resources to satisfy the reservation. Depending on the setting “expandable reservation” the resource pool checks its own pool of unreserved resources or borrows resources from its parent. If a virtual machine is moved into the cluster and EVC is enabled in the DRS cluster, EVC admission control verifies if the applied EVC mode to the virtual machine does not exceed the current EVC baseline of the cluster. DRS selects a host based on configured VM-VM and VM-Host affinity rules. Last step is Host admission control. In the end it’s the host that actually needs to provide the compute environment to allow the virtual machine to operate in. A cluster can have enough unreserved resources available, but it can be in a fragmented stage, where there are not enough resources available per host to satisfy the virtual machine reservation. To solve this problem a DRS invocation is triggered to recommend virtual machine migrations to re-balance the cluster and free up space on a particular host for the new virtual machine. If DRS is not enabled, the Host rejects the virtual machine due to the inability to provide the required resources. Host admission control also verifies is the virtual machine configuration is compatible with the host. The VM networks and datastores must be available in order to accommodate the virtual machine. The virtual machine compatibility list also the suitable host if the virtual machine is placed inside a “must” VM-Host affinity rules, admission control checks if its listed in the compatibility list. The last check is if the host can create a VM-swap file on the designated VM swap location. So there you have it before a virtual machine is powered-on or moved into a cluster, all these admission controls will make sure the virtual machine can operate within its required parameters and no cluster feature requirement is being violated.

MIXING RESOURCE POOLS AND VIRTUAL MACHINES ON THE SAME HIERARCHICAL LEVEL

One of most frequent questions I receive is about mixing resource pools and virtual machines at the same hierarchical level. In almost all of the cases we recommend to commit to one type of child objects. Either use resource pools and place virtual machines within the resource pools or place only virtual machines at that hierarchical level. The main reason for this is how resource shares work. Shares determine the relative priority of the child-objects (virtual machines and resource pools) at the same hierarchical level and decide how excess of resources (total system resources - total Reservations) made available by other virtual machines and resource pools are divided. Shares are level-relative, which means that the number of shares is compared between the child-objects of the same parent. Since, they signify relative priorities; the absolute values do not matter, comparing 2:1 or 20.000 to 10.000 will have the same result. Lets use an example to clarify. In this scenario the DRS cluster (root resource pool) has two child objects, resource pool 1 and a virtual machine 1. The DRS cluster issues shares amongst its children, 4000 shares issued to the resource pool, 2000 shares issued to the virtual machine. This results in 6000 shares being active on that particular hierarchical level. During contention the child-objects compete for resources as they are siblings and belong to the same parent. This means that the virtual machine with 2000 shares needs to compete with the resource pool that has 4000 shares. As 6000 shares are issued on that hierarchical level, the relative value of each child entity is (2000 of 6000 shares) = 33% for the virtual machine and (4000 shares of 6000=66%) for the resource pool. The problem with this configuration is that the resource pool is not only a resource consumer but also a resource provider. So that it must claim resources on behalf of its children. Expanding the first scenario, two virtual machines are placed inside the resource pool. The Resource Pool issues shares amongst its children, 1000 shares issued to virtual machine 2 (VM2) and 2000 shares issued to virtual machine 3 (VM3). This results in 3000 shares being active on that particular hierarchical level. During contention the child-objects compete for resources as they are siblings and belong to the same parent which is the resource pool. This means that VM2 owning 1000 shares needs to compete with VM3 that has 2000 shares. As 3000 shares are issued on that hierarchical level, the relative value of each child entity is (1000 of 3000 shares) = 33% for VM2 and (2000 shares of 3000=66%) for VM3. As the resource pool needs to compete for resources with the virtual machine on the same level, the resource pool can only obtain 66% of the cluster resources. These resources are divided between VM2 and VM3. That means that VM2 can obtain up to 22% of the cluster resources (1/3 of 66% of the total cluster resources is 22%). Forward to scenario 3, two additional virtual machines are created and are on the same level as Resource Pool 1 and virtual machine 1. The DRS cluster issues 1000 shares to VM4 and 1000 shares to VM5. As the DRS cluster issued an additional 2000 shares, the total issued shares is increased to 8000 shares. Resulting in dilution of the relative share values of Resource Pool 1 and VM1. Resource pool 1 now owns 4000 shares of a total 8000 bringing the relative value down from 66% to 50%. VM1 owns 2000 shares of 8000, bringing its value down to 25%. Both VM4 and VM5 own each 12.5% of shares. As resource pool 1 provides resources to its child-object VM2 and VM3, fewer resources are divided between VM2 and VM3. That means that in this scenario VM2 can obtain up to 16% of the cluster resources (1/3 of 50% of the total cluster resources is 16%). Introducing more virtual machines to the same sibling level as the Resource Pool 1, will dilute the resources available to virtual machines inside Resource Pool 1. This is the reason why we recommend to commit to a single type of entity at a specific sibling level. If you create resource pools, stick with resource pools at that level and provision virtual machines inside the resource pool. Another fact is that a resource pool receives shares similar to a 4-vcpu 16GB virtual machine. Resulting in a default share value of 4000 shares of CPU and 163840 shares of memory when selecting Normal share value level. When you create a monster virtual machine and place it next to the resource pool, the resource pool will be dwarfed by this monster virtual machine resulting in resource starvation. Note: Shares are not simply a weighting system for resources. All scenarios to demonstrate way shares work are based on a worst-case scenario situation: every virtual machine claims 100% of their resources, the system is overcommitted and contention occurs. In real life, this situation (hopefully) does not occur very often. During normal operations, not every virtual machine is active and not every active virtual machine is 100% utilized. Activity and amount of contention are two elements determining resource entitlement of active virtual machines. For ease of presentation, we tried to avoid as many variable elements as possible and used a worst-case scenario situation in each example. So when can it be safe to mix and match virtual machines and resource pools at the same level? When all child-objects are configured with reservation equal to their configuration and limits, this result in an environment where shares are overruled by reservation and no opportunistic allocation of resources exist. But this designs introduces other constraints to consider.