VM STORAGE PROFILES AND STORAGE DRS - PART 2 - DISTRIBUTED VMS

Mentioned in part-1 of the Storage DRS and VM Storage Profiles series, Storage DRS expects “storage characteristics –alike” datastores inside a single datastore cluster. But what if you have multiple tiers of storage and you want to span the virtual machine across them? Storage profiles can assist in deploying the VMs across multiple datastore clusters safely and inline with your SLAs. Storage DRS Datastore architecture When you have multiple tiers of storage, its recommended to create multiple datastore clusters and each datastore cluster contains disks from a single tier. Let’s assume you have three different kinds of disks in your array: SSD, FC 15K and SATA. Datastores backed by disk out a single pool are aggregated into a single datastore, resulting in three datastore clusters. Having multiple datastore clusters can increase the complexity of the provisioning process, using VM storage profiles ensures you that virtual machines or disk files are placed in the correct datastore cluster. Assign storage capabilities to datastores All datastores within a single datastore cluster are associated with the same storage capability.

VM STORAGE PROFILES AND STORAGE DRS - PART 1

In my previous article about how to configure storage profiles using the web client I stated that different storage profiles could be assigned to a single virtual machine. Storage profiles can be used together with Storage DRS. Let’s take a closer look on how to use storage profiles with Storage DRS. Architectural view VM storage profiles need to be connected to a storage capability to function. The storage capability itself needs to be associated to one or more datastores. A virtual machine in its whole can be associated with a storage profile, or you can use a more granular configuration and associate different storage profiles to the VM working directory and / or VMDK files. Datastore cluster storage capabilities You might have noticed that there isn’t a datastore cluster element depicted in the diagram. The storage capabilities of a datastore cluster are extracted from the associated storage capabilities of each datastore member. If all datastores are configured with the same storage capability, the datastore cluster surfaces this storage capability and becomes compliant with the connected VM storage profiles. For example, “Datastore cluster – Tier 1 VMs and VMDKs” contains 4 datastores. NFS-F-01, NFS-F-02, NFS-F-03 are associated with the storage capability “SSD low latency disk (Tier-1 VMs and VMDKs)” while datastore NFS-F-04 is associated with storage capability “FC 15K – High Speed disk (Tier 2 VMs and VMDKs)”. When reviewing the Storage Capabilities of the datastore cluster, no Storage Capability is displayed: The VM Storage Profile “Tier 1 VMs and VMDK” is connected to the Storage Capability “SSD low latency disk (Tier-1 VMs and VMDKs)”. When selecting storage during the deployment of a virtual machine, the datastore cluster is considered incompatible with the selected VM Storage Profile. Incompatible, but there are three datastores available with the correct Storage capabilities? Although this is true, Storage DRS does not incorporate storage profiles compliancy in its balancing algorithms. Storage DRS is designed with the assumption that all disks backing the datastores are “storage characteristics-alike”. Manually selecting a datastore in the datastore cluster is only possible if the option “Disable Storage DRS for this virtual machine” is selected. Placing the VM on the specific datastore and then enabling Storage DRS later on that VM is futile. Storage DRS will load balance the VM if necessary, but it doesn’t take the VM storage profile compatibility into account when load balancing. So if you have, please remove this “workaround” in your operation manuals :) After removing the datastore with the dissimilar storage capability (NFS-F-04), the Datastore cluster surfaces “SSD – Low Latency disk (Tier-1 VMs and VMDKs)” and becomes compatible with virtual machines associated with the Tier-1 VM storage Profile. Part 2 will cover distributing virtual machine across multiple datastores using Storage Profiles. Get notification of these blogs postings and more DRS and Storage DRS information by following me on Twitter: @frankdenneman

VMWORLD EUROPE SESSIONS

At VMworld Europe I will participate in Meet the Expert sessions, Group discussions and three breakout sessions. Here’s an overview of my public schedule. I hope to see you all in my sessions or participate in the group discussions. Or schedule a time slot in Meet the Expert if you have a question about Storage DRS or DRS that you want to ask me. Tuesday 9 October: 11:00 – 12:00: Group Discussion 28 Resource Management 12:30 - 13:30: INF-VSP1168 Architecting a Cloud Infrastructure 14:00 – 15:00: INF-VSP1683 Resource Pool Best Practices Wednesday 10 October: 11:00 – 12:00: Meet the Experts 05 12:30 – 13:30: Group Discussion 28 Resource Management 15:30 – 16:30: INF-STO1545 Architecting and designing (SDRS) Datastore Clusters Thursday 11 October: 10:30 – 11:30: Meet the Experts 11

HOW TO ATTACH VM STORAGE PROFILES TO A VIRTUAL MACHINE USING THE WEB CLIENT

Virtual Machine Storage Profiles are used to identify the storage capabilities necessary in order to properly run the application within the virtual machine. VM Storage Profiles need to enabled first on the hosts and/or Cluster before you are able to assign them to a virtual machine. 1. Select VM Storage Profiles in the Home screen 2. Select the Icon in the middle to enable VM Storage profiles 3. Select the cluster or host that you want to enable Check if the host has the appropriate license 4. Create Storage Profiles A VM storage profile is attached to a Storage capability. In turn a Storage Capability profile is attached to a datastore. For more information about storage capabilities please read the article: vSphere 5.0 Storage Features Part 11 – Profile Driven Storage by Cormac Hogan. 5. Go back to home 6. Select VM and Templates 7. Select Datacenter in the left pane 8. Select the menu option “Related Objects” in the right pane 9. Select the menu option “Virtual Machines” 10. Right click on a virtual machine 11. Select All vCenter Actions 12. Select Storage Profiles 13. Select Manage Storage Profiles 14. Apply the VM storage Profiles to the working directory and disk Please note that you can assign different storage profiles to the virtual machine working directory and each single VMDK. The working directory is where the .vmx, .nvram, .log resides. This is listed in the UI as the Home VM location. Each VMDK can be assigned with a different storage profile to align it with your SLA’s.

HOW TO CREATE A DATASTORE CLUSTER USING THE NEW WEB CLIENT.

vSphere 5.1 main user interface is provided by the web client, during beta testing I spend some time to get accustomed to the new user interface. In order to save you some time, I created this write-up on how to create a datastore cluster using the web client. I assume you already installed the new vCenter 5.1. If not, check out’s Duncan post on how to install the new vCenter Server Appliance. Before showing the eight easy steps that need to be taken when creating a datastore cluster, I want to list some constraints and the recommendations for creating datastore clusters. Constraints: • VMFS and NFS cannot be part of the same datastore cluster. • Similar disk types should be used inside a datastore cluster. • Maximum of 64 datastores per datastore cluster. • Maximum of 256 datastore clusters per vCenter Server. • Maximum of 9000 VMDKs per datastore cluster Recommendations: • Group disks with similar characteristics (RAID-1 with RAID-1, Replicated with Replicated, etc.) • Leverage information provided by vSphere Storage APIs - Storage Awareness The Steps 1. Go to the Home screen and select Storage 2. Select the Datastore Clusters icon in Related Objects view. 3. Name and Location The first steps are to enable Storage DRS, specify the datastore cluster name and check if the “Turn on Storage DRS” option is enabled. When “Turn on Storage DRS” is activated, the following functions are enabled: • Initial placement for virtual disks based on space and I/O workload • Space load balancing among datastores within a datastore cluster • IO load balancing among datastores within a datastore cluster The “Turn on Storage DRS” check box enables or disables all of these components at once. If necessary, I/O balancing functions can be disabled independently.If Storage DRS is not enabled, a datastore cluster will be created which lists the datastores underneath, but Storage DRS won’t recommend any placement action for provisioning or migration operations on the datastore cluster. When you want to disable Storage DRS on an active datastore cluster, please note that all the Storage DRS settings, e.g. automation level, aggressiveness controls, thresholds, rules and Storage DRS schedules are saved so they may be restored to the same state at the moment Storage DRS was disabled. 4. Storage DRS Automation Storage DRS offers two automation levels: No Automation (Manual Mode) Manual mode is the default mode of operation. When the datastore cluster is operating in manual mode, placement and migration recommendations are presented to the user, but are not executed until they are manually approved. Fully Automated Fully automated allows Storage DRS to apply space and I/O load-balance migration recommendations automatically. No user intervention is required. However, initial placement recommendations still require user approval. Storage DRS allows virtual machines to have individual automation level settings that override datastore cluster-level automation level settings. Similar to when DRS was introduced, I recommend to start using manual mode first and review the generated recommendations. If you are comfortable with the decision matrix of Storage DRS you can switch to fully automated. Please note that you can switch between modes on the fly and without incurring downtime. 5. Storage DRS Runtime Settings Keep the defaults for now. Future articles expand upon the Storage DRS thresholds and advanced options. 6. Select Clusters and Hosts The “Select Hosts and Clusters” view allows the user to select one or more (DRS) clusters to work with. Only clusters within the same vCenter datacenter can be selected, as the vCenter datacenter is the boundary for Storage DRS to operate in. 7. Select Datastores By default, only datastores connected to all hosts in the selected (DRS) cluster(s) are shown. The Show datastore dropdown menu provides the options to show partially connected datastores. The article partially connected datastore cluster gives you insight of the impact of this design decision. 8. Ready to Complete The “Ready to Complete” screen provides an overview of all the settings configured by the user. Review the configuration of your new datastore cluster and click on finish.

WHERE IS MY NEW VMOTION FUNCTIONALITY?

Just a reminder as I received a lot of questions and comments about this: The new vMotion functionality - migrating virtual machines between host without shared storage - is only available via the web client. Please note that in the vSphere 5.1 release all new features are only visible via the web client and not in the old vSphere client. For more information about the vMotion functionality: vSphere 5.1 vMotion deepdive Get notification of these blogs postings and more DRS and Storage DRS information by following me on Twitter: @frankdenneman

VSPHERE 5.1 VMOTION DEEP DIVE

vSphere 5.1 vMotion enables a virtual machine to change its datastore and host simultaneously, even if the two hosts don’t have any shared storage in common. For me this is by far the coolest feature in the vSphere 5.1 release, as this technology opens up new possibilities and lays the foundation of true portability of virtual machines. As long as two hosts have (L2) network connection we can live migrate virtual machines. Think about the possibilities we have with this feature as some of the current limitations will eventually be solved, think inter-cloud migration, think follow the moon computing, think big! The new vMotion provides a new level of ease and flexibility for virtual machine migrations and the beauty of this is that it spans the complete range of customers. It lowers the barrier for vMotion use for small SMB shops, allowing them to leverage local disk and simpler setups, while big datacenter customers can now migrate virtual machines between clusters that may not have a common set of datastores between them. Let’s have a look at what the feature actually does. In essence, this technology combines vMotion and Storage vMotion. But instead of either copying the compute state to another host or the disks to another datastore, it is a unified migration where both the compute state and the disk are transferred to different host and datastore. All is done via the vMotion network (usually). The moment the new vMotion was announced at VMworld, I started to receive questions. Here are the most interesting ones that allows me to give you a little more insight of this new enhancement. Migration type One of the questions I have received is, will the new vMotion always move the disk over the network? This depends on the vMotion type you have selected. When selecting the migration type; three options are available: This may be obvious to most, but I just want to highlight it again. A Storage vMotion will never move the compute state of a VM to another host while migrating the data to another datastore. Therefore when you just want to move a VM to another host, select vMotion, when you only want to change datastores, select Storage vMotion. Which network will it use? vMotion will use the designated vMotion network to copy the compute state and the disks to the destination host when copying disk data between non-shared disks. This means that you need to the extra load into account when the disk data is being transferred. Luckily the vMotion team improved the vMotion stack to reduce the overhead as much as possible. Does the new vMotion support multi-NIC for disk migration? The disk data is picked up by the vMotion code, this means vMotion transparently load balances the disk data traffic over all available vMotion vmknics. vSphere 5.1 vMotion leverages all the enhancements introduced in the vSphere 5.0 such as Multi-NIC support and SDPS. Duncan wrote a nice article on these two features. Is there any limitation to the new vMotion when the virtual machine is using shared vs. unshared swap ? No, either will work, just as with the traditional vMotion. Will the new vMotion features be leveraged by DRS/DPM/Storage DRS ? In vSphere 5.1 DRS, DPM and Storage DRS will not issue a vMotion that copies data between datastores. DRS and DPM remains to leverage traditional vMotion, while Storage DRS issues storage vMotions to move data between datastores in the datastore cluster. Maintenance mode, a part of the DRS stack, will not issue a data moving vMotion operation. Data moving vMotion operations are more expensive than traditional vMotion and the cost/risk benefit must be taken into account when making migration decisions. A major overhaul of the DRS algorithm code is necessary to include this into the framework, and this was not feasible during this release. How many concurrent vMotion operations that copies data between datastores can I run simultaneously? A vMotion that copies data between datastores will count against the limitations of concurrent vMotion and Storage vMotion of a host. In vSphere 5.1 one cannot perform more than 2 concurrent Storage vMotions per host. As a result no more than 2 concurrent vMotions that copy data will be allowed. For more information about the costs of the vMotion process, I recommend to read the article: “Limiting the number of Storage vMotions” How is disk data migration via vMotion different from a Storage vMotion? The main difference between vMotion and Storage vMotion is that vMotion does not “touch” the storage subsystem for copy operations of non-shared datastores, but transfers the disk data via an Ethernet network. Due to the possibilities of longer distances and higher latency, disk data is transferred asynchronously. To cope with higher latencies, a lot of changes were made to the buffer structure of the vMotion process. However if vMotion detects that the Guest OS issues I/O faster than the network transfer rate, or that the destination datastore is not keeping up with the incoming changes, vMotion can switch to synchronous mirror mode to ensure the correctness of data. I understand that the vMotion module transmits the disk data to the destination, but how are changed blocks during migration time handled? For disk data migration vMotion uses the same architecture as Storage vMotion to handle disk content. There are two major components in play – bulk copy and the mirror mode driver. vMotion kicks off a bulk copy and copies as much as content possible to the destination datastore via the vMotion network. During this bulk copy, blocks can be changed, some blocks are not yet copied, but some of them can already reside on the destination datastore. If the Guest OS changes blocks that are already copied by the bulk copy process, the mirror mode drive will write them to the source and destination datastore, keeping them both in lock-step. The mirror mode driver ignores all the blocks that are changed but not yet copied, as the ongoing bulk copy will pick them up. To keep the IO performance as high as possible, a buffer is available for the mirror mode driver. If high latencies are detected on the vMotion network, the mirror mode driver can write the changes to the buffer instead of delaying the I/O writes to both source and destination disk. If you want to know more about the mirror mode driver, Yellow bricks contains a out-take of our book about the mirror mode driver. What is copied first, disk data or the memory state? If data is copied from non-shared datastores, vMotion must migrate the disk data and the memory across the vMotion network. It must also process additional changes that occur during the copy process. The challenge is to get to a point where the number of changed blocks and memory are so small that they can be copied over and switch over the virtual machine between the hosts before any new changes are made to either disk or memory. Usually the change rate of memory is much higher than the change rate of disk and therefore the vMotion process start of with the bulk copy of the disk data. After the bulk data process is completed and the mirror mode driver processes all ongoing changes, vMotion starts copying the memory state of the virtual machine. But what if I share datastores between hosts; can I still use this feature and leverage the storage network? Yes and this is very cool piece of code, to avoid overhead as much as possible, the storage network will be leveraged if both the source and destination host have access to the destination datastore. For instance, if a virtual machine resides on a local datastore and needs to be copied to a datastore located on a SAN, vMotion will use the storage network to which the source host is connected. In essence a Storage vMotion is used to avoid vMotion network utilization and additional host CPU cycles. Because you use Storage vMotion, will vMotion leverage VAAI hardware offloading? If both the source and destination host are connected to the destination datastore and the datastore is located on an array that has VAAI enabled, Storage vMotion will offload the copy process to the array. Hold on, you are mentioning Storage vMotion, but I have Essential Plus license, do I need to upgrade to Standard? To be honest I try to keep away from the licensing debate as far as I can, but this seems to be the most popular question. If you have an Essential Plus license you can leverage all these enhancements of vMotion in vSphere 5.1. You are not required to use a standard license if you are going to migrate to a shared storage destination datastore. For any other licensing question or remark, please contact your local VMware SE / account manager. Update: Essential plus customers, please update to vCenter 5.1.0A. For more details read the follow article: “vMotion bug fixed in vCenter Server 5.1.0a”. Get notification of these blogs postings and more DRS and Storage DRS information by following me on Twitter: @frankdenneman

STORAGE DRS DATASTORE CLUSTER DEFAULT AFFINITY RULE

In vSphere 5.1 you can configure the default (anti) affinity rule of the datastore cluster via the user interface. Please note that this feature is only available via the web client. The vSphere client does not contain this option. By default the Storage DRS applies an intra-VM vmdk affinity rule, forcing Storage DRS to place all the files and vmdk files of a virtual machine on a single datastore. By deselecting the option “Keep VMDKs together by default” the opposite becomes true and an Intra-VM anti-affinity rule is applied. This forces Storage DRS to place the VM files and each VDMK file on a separate datastore. Please read the article: “Impact of intra-vm affinity rules on storage DRS” to understand the impact of both types of rules on load balancing.

VSPHERE 5.1 STORAGE VMOTION PARALLEL DISK MIGRATIONS

Where previous versions of vSphere copied disks serially, vSphere 5.1 allows up to 4 parallel disk copies per Storage vMotion operation When you migrate a virtual machine with five VMDK files, Storage vMotion copies of the first four disks in parallel, then starts the next disk copy as soon as one of the first four finishes. To reduce performance impact on other virtual machines sharing the datastores, parallel disk copies only apply to disk copies between distinct datastores. This means that if a virtual machine has multiple VMDK files on Datastore1 and Datastore2, parallel disk copies will only happen if destination datastores are Datastore3 and Datastore4. Let’s use an example to clarify the process. Virtual machine VM1 has four vmdk files. VMDK1 and VMDK2 are on Datastore1, VMDK3 and VMDK4 are on Datastore2. The VMDK files are moved from Datastore1 to Datastore4 and from Datastore2 to Datastore3. VMDK1 and VMDK3 are migrated in parallel, while VMDK2 and VMDK4 are queued. The migration process of VMDK2 is started the moment the migration of VMDK1 is complete, similar for VMDK4 as it will be started when the migration of VMDK3 is complete. A fan out disk copy, in other words copying two VMDK files on datastore A to datastores B and C, will not have parallel disk copies. The common use case of parallel disk copies is the migration of a virtual machine configured with an anti-affinity rule inside a datastore cluster.

STORAGE DRS DATASTORE CORRELATION DETECTOR

One of the cool new features of Storage DRS in vSphere 5.1 is the datastore correlation detector used by the SIOC injector. Storage arrays have many ways to configure datastores from among the available physical disk and controller resources in the array. Some arrays allow sharing of back-end disks and RAID groups across multiple datastores. When two datastores share backend resources, their performance characteristics are tied together: when one datastore experiences high latency, the other datastore will also experience similar high latency since IOs from both datastore are being serviced by the same disks. These datastores are considered “performance-related”. I/O load balancing operations in vSphere 5.1 avoid recommending migration of virtual machines between two performance-correlated datastores. I/O load balancing algorithm Storage DRS collects several virtual machine metrics to analyze the workload generated by the virtual machines within the datastore cluster. These metrics are aggregated in a workload model. To effectively distribute the different load of the virtual machines across the datastores, Storage DRS needs to understand the performance (latency) of each datastore. When a datastore violates its I/O load threshold, Storage DRS moves virtual machines out of the datastore. By linking workload models to device models, Storage DRS is able to select a datastore with a low I/O load when placing a virtual machine with a high I/O load during load balance operations. Performance related datastores However if data is moved between datastores that are backed by the same disks, the move may not decrease the latency experienced on the source datastore as the same set of disks, spindles or RAID-groups service the destination datastore as well. I/O load balancing recommendations should avoid using two performance-correlated datastores, since moving a virtual machine from the source datastore to the destination datastore has no effect on the datastore latency. How does Storage DRS discover performance related datastores? How does it work? The datastore correlation detector measures performance during isolation and when concurrent IOs are pushed to multiple datastores. The basic mechanism of correlation detector is rather straightforward: compare the overall latency when two datastores are being used alone in isolation and when there are concurrent IO streams on both of the datastores. If there is no performance correlation, the concurrent IO to the other datastore should have no effect. Contrariwise, if two datastores are performance correlated, then concurrent IO stream should amplify the average IO latency on both datastores. Please note that datastores will be checked for correlation on a regular basis. This allows Storage DRS to detect changes to the underlying storage configuration. Example scenario In this scenario Datastore1 and Datastore2 are backed by disk devices grouped in Diskgroup1, while Datastore3 and Datastore4 are backed by disk devices grouped in Diskgroup2. All four datastores belong to a single datastore cluster. After SIOC has run the workload and device models on a datastore, SIOC picks a random datastore in the datastore cluster to check for correlations. If both datastores are idle, the datastore correlation detector uses the same workload to measure the average I/O latency in isolation and concurrent I/O mode. Isolation The SIOC injector measures the average IO latency of Datastore1 in isolation. This means it measures the latency of the outstanding I/O of Datastore1 alone. Next, it measures the average IO latency of Datastore2 in isolation. Concurrent I/Os The first two steps are used to establish the baseline for each datastore. In the third step the SIOC injector sends concurrent I/O to both datastores simultaneously. This results in the behavior that Storage DRS does not recommend any I/O load balancing operations between Datastore1 and 2 and Datastore3 and 4, but it can recommend for example to move virtual machines from Datastore1 to Datastore2 or from Datastore2 to Datastore3, etc. All moves are possible as long as the datastores are not correlated. Enable Storage DRS on performance-correlated datastores? When two datastores are marked as performance-correlated, Storage DRS does not generate IO load balancing recommendations between those two datastores. However Storage DRS can be used for initial placement and still generate recommendations to move virtual machines between two correlated datastores to address out of space situations or to correct rule violations. Please keep in mind that some arrays use a subset of disk out of a larger diskpool to back a single datastore. With these configurations, it appears that all disks in a diskpool back all the datastores but in reality they don’t. Therefor I recommend to set Storage DRS automation mode to manual and review the migration recommendations to understand if all datastores within the diskpool are performance-correlated.