UPGRADING VMFS DATASTORES AND SDRS

Among many new cool features introduced by vSphere 5 is the new VMFS file system for block storage. Although vSphere 5 can use VMFS-3, VMFS-5 is the native VMFS level of vSphere 5 and it is recommended to migrate to the new VMFS level as soon as possible. Jason Boche wrote about the difference between VMFS-3 and VMFS-5. vSphere 5 offers a pain free upgrade path from VMFS-3 to VMFS-5. The upgrade is an online and non-disruptive operation which allows the resident virtual machines to continue to run on the datastore. But upgraded VMFS datastores may have impact on SDRS operations, specifically virtual machine migrations. When upgrading a VMFS datastore from VMFS-3 to VMFS-5, the current VMFS-3 block size will be maintained and this block size may be larger than the VMFS-5 block size as VMFS-5 uses unified 1MB block size. For more information about the difference between native VMFS-5 datatstores and upgraded VMFS-5 datastore please read: Cormac’s article about the new storage features Although the upgraded VMFS file system leaves the block size unmodified, it removes the maximum file size related to a specific block size, so why exactly would you care about having a non-unified block size in your SDRS datastore cluster? In essence, mixing different block sizes in a datastore cluster may lead to a loss in efficiency and an increase in the lead time of a storage vMotion process. As you may remember, Duncan wrote an excellent post about the impact of different block sizes and the selection of datamovers. To make an excerpt, vSphere 5 offers three datamovers: • fsdm • fs3dm • fs3dm – hardware offload The following diagram depicts the datamover placement in the stack. Basically, the longer path the IO has to travel to be handled by a datamover, the slower the process. In the most optimal scenario, you want to leverage the VAAI capabilities of your storage array. vSphere 5 is able to leverage the capabilities of the array allowing hardware offload of the IO copy. Most IOs will remain within the storage controller and do not travel up the fabric to the ESXi host. But unfortunately not every array is VAAI capable. If the attached array is not VAAI capable or enabled, vSphere will leverage the FS3DM datamover. FS3DM was introduced in vSphere 4.1 and contained some substantial optimizations so that data does not travel through all stacks. However if a different block size is used, ESXi reverts to FSDM, commonly known as the legacy datamover. To illustrate the difference in Storage vMotion lead time, read the following article (once again) by Duncan: Storage vMotion performance difference. This article contains the result of a test in which a virtual machine was migrated between two different types of disks configured with deviating block sizes and at a different stage a similar block size. To emphasize; the results illustrates the lead time of the FS3DM datamover and the FSDM datamover. The results below are copied from the Yellow-Bricks.com article:

MULTI-NIC VMOTION SUPPORT IN VSPHERE 5.0

There are some fundamental changes to vMotion scalability and performance in vSphere 5.0 one is the multi-nic support. One of the most visible changes is multi-NIC vMotion capabilities. In vSphere 5.0 vMotion is now capable of using multiple NICs concurrently to decrease lead time of a vMotion operation. With multi-NIC support even a single vMotion can leverage all of the configured vMotion NICs, contrary to previous ESX releases where only a single NIC was used. Allocating more bandwidth to the vMotion process will result in faster migration times, which in turn affects the DRS decision model. DRS evaluates the cluster and recommends migrations based on demand and cluster balance state. This process is repeated each invocation period. To minimize CPU and memory overhead, DRS limits the number of migration recommendations per DRS invocation period. Ultimately, there is no advantage recommending more migrations that can be completed within a single invocation period. On top of that, the demand could change after an invocation period that would render the previous recommendations obsolete. vCenter calculates the limit per host based on the average time per migration, the number of simultaneous vMotions and the length of the DRS invocation period (PollPeriodSec). PollPeriodSec: By default, PollPeriodSec – the length of a DRS invocation period – is 300 seconds, but can be set to any value between 60 and 3600 seconds. Shortening the interval will likely increase the overhead on vCenter due to additional cluster balance computations. This also reduces the number of allowed vMotions due to a smaller time window, resulting in longer periods of cluster imbalance. Increasing the PollPeriodSec value decreases the frequency of cluster balance computations on vCenter and allows more vMotion operations per cycle. Unfortunately, this may also leave the cluster in a longer state of cluster imbalance due to the prolonged evaluation cycle. Estimated total migration time: DRS considers the average migration time observed from previous migrations. The average migration time depends on many variables, such as source and destination host load, active memory in the virtual machine, link speed, available bandwidth and latency of the physical network used by the vMotion process. Simultaneous vMotions: Similar to vSphere 4.1, vSphere 5 allows you to perform 8 concurrent vMotions on a single host with 10GbE capabilities. For 1GbE, the limit is 4 concurrent vMotions. Design considerations When designing a virtual infrastructure leveraging converged networking or Quality of Service to impose bandwidth limits, please remember that vCenter determine the vMotion limits based on the vMotion uplink physical NIC reported link speed. In other words, if the physical NIC reports at least 10GbE, link speed, vCenter allows 8 vMotions, but if the physical NIC reports less than 10GBe, but at least 1 GbE, vCenter allows a maximum of 4 concurrent vMotions on that host. For example; HP Flex technology sets a hard limit on the flexnics, resulting in the reported link speed equal or less to the configured bandwidth on Flex virtual connect level. I’ve come across many Flex environments configured with more than 1GB bandwidth, ranging between 2GB to 8GB. Although they will offer more bandwidth per vMotion process, it will not offer an increase in the amount of concurrent vMotions. Therefore, when designing a DRS cluster, take the possibilities of vMotion into account and how vCenter determines the concurrent number of vMotion operations. By providing enough bandwidth, the cluster can reach a balanced state more quickly, resulting in better resource allocation (performance) for the virtual machines. **disclaimer: this article contains out-takes of our book: vSphere 5 Clustering Technical Deepdive**

BLACK AND WHITE EDITION CLUSTERING DEEPDIVE AVAILABLE

It looks like Amazon is getting its game together. As of now the Black and White paperback edition is available at Amazon.com. Get it here: VMware vSphere Clustering Technical Deepdive We are still waiting for the Full color edition to become available, but hey it’s a start :)

VMWARE VSPHERE 5 CLUSTERING TECHNICAL DEEPDIVE

As of today the paperback versions of the VMware vSphere 5 Clustering Technical Deepdive is available at Amazon. We took the feedback into account when creating this book and are offering a Full Color version and a Black and White edition. Initially we planned to release an Ebook and a Full Color version only, but due to the high production cost associated with Full color publishing, we decided to add a Black and White edition to the line-up as well. At this stage we do not have plans to produce any other formats. As this is self-publishing release we developed, edited and created everything from scratch. Writing and publishing a book based on new technology has serious impact on one’s life, reducing every social contact to a minimum even family life. As of this, our focus is not on releasing additional formats such as ibooks or Nook at this moment. Maybe at a later stage but VMworld is already knocking on our doors, so little time is left to spend some time with our families. When producing the book, the page count rapidly exceeded 400 pages using the 4.1 HA and DRS layout. As many readers told us they loved the compactness of the book therefor our goal was to keep the page count increase to a minimum. Adjusting the inner margins of the book was the way to increase the amount of space available for the content. A tip for all who want to start publishing, start with getting accustomed to publisher jargon early in the game, this will save you many failed proof prints! We believe we got the right balance between white-space and content in the book, reducing the amount of pages while still offering the best reading experience. Nevertheless the number of pages grew from 219 to 348. While writing the book, we received a lot of help and although Duncan listed all the people in his initial blog, I want to use take a moment to thank them again. First of all I want to thank my co-author Duncan for his hard work creating content, but also spending countless hours on communication with engineering and management. Anne Holler - DRS and SDRS engineer – Anne really went out of her way to help us understand the products. I frequently received long and elaborate replies regardless of time and day. Thanks Anne! Next up is Doug – its number Frank not amounts! – Baer. I think most of the time Doug’s comments equaled the amount of content inside the documents. Your commitment to improve the book impressed us very much. Gabriel Tarasuk-Levin for helping me understand the intricacies of vMotion. A special thanks goes out to our technical reviewers and editors: Keith Farkas and Elisha Ziskind (HA Engineering), Irfan Ahmad and Rajesekar Shanmugam (DRS and SDRS Engineering), Puneet Zaroo (VMkernel scheduling), Ali Mashtizadeh and Doug Fawley and Divya Ranganathan (EVC Engineering). Thanks for keeping us honest and contributing to this book. I want to thank VMware management team for supporting us on this project. Doug “VEEAM” Hazelman thanks for writing the foreword! Availability This weekend Amazon made both the black and white edition and the full color edition available. Amazon list the black and white edition as: VMware vSphere 5 Clustering Technical Deepdive (Volume 2) [Paperback], whereas the full color edition is listed with Full Color in its subtitle. Or select the following links to go the desired product page: Black and white paperback $29.95 Full Color paperback $49.95 For people interested in the ebook: VMware vSphere 5 Clustering Technical Deepdive (price might vary based on location) If you prefer a European distributor, ComputerCollectief has both books available: Black and White edition: http://www.comcol.nl/detail/74615.htm Full Color edition: http://www.comcol.nl/detail/74616.htm Pick it up, leave a comment and of course feel free to make those great mugshots again and ping them over via Facebook or our Twitter accounts! For those looking to buy in bulk (> 20) contact [email protected].

AMAZON INDEXING PROBLEMS

The entire week Amazon hasn’t been able to index both vSphere 5 Clustering technical deepdive editions properly. We are working with Createspace to fix these problems. In the meantime, both full color and black and white editions can be ordered at Createspace: Black and White: https://www.createspace.com/3641804 $29.95 Full Color: https://www.createspace.com/3586911 $49.95 An update follows as soon as Amazon list the paperbacks.

A LOOK INSIDE THE UPCOMING VSPHERE CLUSTERING BOOK

I just received the second proof copy of the new book and I’m really (really) stoked about the book. The full color print is awesome and truly adds that special feeling to the book. I’m so excited how the diagrams turned out, that I must share some pictures of the inside of the book. (For the CSI-fan’s, yes I have blurred some text fields as they contain NDA material) Besides the diagrams, the whole interior is redesigned. The spread is reviewed, inner and outer margins are adjusted and we even taken the gutter space into account, providing a better and nicer reading experience. We decided that we are going to offer the book in Full color format and a (full-color) ebook. After seeing the full-color version, we believe publishing a black and white version will not do the content any justice.And due to time constraints we cannot invest time in offering a black and white version of the book. We are still finalizing the book but we hope to provide the possibility of pre-ordering near the publish date. Stay tuned for more information!

KEEP ALIVE PING - SOME UPDATES

Lately the amount of content of frankdenneman.nl is getting a bit stale, so a small update from my side to show what I’ve been up to and that this blog is still alive. vSphere x Clustering Deepdive Duncan and I are working (feverishly) on a new book for a while. Calling it an update of the 4.1 HA and DRS technical deepdive won’t do the book any justice as the old chapters are complete rewritten. The HA section will cover the new HA stack of the upcoming vSphere version, while the focus of the DRS section leans more towards resource management. In addition this book covers Storage DRS and introduces a cool new feature called “supporting deep dives” These additional deep dives expand on supporting technologies of the main cluster feature set, it will contain in-depth information of technologies such as vMotion, Storage vMotion, EVC and certain new technologies introduced in the upcoming version of vSphere. I bet you guys will love this stuff. Speaking at VMworld 2011 Last Friday I received very good news. In my previous post I asked everyone to consider voting on the sessions I participate in and the good news is that both sessions were accepted. Session VSP1682 - vSphere 5 clustering Q&A, co-presenting with Duncan is accepted for both VMworld Las Vegas as well as VMworld Europe in Copenhagen. The second session, VSP1425 - Ask the Experts vBloggers, I’m proud to join the incredible line-up of Chad Sackac, Duncan Epping, Rick Scherer and Scott Lowe to help answer questions on virtualization design. After 5 years of visit VMworld as an attendee, I finally get to experience what it’s like to be at the other side of the room.

VMWORLD PUBLIC VOTING

VMworld 2011 session voting opened a week ago and there a still a few days left to cast your vote. About 300 in-depth sessions will be presented at VMworld this year and this year two sessions are submitted in which I participate. Both sessions are not the typical PowerPoint slide sessions, but are based on interaction with the attending audience. TA 1682 – vSphere Clustering Q&A Duncan Epping and Frank Denneman will answer any question with regards to vSphere Clustering in this session. You as the audience will have the chance to validate your own environment and design decisions with the Subject Matter Experts on HA, DRS and Storage DRS. Topics could include for instance misunderstandings around Admission Control Policies, the impact of limits and reservations on your environment, the benefits of using Resource Pools, Anti-Affinity Rules Gotchas, DPM and of course anything regarding Storage DRS. This is your chance to ask what you’ve always wanted to know! Duncan and I conducted this very successful session at the Dutch VMUG. Audience participation led to a very informative session where both general principles and in-depth details were explained and misconceptions where addressed. TA1425 - Ask the Expert vBloggers Four VMware Certified Design Experts (VCDX) On Stage! Are you running a virtual environment and experiencing some problems? Are you planning your companies’ Private Cloud strategy? Looking to deploy VDI and have some last minute questions? Do you have a virtual infrastructure design and want it blessed by the experts? Come join us for a one hour panel session where your questions are the topic of discussion! Join the Virtualization Experts, Frank Denneman (VCDX), Duncan Epping (VCDX), Scott Lowe (VCDX) and Chad Sakac, VP-VMware Alliance within EMC, as they answer your questions on virtualization design. Moderated by VCDX #21, Rick Scherer from VMwareTips.com Many friends of the business have submitted great sessions and there are really too many to list them all, but there is one I would want to ask you to vote on and that is the ESXi Quiz Show. A 1956 – The ESXi Quiz Show Join us for our very first ESXi Quiz Show where teams of vExperts and VMware engineers will match expertise on technical facts, trivia related to all VMware ESXi and related products. You as the audience will get 40% of the vote. We will cover topics around ESXi migration, storage, networking security, and VMware products. As an attendee of this session you will get to see the experts battle each other. For the very first time at VMworld you get to decide who leaves the stage as a winner and who does not. This can become the most awesome thing that ever hit VMworld. Can you think about the gossip, the hype and the sensation will introduce during VMworld? As Top vExperts, bloggers, VMware engineers and the just the lone sys admin (no not you Bob Plankers ;) ) compete with each other. Will the usual suspect win or will there be upsets? Who will dethrone who? Really I think this will become the hit of VMworld 2011 and will be the talk of the day at every party during the VMworld week. Session Voting is open until May 18, the competition is very fierce and it’s very difficult to choose between the excellent submitted sessions, however I would like to ask your help and I hope you guys are willing to vote on these three sessions. http://www.vmworld.com/cfp.jspa

CONTENTION ON LIGHTLY UTILIZED HOSTS

Often I receive the question why a virtual machine is not receiving resources while the ESXi host is lightly utilized and is accumulating idle time. This behavior is observed while reviewing the DRS distribution chart or the Host summary tab in the vSphere Client. A common misconception is that low utilization (Low MHz) equals schedule opportunities. Before focusing on the complexities of scheduling and workload behavior, let’s begin by reviewing the CPU distribution chart. The chart displays the sum of all the active virtual machines and their utilization per host. This means that in order to have 100% CPU utilization of the host, every active vCPU on the host needs to consume 100% of their assigned physical CPU (pCPU). For example, an ESXi host equipped with two Quad core CPUs need to simultaneously run eight vCPUs and each vCPU must consume 100% of “their” physical CPU. Generally this is a very rare condition and is only seen during boot storms or incorrect configured scheduled anti-virus scanning. But what causes latency (ready time) during low host utilization? Lets take a closer look at some common factors that affect or prohibit the delivery of the entitled resources:

RESTART VCENTER RESULTS IN DRS LOAD BALANCING

Recently I had to troubleshoot an environment which appeared to have a DRS load-balancing problem. Every time when a host was brought out of maintenance mode, DRS didn’t migrate virtual machines to the empty host. Eventually virtual machines were migrated to the empty host but this happened after a couple of hours had passed. But after a restart of vCenter, DRS immediately started migrating virtual machines to the empty host. Restarting vCenter removes the cached historical information of the vMotion impact. vMotion impact information is a part of the Cost-Benefit Risk analysis. DRS uses this Cost-Benefit Metric to determine the return on investment of a migration. By comparing the cost, benefit and risks of each migration, DRS tries to avoid migrations with insufficient improvement on the load balance of the cluster. When removing the historical information a big part of the cost segment is lost, leading to a more positive ROI calculation, which in turn results in a more “aggressive” load-balance operation.