KUBERNETES AND VSPHERE COMPUTE AND STORAGE DOUBLEHEADER VMWORLD SESSION

Kubernetes is hot! It is one of the most talked about technologies of this year. For some, it’s the next platform, for others it’s just another tech making its way in the datacenter. Will it replace virtual machines, will it get displace vSphere? Some ask, why run Kubernetes on top of vSphere when you can run it on bare metal? We rather not go back to 2005 and deal with a sprawl of bare-metal servers, we believe Kubernetes and vSphere are better together! In the session “CNA1553BU - Deep Dive: The value of Running Kubernetes on vSphere” Michael Gasch and I review the behavior of Kubernetes resource management, optimization, and availability for container orchestration. Kubernetes is a system optimized for cloud-native workloads where failure and disruption is anticipated, but how about the infrastructure that is required to run these cloud-native apps? How about Kubernetes ability to economically and optimally consume the available resources? We will answer these questions and reveal why vSphere is such a good match with its extensive features such as high availability, NUMA optimization, and distributed resource scheduler. In this session, we explore the critical elements of a container and demonstrate that Kubernetes does not run in thin air. Running Linux on bare-metal or inside a VM determines your scalability, your recoverability, and your portability. If you spin up a Kubernetes cluster at Amazon or Google, they will deploy it for you in virtual machines, if these cloud-native giants use VMs, why would you use bare-metal? Adding vSphere to the picture, Kubernetes gains several advantages for both, cloud-native and traditional workloads. vSphere also plays a critical role in keeping the Kubernetes control plane components highly-available in case of planned and unplanned downtime. We are going to detail recommended DRS and HA settings and many other best practices for Kubernetes on vSphere based on real-world customer scenarios. Of course, an outlook on upcoming improvements for the Kubernetes on vSphere integration should not be missing in a deep dive session! Last but not least, you’ll definitely learn how to respond to common objections to win back your end-user. Still not convinced? Let’s dive into the behavior of Linux CPU scheduling versus ESXi CPU and NUMA scheduling and help you understand how to size and deploy your Kubernetes cluster on vSphere correctly. Developers shouldn’t need to worry about all these settings and the underlying layers. They just want to deploy the application, but it’s our job to cater to the needs of the application and make sure the application runs consistently and constantly. This applies to compute, but also to storage.7 out of 10 applications that run on kubernetes are stateful, so it makes sense to incorporate persistent storage in your kubernetes design. Some applications are able to provide certain services like replication themselves, thus it makes no sense to “replicate” that service at the infrastructure layer. vSAN and its storage policies allow the admin to provide storage services that are tailor-made to the application stack. Cormac Hogan and Christos Karamanolis talk about why vSAN is the ultimate choice for running next-gen apps. Visit their session “HCI1338BU-HCI: The Ideal Operational Environment for Cloud-Native Applications” to hear about real-world use-cases and learn what you need to do when dealing with these next-gen apps. Please note that if you attempt adding these sessions to your schedule, you might get a warning that you are on the waiting list. As we understood it, all sessions are booked in small rooms and depending on the waiting list they are moved to bigger rooms. Thus sign up for these sessions even if they state waiting list only. It will be sorted out during the upcoming weeks. Hope to see you in our session!

INTRODUCTION TO ELASTIC DRS

VMware Cloud on AWS allows you to deploy physical ESXi hosts on demand. You can scale in and scale out your cluster by logging into the console. This elasticity allows you to right-size your SDDC environment for the current workload demand. No more long procurement process, no more waiting for the vendor to ship the goods. No more racking, stacking in a cold dark datacenter. Just with a few clicks, you get new physical resources added to your cluster, ESXi and vSAN fully installed, configured, patched and ready to go! Having physical resources available on demand is fantastic, but it still requires manual monitoring and manual operations to scale out or scale in the vSphere cluster. Wouldn’t it be more comfortable if the cluster automatically responds to the dynamic nature of the workloads? As of today, you can enable Elastic DRS. Introducing Elastic DRS Elastic Distributed Resources Scheduler (EDRS) is a policy-based solution that automatically scales a vSphere Cluster in VMware Cloud on AWS based on utilization. EDRS monitors CPU, memory, and storage resources for scaling operations. EDRS monitors the vSphere cluster continuously, and each 5 minutes EDRS runs the algorithm to determine if scale-out or scale-in operations is necessary. Algorithm Behavior EDRS is configured with thresholds for each resource and generates scaling recommendations if utilization consistently remains above or below their respective thresholds. EDRS algorithm takes spikes and randomness of utilization into consideration when generating these scaling recommendations. Scaling Operations Thresholds are defined for scale up operations and scale down operations. To avoid generating recommendations by spikes, EDRS generates a scale operation if the resource utilization shows consistent progress towards a threshold. To generate a scale out operation, a single threshold must be exceeded. That means that if CPU utilization shows consistent progress towards the threshold and at one point exceeds the threshold, EDRS triggers an event and adds an ESXi host to the vSphere cluster. Similar to adding an ESXi host manually, the ESXi host is installed with the same ESXi version, patch level and is configured with the appropriate logical networks and adds the capacity to the vSAN datastore. To automatically scale down the cluster, utilization across ALL three resources must be consistently below the specified scale-in thresholds. Minimum and Maximum Number of ESXi hosts You can restrict the bounds of a minimum and a maximum number of ESXi hosts. EDRS can be enabled if the cluster consists of four ESXi hosts, EDRS does not scale in beyond the four ESXi host minimum. When setting a maximum number of ESXi hosts, all ESXi hosts in the vSphere cluster, including those in maintenance mode are included in the count. Only active ESXi hosts are counted towards the minimum. As a result, the VMware cloud on AWS SDDC ignores EDRS recommendations during maintenance and hardware remediation operations. Currently, the maximum number of host in an Elastic-DRS enabled cluster is 16. Scaling Policies EDRS provides policies to adjust the behavior of scaling operations. EDRS provides two scaling policy that optimizes for cost or performance. Both policies have the same scale-out threshold. They only differ on scale-in thresholds.

RESOURCE POOLS AND SIBLING RIVALRY

One of the most powerful constructs in the Software Defined Data Center is the resource pool. The resource pool allows you to abstract and isolate cluster compute resources. Unfortunately, it’s mostly misunderstood and it received a bad rep in the past that it cannot get rid off. One of the challenges of resource pools is to fully commit to resource pools. Placing virtual machines next to resource pools can have an impact of resource distribution. This article zooms in on sibling rivalry. But before this adventure begins, I would like to stress that the examples provided in the article are a worst-case scenario. In this scenario, all VMs are 100% active. An uncommon situation, but it helps to easily explain the resource distribution. Later in the article, I use a few examples, in which some VMs are active and some are idle. And as you will see, resource pools aren’t that bad after all. Resource Pool Size Because resource pool shares are relative to other resource pools or virtual machines with the same parent resource pool, it is important to understand how vCenter sizes resource pools. The values of CPU and memory shares applied to resource pools are similar to virtual machines. By default, a resource pool is sized like a virtual machine with 4 vCPUs and 16GB of RAM. Depending on the selected share level, a predefined number of shares are issued. Similar to VMs, four share levels can be selected. There are three predefined settings: High, Normal or Low, which specify share values with a 4:2:1 ratio, and the Custom setting, which can be used to specify a different relative relationship.

VIRTUALLY SPEAKING PODCAST ABOUT TECHNICAL WRITING

Last week Duncan and I were guests on the ever popular Virtually Speaking Podcast. In this show we discussed the difference in technical writing, i.e., writing a blog post versus writing a book. We spoke a lot about the challenges of writing a book and the importance of a supporting cast. We received a lot of great feedback on social media, and Pete told me the episode was downloaded more than a 1000 times in the first 24 hours. I think this is especially impressive as he published the podcast on a Saturday Afternoon. Due to this popularity, I thought it might be cool to share the episode in case you missed the announcement. Pete and John shared the links to our VMworld sessions on this page. During the show, I mentioned the VMworld session of Katarina Wagnerova and Mark Brookfield. If you go to VMworld, I would recommend attending this session. It’s always interesting to hear people talk about how they designed an environment and dealt with problems in a very isolated place on earth. Enjoy listening to the show.

RESOURCE CONSUMPTION OF ENCRYPTED VMOTION

vSphere 6.5 introduced encrypted vMotion and encrypts vMotion traffic if the destination and source host are capable of supporting encrypted vMotion. If true, vMotion traffic consumes more CPU cycles on both the source and destination host. This article zooms in on the impact of CPU consumption of encrypted vMotion on the vSphere cluster and how DRS leverages this new(ish) technology. CPU Consumption of vMotion Process ESXi reserves CPU resources on both the destination and the source host to ensure vMotion can consume the available bandwidth. ESXi only takes the number of vMotion NICs, and their respective speed into account, the number of vMotion operations does not affect the total of CPU resources reserved! 10% of a CPU core for a 1 GbE NIC, 100% of a CPU core for a 10 GbE NIC. vMotion is configured with a minimum reservation of 30%. Therefore, if you have 1 GbE NIC configured for vMotion, it reserves at least 30% of a single core.

NEW FLING: DRS ENTITLEMENT

I’m proud to announce the latest fling; DRS entitlement. This fling is built by the performance team and it provides insight to the demand and entitlement of the virtual machines and resource pools within a vSphere cluster. By default, it shows the active CPU and memory consumption, which by itself helps to understand the dynamics within the cluster. Especially when you are using resource pools with different levels of share values. In this example, I have two resource pools, one containing the high-value workloads for the organization, and one resource pool containing virtual machines that are used for test and dev operations. The high-value workloads should receive the resources they require all the time. The What-If functionality allows you to simulate a few different scenarios. A 100% demand option and a simulation of resource allocation settings. The screenshot below shows the what-if entitlement. What if these workloads generate 100% of activity, what resources do these workloads require if they go to the max? This allows you to set the appropriate resource allocations settings such as reservations and limits on the resource pools or maybe even on particular virtual machines. Another option is to specify particular Reservation, Limits, and Shares (RLS) settings to an object. Select the RLS option and select the object you want to use in the simulation. In this example, I selected the Low Value Workload resource pool and changed the share value setting of the resource pool. You can verify the new setting before running the analysis. Please note, that this is an analysis, it does not affect the resource allocation of active workload whatsoever. You can simulate different settings and understand the outcome. Once the correct setting is determined you can apply the setting on the object manually, or you can use the PowerCLI setting and export the PowerCLI one-liner to programmatically change the RLS settings. Follow the instruction on the flings website to install it on your vCenter. I would like to thank Sai Inabattini and Adarsh Jagadeeshwaran for creating this fling and for listening to my input! RUN DRS!

STRETCHED CLUSTERS ON VMWARE CLOUD ON AWS, A REALLY BIG THING

This week Emad published an excellent article about the stretched cluster functionality of VMware Cloud on AWS. To sum up, you can now deploy a single vSphere cluster across two AWS availability zones. A trip to Memory Lane I think the ability to stretch a vSphere cluster across two availability zones is a really big thing. Go back to the days where we had to refactor the application to make it highly available. To reduce application downtime, you typically used clustering software such as Microsoft cluster or Veritas clustering services. But not all applications were fit for this solution. When we introduced VMware High Availability back in 2006, we brought a big change to the industry. From that point on you could provide crash-consistent failover ability to all your workloads. No need to refactor any application, no need to build outlandish hardware solutions. Just enable a few tickboxes at the infrastructure layer, and every workload running inside a VM is protected. And to this day, HA remains the most popular functionality of vSphere. Amazon Web Services Resiliency Strategy Amazon urges you to design your application to be resilient to infrastructure outages. Amazon AWS is hosted in multiple locations worldwide. These locations are composed of regions and Availability Zones. Each region is a separate geographic area that has multiple, isolated locations known as Availability Zones. AWS provides the ability to place instances and data in multiple locations. And you can take advantage of the safety and reliability of geographic redundancy by spanning your Auto Scaling group across multiple Availability Zones within a region and then attach a load balancer to distribute incoming traffic across those Availability Zones. Incoming traffic is distributed equally across all Availability Zones enabled for your load balancer. And this works very well if you are refactoring your application or if you are building a complete new cloud-native stack. The challenge we face today is that not all applications lend to getting refactored, or some applications do not require the journey from monolithic to full-FAAS. Hybrid-Cloud Experience With stretched clusters in VMware Cloud on AWS, we introduce the same ease of infrastructure resiliency to workloads that run on AWS infrastructure. Merely expand you vSphere cluster to 6 hosts and select multi-az deployment. After that, the workload in the Cloud SDDC is protected for AZ outages. If something happens, HA detects the failed VMs and restarts them on different physical servers in the remaining AZ without manual human involvement. The ability to stretch your vSphere cluster across AZs allows you to easily provide resiliency to your workload within the AWS infrastructure without the Herculean effort of refactoring all your applications.

DYING HOME LAB - FEEDBACK WELCOME

The servers in my home lab are dying on a daily basis. After four years of active duty, I think they have the right to retire. So I need something else. But what? I can’t rent lab space as I work with unreleased ESXi code. I’ve been waiting for the Intel Xeon D 21xx Supermicro systems, but I have the feeling that Elon will reach Mars before we see these systems widely available. The system that I have in mind is the following:

DEDICATED HARDWARE IN A PUBLIC CLOUD WORLD

One of the more persistent misconceptions is that the components of VMware’s Software Defined Data Center (SDDC) on VMware Cloud on AWS are virtualized or that the deployed VMs run natively on Amazon. And to be honest, it’s not even weird that most people think this way. After all, Amazon Web Services launched in March 2006, 12 years ago. AWS and Elastic Compute Cloud (EC2) and Amazon Simple Storage Service (S3) are synonymous with each other. All of a sudden, you can know “run vSphere on AWS”.

VBROWNBAG TECHTALKS VMWORLD CALL FOR PAPERS NOW OPEN

Although the selection process of the submitted VMworld 2018 sessions is still ongoing, vBrownbag announced their call for papers. As Duncan mentioned in his Call for paper article ‘Good luck, and remember: if you don’t end up getting selected, submit the proposal to a VMUG near you instead. They are always begging for community sessions.’ Think about signing up for the vBrownbag as well. Since last year all the vBrownbag sessions are published in the content catalog. Thus your session is visible for all 23.000+ attendees. Go right ahead and fill out this form.