VMWARE CLOUD™ ON AWS – A CLOSER LOOK
After a long time of keeping this silent, I can finally share a little bit what I’ve been focussing on at VMware. (This is a repost of content on blogs.vmware.com) Today, VMware and Amazon Web Services (AWS) are announcing a strategic partnership providing the ability to run a full VMware Software Defined Data Center (SDDC) as a cloud service on AWS. This service will include all the enterprise tools you’re familiar with including vSphere, ESXi, VSAN and NSX. This article provides a technical preview of the new service VMware Cloud on AWS (VMC), allowing me to give you a sneak peak of the incredibly cool stuff that is coming. This architecture is a match made in heaven if you ask me. It allows administrators and architects that are used to vSphere to leverage the agility of AWS without re-architecting applications and reconstructing operational procedures. One great advantage is that vCenter will be the main platform of operations, therefore all tools that you currently run against vCenter in your on-premises vSphere deployment will work with the in-cloud SDDC environment. All these tools and functionalities that have been developed over the years are now coming together and provide an environment that allows workload mobility between clouds while pushing data center agility to new levels. In short, once signed up, select a cluster size and a SDDC environment is created for you in a very short time. To emphasize (and to avoid any misconception), the VMware cloud will run on native ESX on next-generation, bare metal AWS Infrastructure. The VMware cloud will be deployed as a private cloud containing vSphere ESXi hosts, VSAN and NSX on AWS infrastructure. This will allow you to run enterprise workloads with the same performance, reliability and availability levels as your on-premises vSphere deployments but now on an AWS architecture. The main difference between the on-prem and in-cloud deployment is that VMware manages and operates the infrastructure of the VMware Cloud on AWS. It is important to note here that this is a fully managed service. That is to say, VMware will install, manage and maintain the underlying ESXi, VSAN, vCenter and NSX infrastructure. Routine operations like patching or hardware failure remediation will be taken care of by VMware as part of the service. Customers will have delegated permissions to things like vCenter and will be able to use vCenter to perform administrative tasks but there will be some actions like patching which VMware will provide to you as part of the service. This means that VMware takes care of the core infrastructure in partnership with AWS. VMware Cloud on AWS will be available as a stand-alone deployment, as a Hybrid cloud deployment or as a cloud-to-cloud deployment. With hybrid and cloud-to-cloud deployments, vCenter enhanced linked-mode provides a single pane of glass that assists IT operation teams to manage the SDDC deployments from a centralized console. NSX extends this single pane of glass by providing consistent network and security services between the various deployments. However, NSX is not a requirement! If you are not running NSX on premise right now, you will still be able to run VMware Cloud on AWS but you won’t be able to utilize the hybrid cloud features of NSX until you do. With the ability to span networks and clouds, vMotion provides workload mobility, allowing the movement of workloads in and out the various cloud deployments. Yes, you read that correctly, you can vMotion from your existing on-premises vSphere environment to AWS! One of the interesting concepts is elastic scaling. Elastic scaling would help to solve one of the toughest challenges an IT architect can face: capacity planning. Major key points of capacity planning are current and future resource demand, failure recovery capacity and maintenance capacity. Finding the right balance between maintaining workload performance versus the downside of CAPEX and OPEX of reserved failover capacity is difficult. Think about how elastic scaling would transform vSphere clusters into agile powerhouses. Instead of going through the tedious procuring and installing process yourself, benefit from the IT-at-scale mindset and services delivered by AWS. Since ESXi 4.0, vSphere HA has enabled workloads to restart the surviving hosts in the cluster. However, when a host outage is not temporary, host resources can become constrained due to the reduction of the available hosts. Auto-remediation can builds upon DR solutions ensuring available host resources remain consistent during an ESXi host outage. When a host failure is detected, auto-remediation adds other hosts to the cluster, ensuring that the workload performance will not be impacted in the long run by a host failure. If partial (hardware) failure occurs, auto-remediation ensures that VSAN operations complete before ejecting the degraded host. Another benefit of this framework is the ability to retain similar levels of resources during maintenance. During maintenance operations, the cluster size is not reduced, workloads are not impacted by a loss of resources and continue to perform similarly as to normal operation hours. I believe one of the strengths of VMware Cloud on AWS service is that it allows administrators, operation teams and architects to use their existing skill set and tools to consume AWS infrastructure. You can move workloads to the cloud without having to replatform them in any way, no conversion of virtual machines, no repackaging and very important no extensive testing, you just migrate the VM. Another strength it the ability to pair current workloads with the advanced feature set of AWS. As a result, IT teams will be able to extend their skill set discovering the vast catalog of services AWS has to offer. This creates an environment that works seamlessly with both on-premises private clouds and advanced AWS Public Cloud Services. There are so many other great features that I want to cover, but let’s save that for future articles.
THIS BLOG HAS BEEN HACKED BY VS0CIETY
Follow us @vS0ciety
VMWORLD GEEK WHISPERERS PODCAST - CHOOSING TITLES YOU WANT TO HAVE
Amy Lewis asked me to appear on the Geek Whisperers Live podcast at VMworld 2016 in Las Vegas. And as always I had a blast discussing various topics with Amy, Matt, and John. In this talk, we spoke about becoming an evangelist, what the challenges are as an evangelist and why you won’t want to pick the title of evangelist yourself. Of course, while interacting with this magnificent group of people you tend to talk about a lot more things. So go on and check it out, I had a blast doing it. http://geek-whisperers.com/2016/09/choosing-titles-you-want-to-have-wfrank-denneman-at-vmworld-2016-episode-120/
I'M COMING HOME
I’m excited to announce that I’ve accepted a position at VMware as Senior Staff Architect. I can’t share the details of this next-level product that I will be working on right now. But I look forward to sharing more information when the time is right. I cannot wait to get started. #GameOn!
NUMA DEEP DIVE PART 5: ESXI VMKERNEL NUMA CONSTRUCTS
ESXi Server is optimized for NUMA systems and contains a NUMA scheduler and a CPU scheduler. When ESXi runs on a NUMA platform, the VMkernel activates the NUMA scheduler. The primary role of the NUMA scheduler is to optimize the CPU and memory allocation of virtual machines by managing the initial placement and load balance virtual machine workloads dynamically across the NUMA nodes. Allocation of physical CPU resources to virtual machines is carried out by the CPU scheduler.
NUMA DEEP DIVE PART 4: LOCAL MEMORY OPTIMIZATION
If a cache miss occurs, the memory controller responsible for that memory line retrieves the data from RAM. Fetching data from local memory could take 190 cycles, while it could take the CPU a whopping 310 cycles to load the data from remote memory. Creating a NUMA architecture that provides enough capacity per CPU is a challenge considering the impact memory configuration has on bandwidth and latency. Part 2 of the NUMA Deep Dive covered QPI bandwidth configurations, with the QPI bandwidth ‘restrictions’ in mind, optimizing the memory configuration contributes local access performance the most. Similar to CPU, memory is a very complex subject and I cannot cover all the intricate details in one post. Last year I published the memory deep dive series and I recommend to review that series as well to get a better understanding of the characteristics of memory.
NUMA DEEP DIVE PART 3: CACHE COHERENCY
When people talk about NUMA, most talk about the RAM and the core count of the physical CPU. Unfortunately, the importance of cache coherency in this architecture is mostly ignored. Locating memory close to CPUs increases scalability and reduces latency if data locality occurs. However, a great deal of the efficiency of a NUMA system depends on the scalability and efficiency of the cache coherence protocol! When researching the older material of NUMA, today’s architecture is primarily labeled as ccNUMA, Cache Coherent NUMA.
NUMA DEEP DIVE PART 2: SYSTEM ARCHITECTURE
Reviewing the physical layers helps to understand the behavior of the CPU scheduler of the VMkernel. This helps to select a physical configuration that is optimized for performance. This part covers the Intel Xeon microarchitecture and zooms in on the Uncore. Primarily focusing on Uncore frequency management and QPI design decisions. Terminology There a are a lot of different names used for something that is apparently the same thing. Let’s review the terminology of the Physical CPU and the NUMA architecture. The CPU package is the device you hold in your hand, it contains the CPU die and is installed in the CPU socket on the motherboard. The CPU die contains the CPU cores and the system agent. A core is an independent execution unit and can present two virtual cores to run simultaneous multithreading (SMT). Intel proprietary SMT implementation is called Hyper-Threading (HT). Both SMT threads share the components such as cache layers and access to the scalable ring on-die Interconnect for I/O operations. Interesting entomology; The word “die” is the singular of dice. Elements such as processing units are produced on a large round silicon wafer. The wafer is cut “diced” into many pieces. Each of these pieces is called a die.
NUMA DEEP DIVE PART 1: FROM UMA TO NUMA
Non-uniform memory access (NUMA) is a shared memory architecture used in today’s multiprocessing systems. Each CPU is assigned its own local memory and can access memory from other CPUs in the system. Local memory access provides a low latency - high bandwidth performance. While accessing memory owned by the other CPU has higher latency and lower bandwidth performance. Modern applications and operating systems such as ESXi support NUMA by default, yet to provide the best performance, virtual machine configuration should be done with the NUMA architecture in mind. If incorrect designed, inconsequent behavior or overall performance degradation occurs for that particular virtual machine or in worst case scenario for all VMs running on that ESXi host. This series aims to provide insights of the CPU architecture, the memory subsystem and the ESXi CPU and memory scheduler. Allowing you in creating a high performing platform that lays the foundation for the higher services and increased consolidating ratios. Before we arrive at modern compute architectures, it’s helpful to review the history of shared-memory multiprocessor architectures to understand why we are using NUMA systems today.
INTRODUCTION 2016 NUMA DEEP DIVE SERIES
Recently I’ve been analyzing traffic to my site and it appears that a lot CPU and memory articles are still very popular. Even my first article about NUMA published in february 2010 is still in high demand. And although you see a lot of talk about the upper levels and overlay technology today, the focus on proper host design and management remains. After all, it’s the correct selection and configuration of these physical components that produces a consistent high performing platform. And it’s this platform that lays the foundation for the higher services and increased consolidating ratios.