One of the most important aspects of VMware vSphere is that it allows users to provision and manage one or more Virtual Machines (VM’s) on individual physical servers using their underlying resources. As you may know, this means that you don’t have to allocate a physical server for each application you wish to run, using virtualization you can allocate (share) a physical server’s resources across multiple VMs so you can host multiple isolated operating systems that run different workloads on a single physical server (or Host). This allows for more efficient use of the same physical resources and reduces spending on physical servers, reduced storage space (ie: server racks, etc) and hardware maintenance – all of which lowers your Total Cost of Ownership (TCO).
However, you’ll want to ensure that your physical resources are readily available, used efficiently, and not over consumed by specific parts of your infrastructure at the expense of the rest of your environment. The importance of monitoring VMware performance metrics for organizations cannot be understated. You’ll need a good understanding of how your VMs are performing across your infrastructure to know whether they are functioning correctly, because, after all – vSphere is a shared resource environment {This means that VMs on a host share the resources [CPU, Memory, Network bandwidth and Storage] of that host}
Ideally, we need to understand device performance - Monitoring VMware performance metrics is a critical step in keeping your VMs healthy. Using a performance monitor ensures that all VMs, servers, CPUs, and other devices are running optimally. A single performance issue on a VM host not only affects the physical server but also all the other VMs and their applications. Over- or under-allocation of resources to a VM leads to a waste or unbalanced use of resources. This can result in huge losses for a company since businesses today host several devices and servers in a virtual environment. To prevent performance issues, you must set thresholds for individual devices and take proper remedial action when they are exceeded.
Performance and capacity management go hand-in-hand. For instance, if your applications and workloads experience a bottleneck, it can lead to degraded performance or even downtime if you do not have the necessary resource capacity. For vSphere administrators, monitoring can help right-size virtual machines so that resources are optimally distributed between them. A large number of performance issues experienced with VMware vSphere are associated with a lack of available physical memory on a host.
To refresh some basic vSphere truths: - Physical servers running ESXi hypervisors are called ESXi hosts. By default, ESXi hosts allocate physical resources to each running VM based on a variety of factors, including what resources are available (on a host or across a group of hosts), the number of VMs currently running, and the resource usage of those VMs. If resources are overcommitted (i.e., total resource allocation exceeds capacity), there are three settings that you can customize to manage and optimize how the ESXi host allocates resources. These are:
- Shares – which allow you to prioritize certain VMs by defining their relative claim to resources. For example, a VM with half as many shares of memory as another can only consume up to half as much memory.
- Reservations – which define a guaranteed minimum amount of a resource that the ESXi host will allocate to a VM, and finally
- Limits – which define a maximum amount of a resource that the ESXi host will allocate to a VM.
By default, VMs will have shares based on their allocated resources. In other words, if a VM is allocated twice as much vCPU as another VM, it will have twice as many CPU shares. Shares are also bound by any configured reservations or limits, so even if a VM has more shares, the host will not allocate more resources than its set limit.
You can also partition the physical resources of one or more ESXi hosts into logical units called resource pools. Resource pools are organized hierarchically. A parent pool can contain one or more VMs, or it can be subdivided into child resource pools that share the parent pool’s resources. Each child resource pool can in turn be subdivided further. Resource pools add flexibility to vSphere resource management, making it possible to silo resource consumption across your organization (e.g., different departments and administrators are assigned their own resource pools).
Now that we’ve reminded ourselves of a few of vSphere’s main components and general architecture, I want to drill down into one of the most (In My Opinion) critical vSphere resources – that of Memory. You’ll want to pay close attention to it to monitor your vSphere environment at the VM, host, and cluster levels.
In vSphere, there are three layers of memory to be aware of:
- Host physical memory (the memory available to ESXi hypervisor from the underlying host)
- Guest physical memory (the memory available to Operating Systems (O/S) running on VMs), and
- Guest virtual memory (the memory available at the application level of a VM).
Each VM has a configured amount of its host physical memory that the guest operating system may access. This configured size is different, however, from how much memory the host may actually allocate to it, which depends on the VM’s need as well as any configured shares, limits, or reservations. For example, though a VM may have a configured size of 2 GB, the ESXi host might only need to allocate 1 GB to it based on its actual workload (i.e., any running applications or processes). Note that, if no memory limit has been set on a VM, its configured size will act as its default limit.
When a virtual machine starts, the ESXi hypervisor of its underlying host creates a set of memory addresses matching the memory addresses presented to the guest operating system running on the virtual machine. When an application running on a VM attempts to read from or write to a memory page, the VM’s guest O/S translates between guest virtual memory and guest physical memory as if it were a non-virtualized system. The guest O/S, however, does not have access to and so cannot allocate host physical memory. Instead, the host’s ESXi hypervisor intercepts memory requests and maps them to memory from the underlying physical host. ESXi also maintains a mapping (called shadow page tables) of each memory translation: guest virtual to guest physical, and guest physical to host physical. This ensures consistency across all layers of memory.
This approach to memory virtualization means that each VM only sees its own memory usage while the ESXi host can allocate and manage memory across all running VMs. However, the ESXi host cannot determine when a VM frees up, or deallocates guest physical memory. Nor does a VM know when the ESXi host needs to allocate memory to other VMs. If guest physical memory across all running VMs (plus any necessary overhead memory usage) is equal to (or less than) the host’s physical memory, this is not a problem. But, in cases where memory is overcommitted (i.e. aggregated guest memory is greater than the host physical memory), ESXi hosts will resort to memory reclamation techniques such as swapping and ballooning in order to reclaim free memory from VMs and allocate it to other VMs. Resource overcommitment and memory reclamation strategies help optimize memory usage, but it’s important to monitor metrics that track ballooning and swapping because excessive use of either can result in degraded VM performance.
One recommended metric to set an alert on: Balloon driver (vmmemctl) capacity.
Each VM in vSphere can have a balloon driver (named vmmemctl) installed on it. If an ESXi host runs low on physical memory that it needs to allocate (i.e., less than 6 percent free memory), it can reclaim memory from the guest physical memory of virtual machines. However, because ESXi hypervisors have no knowledge of what memory is no longer in use by VMs, they send requests to the balloon drivers to “inflate” by gathering unused memory from the VM. The ESXi host can take that memory from the “inflated” balloon driver and deallocate the appropriate mapped host physical memory, which it can then allocate to other VMs. This technique is known as memory ballooning.
While ballooning can assist ESXi hosts when they are short on memory, if done too often it can degrade VM performance if the guest operating system later can’t access memory that’s been stored by the balloon driver and reclaimed by the host. And, if ballooning isn’t sufficient, ESXi hosts may begin to use swap memory to meet the memory demands of VMs, which will result in a severe degradation in application performance.
If your environment is healthy and virtual machines are properly rightsized, memory ballooning should generally be unnecessary. To be proactive - you should set up an alert for any positive value for mem.vmmemctl, which would indicate that the ESXi host is out of available memory.
One recommended metric to set an alert on: Memory swapped in/out.
When an ESXi host provisions a virtual machine, it allocates physical disk storage files known as swap files. Swap file size is determined by the VM’s configured size, less any reserved memory. For instance, if a VM is configured with 3 GB of memory and has a 1 GB reservation, it will have a 2 GB swap file. By default, a VM’s swap files are co-located with its virtual disk, on shared storage.
If host’s physical memory begins to run low, and memory ballooning isn’t reclaiming sufficient memory to meet demands quickly enough (or if the balloon driver is disabled or never installed), ESXi hosts will begin using swap space to read and write data that would normally go to memory. This process is called memory swapping.
Reading and writing data to disk takes much longer than using memory and can drastically slow down a VM, so memory swapping should be regarded as a last resort. Ensure that swapping is kept at a minimum by setting alerts to notify you of any spikes and you can decide how to resize virtual machines if necessary. If you notice an increase in swapping, you might also want to check on the status of VM balloon drivers as swapping may indicate that ballooning has failed to reclaim sufficient memory.
One recommended metric to watch: Active memory versus consumed memory.
In order for a VMKernel to accurately discern how much memory is actively in use by VMs, it would need to monitor every memory page that has been read from or written to. This process, however, would require too much overhead. Instead, the VMKernel uses algorithmic learning to generate an estimate of each VM’s active memory usage. The VMKernel reports this estimate, measured in KBs, as the mem.active metric. Consumed memory, measured by the mem.consumed metric, represents the amount of an underlying host’s memory that’s allocated to a VM.
Active memory is a good real-time gauge of your VMs’ memory usage and monitoring it alongside consumed memory can help you determine if virtual machines have sufficient memory allocated to them. If a VM’s active memory is consistently significantly below its consumed memory, it means it has more memory allocated to it than it needs and, as a consequence, the host has less memory available for other VMs than it could. To amend this, consider changing your VM’s configured size or memory reservation.
One recommended metric to watch: Memory usage.
At the VM level, the mem.usage metric measures what percentage of its configured memory a VM is actively using. Ideally, a VM should not always be using all of its configured memory. If it is consistently using a large portion of its configured memory, the VM will be less resilient to any spikes in memory usage if its ESXi host cannot allocate additional memory. If this is the case, consider reconfiguring VM memory size, updating its memory allocation settings (shares, reservations, etc.), or migrating the VM to a cluster with more memory.
At the host level, memory usage represents the percentage of an ESXi host’s physical memory that is being consumed. If memory usage at the host level is consistently high, it may not be able to provision memory to the VMs that need it and it will need to run memory ballooning more often or may even begin to start swapping memory.
Just how do we monitor performance? There are numerous options available, from the default vSphere performance monitor and the various VM Operating System performance monitors to 3rd party products from suppliers such as SolarWinds, DataDog, NinjaOne, ManageEngine OpManager, and Veeam (just to name a few!)
To be honest – there are so many aspects to successfully implementing and using VMware vSphere that we could be here all week! If your head is spinning from the complexity and variety of configuration issues (and in this article we only discussed memory – there are 3 more resources to also consider) – perhaps it’s time to get an expert in to look over your infrastructure and perhaps offer advice, or to get some in depth hands on training with our vSphere Install, Configure and Manage course (or any of our other VMware courses). Either way, do contact us at SureSkills we are Irelands Premier VMware Partner
#SureSkills #VMware #virtualisation #training #Veeam #SolarWinds