Tag Archives: Performance

vSphere Storage Latency – View from the vmkernel

The storage latency data that vCenter provides is a 20-second average. With vRealize Operations, and other management tools that keeps the data much longer, it is a 5-minute average. 20 seconds may seem short, but if the storage is doing 10K IOPS, that is an average of 10,000 x 20 = 200,000 read or writes. An average of 200,000 numbers will certainly hide the peak. If you have 100 reads or writes that experienced bad latency, but the remaining 199,900 operations returned fast, that poor 100 operations will be hidden.

With esxtop, it can get down to 2 seconds. That’s great, but we need to do it per ESXi. If you have a large farm, it means logging into every ESXi host. This will also have impact on performance, so you do not want to do it all the time, 24 x 7 x 365 days. There is also an issue of presenting that data in 1 screen.

ESXi vmkernel actually logs if the storage latency gets worse or improve. It does this out of the box, so nothing we need to enable. This log entry is not an average. No, it is not logging every single IO the vmkernel issues. That would generates way too many logs. It only exists when the situation improves or deteriorates. This is what you want anyway. As a bonus, the data is in microseconds, so it’s also more granular if you need something more accurate than ms.

This is where vRealize Log Insight comes in. With just a few clicks, I got the screenshot below. Notice that Log Insight already has a variable (field) for VMware ESXi SCSI Latency. So all I needed to do is to get all the log entries where this field exist. From there, it’s a matter of presentation. In the following screenshot, I grouped them by the Device ID (LUN). I only have 1 device that has this issue in the past, so apology for the poor example.

The line chart is an average. You can take the Max if you want to see the worst. I’ve zoomed the chart to a narrow window of just 2:15 minutes (from 18:06:00 to 18:08:15 hours), and each data point represents 1 second. So that is 1 second granularity. 20x better than the 20 seconds average you get from vCenter.

Storage latency

What do they look like when you have multiple LUNs, and you have storage issue? Here is an example. I have grouped them by Device as it’s easier to discuss with storage admin if you can tell the Device ID.

Max Storage Latency by LUN

Here is another example. This time the latency went up beyond 20,000 ms!

Max Storage Latency by LUN 2

Beside the line chart, you can also create table. Log Insight automatically shows the field above. All I did was hiding some (14 to be exact, as shown next to the Columns). I find the table useful to complement the chart.

You may be curious about what field is. Log Insight comes with many out of the box fields. They are provided by the content pack. To see the default fields, just duplicate it like what I did below. Log Insight will automatically highlight in green, how it determines the field. In the example below, it parses the string “microseconds to” and “microseconds” and the value in between the strings is extracted as field.

You can also set the type of field. In this case, it specifies the field as Integer.

Storage latency - field

You may think…. why not search for the string “latency”, so we get everything? Well… this is what I got. There are many log entries with the word latency in it that is not relevant. I have 169 entries below.

Storage latency - text latency

The following screenshot shows more examples of the log entries.

Storage latency - text latency 2

That’s all. Hope you find it useful. If you do, you may find this and this useful.

vCenter and vRealize counters – part 3

Storage

If you look at the ESXi and VM metric groups for storage in the vCenter performance chart, it is not clear how they relate to one another at first glance. You have storage network, storage adapter, storage path, datastore, and disk metric groups that you need to check. How do they impact on one another?

I have created the following diagram to explain the relationship. The beige boxes are what you are likely to be familiar with. You have your ESXi host, and it can have NFS Datastore, VMFS Datastore, or RDM objects. The blue colored boxes represent the metric groups.

Book - chapter 4 - 01

NFS and VMFS datastores differ drastically in terms of counters, as NFS is file-based while VMFS is block-based. For NFS, it uses the vmnic, and so the adapter type (FC, FCoE, or iSCSI) is not applicable. Multipathing is handled by the network, so you don’t see it in the storage layer. For VMFS or RDM, you have more detailed visibility of the storage. To start off, each ESXi adapter is visible and you can check the counters for each of them. In terms of relationship, one adapter can have many devices (disk or CDROM). One device is typically accessed via two storage adapters (for availability and load balancing), and it is also accessed via two paths per adapter, with the paths diverging at the storage switch. A single path, which will come from a specific adapter, can naturally connect one adapter to one device. The following diagram shows the four paths:

Book - chapter 4 - 02

A storage path takes data from ESXi to the LUN (the term used by vSphere is Disk), not to the datastore. So if the datastore has multiple extents, there are four paths per extent. This is one reason why I did not use more than one extent, as each extent adds four paths. If you are not familiar with extent, Cormac Hogan explains it well on this blog post.

For VMFS, you can see the same counters at both the Datastore level and the Disk level. Their value will be identical if you follow the recommended configuration to create a 1:1 relationship between a datastore and a LUN. This means you present an entire LUN to a datastore (use all of its capacity).

The following screenshot shows how we manage the ESXi storage. Click on the ESXi you need to manage, select the Manage tab, and then the Storage subtab. In this subtab, we can see the adapters, devices, and the host cache. The screen shows an ESXi host with the list of its adapters. I have selected vmhba2, which is an FC HBA. Notice that it is connected to 5 devices. Each device has 4 paths, so I have 20 paths in total

ESX - Adapter

Let’s move on to the Storage Devices tab. The following screenshot shows the list of devices. Because NFS is not a disk, it does not appear here. I have selected one of the devices to show its properties.

ESXi - device

If you click on the Paths tab, you will be presented with the information shown in the next screenshot, including whether a path is active. Note that not all paths carry I/O; it depends on your configuration and multipathing software. Because each LUN typically has four paths, path management can be complicated if you have many LUNs.

ESXi - device path

The story is quite different on the VM layer. A VM does not see the underlying shared storage. It sees local disks only. So regardless of whether the underlying storage is NFS, VMFS, or RDM, it sees all of them as virtual disks. You lose visibility in the physical adapter (for example, you cannot tell how many IOPSs on vmhba2 are coming from a particular VM) and physical paths (for example, how many disk commands traveling on that path are coming from a particular VM). You can, however, see the impact at the Datastore level and the physical Disk level. The Datastore counter is especially useful. For example, if you notice that your IOPS is higher at the Datastore level than at the virtual Disk level, this means you have a snapshot. The snapshot IO is not visible at the virtual Disk level as the snapshot is stored on a different virtual disk.

Book - chapter 4 - 03

Network

My apology that I cannot publish information on Network as it’s not provided as free pages by the publisher. The information is covered in my book.

vCenter and vRealize counters – part 2

Compute

The following diagram shows how a VM gets its resource from ESXi. Unlike a physical server, a VM has dynamic resources given to it. It is not static. Contention, Demand and Entitlement are concepts that do not exist in the physical world. It is a pretty complex diagram, so let me walk you through it.

Book - chapter 3 - 02

The tall rectangular area represents a VM. Say this VM is given 8 GB of virtual RAM. The bottom line represents 0 GB and the top line represents 8 GB. The VM is configured with 8 GB RAM. We call this Provisioned. This is what the Guest OS sees, so if it is running Windows, you will see 8 GB RAM when you log into Windows.

Unlike a physical server, you can configure a Limit and a Reservation. This is done outside the Guest OS, so Windows or Linux does not know. You should minimize the use of Limit and Reservation as it makes the operation more complex.

Entitlement means what the VM is entitled to. In this example, the hypervisor entitles the VM to a certain amount of memory. I did not show a solid line and used an italic font style to mark that Entitlement is not a fixed value, but a dynamic value determined by the hypervisor. It varies every minute, determined by Limit, Entitlement, and Reservation of the VM itself and any shared allocation with other VMs running on the same host.

Obviously, a VM can only use what it is entitled to at any given point of time, so the Usage counter does not go higher than the Entitlement counter. The green line shows that Usage ranges from 0 to the Entitlement value.

In a healthy environment, the ESXi host has enough resources to meet the demands of all the VMs on it with sufficient overhead. In this case, you will see that the Entitlement, Usage, and Demand counters will be similar to one another when the VM is highly utilized. This is shown by the green line where Demand stops at Usage, and Usage stops at Entitlement. The numerical value may not be identical because vCenter reports Usage in percentage, and it is an average value of the sample period. vCenter reports Entitlement in MHz and it takes the latest value in the sample period. It reports Demand in MHz and it is an average value of the sample period. This also explains why you may see Usage a bit higher than Entitlement in highly-utilized vCPU. If the VM has low utilization, you will see the Entitlement counter is much higher than Usage.

An environment in which the ESXi host is resource constrained is unhealthy. It cannot give every VM the resources they ask for. The VMs demand more than they are entitled to use, so the Usage and Entitlement counters will be lower than the Demand counter. The Demand counter can go higher than Limit naturally. For example, if a VM is limited to 2 GB of RAM and it wants to use 14 GB, then Demand will exceed Limit. Obviously, Demand cannot exceed Provisioned. This is why the red line stops at Provisioned because that is as high as it can go.

The difference between what the VM demands and what it gets to use is the Contention counter. Contention is a special counter that tracks all these competition for resources. It’s a counter that only exists in the virtual world.

So Contention, simplistically speaking, is Demand – Usage. I said simplistically as that’s not the actual formula. The actual formula does not really matter for all practical purpose as it’s all relative to the expectation that you’ve set to your customers (VM Owner).

Contention happens when what the VM demands is more than it gets to use. So if the Contention is 0, the VM can use everything it demands. This is the ultimate goal, as performance will match the physical world. This Contention value is useful to demonstrate that the infrastructure provides a good service to the application team. If a VM owner comes to see you and says that your shared infrastructure is unable to serve his or her VM well, both of you can check the Contention counter.

The Contention counter should become a part of your Performance SLA or Key Performance Indicator (KPI). It is not sufficient to track utilization alone. When there is contention, it is possible that both your VM and ESXi host have low utilization, and yet your customers (VMs running on that host) perform poorly. This typically happens when the VMs are relatively large compared to the ESXi host. Let me give you a simple example to illustrate this. The ESXi host has two sockets and 20 cores. Hyper-threading is not enabled to keep this example simple. You run just 2 VMs, but each VM has 11 vCPUs. As a result, they will not be able to run concurrently. ESXi VMkernel will schedule them sequentially as there are only 20 physical cores to serve 22 vCPUs. Here, both VMs will experience high contention.

Hold on! You might say, “There is no Contention counter in vSphere and no memory Demand counter either.”

This is where vR Ops comes in. It does not just regurgitate the values in vCenter. It has implicit knowledge of vSphere and a set of derived counters with formulae that leverage that knowledge.

You need to have an understanding of how the vSphere CPU scheduler works.
The following diagram shows the various states that a VM can be in:

Book - chapter 3 - 03

The preceding diagram is taken from The CPU Scheduler in VMware vSphere®
5.1: Performance Study. This is a whitepaper that documents the CPU scheduler with
a good amount of depth for VMware administrators. I highly recommend you read this
paper as it will help you explain to your customers (the application team) how your
shared infrastructure juggles all those VMs at the same time. It will also help you pick
the right counters when you create your custom dashboards in vRealize Operations.