vRealize Operations 7.0 enhances the widgets and dashboard, which enables us to create better user experience. With that, happy to share the VMware ESXi Performance dashboard:
The above dashboard is color coded. The idea is you just need to glance that everything is green. You only need to look at the counter if they are not green.
Layout wise, it’s split into 4 levels. Do click to enlarge it as there is description added on the image. The dashboard shows Performance first, then utilization. Can you guess why?
Performance: What counters define your ESXi Performance?
- We know that utilization is not performance. It’s related, but it’s not the same thing. An ESXi with low utilization could be a sign of something wrong. Could it be CPU and RAM are waiting for Disk? Could it be networks are dropping packets?
- A high performing ESXi is one that does its job well. It serves its workload easily. It’s not struggling juggling the demands from all the VMs running on it. So performance must be measured in terms of how the VMs are being served. There are 2 sub-dimension to this.
- How bad is the problem? This covers the depth.
- How widespread is the problem? This covers the breadth.
- How bad is the problem can be quantified by taking the worst CPU Contention or RAM Contention experienced by all the VMs.
- How widespread is the problem can be quantified by the percentage of VMs facing contention.
- The 2 sub-dimensions complement each other. It gives you an insight into the performance of your ESXi. If you have a very bad contention, but it only impact a small percentage, then the problem is narrow. This could be sign of monster VMs. If the worst contention is not that bad, but it impacts almost all VMs, then the ESXi itself is struggling.
- Do you know why I don’t add VM Disk Latency? Even on vSAN, the solution may not be on the ESXi you’re looking at.
Utilization: Drive it high as you paid for the whole box
- Now that you can measure Performance, you have confidence to drive utilization high. No need to artificially put headroom. Hence Utilization is shown below Performance as it’s secondary.
- For RAM, both Consumed and Active are shown. If active is low, no need to upgrade RAM as Consumed contains disk cache. For me, it’s fine for Consumed to be 95% so long RAM Contention is 0.
- For CPU, both Demand and Usage are high. Do you know the difference between both?
- Download the dashboard from VMware code.
- Import the dashboard, view, and supermetric.
- Enable the supermetric in your base policy. Hope it’s a good introduction to the awesome power of supermetric!
- Replace your ESXi Summary Page with this. Sunny my brother has documented here.
Hope you find it useful. Next is vSphere Cluster Performance dashboard.