Accuracy of Guest OS memory has been a debate for a long time. I’ve been at VMware for 1+ decade and remember the debate between Active RAM and Consumed. Along came the visibility into the Guest OS by vR Ops, which was a major step forward. It is not, however, the final solution yet, as we still have to consider applications.
Take a look at the following utilization diagram:
The first bar is a generic guidance. The 2nd bar is specific to RAM.
- When you spend on infrastructure, you want to use it well. After all, you pay for the whole box. So ideally, it’s 100%. The first bar above shows the utilization range (0% – 100%). Green is where you want the utilization to be. Below 50% is blue, symbolizing cold. The company is wasting money if the utilization is below 50%. So below the green zone, there is a wastage zone. On the other hand, higher than 75% opens the risk that performance may be impacted. Hence I put yellow and red threshold. The green zone is actually a narrow band.
- In general, applications tend to work on a portion of its Working Set at any given time. The process is not touching all its memory all the time. As a result, the rest becomes cache. This is why it’s fine to have active + cache beyond 95%. If your ESXi is showing 98%, do not panic. Windows and Linux are doing this too. The modern-day OS is doing its job caching all those pages for you. So you want to keep the Free pages low.
Cache is an integral part of memory management, as the more you cache, the lower your chance of hitting a cache miss. This makes sense. RAM is much faster than Disk, so if you have it, why not use it? Remember when Windows XP introduced pre-fetch, and subsequently Windows SuperFetch? If not, here is a good read.
As you can read from the SuperFetch, Memory Management is a complex topic. There are many techniques involved. Unfortunately, this is simplified in the UI. All you see is something like this:
Linux and VMkernel also has its fair share of simplifying this information. This Linux Ate My RAM is pretty famous. For ESXi, a common misperception is “we are short on RAM, but fine on CPU”, when it is actually the other way around! To prove it, check the Max VM CPU Contention and Max VM RAM Contention counter for each cluster.
Windows Memory Counters
- In Use: this is what Windows needs to operate. Not all of them are actively in used though, which is why the VM Active Memory counter from hypervisor can be lower than this. Notice that Windows compresses its active RAM, even though it has plenty of Free RAM available. This is a different behaviour to ESXi, which do not compress unless it’s running low on Free. Formula is In use = Total – (Modified + Standby + Free)
- Committed: the currently committed virtual memory, although not all of them are written to the pagefile yet. Commit can go up without In Use going up, as Brandon shares here.
- Commit Limit: Commit Limit is physical RAM + size of the page file. Since the pagefile is normally configured to map the physical RAM, the Commit Limit tends to be 2x.
- Modified: page that was modified but no longer used, hence it’s available for other usage. It’s counted as part of Available. The API name is guest.mem.modifiedPages (Win32_PerfRawData_PerfOS_Memory#ModifiedPageListBytes)
- Standby: Windows has 3 levels of standby. They are:
- guest.mem.standby.core (Win32_PerfRawData_PerfOS_Memory#StandbyCacheCoreBytes)
- guest.mem.standby.normal (Win32_PerfRawData_PerfOS_Memory#StandbyCacheNormalPriorityBytes)
- guest.mem.standby.reserve (Win32_PerfRawData_PerfOS_Memory#StandbyCacheReserveBytes)
- Free: guest.mem.free (Win32_PerfRawData_PerfOS_Memory#FreeAndZeroPageListBytes)
- Cached = Modified + Standby
- Available = Free + Standby
- Paged pool: this is a part of Cache Bytes. Based on this great article, it includes Pool Paged Resident Bytes, the System Cache Resident Bytes, the System Code Resident Bytes and the System Driver Resident Bytes.
- Non-paged pool: this is kernel RAM. It cannot be paged out. It’s part of In Use.
As a result, determining what Windows actually use is difficult. If the UI is not clear, the API is even more challenging. The name in the API does not map to the name in the UI.
Linux Memory Counters
As you can guess from above, Linux does it differently 🙂 There are 5 counters that we’re interested:
- Total: guest.mem.total (/proc/meminfo#MemTotal)
- Buffers: guest.mem.buffers (/proc/meminfo#Buffers)
- guest.mem.cached (/proc/meminfo#Cached)
- guest.mem.slabReclaim (/proc/meminfo#SReclaimable)
- Available: guest.mem.available (/proc/meminfo#MemAvailable since Linux 3.14)
- Free: guest.mem.free (/proc/meminfo#MemFree)
Used = total - free - buffers - cache Cache = guest.mem.cached + guest.mem.slabReclaim
vR Ops Counters
How are the above Guest OS metrics appears in vR Ops? VMware vSphere Tools comes to the rescue! It provides a set of counters (details here), which is crucial as hypervisor can’t see inside the Guest OS.
5 July 2020: Special thanks to Wasantha Jayawardane for pointing out a mistake in my blog. The following section has been corrected.
Take Windows for example, there are 4 counters (modified and 3 cache), so the result can vary if you take any of these 4.
Example 1: Standby Normal fluctuates, can be as high at 90%. The other 2 cache remains constantly negligible. The chart below is based on >26000 samples, so there is plenty of chance for each 3 counters to fluctuate.
Example 2: This time around, the VM usable memory was increased 2x in the last 3 months. Standby Normal hardly move, but Standby Reserve took advantage of it.
As a result, vR Ops Needed Memory is not based on the above 4 cache. Instead, the formula is
Physical Memory - MAX (0, ( memAvailable - 5 % of physical Mem ))
My point is determining the memory required accurately depends on the application. No, complaint by Apps Team is not a reliable metric at all.
It’s impossible to know the need of each apps. As a result, vR Ops Needed Memory is designed to give early warning, so you can take your time to investigate and discuss with the application team.
For higher accuracy, use application level monitoring such as Blue Medora, or install agent such as Telegraf.
The changes in vR Ops 6.7 is the mapping of VM counters. We did not come up with a new counter. All the counters were already available since vR Ops 6.3, and I explained the counters in this post. In 6.7, the Memory Workload counter and Memory Usage now map to Guest OS instead of Active. If the Guest OS is not available, it falls back to Consumed. This is why you start seeing a jump in memory usage. It’s not that vR Ops 6.7 chooses the wrong counter. It’s merely using a different counter.
vR Ops 6.6 uses VM Active vR Ops 6.7 uses Guest OS, then VM Consumed
Do not use Active for Guest OS. It’s a counter for the ESXi.
What counters to use?
It depends on the context. Are you troubleshooting performance or analysing for right sizing? Performance is about “Does the application get what it needs?”. What the long term or big picture is less relevant. Right sizing is about “Overall, what is the right amount needed”. You look at long term. A 5-minute burst should not dictate your overall sizing recommendation.
- For Performance:
- VM Memory Contention. No point increasing/shrinking the Guest if the problem is with the hypervisor.
- Guest OS Page-In Rate. A heavy page-in may impact performance. Page-In is not an metric for capacity due to pre-fetch.
- Guest OS Memory Needed.
- For Capacity:
- For best performance: Guest OS Memory Needed
- For lowest cost: Guest OS Used (need agent), Guest OS Cache (need agent)
I hope the article helps to clarify. What counter do you want to use in future version of vR Ops? Discuss with your TAM or SE, as they will champion you internally in VMware. If you do not have one, let me know at email@example.com.