Tag Archives: network

vSphere visibility for Network Team

This post continues from the Operationalize Your World post. Do read it first so you get the context.

Similar to the problem face between Storage Team and Platform Team, VMware Admin needs to reach out to Network Team. A set of purpose-built dashboards will enable both team to look at issue from the same point of view.

The dashboards must answer the following basic questions for Network Team:

  1. What have I got in this virtual environment?
    • What is the virtual network configuration? What are the networks, and how big are they?
    • We have Distributed vSwitches, Distributed Port Group, Datacenter, Cluster, ESXi, etc. How are they related?
    • Who are the consumer of my network? Where are they located?
  2. Are they healthy?
    • Do we have any errors in our networks? Which port groups see packets dropped? If there is problem, which VMs or ESXi, are affected?
    • Do we have too many special packets? Broadcast, multicast and unknown packets. Who generates them and when?
  3. Are they optimized?
    • Just because something is healthy does not mean they are optimized. Look for opportunity to right size.

Once Network Admin know what they are facing, they are in better position to analyse:

  1. Utilization
    • Is any VM or ESXi near its peak?
    • Who are the top consumer for each physical datacenter?
    • How is the workload distributed in this shared environment?
  2. Performance
    • When VM Owner thinks Network is the culprit, can both Network Team and Platform verify that quickly?
  3. Configuration
    • Are the config consistent?
    • Do they follow best practice?

Once we know the questions and requirements, we can plan a set of dashboards. I’ll show some sample dashboards to get you going. They follow the dashboard best practices.

What have I got? 

This first dashboard provides overall visibility to Network team. It gives insight into the SDDC.

  • It shows the total environment at a glance. A Network Admin can see how the virtual network maps to the virtual environment (ESXi, vCenter Datacenter, vCenter).
  • It quickly shows the structure of virtual network. For each Distributed vSwitch, you can see what its port groups are and ESXi are connected to it. You see the config of both objects (port group and ESXi). You can see if the configuration is not consistent.
  • The heatmap quickly shows all port groups by size, so you can find out your largest ones easily. The color code also lets you see which ones are used the most.

overview

Once you know your environment, you are in a position to do monitoring. I don’t feel comfortable doing monitoring unless I have the big picture. It gives me context.

Are they healthy?

The next dashboard shows quickly if there is dropped packet and error packet in your network.

  • The first line chart shows the maximum packet dropped among all Distributed Switch. So if any switch has dropped packet, it will show up.
  • The second line chart sums all the error packets.

Both line charts are color coded, and you should expect to see green. This means no dropped packets nor error packets.

performance

The dashboard above has interaction, allowing you to drill down when the line charts are not showing green.

  • If you do not see green, you can drill down into each Distributed Switch.
    • As a virtual switch can span thousands of ports, it helps if you can drill down by Port Group and ESXi host. The dashboard automatically shows the relevant Port Group and ESXi of the distributed switch.
  • If there is a need to, you can even drill down to individual VM.
    • The table at the bottom is collapsed by default. Expand it and you’ll see all VMs with dropped packets information.

Other than dropped packets and error packets, Network Admin can also check for multicast, broadcast, and unknown packets. You don’t want to have too many of them zipping around your DC.

special-packets

The same concept is being applied, hence Network team only need to learn the dashboard once.

The line charts show the total broadcast and total multicast packets. We are not doing hourly average as at the global level, the law of large numbers will ensure its smooth. A significant deviation is required to move the number. Hence if there is big jump, you know something amiss.

Just like the previous dashboard, this dashboard lets you drill down too. You can check which VMs are generating broadcast packets.

Is Utilisation running high? 

Another factor that can impact performance is high utilization. Network team can see the total utilization of the network. The following dashboard answers questions such as:

  • What’s the total workload hitting our physical switches?
  • If the total workload increasing?
  • Any crazy pattern in utilization? Any sudden spike that should not have happened?

capacity

Line charts are used instead of a single number, as you can see pattern over time. In fact, 2 line charts are provided: detailed and big picture.

The chart gives you the Total throughput hitting the physical switches, so you know how much bandwidth is generated. The chart will be defaulted to 1 month as this is more of a long term view, not really for troubleshooting.

Based on the line charts, you can drill down into a specific time period where the peak was high. The Top-N lists the ESXi with the highest usage. Click on it, and its detailed utilization will be shown. You can see if any of its vmnic is near the physical limit. The super metric takes into account both RX and TX, and you can hit a limit on either.

If you see an ESXi is saturated, but others are barely used, that means your workload is not well distributed. Note that vSphere 6.0 DRS does not consider network, so unbalance is possible. vSphere 6.5 DRS takes network into account.

Customisation:

  • You can add the total line chart as above, but for VM.
    • VM traffic does not include hypervisor traffic (vMotion, management network, IP storage). So it’s pure business workload.
    • We should be expecting this number to slowly rise, reflecting growth.
    • A sudden spike is bad, and so is a sudden drop. We can turn on analytics on it so you get alert.
  • For details on how to do it, see http://virtual-red-dot.info/is-any-of-your-esxi-vmnics-saturated/

How is the workload distributed?

Distributed Switch does not span beyond vSphere Data Center. So data center is a logical choice to start analysing the traffic. The following dashboard compares the workload of each data center. Using color code makes it easy to see which DC reaches a workload that is high.

You can drill down inside the datacenter object. Click on it, and all the ESXi and port groups connected to it will be automaticaly shown. Click on an ESXi, and you can drill down into a VM.

workload-distribution

You can change the threshold (limitation: same limit for every DC) to suit your need.

Who are the Top Talkers? 

Another reason for performance is you have VMs that are consuming excessive network resources. Or you have a peak period, where the total is simply much higher than low period. The next dashboard provides 2 line charts. Again, line chart is used as you can see the pattern.

The table provides a list of VM, sorted by their peak utilization. You can find out who are the bursty-users (5 minute highest), and who are the sustained-users (1 hour highest and 1 day highest).

top-consumers

This example is only for the VM. We can build one for the ESXi if that’s needed. The concept is the same.

Is Network the culprit?

Lastly, it’s all about Service, not System or Architecture. When a VM Owner complains that IaaS is causing the problem, Network Admin and VMware Admin can quickly see the same dashboard to agree whether network is the culprit or not.

vm-troubleshooting

Hope you find the material useful. If you do, go back to the Main Page for the complete coverage of SDDC Operations. It gives you the big picture so you can see how everything fits together. If you already know how it all fits, you can go straight to download here.

VMware SDDC Architecture: Network

In the previous article, I covered the requirements and overall architecture. We have also covered the Compute architecture. To some extend, I’ve covered Storage architecture as it’s using VSAN (we will dive in a future blog). In this blog, I will cover the Network architecture. I’m not a Network Architect, and have benefited from great blogs by Ivan and Scott.

Logical Architecture

There are 4 kinds of network in VMware SDDC, namely:

  1. VMkernel network
  2. ESXi Agents network
  3. VM network
  4. iLO network

The VMkernel network has grown as VMware adds more capabilities into the ESXi kernel. In vSphere 6, you may need up to 8 IP addresses for each ESXi host. Since they are on the physical network, you will need VLAN for each. While you can place the vmkernel network on VXLAN, for simplicity and cleaner separatio, I’d put them on VLAN. Operationally, I’ve put them on a 2-digit VLAN ID, so it’s easier for anyone in the company to remember that VLAN 10 – 99 are for VMkernel.

network 1

There are many articles on the various VMkernel network, so I will just touch them briefly:

  • Management Network. This is predominantly for ESXi. I put all of them on the same VLAN to prevent operations from becoming too complex. There is no need for isolation as this is out of band. VMs traffic do not go here.
  • vMotion network. I keep them on separate VLAN as there is no need to do inter-cluster vMotion a VM across Clusters Type. For example, In the Network Edge Clusters, the only VMs living here will be NSX Edge VMs and NSX Distributed Router VMs. There is no need for other VMs to live in this cluster. To minimize human error and ensure the segregation, the vMotion network does not go across different type of clusters. Let’s take another example to make it clearer. In the VDI cluster, we will have 1 – 5 clusters per physical DC. They can vMotion among these 5 VDI clusters, so the entire 5 clusters is just 1 logical pool. There is no business need for the VDI VM to vMotion to the Management Cluster. Also, the VDI server infrastructure (e.g. Horizon View Connection Server) live in the Management Cluster, so separation helps simplify operation. This interface needs 2 IP addresses if you are doing multi-NIC vMotion.
  • Fault Tolerant network. I apply the same restriction to prevent a VM and its shadow VM spans across 2 different type of cluster.
  • Storage network. This can be NFS, iSCSI or VSAN. From the above diagram, you can see that I share the Storage network between Server workload, Desktop workload and Non Production. To keep things simpler, you can actually share with the Management and Edge clusters also. This means you only have 1 VLAN. The reason it is safe to do that is there is no life migration. The VM has to be shut down as there is no vMotion network. You also cannot have an FT VM spanning as there is no FT network across the cluster type.
  • vSphere Replication network. Having a separate network makes monitoring easier.
  • VXLAN network. This is the network where the VM traffic will be tunnelled. Having a separate network makes monitoring easier.

The above will need 6-7 IP addresses. Plan your IP address carefully. I personally prefer an easy correlation to my ESXi. For example, ESXi-01 in my environment will have x.x.x.1 address, and ESXi-99 will have the x.x.x.99 address.

ESXi Agent Network

Data Center components are moving to the hypervisor. When they move, they either move to the kernel as kernel module (e.g. VSAN), or they take the VM form factor. An example of VM form factor is Nutanix Storage VM and TrendMicro Deep Security. Regardless, you need an IP address for every ESXi.

This Network is not a vmkernel network. They are VM network. However, they are backed by VLAN, not VXLAN. That means they are on the physical network, given a physical IP address. So you need to plan them.

Now that we’ve covered the ESXi networks, let’s move to the VM Network.

VM Network

All of them, without exception, will be on VXLAN. This allows decoupling with the physical DC network. Another word, we virtualize the VM Network. Defining it on software allows inter-DC mobility. There will be many VXLAN networks, so I need to plan them carefully too. In the above diagram, I have grouped into 4 top-level groups. I’d give each group its own range, so I know what kind of workload is running given a VXLAN number. Here is an example:

  • VXLAN 10000 – 19999: Server Workload
  • VXLAN 20000 – 29999: DMZ Workload
  • VXLAN 30000 – 39999: Desktop Workload
  • VXLAN 40000 – 49999: Non Production Workload

I have a wide range on each as I have a sub-category. Yes, this means you will have a lot more VXLAN than you do VLAN. This is a fundamental difference between networking in SDDC and networking prior network virtualization. You do not want to have too many VLANs as it’s operationally taxing. VXLAN does not have that issue. Network becomes cheap. You can lots of them. For example, the server workload is split per application. If I give each application up to 10 networks, I can have 1000 applications. By having 10 networks, I can have numbering convention. For example:

  • Web server network: xxxx1. Example is 10001, 10011, 10021, and 10031 for the first 4 applications. This means I know that anything on 1xxx1 is my production web servers.
  • Application server network: xxxx2
  • DB server network: xxxx3

Lastly, but certainly not the least important, you should have the iLO network for light-out management. This is the physical boxes management network.

Physical Architecture

[10 Nov 2015: I got a correction from Raj Yavatkar and T. Sridhar that we should not have spine connect to the northbound switch – if you do that, that creates some interesting issues; spines should be devoted to carry, inter-rack, East-West traffic. I will update the diagrams once I have some time. Thanks Raj and Sridhar for the correction. Much appreciated].

As SDDC Architect, you need to know how to implement the above Logical Architecture. It is software defined, but the hardware plays an important role. If necessary, tap the expertise of the Network Architect. In my case, I have requested YJ Huang from Arista to help me. I also benefit from Ivan Pepelnjak’s post here.

We start from the base connectivity. The diagram below shows 2 ESXi hosts, and how they are physically connected to network devices. I am using 2x 10 GE cables for data network, and 1x 1 GE cable for iLO. There is no need for HA for the iLO network. In most cases, 2x 10 GE is more than enough. Know your workload before you decide with 4x 10 GE.

network 1-1

Now that we’ve covered the basic, let’s see what the overall picture look like when we attach all the ESXi Hosts. The diagram below shows the 2 physical data centers. They have identical physical setup, but different IP addresses. Data Center 1 could have 10.10.x.x, while Data Center 2 has 20.20.x.x. By not extending the physical network, you contain the failure domain.

The diagram shows how my 5 clusters are connected to the switches. I use a Spine Leaf as I want to be able to scale without major re-architecting. I’ve drawn the future switches in grey. They naturally come in pair. I draw the spine-leaf connection thicker as that is 40G.

network 2

Let’s see how the architecture scale to the requirements, which is 2000 server VM and 5000 VDI VM. As you can see, it’s essentially an extension. Fundamentally, it remains the same. I do change the cluster placement to keep it simpler. This comes at a cost of re-cabling.

Architecture 2000

You maybe wondering why I use 40G between spine and leaf. For the VM Network, 10G is more than sufficient. The reason is the VMkernel Network. The vMotion boundary cut across pods.

network 5

I hope you find useful. Keen to hear your thought!

Is any of your ESXi vmnics saturated?

You have a lot of ESXi hosts, and you know that traffic such as vMotion and VSAN can consume a lot of bandwidth. With vRealize Operations, you can see whether any of the vmnics (the physical NICs) across all ESXi hosts are hitting its limit. Since the network now sports full duplex transmission, we need to prove that neither Transmit (TX) nor Receive (RX) hit the limit. The limit can be 1 GE or 10 GE.

An ESXi typically has 2x 10 GE, or many 1 GE. In my example below, each ESXi has 4 vmnics. That means I need to check 4 x 2 = 8 counters, and make sure none of them hit a limit.

So I need to have the following equation for each ESXi Host:

Max (vmnic0 TX, vmnic0 RX, vmnic1 TX, vmnic1 RX, vmnic2 TX, .vmnic2 RX, vmnic3 TX, vmnic3 RX)

I will use vRealize Operations to implement the above idea. You can use either 5.x or 6.x for this. The example below uses 6.0.1

First step is to create a new super metric. Since I need to apply the formula to the ESXi in question, I need to use the This option. Click on the little blue icon, above the formula bar, then double click on the formula you want. Follow the steps in the screenshot below, as it’s different to a super metric that do not use the This option.

  1. Click the This icon. In the screenshot, I’ve clicked it.
  2. Choose the Object Type: ESXi Host.
  3. Choose any of your ESXi host. It does not matter which one, as the name will be replaced with “This Resource“.
  4. Choose the counter you want to add to super metric. Double click on it.

00

I performed the above steps repeatedly. 10x to be exact. Yes, I added vmnic4 as I needed to verify whether the formula will result in an error since my ESXi does not have vmnic4. Why do I do this? In an environment with hundreds of ESXi globally, you may have variances. So you want to create 1 super metric, and apply to all.

I then did the usual preview, to verify that it works. Notice the actual numbers and pattern of the chart.

10 Is any of my vmnics saturated

I know you don’t want to double click so many times, so here it is for you to copy-paste:

Max ([
Max(${this, metric=net:vmnic0|received_average}),
Max(${this, metric=net:vmnic0|transmitted_average}),
Max(${this, metric=net:vmnic1|received_average}),
Max(${this, metric=net:vmnic1|transmitted_average}),
Max(${this, metric=net:vmnic2|received_average}),
Max(${this, metric=net:vmnic2|transmitted_average}),
Max(${this, metric=net:vmnic3|received_average}),
Max(${this, metric=net:vmnic3|transmitted_average}),
Max(${this, metric=net:vmnic4|received_average}),
Max(${this, metric=net:vmnic4|transmitted_average})
]) * 8 / 1024

In the formula above, I manually added * 8 / 1024 so KBps gets converted to Mbps.

One habit that I recommend when it comes to creating super metric, is always verify the formula. In the screenshot below, I manually plotted each vmnic TX and RX. I have 8 lines. Notice the Max of these 8 lines, matches the super metric I created above. With this, I have the proof that it works as expected.

10 Is any of my vmnics saturated -2

As usual, do not forget to associate it, and also enable it. You should know the drills by now 🙂 If you are not sure on the steps, search my blog on the steps.

10 Is any of my vmnics saturated - 3 - see my blog on enabling it

The above is great for 1 ESXi host. You can enable alert, and since you have visibility at the ESXi level, you can know for sure which ESXi was affected.

You may say that with DRS and HA, you should also apply it at the cluster level. Ok, we can do that too. We just need to create another super metric. Since I already have the data for each host, all I need to do is applying Max to the super metric.

11 applying at cluster level

As usual, I did a preview on the above screen, and then verify it manually on the following screen. Notice the patterns and numbers match!

11 applying at cluster level - verifying

The above works for a cluster. What if you need to apply at the higher level? You need to adjust the depth= parameter. In the example below, I use the World object. Notice I set the depth=4 as the hierarchy from World to ESXi host is:

  1. Cluster
  2. vCenter Data Center
  3. vCenter
  4. World

Max ESXi vmnic usage at physical DC

There you go. You can now have the proof whether any of your vmnic has hit the limit on either TX or RX. And you can see it at the cluster level and ESXi level 🙂