In the previous article, I covered the requirements and overall architecture. We have also covered the Compute architecture. To some extend, I’ve covered Storage architecture as it’s using VSAN (we will dive in a future blog). In this blog, I will cover the Network architecture. I’m not a Network Architect, and have benefited from great blogs by Ivan and Scott.
There are 4 kinds of network in VMware SDDC, namely:
- VMkernel network
- ESXi Agents network
- VM network
- iLO network
The VMkernel network has grown as VMware adds more capabilities into the ESXi kernel. In vSphere 6, you may need up to 8 IP addresses for each ESXi host. Since they are on the physical network, you will need VLAN for each. While you can place the vmkernel network on VXLAN, for simplicity and cleaner separatio, I’d put them on VLAN. Operationally, I’ve put them on a 2-digit VLAN ID, so it’s easier for anyone in the company to remember that VLAN 10 – 99 are for VMkernel.
There are many articles on the various VMkernel network, so I will just touch them briefly:
- Management Network. This is predominantly for ESXi. I put all of them on the same VLAN to prevent operations from becoming too complex. There is no need for isolation as this is out of band. VMs traffic do not go here.
- vMotion network. I keep them on separate VLAN as there is no need to do inter-cluster vMotion a VM across Clusters Type. For example, In the Network Edge Clusters, the only VMs living here will be NSX Edge VMs and NSX Distributed Router VMs. There is no need for other VMs to live in this cluster. To minimize human error and ensure the segregation, the vMotion network does not go across different type of clusters. Let’s take another example to make it clearer. In the VDI cluster, we will have 1 – 5 clusters per physical DC. They can vMotion among these 5 VDI clusters, so the entire 5 clusters is just 1 logical pool. There is no business need for the VDI VM to vMotion to the Management Cluster. Also, the VDI server infrastructure (e.g. Horizon View Connection Server) live in the Management Cluster, so separation helps simplify operation. This interface needs 2 IP addresses if you are doing multi-NIC vMotion.
- Fault Tolerant network. I apply the same restriction to prevent a VM and its shadow VM spans across 2 different type of cluster.
- Storage network. This can be NFS, iSCSI or VSAN. From the above diagram, you can see that I share the Storage network between Server workload, Desktop workload and Non Production. To keep things simpler, you can actually share with the Management and Edge clusters also. This means you only have 1 VLAN. The reason it is safe to do that is there is no life migration. The VM has to be shut down as there is no vMotion network. You also cannot have an FT VM spanning as there is no FT network across the cluster type.
- vSphere Replication network. Having a separate network makes monitoring easier.
- VXLAN network. This is the network where the VM traffic will be tunnelled. Having a separate network makes monitoring easier.
The above will need 6-7 IP addresses. Plan your IP address carefully. I personally prefer an easy correlation to my ESXi. For example, ESXi-01 in my environment will have x.x.x.1 address, and ESXi-99 will have the x.x.x.99 address.
ESXi Agent Network
Data Center components are moving to the hypervisor. When they move, they either move to the kernel as kernel module (e.g. VSAN), or they take the VM form factor. An example of VM form factor is Nutanix Storage VM and TrendMicro Deep Security. Regardless, you need an IP address for every ESXi.
This Network is not a vmkernel network. They are VM network. However, they are backed by VLAN, not VXLAN. That means they are on the physical network, given a physical IP address. So you need to plan them.
Now that we’ve covered the ESXi networks, let’s move to the VM Network.
All of them, without exception, will be on VXLAN. This allows decoupling with the physical DC network. Another word, we virtualize the VM Network. Defining it on software allows inter-DC mobility. There will be many VXLAN networks, so I need to plan them carefully too. In the above diagram, I have grouped into 4 top-level groups. I’d give each group its own range, so I know what kind of workload is running given a VXLAN number. Here is an example:
- VXLAN 10000 – 19999: Server Workload
- VXLAN 20000 – 29999: DMZ Workload
- VXLAN 30000 – 39999: Desktop Workload
- VXLAN 40000 – 49999: Non Production Workload
I have a wide range on each as I have a sub-category. Yes, this means you will have a lot more VXLAN than you do VLAN. This is a fundamental difference between networking in SDDC and networking prior network virtualization. You do not want to have too many VLANs as it’s operationally taxing. VXLAN does not have that issue. Network becomes cheap. You can lots of them. For example, the server workload is split per application. If I give each application up to 10 networks, I can have 1000 applications. By having 10 networks, I can have numbering convention. For example:
- Web server network: xxxx1. Example is 10001, 10011, 10021, and 10031 for the first 4 applications. This means I know that anything on 1xxx1 is my production web servers.
- Application server network: xxxx2
- DB server network: xxxx3
Lastly, but certainly not the least important, you should have the iLO network for light-out management. This is the physical boxes management network.
[10 Nov 2015: I got a correction from Raj Yavatkar and T. Sridhar that we should not have spine connect to the northbound switch – if you do that, that creates some interesting issues; spines should be devoted to carry, inter-rack, East-West traffic. I will update the diagrams once I have some time. Thanks Raj and Sridhar for the correction. Much appreciated].
As SDDC Architect, you need to know how to implement the above Logical Architecture. It is software defined, but the hardware plays an important role. If necessary, tap the expertise of the Network Architect. In my case, I have requested YJ Huang from Arista to help me. I also benefit from Ivan Pepelnjak’s post here.
We start from the base connectivity. The diagram below shows 2 ESXi hosts, and how they are physically connected to network devices. I am using 2x 10 GE cables for data network, and 1x 1 GE cable for iLO. There is no need for HA for the iLO network. In most cases, 2x 10 GE is more than enough. Know your workload before you decide with 4x 10 GE.
Now that we’ve covered the basic, let’s see what the overall picture look like when we attach all the ESXi Hosts. The diagram below shows the 2 physical data centers. They have identical physical setup, but different IP addresses. Data Center 1 could have 10.10.x.x, while Data Center 2 has 20.20.x.x. By not extending the physical network, you contain the failure domain.
The diagram shows how my 5 clusters are connected to the switches. I use a Spine Leaf as I want to be able to scale without major re-architecting. I’ve drawn the future switches in grey. They naturally come in pair. I draw the spine-leaf connection thicker as that is 40G.
Let’s see how the architecture scale to the requirements, which is 2000 server VM and 5000 VDI VM. As you can see, it’s essentially an extension. Fundamentally, it remains the same. I do change the cluster placement to keep it simpler. This comes at a cost of re-cabling.
You maybe wondering why I use 40G between spine and leaf. For the VM Network, 10G is more than sufficient. The reason is the VMkernel Network. The vMotion boundary cut across pods.
I hope you find useful. Keen to hear your thought!