Home > ESX, Flex-10, HP, VMware > Flex-10 ESX design with simplicity and scalability: Part 2

Flex-10 ESX design with simplicity and scalability: Part 2

February 17th, 2011

This post continues from Part 1 where the Flex-10 configuration was done.

ESX Networking

So now we have all the Flex-10 networking configured, its time to see how this works in ESX.

All VLANs are now being passed down all the uplinks, through the Flex-10 switches and onto the vm_trunk_1a and vm_trunk_2a Ethernet Networks.

What we need to do now is install ESX or ESXi and get them connected to the networks.

You would have defined a particular VLAN which you will be using for your physical host IP addressing.

A good idea is to plan out your IP addresses scheme for Service Console/Management Network, VMKernel and vMotion to have IP addresses that are consistent. This also allows some clever scripting possibilities.

For vMotion, as the traffic will only be internal to a rack you can use an internal only IP address range like 192.168.x.x which keeps it separate from the rest of your traffic.

If you have a /24 subnet and you will have less than 100 hosts then you could do something like this:

ESX Host NameService Console IPvmkernel IPvMotion IP
esxhost0110.1.100.110.1.100.101192.168.0.1
esxhost0210.1.100.210.1.100.102192.168.0.2
esxhost0310.1.100.310.1.100.103192.168.0.3
esxhost0410.1.100.410.1.100.104192.168.0.4
............
esxhost9910.1.100.9910.1.100.199192.168.0.99

If you have many more hosts you could have /23 subnets or even different vlans for Service Console and vmkernel. Just think ahead so you have options when you need many more hosts.

Install ESX(i) and select vmnic0 in the list and put the Service Console/Management Network on this VLAN.

Add the ESX(i) host into vCenter or connect directly.

The default networking for ESX will look like this:

Remove any VM networks that may have been created as part of the build.

vmnic0 is connected to the vm_trunk_1a Ethernet Network

Then add vmnic1 into vSwitch0 as this is the Nic connected to the vm_trunk_2a Ethernet Network.

Select Properties for vSwitch0 and select the Network Adapters Tab.

Click Add and select vmnic1 and click Next.

Accept the default Policy Failover Order and click Finish to add vmnic1

It’s a good idea to increase the Port Number for vSwitch0 so you can support more VMs.

Select vSwitch0 properties and select the General Tab and change Number of Ports to 120 which should be enough VMs to run on a BL460/BL490

Then you need to enable Beacon Probing on vSwitch0.

Beacon Probing is an extra level of protection to allow network failover. Beacon probing polls the default gateway of all uplinks to determine which links are available. Smart Link is the HP technology that fails Flex-Nics when uplinks are down but unfortunately firmware and driver incompatibilities sometimes mean this functionality doesn’t work so enabling Beacon Probing allows for failover even when Smart Link doesn’t work and gives you that extra level of protection.

Edit the vSwitch0 vSwitch and select the NIC Teaming tab.

Change the Network Failover Detection to Beacon Probing and click OK.

Then you need to add a Port Group for the vmkernel

Under vSwitch0 Properties click Add and select VMkernel.

Enter vmkernel as the Network Label and set the VLAN ID and click Next.

Enter the IP address details and click Next

Enter the Default Gateway.

Then go ahead and add all VM port groups to vSwitch0

Under vSwitch0 Properties click Add.

Select Virtual Machine

Enter the Network Label and VLAN ID:

Repeat for all other VM Port Groups.

Once you have added your VM Port groups, your vSwitch0 now will look something like this:

Now we need to segregate the network traffic by configuring Port Groups to use particular vmnics in vSwitch0.

We want all VMkernel NAS traffic to flow primarily over vmnic1

Edit the properties of the VMkernel port group.

Select the NIC Teaming Tab.

Check the Override vSwitch failover order:

Move vmnic0 down to a Standby Adapter and click OK.

We want all VM LAN traffic to flow primarily over vmnic0

Edit the properties of a Virtual Machine Port Group.

Select the NIC Teaming Tab.

Check the Override vSwitch failover order:

Move vmnic1 down to a Standby Adapter and click OK.

Repeat for all other Virtual Machine Port Groups.

We don’t need to amend the Service Console Port group as it is minimal traffic so can run over both uplinks.

You will now have all VMkernel NAS traffic primary over vmnic1. If vmnic1 fails then it will fail over to use vmnic0.

You will now have all VM LAN traffic primary over vmnic0. If vmnic0 fails then it will fail over to use vmnic0.

Then we need to add another vSwitch and configure vMotion.

Click Add Networking and select VMKernel.

Select vmnic2 (vm_vmotion_1b) and vmnic3 (vm_vmotion_2b) and click Next.

Enter vmotion as the Network Label and as this is an internal network you don’t need a VLAN ID.

Enter the IP address you have allocated and click Next.

You may want to enable Beacon Probing on vSwitch1 as well for the sake of consistency. Although not strictly required as it is an internal only network if you do in the future add any uplinks you have Beacon Probing enabled already.

Your ESX Network will now look something like this:

So, now you have your networking configured you can go ahead and mount your storage and configure the rest of the host.

I will be posting a guide soon on using PowerCLI to do the networking config as these manual steps are painful especially if you have many hosts to deploy.

The next step which is absolutely critical is to test your network design. Every single component needs to be failed and recorded so you are sure your network failover works as expected. Using blade chassis all linked together means you have many more eggs in your basket and if something doesn’t work when an uplink switch dies you could be facing a big outage. Don’t assume, test!

Use the testing spreedsheet in my previous post as a guide: Planning for and testing against failure, big and small

Future Expansion

The best way to add additional functionality is to be able to present it as just another VLAN. You can add the VLAN to your upstream switches and send them down the uplink trunks and as you are using Tunneled VLANs they will be available to ESX to use with another port group.

As this design is only using 4 out of the 8 available Flex-Nics you do have capacity to add additional networking.

You may require a separate backup network or have an IP load balancing environment that needs to be uplinked to separate upstream switches or want to connect your blades to an old network environment to allow migration. To connect to anything else you would run the additional uplinks and connect them to Ethernet Networks unused_nic_1c and unused_nic_2c or unused_nic_1d and unused_nic_2d and they will be available to your hosts.

Some Limitations

Now, this design may not suite everybody so here are some of the limitations to be aware of. You can then decide whether they apply to you and amend the design if need be.

  1. vMotion has a rack boundary. I like to think of a rack as a modular building block and design clusters that they are contained within a single rack. If you need to temporarily vMotion across racks you can enable vMotion on the VMKernel network which will mean during the vMotion you will be sharing bandwidth with NAS traffic but if you are happy with this this it could be a solution. If you need to vMotion across racks more often you can always add uplinks to the vm_vmotion_1b and vm_vmotion_2b Ethernet Networks.
  2. As Virtual Connect isn’t managing the VLANs and only passing them through any VM traffic that needs to go from one VLAN to another needs to go out of the rack up to the upstream switch. If you have a lot of heavy VM to VM traffic this may not be efficient. You can put the VMs on the same VLAN which will then stay within the rack but think about your traffic requirements if this applies.
  3. I haven’t catered for other types of servers you may want in the rack running Windows or Linux that need a PXE boot or have an OS that doesn’t support VLAN tagging. You could run additional uplinks for this or set the native/default VLAN on your uplink ports to a useable VLAN so this also sends untagged packets through Flex-10.

Hopefully this post has given you some more information on Flex-10 and shown you how to can fairly easily integrate it into your environment in a simple and scalable way.

Categories: ESX, Flex-10, HP, VMware Tags: , , , , ,
  1. Simon
    March 7th, 2012 at 07:22 | #1

    I just read the article to see if I could pickup any new tweaks for my setup and I have only one question: why “Beacon probing”. From what I’ve learned beacon probing is useless when used with less then three interfaces and I cannot understand why it is the default in ESX. In beacon-probing one member of the trunk sends/broadcasts a beacon which should be received by the other members. When the beacon is received by an interface the link is confirmed. In the case of less then three interfaces: if the beacon isn’t received by the interface it has learned nothing because it could be the sending interface that has a failure plus you have spammed the network with useless bits. So in this case I’d choose the other option. *what was it? link status?*

    To confirm: http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1005577

  2. WoodITWork
    March 27th, 2012 at 12:37 | #2

    @Simon

    Hi Simon.

    You are right that in a normal situation beacon probing normally is best with 3 interfaces but in a blade chassis you have a different problem to contend with.

    The Nic on the blade is always connected to the Flex-10 switch as it is hard wired in the back of the chassis. If you rely on just link status this will never go down unless the chassis back plane fails or the actual Nic fails. Network traffic will therefore not fail over if the uplink from the Flex-10 switch to the outside world fails.

    HP uses Smart Link which when it detects an upstream uplink has failed, fails the actual Flex-Nic so the blade can see a network down condition and fail over. This works pretty well but has had historic issues when the drivers and firmware didn’t work properly.

    Beacon-Probing pings the default gateway to see if there is a network connection available. If the Nic hardwired to the chassis is still permanently connected to the Flex-10 switch but the upstream connection is down, the ping to the default gateway will fail and the Nic will fail over correctly even though the physical Nic is still connected to the switch.

    Hope that helps.

  1. No trackbacks yet.
Comments are closed.