Home > Flex-10, HP > Configuring Cisco Ports with HP Flex-10 to avoid loops

Configuring Cisco Ports with HP Flex-10 to avoid loops

April 28th, 2011

When deploying HP Flex-10 switches in your racks, one of the things that is often overlooked is ensuring you have the correct upstream switch port configuration.

Your Flex-10 switches make up only one half of your network topology and need to be connected to upstream switches to complete the networking design. I’m going to use the example of Cisco upstream switches.

I’ve blogged previously on my ideas for a Flex-10 ESX design with simplicity and scalability, See Part 1 and Part 2

Ensuring your Flex-10 uplinks are connected to your upstream switch ports correctly is vital to ensure your Flex-10 infrastructure is always available and stable.

You may also have separate teams of people who look after your Flex-10 switches and your upstream Cisco switches as HP Flex-10 has often been sold as a networking technology which is easy enough to implement by server engineers and so may not be supported by your networking team. Always ensure your teams are talking to each other to fully understand the technology as finger pointing when things go wrong is the last thing you need!

HP has published a few documents to assist Cisco people in understanding Flex-10:

Loops and Spanning Tree
The thing that makes Cisco people the most nervous about HP Flex-10 is how it handles loops as Cisco switches don’t like loops.

Network loops are created when the same MAC address is seen as coming from multiple sources. Spanning tree is a networking protocol that is used to detect and prevent loops to ensure there is only one active path between switches but allow redundent links in case of failure. When the same MAC address is seen on more than one port, spanning tree kicks in and shuts down all the ports for that particular vlan until it can work out the flow of traffic and then open up enough ports to avoid the loop and then pass traffic. This process can take some time and as traffic doesn’t flow while spanning tree is working out how to avoid the loop, it could bring down your environment.

For more information on Spanning Tree Protocol see Cisco’s explanation:
http://www.cisco.com/univercd/cc/td/doc/product/rtrmgmt/sw_ntman/cwsimain/cwsi2/cwsiug2/vlan2/stpapp.htm

HP sells Flex-10 saying it doesn’t participate in Spanning Tree and so your Cisco admins should not worry.
Quoting from Page 27 of HP Virtual Connect for the Cisco Network Administrator

Since Virtual Connect does not present itself to the external Cisco network as another “Ethernet switch”, the subject of “spanning tree interoperability” is not an applicable topic when discussing the two products.

I don’t particularly like HP’s use of language. Calling discussions about spanning tree “not an applicable topic” seems like you should avoid the topic rather than understand it and configure your ports accordingly.

HP Flex-10 avoids spanning tree by allowing only one logical path for traffic to flow. All loop detection takes place wthin the Virtual Connect Domain so by the time traffic is ready to pass out of the Virtual Connect Domain to upstream switches you can be sure there are no loops.

This is similar to how ESX manages traffic. An ESX host may have a vSwitch attached to two redundent switches. The ESX Nic bonding software ensures there are no loops and traffic only flows over a single logical path out of the vSwitch.

So if HP Flex-10 doesn’t participate in Spanning Tree, is there anything you need to do on the Cisco port side? The answer is yes, you need to ensure that you turn OFF any spanning tree detection on your Cisco ports. This may seem counterintuitive. If HP Flex-10 doesn’t participate in spanning tree, what is the harm in leaving the detection on as it will never see any loops?

Unfortunately sometimes networking doesn’t play nicely and Cisco may see a loop when HP Flex-10 isn’t creating one and having spanning tree kick in incorrectly and unexpectedly may bring your environment down.

All things Ports
Your Flex-10 uplinks will be connected into upstream ports (also called interfaces).
To disable spanning tree, these ports need to be configured as edge devices (or have PortFast enabled on older Cisco switches) which is the same port type you would use to connect normal servers.

These ports will be in either of two modes, access or trunk.
Access ports are configured with a single vlan and trunk ports have multiple vlans.
Having access or trunk ports doesn’t make a difference with loop detection in terms of your Cisco port config.

You can bond Cisco ports together to create an LACP group to increase bandwidth. In newer Cisco Nexus switches these groups can span separate physical Cisco Nexus switches and are called Virtual Port Channels (vPC). LACP groups and Virtual Port Channels act as a single logical interface and some of the config which would have previously been done on the individual ports are done at the LACP group/vPC level.

Looking again at my post Flex-10 ESX design with simplicity and scalability: Part 1 you can see the difference between the three proposed options.

  • 20Gb Option: No LACP Group, 10 GbE per uplink set.
  • 40Gb Option: LACP Group to single switch, 20 GbE per uplink set
  • 40Gb vPC Option: Virtual Port Channel (vPC) split across 2 x Cisco Switches, 20 GbE per uplink set.

Cisco configuration commands are also sometimes different between switch models so check with your networking team if the commands are not quite the same. For example, on Cisco Nexus switches you need to configure ports/vPCs as edge devices to disable spanning tree while on pre-Nexus switches you need to enable portfast on the port/group to do the same thing.

If you are using a LACP/vPC group, it is good practice to set all configuration settings on the LACP/vPC group rather than the individual interfaces.

BPDU Guard
There is another extra layer of protection against spanning tree you can configure and that is called BPDU Guard. BPDU Guard is a Cisco protection feature when spanning tree is disabled. If you had to move your Flex-10 uplinks from Flex-10 to another Cisco switch, BPDU Guard would recognise that you are now connecting two Cisco switches together by listening for what are called BPDU packets and shut down the ports to protect against a loop even though spanning tree is disabled.

BPDU Guard can be set as a global setting on all your Cisco ports and therefore doesn’t need to be set on individual ports.
To enable BPDU Guard globally for all edge ports the command is:

spanning-tree port type edge bpduguard default

If you don’t want to make this a global setting you can enable BPDU Guard on an individual LACP/vPC group or access/trunk port with the following command:

spanning-tree bpduguard enable

BPDU Filter
BPDU Filter is another way of ENABLING spanning tree even though you may have disabled spanning tree on the individual LACP/vPC group or access/trunk port. BPDU filter invokes spanning tree across the entire vlan when it sees BPDU packets so you definitely want to ensure you do NOT have BPDU filter enabled on any ports.

Ensure you do not have BPDU Filter enabled globally for all edge ports by removing the following command:

spanning-tree port type edge bpdufilter default

or have it set to disabled for edge ports by setting the following command:

no spanning-tree port type edge bpdufilter default

The damage control difference between BPDU Guard and BPDU Filter is BPDU Guard will just shut down the port which may isolate your rack while BPDU Filter will invoke spanning tree across all affected vlans which could bring down every rack in your environment if they share vlans. You don’t want any network issue spreading across racks otherwise you have a far bigger problem on your hands.

Disabling Spanning Tree
To disable spanning tree on the Cisco ports your Flex-10 switches are connected to you need to configure the following:

Remember if you are using a LACP Group or a Nexus vPC make the configuration changes on the LACP/vPC Group rather than on the individual ports

Ensure you do not have BPDU Filter enabled on any LACP/vPC group or access/trunk ports by REMOVING the following command:

spanning-tree bpdufilter enable

If you do not have BPDU Guard enabled as a global setting configure BPDU Guard on the LACP/vPC group or access/trunk port with the following command:

spanning-tree bpduguard enable

Disable Spanning Tree on the LACP/vPC group or access/trunk ports:
Cisco Nexus Trunked Ports:

spanning-tree port type edge trunk

Cisco Nexus Access Ports:

spanning-tree port type edge

Cisco Pre-Nexus Trunked Ports:

spanning-tree portfast trunk

Cisco Pre-Nexus Access Ports:

spanning-tree portfast

Once you have your Cisco Ports configured I would suggest you start some testing. Run pings to the following:

  • IP Address of each Flex-10 Switch
  • Service console/management IP Address and vmkernel IP Address of an ESX(i) host in each chassis
  • IP Address of a VM on each ESX(i) host in each chassis

Reboot each Flex-10 switch in turn leaving enough time for the switches to come back and be able to route traffic and ensure your ESX(i) hosts and VMs are still available and check your Cisco logs for any errors or loops.

For more detailed testing steps have a look at the example testing steps and spreedsheet in my post, Planning for and testing against failure, big and small

Categories: Flex-10, HP Tags: , , , , ,
  1. May 21st, 2011 at 05:47 | #1

    I just wanted to drop a line and thank you for all the information you’ve posted around HP’s Virtual Connect. I’ve been struggling to bring some stability to a C7000 enclosure filled with BL-490c blades all running VMware ESX along with Cisco Nexus 1000V and HP’s VC Flex-10.

    Last week we struggled to upgrade all the blades to ESX 4.1 Update 1.

    Thanks to your site we were able to resolve the majority of issues we were experiencing with the bnx2x (1.62.15.v41.2) driver. We still need to occasionally issue a “vem restart” after rebooting to get everything working properly but I’m hopefully that will be fixed when we upgrade the Nexus 1000V next week.

    Cheers!

  2. WoodITWork
    May 25th, 2011 at 11:17 | #2

    @Michael McNamara
    Glad the information has been of use.
    When do you need to do a “vem restart” and what does it fix?

  3. Derrick Brown
    July 19th, 2011 at 21:20 | #3

    Hello,

    I’m receiving tons of these messages daily on my nexus switches:
    2011 Jul 19 20:41:43 ncmec-hqnx02 %FWM-2-STM_LEARNING_RE_ENABLE: Re enabling dyn
    amic learning on all interfaces
    2011 Jul 19 21:07:05 ncmec-hqnx02 %FWM-2-STM_LOOP_DETECT: Loops detected in the
    network among ports Po101 and Po102 – Disabling dynamic learning for 180 seconds

    will performing “spanning-tree port type edge trunk” resolve these issues.

  4. WoodITWork
    August 8th, 2011 at 11:48 | #4

    @Derrick Brown
    Hi Derrick.

    Ensuring your Cisco ports are configured correctly to avoid loops should get rid of these messages but you need to ensure you follow all the recommendations as you don’t want to get into a situation when you are turning off loop detection but then are still having loops in your network.

  5. swampie51
    November 18th, 2011 at 17:46 | #5

    Just to clarify. Spanningtree has no knowledge of the CAM table or the mac-addresses seen on individual ports. In general Spanningtree is a Layer2 feature sends out a hello packet (BPDU) at regular intervals (eg every 2 seconds). If a port sees a BPDU from itself on a seperate port it views the connection as a loop and will error disable one of the ports. Loops are very bad things on a Layer2 network due to the way in which a broadcast can be ever propogated and all consuming.

  6. December 16th, 2011 at 10:17 | #6

    It’s simple, the article is actually the best topic on this registry related issue. I fit in with your conclusions and will eagerly look forward to your next updates. Just saying thanks will not just be sufficient, for the fantastic lucidity in your writing. I will instantly grab your feed to stay informed of any updates. Really fantastic and I will be coming back for more information at your site and revisit it! Thank you. store

  7. Doug Eastman
    November 15th, 2012 at 17:32 | #7

    We have been having some issues with Flex 10 modules and HPC7000 as well.
    We are using Nexus 5596’s with VPC configured…Po100 is our VPC peer/domain between the switches.

    At times….usually during a flex10 firmware upgrade this happens to us on both switches. It appears that the “Bulk” movement of the VM’s causes the N5K’s to think there is a loop, thus disabling learning for 180 seconds…

    The problem is when the loop is detected for our VLAN 997 (NFS link to our storage) and both switches simultaneously disable learning.

    So when the VM’s move…multiple devices lose their NFS storage. Because VLAN 997 isn’t available during that 180 second time frame. Needless to say it wreaks havoc on our VM’s that need the (NFS Vlan)

    FYI Po29 and Po30 are the VPC’s to the C7000 Blade enclosure, and again Po100 is the VPC connection between the two N5K’s

    Log entry from both switches.
    2012 Nov 2 23:15:17 stp1n5k-1 %FWM-2-STM_LOOP_DETECT: Loops detected in the network for mac 0050.5677.0863 among ports Po100 and Po30 vlan 997 – Disabling dynamic learn notifications for 180 seconds

    2012 Nov 2 23:15:17 stp1n5k-2 %FWM-2-STM_LOOP_DETECT: Loops detected in the network for mac 0050.567a.ba67 among ports Po29 and Po30 vlan 997 – Disabling dynamic learn notifications for 180 seconds 2012 Nov 2 23:18:17 stp1n5k-2 %FWM-2-STM_LEARNING_RE_ENABLE: Re enabling dynamic learning on all interfaces

    We do have BPDUguard enabled globaly on the N5K’s
    spanning-tree port type edge bpduguard default

    Also on the port-channel/VPC’s to the Flex10’s we have these two spanning tree commands

    spanning-tree port type edge trunk
    spanning-tree guard loop

    I believe that the spanning-tree guard loop is the command that is causing our problems but have not been able to get anyone to tell me it should be removed…Cisco TAC included.

    This command stems from the worries that the Flex10 modules could loop the network if they do indeed “Act like a Switch”, which everything I have been told tells me they do not.

    I am in the process of setting up a CM to remove this command from my port channels to our HP-C7K’s and would appreciate any feedback on the matter.

  8. WoodITWork
    November 16th, 2012 at 11:31 | #8

    Hi @Doug Eastman
    How are your Ethernet Networks configured on the Virtual Connect and how to they connect to your ESX(i) hosts. Do you have active / passive uplinks or all active uplinks? If you have all active uplinks then no loop should be possible.

    Virtual Connect doesn’t participate in spanning tree as it puts uplinks into standby when it can’t create an Etherchannel connection.

    I haven’t seen the spanning-tree guard loop command before but my recommendation would be to remove it as it does seem the Cisco switch is seeing a loop where one doesn’t exist.

  9. Doug Eastman
    November 19th, 2012 at 16:05 | #9

    Thanks for the resoponse.
    I do not do the administration on the C7K/Flex 10’s.
    Here is what I do know.
    All active uplinks *0G to each Blade Enclosure.
    We are using Nexus 5K’s and VPC technology.
    We did not see these issues before going to VDS and performing firmware upgrades on the flex10 modules.

  10. Doug Eastman
    November 19th, 2012 at 16:07 | #10

    Sorry, 80GB to each Blade Enclosure…2X 40G vpc’s

  11. squebel
    November 20th, 2012 at 17:32 | #11

    @Doug Eastman – Are you saying that you didn’t start seeing this issue until you moved to the vSphere vDS (Distributed Switch) or did you mean something else?

    We’re seeing all sorts of weird connectivity issues as well which appears to be caused by spanning tree reconvergence in the network but we can’t figure out why it’s affecting our hosts and vm’s.

  12. July 31st, 2014 at 10:27 | #12

    Some incorrect details were given on Cisco networking.

    Switches running STP detect loops by seeing their own BPDUs come back at them (i.e., a loop). MAC addresses flapping from one port to another have nothing to do with STP and only serves to trigger flapping log entries and forcing the MAC table to be frequently updated which can impact switch CPU.

    Portfast does not disable STP on an interface. Portfast still runs with STP, but just skips the learning and listening states and jumps right to forwarding. See http://networklessons.com/spanning-tree/does-portfast-disable-spanning-tree/

  13. Ashish
    October 19th, 2015 at 10:59 | #13

    Hi,

    For STM loop detect messages make either make the port-channel 101 and 102 as a single port channel or ask the server admin to change the NIC mode from Active/Active to Active/Backup.

  1. No trackbacks yet.
Comments are closed.