enabling data science: February 2011

Saturday, February 19, 2011

Where to place VSM and vCenter

With Nexus 1000v, VSM and vCenter can run as VM under VEM, but that doesn’t mean they always should.

VSM is the “supervisor” for VEMs (virtual line cards). It also communicates with vCenter which is the central management and provisioning center for Vmware virtual switching.

As a network designer, we will need to work with host team to determine VSM’s form factor:

As a VM running under VEM (taking a veth port)
As a VM running under a vSwitch
As a separate physical machine
As an appliance (Nexus 1010 VSA)

As you can see, options range from complete integration in the virtualized environment, to complete separation, at increasing cost. Arguably, in a large and complex virtualization environment, the advantage of having separate control points will become more apparent. Here we briefly touch on two practical considerations.

Failure Scenarios

When everything works, there is really no disadvantage having VSM and vCenter plugged into a VEM. In theory, VSM can communicate even before VEMs are boot up, through control and packet VLANs which should be system VLANs. However, it could become a lot more complex to troubleshoot, when something is not right. For example, misconfiguration on vCenter leading to communication failure, software bug on the Nexus 1000v leading to partial VLAN failures, having a faulty line card with packet drops.

The point is, if there is a failure, we want to know quickly if it is in the control plane or the data plane. We often rely on the control plane to analyze what is going on in the data plane. Mixing VSM with VEM increases the risk of having control plane and data plane failure at the same time, making root cause isolation more difficult. However unlikely we may think, failure scenarios could happen. When it does, having access to VSM and vCenter is essential to troubleshooting and problem isolation. We know VEM does not rely on the availability of VSM to pass packets; however having VSM under VEM essentially places it under the same DVS that it manages, therefore subject to DVS port corruption error as an example. When a VEM fails, imagine losing access to VSM and vCenter as well because they are running under it.

Administrative Boundary

VSM and vCenter, due to their critical nature, needs to be protected. To prevent administrators from mistakenly change vCenter and VSM while making changes to other VMs, there should be as much administrative boundary established as the infrastructure supports.

Having VSM and vCenter in a separate control cluster with dedicated hosts creates clear administrative boundary. The use of a Vmware virtual switches (vDS) instead of VEM for vCenter and VSM will further decouple dependency. The vDS should be clearly named; its special purpose will be understood by all administrators, therefore minimizing the chance for mistakes.

The diagram shows a sample of placing VSM and vCenter as VMs on a separate control cluster separate from the applications VM they manage.

Saturday, February 12, 2011

Spanning tree in a Nexus virtualized data center

Nexus VPC reduces the reliance on Spanning Tree in a data center design, and improves link utilization and load sharing. However, spanning tree is still a necessary component of the design, often seemingly more complex, due to various options available, and connection to third party devices.

There was an earlier post about Bridge Assurance. This note provides a more complete reference model (R-PVST), to capture recommended best practices in a typical design, followed by notes explaining why some of the choices made.
Click to see larger image.

Nexus 7000 VPC peer link and non-VPC link

Enable BA with “spanning-tree port-type network” on both ends of trunk (port channel)

Why Bridge Assurance: new Cisco STP feature works in conjunction with Rapid-PVST BPDUs to protect from bridging loops (also supported by MST). BA must be supported by and configured on both switches on a point to point link, otherwise blocking will occur. BA uses bidirectional hello to prevent looping conditions caused by unidirectional links or a malfunctioning switch. If a BA port stops receiving BPDUs, the port is moved into the blocking state.

Nexus 7000 connection to third party Load Balancer or Firewall devices

Enable port fast with “spanning-tree port-type edge trunk”

Why Port Fast: This is for third party devices which can be treated like hosts connected to access (they do not send BPDUs). Setting port-type to edge enables Port Fast which allows access port to enter the forwarding state immediately, instead of waiting for STP to converge. These ports should not receive bridge protocol data units (BPDUs), otherwise they will immediately transition to the blocking state. The trunk keyword enables edge behavior on a trunk port.

Why BPDU Guard: BPDU Guard works together with Port Fast on edge ports. In a valid design, edge ports should not receive BPDUs. Reception of a BPDU indicates an error in configuration, such as connection of an unauthorized device. By shutting down a port that receives a BPDU, BPDU Guard protects the network, since only an administrator can put the edge port back in service.

Nexus 7000 downlink to Nexus 5000 access switches (back to back VPC)

Enable root guard on “spanning-tree port-type normal”, note this is applied on VPC

Why Root Guard: Note Nexus 7000s are SPT root. Root Guard is a feature placed on aggregation port facing access switches, preventing it from becoming a root port. In this case it prevents Nexus 5000 to become a SPT root switch, to ensure that a configuration error on an access layer switch does not cause STP disruption and instability.

SPT on VPC is a topic in itself. Cisco has a good document here with more details. Note bridge assurance is usually not needed, since VPC consistency check ensures the integrity of configurations.

Nexus 5000 uplink to Nexus 7000 (back to back VPC)

Enable loop guard on “spanning-tree port-type normal”, note this is applied on VPC

Why Loop Guard: For additional protection against loops, Loop Guard is enabled on root and alternate ports (facing root bridge). When Loop Guard on Nexus 5000 detects that BPDUs are no longer being received, the port is moved into a loop-inconsistent state instead of transitioning through SPT convergence. This will break a Layer 2 loop immediately due to misconfiguration or unidirectional link.

Nexus 5000 connection to hosts

Enable port fast with “spanning-tree port-type edge trunk”

Why Port Fast: see above, same as Nexus 7000 connection to third party.

Why BPDU Guard: see above, same as Nexus 7000 connection to third party.

A final note, Cisco Nexus 1000V does not run SPT, it does not generate BPDUs, nor does it respond to them. Nexus 1000V examines source and destination MAC address to prevent loops. This is the reason Nexus 5000 access port connected to Nexus 1000v virtual switch should be set to type edge, just like an access port connected to a physical host.

Saturday, February 5, 2011

vmware ESX/ESXi host network load sharing options - Simplified

Why load sharing

With virtualization, network and server domains converge; there are usually multiple vendors for server, NIC, storage, and networks. A typical example is using Nexus 1000v which is a Cisco product, embedded in ESX which is VMware, utilizing an HP NIC, and interacting with NAS which is yet another vendor.

Why load sharing? It is high desirable to have redundant uplinks from an ESX host for high availability. In addition, load sharing over redundant uplinks improves performance and utilization. So what are the load sharing options from the host?

A fundamental design is how traffic flows to and from VM to the rest of the network. In this example, VMs reside in ESX, but the concept is the same for any virtualization host interacting with the network.

There have been numerous vendor documents, often covering a certain aspect in detail, occasionally conflicting and confusing as technologies have been evolving. I posted earlier about load sharing mainly on Nexus switch side. Scott Lowe has a series of excellent articles on the topic. Why the summary here? I found it necessary to organize multiple concepts around host load sharing under a simple framework to make it easier to understand and apply.

Options

The following table summarizes common load sharing options, from the least desirable to the most.

click to see full size image of the summary.

Additional Considerations - LACP

The last two options require some clarification. Without Nexus 1000v, VMware does not support dynamic DHCP. Therefore VMware documentation usually specifies “mode on” as the recommended configuration. Cisco definitely supports dynamic LACP, although earlier Nexus 1000v releases may have had some LACP specific bugs. The latest Nexus 1000v Release 4.2(1) SV1(4) contains many fixes, but is to be proven in a production system, it also has the additional benefit of LACP offload from VSM to VEM. If you know of more recent developments or road map, please kindly let me know.