Saturday, August 7, 2010

Nexus7000/5000/1000v – A closer look at VPC and port channel load balancing

Port channel is a great redundacy and load sharing feature in data centers. Cisco Nexus takes it one step further with Virtual Port Channel (VPC). There are numerous good documentations about VPC, including the downloadable design guide.
Sometimes you need to look under the cover, and see exactly how each physical port is utilized by port channels. This note highlights a few useful commands.

Strictly speaking, port channel does load sharing, not load balancing. So there should not be an expectation for 50/50 balance. How well load sharing works largely depends on the traffic and the hashing method selected. For example, Nexus 1000v supports 17 hashing algorithms to load-share traffic across physical interfaces in a PortChannel, including source-based hashing and flow-based hashing. The default is source-mac.

Another important concept: hashing is uni-directional, determined by the sending party. Therefore there is no guarantee that load sharing will be symmetrical. For illustration, Nexus 7k and 5k are connected in what is known as “back to back” VPCs (port channel 75). A Nexus 7k has multiple physical connections southbound on the same logical channel. It determines the physical port to send traffic based on local hashing, as illustrated by the green arrow.


Here is the first command which shows the hashing algorithem used:
show port-channel load-balance
N-7010-1# sh port-c load
Port Channel Load-Balancing Configuration:

System: source-dest-ip-vlan
Port Channel Load-Balancing Addresses Used Per-Protocol:
Non-IP: source-dest-mac
IP: source-dest-ip-vlan

The second command shows how well load sharing is working on your port channels. Note statistics are accumulative, and is reset by clearing corresponding interface counters.
Show port-channel traffic

N-7010-1# show port-c traffic
ChanId Port Rx-Ucst Tx-Ucst Rx-Mcst Tx-Mcst Rx-Bcst Tx-Bcst
------ --------- ------- ------- ------- ------- ------- -------
75 Eth1/5 53.16% 42.58% 49.91% 53.70% 44.03% 44.86%
75 Eth1/6 46.83% 57.41% 50.08% 46.29% 55.96% 55.13%

As in the diagram, orange arrow indicates Nexus 5k northbound load sharing based on its hashing algorithem. The blue arrow indicates Nexus 1kv northbound load sharing based on its hashing.

Since Netflow is not yet supported on Nexus 5k, how do we tell which physical interface will a certain flow take? Here is the third command:

show port-channel load-balance forwarding-path interface ...

The following example shows how different IP address pairs yields different port utilization.
N-5010-1# show port-channel load-balance forwarding-path interface port-channel 75 vlan 25 src-ip 10.17.19.15 dst-ip 10.17.21.15
Missing params will be substituted by 0's.
Load-balance Algorithm on switch: source-dest-ip
crc8_hash: 90 Outgoing port id: Ethernet1/3
Param(s) used to calculate load-balance:
dst-ip: 10.174.21.15
src-ip: 10.174.19.15
dst-mac: 0000.0000.0000
src-mac: 0000.0000.0000

N-5010-sw1# show port-channel load-balance forwarding-path interface port-channel 75 vlan 25 src-ip 10.17.19.10 dst-ip 10.17.34.15
Missing params will be substituted by 0's.
Load-balance Algorithm on switch: source-dest-ip
crc8_hash: 179 Outgoing port id: Ethernet1/4
Param(s) used to calculate load-balance:
dst-ip: 10.174.34.15
src-ip: 10.174.19.10
dst-mac: 0000.0000.0000
src-mac: 0000.0000.0000

If you are running a test, and notices that the traffic is not well balanced, now you have a method to check the physical port allocation for your particular end points. You also have the option to experimenting with different hashing algorithem to suit your needs.

2 comments:

  1. I am seeing this on my VPC 1049 on the Tx-Ucst I only have one port doing all othe transmissions. This is for both controllers for the VPC 1049 ths po is LACP (mode active) and goes to a Windows 2003 server that is running LACP on the interfaces teamed. Any sudjestions on where to get this to ballance out.. Thanks.

    interface po1049
    ChanId Port Rx-Ucst Tx-Ucst Rx-Mcst Tx-Mcst Rx-Bcst Tx-Bcst
    ------ --------- ------- ------- ------- ------- ------- -------
    1049 Eth102/1/15 31.12% 99.90% 33.33% 29.06% 32.42% 0.0%
    1049 Eth102/1/14 31.78% 0.06% 33.33% 41.84% 19.59% 77.98%
    1049 Eth102/1/13 37.08% 0.03% 33.32% 29.09% 47.98% 22.01%

    ChanId Port Rx-Ucst Tx-Ucst Rx-Mcst Tx-Mcst Rx-Bcst Tx-Bcst
    ------ --------- ------- ------- ------- ------- ------- -------
    1049 Eth103/1/15 22.09% 99.17% 33.26% 33.21% 21.93% 0.0%
    1049 Eth103/1/14 55.21% 0.82% 33.39% 33.42% 68.17% 3.55%
    1049 Eth103/1/13 22.68% 0.00% 33.33% 33.35% 9.89% 96.44%

    ReplyDelete
  2. Tony, I can only speak in general terms without topology and device model. You can start by checking currently hashing method: "sh port-c load"

    You probably want to check if the hashing method used matches your traffic (you want variation in source and dest IP in order to the default source-dest-ip-vlan method to be effective).

    You can then "test" load sharing with commands like:
    "show port-channel load-balance forwarding-path interface port-channel 1049 vlan .. src-ip .... dst-ip ...."

    The above should tell you which physical interface gets the specific flow. From there maybe you will get some clues to improve your load sharing?

    ReplyDelete