Thursday, August 19, 2010

Nexus7000 OSPF failure due to MFDM crash – still searching for root cause

This occurred a while ago, still waiting for words. Unfortunately the condition has cleared due to production network, just wondering if anybody else out there has some possible clue about root cause.

The most noticeable symptom was OSPF adjacency problem, with no error message. OSPF was able to establish at least partial adjacency with some neighbors, but not others.

Thinking OSPF process has gone bad, we restarted it, but no use. OSPF process runs fine with no error, but adjacency trouble remains. Further review of logs reveals that a process known as MFDM (Multicast FIB Distribution?) crashed with attempted to restart three times but did not recover. Obviously OSPF adjacency uses Multicast FIB (MFIB).

2010 Aug 2 17:29:57 Nexus-7010 %SYSMGR-2-SERVICE_CRASHED: Service "mfdm" (PID 16377) hasn't caught signal 11 (core will be saved).
2010 Aug 2 17:30:03 Nexus-7010 %SYSMGR-2-SERVICE_CRASHED: Service "mfdm" (PID 16471) hasn't caught signal 11 (core will be saved).
2010 Aug 2 17:30:03 Nexus-7010 %SYSMGR-2-SERVICE_CRASHED: Service "mfdm" (PID 16524) hasn't caught signal 11 (core will be saved).

We could not recover MFDM individually, end up reloading VDC to recover it. Magically, OSPF started working again!

Still an open ticket, no root cause has been identified. Part of the difficulty was we seem to have lost some of the files, which makes it hard for vendor to trace it down. Just wondering is there is any similar experience out there?

And, could a hardware or ASIC related failure be causing MFDM crash? Or is it more likely a software bug?

Thanks for your thoughts and comments.

Saturday, August 7, 2010

Nexus7000/5000/1000v – A closer look at VPC and port channel load balancing

Port channel is a great redundacy and load sharing feature in data centers. Cisco Nexus takes it one step further with Virtual Port Channel (VPC). There are numerous good documentations about VPC, including the downloadable design guide.
Sometimes you need to look under the cover, and see exactly how each physical port is utilized by port channels. This note highlights a few useful commands.

Strictly speaking, port channel does load sharing, not load balancing. So there should not be an expectation for 50/50 balance. How well load sharing works largely depends on the traffic and the hashing method selected. For example, Nexus 1000v supports 17 hashing algorithms to load-share traffic across physical interfaces in a PortChannel, including source-based hashing and flow-based hashing. The default is source-mac.

Another important concept: hashing is uni-directional, determined by the sending party. Therefore there is no guarantee that load sharing will be symmetrical. For illustration, Nexus 7k and 5k are connected in what is known as “back to back” VPCs (port channel 75). A Nexus 7k has multiple physical connections southbound on the same logical channel. It determines the physical port to send traffic based on local hashing, as illustrated by the green arrow.


Here is the first command which shows the hashing algorithem used:
show port-channel load-balance
N-7010-1# sh port-c load
Port Channel Load-Balancing Configuration:

System: source-dest-ip-vlan
Port Channel Load-Balancing Addresses Used Per-Protocol:
Non-IP: source-dest-mac
IP: source-dest-ip-vlan

The second command shows how well load sharing is working on your port channels. Note statistics are accumulative, and is reset by clearing corresponding interface counters.
Show port-channel traffic

N-7010-1# show port-c traffic
ChanId Port Rx-Ucst Tx-Ucst Rx-Mcst Tx-Mcst Rx-Bcst Tx-Bcst
------ --------- ------- ------- ------- ------- ------- -------
75 Eth1/5 53.16% 42.58% 49.91% 53.70% 44.03% 44.86%
75 Eth1/6 46.83% 57.41% 50.08% 46.29% 55.96% 55.13%

As in the diagram, orange arrow indicates Nexus 5k northbound load sharing based on its hashing algorithem. The blue arrow indicates Nexus 1kv northbound load sharing based on its hashing.

Since Netflow is not yet supported on Nexus 5k, how do we tell which physical interface will a certain flow take? Here is the third command:

show port-channel load-balance forwarding-path interface ...

The following example shows how different IP address pairs yields different port utilization.
N-5010-1# show port-channel load-balance forwarding-path interface port-channel 75 vlan 25 src-ip 10.17.19.15 dst-ip 10.17.21.15
Missing params will be substituted by 0's.
Load-balance Algorithm on switch: source-dest-ip
crc8_hash: 90 Outgoing port id: Ethernet1/3
Param(s) used to calculate load-balance:
dst-ip: 10.174.21.15
src-ip: 10.174.19.15
dst-mac: 0000.0000.0000
src-mac: 0000.0000.0000

N-5010-sw1# show port-channel load-balance forwarding-path interface port-channel 75 vlan 25 src-ip 10.17.19.10 dst-ip 10.17.34.15
Missing params will be substituted by 0's.
Load-balance Algorithm on switch: source-dest-ip
crc8_hash: 179 Outgoing port id: Ethernet1/4
Param(s) used to calculate load-balance:
dst-ip: 10.174.34.15
src-ip: 10.174.19.10
dst-mac: 0000.0000.0000
src-mac: 0000.0000.0000

If you are running a test, and notices that the traffic is not well balanced, now you have a method to check the physical port allocation for your particular end points. You also have the option to experimenting with different hashing algorithem to suit your needs.