Sunday, November 25, 2012

Another look at Flow Control in the cloud

I posted about Cisco Flow Control with NetApp NAS during first round of cloud implementation almost two years ago. Paul commented about updates in NetApp documentation, so now is a good time for an update, with a fresh look at the general use of flow control.

Let’s start with NetApp’s “Ethernet Storage Best Practices”, which recommends:
  1.  Not enable flow control throughout the network
  2. Set storage to “send” flow control, set switch to “receive”
We can all agree on the first point. 802.3x Ethernet Flow Control has not been widely adopted in practice, due to implementation complexity and hardware dependency. Higher layer mechanism such as TCP Windowing, is more predictable and effective for end-to-end flow control.

So does the second point still make sense? Let’s revisit the basic concept of feedback in 802.3x: upon receiving a PAUSE frame, the sender responds by stopping transmission of any new packets until the receiver is ready to accept them again. NetApp here assumes that there is more buffer available on the switch side, so the receiver (storage) signals to the sender (switch) to hold packets in buffer until it is ready to receive more.  In today’s high speed 10G based networks, the opposite is often true. There should be more buffer space on storage than on switch side.

Cisco’s Nexus 5k is a typical access switch to connect with storage. On the Nexus 5500 platform, the packet buffer per port is 640KB, which is an increase from 480KB from the previous generation Nexus 5000. With 1G/10G speed, such buffer size only allows the link to be paused for a short time, typically measured in microseconds.  The question is, at what point does the storage receiver needs to send PAUSE? Doing so unnecessarily and pushing the bottleneck on the switch side, is unpredictable at best, and may do more harm than good.

Another NetApp documentation “NetApp storage best practices for Vmware vSphere” makes it more clear by stating “NetApp recommends turning off flow control and allowing congestion management to be performed higher in the network stack”.

A new development with flow control is PFC. Approved in June 2011, Priority Based Flow control (PFC) became standard as 802.1Qbb, which extends 802.3 PAUSE on point-to-point links but supporting multiple priorities. Designed to support “lossless” Ethernet (a key component of DCB), its implementation is highly vendor and hardware specific. For example, Cisco’s Nexus 5k has unique buffer allocation implementation to support PFC. For now it is a special purpose technology with its success largely dependent on that of FCoE.  

I will summarize the revised recommendation as:
  • Turn off flow control unless explicitly requested by endpoint vendors
  • For FCoE and specialized lossless requirement, implement 802.1Qbb Priority-based Flow Control (PFC) only when necessary and when supported by hardware

No comments:

Post a Comment