Monday, July 4, 2011

Data Center ISP Load Sharing Part 4 – Tuning

Does full Internet routing table work the best for load sharing with multi-homed ISP connections? We have shown it is often not the case.

Part 1 of the posting shows  the challenges of dual ISP design, the traditional approach of outbound load sharing based on entire internet routing table will largely depend on the particular ISPs.

Part 2 of the posting shows the advantage of simple default route based Internet load sharing design.

Part 3 of the posting introduces a design that combines the simplicity of default based load sharing to dual ISP, and flexibility of selectively filtering subsets of Internet routes for optimal path selection.

In this final part we look at why and how the results should be fine-tuned.

At the initial design stage, you could estimate the number of routes filtered in from your respective ISPs, using the BGP regular expression you designed. Route count estimate provides the basis for your filtering design.  For example, you can count that approximately 50000 routes will be allowed in by a filter specifying only adjacent networks to a tier one ISP. You also count that approximately 40000 routes will be allowed in by another filter specifying adjacent networks as well as those one hop away from a tier two ISP.

After implementation, you will notice the actual number of specific networks allowed in will be less than the combined total of 50000 plus 40000. The total number (for example 80000) is less than the total due to duplicates. In other words, you learn the same 10000 routes from both ISPs because those networks are adjacent to both ISPs. This is common and to be expected.

You might have expected the duplicate routes to be split more or less evenly across the two ISPs, which is often not the case. Therefore, the effect of duplicate routes on load sharing requires some careful observation. ISPs may operate in different tiers of the internet hierarchy, thus affecting the routes they advertise to have shorter or longer AS path length.  AS path length is a primary criterion in BGP path selection, therefore you will likely see almost all of the duplicate routes favoring one ISP. This may affect load sharing, thus require further adjusting the filters.

A second example is the influence of ISP metrics. Some ISP may advertise routes with a metric, while others advertise all routes with zero metric. Zero metric will be preferred if other more priority criteria is equal.

The diagram shows the original design may have expected a load sharing design of 5/4. However, the result shows load sharing between ISP1 and ISP2 turns out to be 5/3, due to all duplicate routes favoring ISP1. Depending on your specific requirements, fine-tuning of the filter may be necessary.

Design for redundant Internet architecture is unique to every organization’s requirements, its national or global data center architecture, the ISPs selected, and the nature of its Internet traffic. The scenarios described hopefully have provide simple templates as references to adjust for your particular data center.

No comments:

Post a Comment