Showing posts with label routing. Show all posts
Showing posts with label routing. Show all posts

Saturday, March 18, 2017

Three Networking features AWS should support

AWS is continuously enhancing and adding new features. However, a number of fundamental networking features have been discussed for a while, based on recent interactions with AWS team, still not on roadmap.

Here are three of those features high on my list, and why.

1. Multi-Path Routing (ECMP)
Currently, AWS routing table does not allow multiple routes to the same destination. For example, I can only define my default route in a private route table to a single destination (which can be a single point of failure).
If ECMP is supported, user will have a lot of load sharing and resiliency options. For example, I can define multiple default route to point to redundant load sharing gateways in multiple zones.

However, user still needs to keep those route up to date if the target instances changes. This can be done by keeping the ENI persistent and reattaching to new instances, or trigger lambda to update routes when instance refreshes

2. ELB as Route Table target
Supporting load balancer as a routing target may not seem natural as a network solution, there needs to be internal implementation that forward traffic to resolved load balancer and instances behind them.
This type of capability will allow user to fully benefit from the scalability and resiliency of load balancer, and have "native" high availability without the need for a self-maintained layer of lambda checks and actions.

An example that this can be done can be found with Azure, User Defined Route (UDR) can point to Azure Load Balancer (ALB), this enables route table to send traffic to a cluster of gateway nodes behind of load balancer, which leads to simple and elegant resiliency.

3. Native Transit VPC
In large scale enterprise use of AWS, as the number of VPCs go up, transit VPC can really help to scale by consolidating connectivity. Currently, there is a Cisco CSR based solution. But any third party appliances would require maintenance overhead, and introduce bottlenecks.

The ideal solution would be AWS enabled transit, to allow user to self define, much like peering connections.

I hope the these requirements are echoed by user communities.

Friday, May 18, 2012

BGP RIB-failure and effect on route advertisement


When examining routes advertised to BGP neighbor, notice some routes are tagged with “r”:

rtr1#sh ip bgp neighbor 10.11.19.21 advertised
BGP table version is 1735468, local router ID is 10.115.254.254
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
              r RIB-failure, S Stale, m multipath, b backup-path, x best-external, f RT-Filter, a additional-path
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
r>i10.115.254.0/30 10.115.254.9            0    100      0 i
*> 10.115.254.0/23 10.115.254.9           21         32768 i
r>i10.115.254.4/30 10.115.254.9            0    100      0 i
r>i10.115.254.8/30 10.115.254.9            0    100      0 i

Note this is BGP “RIB-failure”, which indicates BGP fails to install the route in routing table. According to this link, the likely cause is the route is already installed by IGP which has a lower AD.

Thursday, December 29, 2011

Beware of unpredictable OSPF to OSPF Mutual Redistribution


Just as it may be necessary to run multiple OSPF processes, It may also be necessary to redistribute routes between them. What happens when two OSPF processes redistribute routes to each other? The results can be quite surprising.

As shown in illustration, R1 runs two OSPF processes. R1 learns a network 10.0.0.0/24 from both routing processes. From OSPF 1 (left) R1 learns it as an inter-area route. From OSPF 2 (right) R1 learns it as an External route. Which direction would R1 prefer?

A simple test demonstrates show results can be unpredictable. By shutting down the interface towards OSPF 1 and turn it back on, R1 prefers E1 route to reach 10.0.0.0/24 via OSPF 2. Subsequently, by resetting the interface towards OSPF 2, R1 prefers inter-area route to reach 10.0.0.0/24 via OSPF 1.

OSPF’s preference of intra-area over inter-area over external applies to routes learned via the same process only; it does not apply to routes learned from multiple OSPF processes.

So what determines preference between routing processes? It’s Admin Distance. Since the default AD for OSPF processes are the same, thus the unpredictable results. Therefore the results can be “swung” by resetting interfaces.

When Administrative Distances are equal, the process that first installs the route in the routing table wins, regardless of metric and type.

How to make it deterministic? The key is obviously around AD. But to apply AD properly as a solution, the desired behavior must be clearly defined. First, identify potentially overlapping networks, that is, networks that can be advertised by both processes. Next, how should the network behave for those networks?

If all overlapping networks should prefer one process, for example, OSPF 1 inter-area should be preferred over OSPF 2 E1, then AD of OSPF 2 can be increased:
router ospf 2
distance ospf external 120

The diagram illustrates the result of the fix, routes from OSPF 1 will now be preferred due to deterministic AD.

If the desired behavior is specific to networks, then AD must be selectively adjusted using filter list. And AD may need to be adjusted on both OSPF processes to arrive at the specific preference for specific networks.

The end results should always be deterministic and predictable, verified using tests in normal and failure scenarios.

As a side note, Before Cisco bug ID CSCdw10987 (integrated in Cisco IOS Software Releases 12.2(07.04)S, 12.2(07.04)T, and later), the last process to make an shortest path first algorithm (SPF) would have won, and the two processes overwrite other routes in the routing table. Now, if a route is installed via one process, it is not overwritten by another OSPF process with the same administrative domain (AD), unless the route is first deleted from the routing table by the process that initially installed the route in the routing table.


Saturday, December 11, 2010

OSPF EIGRP BGP dual point mutual redistribution - Part 3

In part 1 and part 2 of the post, we mainly focused on OSPF and EIGRP mutual redistribution. We use administrative distance to control preference, and use tags to prevent loops.

In a typical enterprise WAN environment, carrier MPLS can also be used to carry traffic between sites and data centers. In a resilient architecture, there are multiple paths to the same destination. The business requirement may be such that certain traffic should take one type of link as its primary path, while still having a backup path in case of failure.

In the illustration, MPLS provides the WAN backup path for direct facilities (OSPF-EIGRP). BGP is used as the dynamic routing protocol through the MPLS cloud.

Recall from part 2, tag 25 is used to indicate routes originated in OSPF, and prevented from feed back from EIGRP back to OSPF. Why is there a third issue with BGP? Because the same route is advertised out of OSPF to MPLS via BGP. The data center running EIGRP will also learn the same route from MPLS cloud as a BGP route, in this case not tagged. On EIGRP to OSPF redistribution point, the tag filter does not stop feedback from a route learned via BGP. As long as east coast has a feasible successor (one with metric lower than current best FD), then this route will be advertised to west coast, with EIGRP distance of 100, thus preventing redistribution.

This is an example of network with the feedback issue. Note update tagged “1979” is sent due to better next hop FD. The end result is a network originated from west coast advertised out MPLS, became advertised back from east coast back on EIGRP, and preventing desired redistribution from OSPF into EIGRP.

East-RTR1#sh ip eigrp top 172.31.44.0 255.255.254.0
EIGRP-IPv4 (AS 100): Topology default(0) entry for 172.31.44.0/23
State is Passive, Query origin flag is 1, 2 Successor(s), FD is 30464
Routing Descriptor Blocks:
10.48.137.101 (GigabitEthernet0/0/1), from 10.48.137.101, Send flag is 0x0
Composite metric is (30464/30208), Route is External
Vector metric:
Minimum bandwidth is 625000 Kbit
Total delay is 1030 microseconds
Reliability is 255/255
Load is 31/255
Minimum MTU is 1500
Hop count is 3
External data:
Originating router is 168.147.152.166
AS number of route is 1
External protocol is OSPF, external metric is 20
Administrator tag is 250 (0x000000FA)
10.48.138.101 (GigabitEthernet0/0/2), from 10.48.138.101, Send flag is 0x0
Composite metric is (30464/30208), Route is External
Vector metric:
Minimum bandwidth is 625000 Kbit
Total delay is 1030 microseconds
Reliability is 255/255
Load is 2/255
Minimum MTU is 1500
Hop count is 3
External data:
Originating router is 168.147.152.166
AS number of route is 1
External protocol is OSPF, external metric is 20
Administrator tag is 250 (0x000000FA)
10.250.32.205 (GigabitEthernet0/2/0), from 10.250.32.205, Send flag is 0x0
Composite metric is (32768/7168), Route is External
Vector metric:
Minimum bandwidth is 625000 Kbit
Total delay is 1120 microseconds
Reliability is 255/255
Load is 2/255
Minimum MTU is 1500
Hop count is 3
External data:
Originating router is 10.250.248.1
AS number of route is 64601
External protocol is BGP, external metric is 0
Administrator tag is 1979 (0x0000369B)


The fix is by also setting tag (25) from routes coming from the west coast data center (identified by originating AS number) on the redistribution point from BGP to EIGRP. These tagged routes can be prevented from “feedback” on the EIGRP link with a route map. The route map can be applied on EIGRP interface distribute-list.

Wednesday, June 9, 2010

OSPF EIGRP BGP dual point mutual redistribution - Part 2

In part 1 of the post, we solved feedback on OSPF side, but issue in the opposite direction still exists. OSPF routes redistributed into EIGRP on R1, R2 learns from R1/EIGRP and prefers it (since we have lowered EIGRP EXT AD to fix issue 1). Because R2 now prefers the feedback route, it will not redistribute these OSPF routes to EIGRP.

First, between R1 and R2, stop EIGRP advertising routes redistributed from OSPF.
router eigrp 1
  distance eigrp 90 100
  distribute-list route-map Deny_routes_from_OSPF in

route-map Deny_routes_from_OSPF deny 10
  match tag [ospf tags]

As best practice, also use tag to stop the feedback to redistribution points, by blocking OSPF originated routes from redistributing back into OSPF.
route-map redist_EIGRP-to-OSPF deny 10
 match tag [ospf tags]
route-map redist_EIGRP-to-OSPF permit 20
...



What about routes learned from MPLS? Those are typically remote offices, which we don’t want to be redistributed from OSPF.

See an MPLS remote network 10.103.2.0/24. EIGRP side gets it from BGP as EIGRP EXT. OSPF side also gets it from BGP. Could redistributing router prefer EIGRP (since we set lower AD)? The router has two paths (left and right) to go out to the WAN to reach the remote site. We would want it to use the “local” MPLS exit point, in this case is through OSPF.

This is where EIGRP metric becomes important. Note EIGRP must have at least default metric for redistribution to happen. It is learning the same route from both BGP and OSPF. At the point of redistribution, the metric assigned to routes redistributed from OSPF to EIGRP should be lower than EIGRP metric going the other way (typically by setting lower bandwidth on WAN links). As a result, the redistribution router prefers the locally redistributed routes from OSPF.

Redistribution-rtr-1#sh ip eigrp top 10.103.2.0 255.255.255.0
EIGRP-IPv4 (AS 100): Topology default(0) entry for 10.103.2.0/24
State is Passive, Query origin flag is 1, 1 Successor(s), FD is 4352
Routing Descriptor Blocks:
168.147.152.161, from Redistributed, Send flag is 0x0
Composite metric is (4352/0), Route is External
Vector metric:
Minimum bandwidth is 625000 Kbit
Total delay is 10 microseconds
Reliability is 255/255
Load is 1/255
Minimum MTU is 1500
Hop count is 0
External data:
Originating router is 10.250.248.7 (this system)
AS number of route is 1
External protocol is OSPF, external metric is 7
Administrator tag is 200 (0x000000C8)
But MPLS does cause additional feedback issue, which we will cover in part 3.

Monday, June 7, 2010

OSPF EIGRP BGP dual point mutual redistribution - Part 1

It is still a common design to use IGP in the enterprise core. For a large enterprise, the requirement to integrate OSPF with EIGRP may be the result of mergers and acquisitions.

Although it may look like a CCIE bootcamp lab, mutual redistribution can be even more challenging in a production environment, due to these factors:
-As it will be shown, dual router mutual redistribution can be very tricky
-"Backdoor" paths such as multiple MPLS WAN networks adds more complexity
-Having multiple data centers, and the need to route different traffic in different failure scenarios

As I worked through the issues in a very large enterprise environment, We have seen multiple issues having a chain reaction, making the symptoms very difficult to diagnose. Sometimes, fixing one issue may introduce new ones. Unless we have an absolute crisp grasp of the design, and a systematic approach, the chance of confusion is extremely high.

This is the first of a 3 part series which I thought would be worthwhile to share some basics, and provide a logical breakdown of the interacting issues into separate and manageable pieces.


The much simplified diagram shows dual mutual redistribution points (blue arrow). So what is the issue? with dual router mutual redistribution, feedback can occur in both directions, resulting in inconsistency on the two redistribution points, and sub-optimal routing, even potential routing loops.
Issue 1: EIGRP->OSPF (feedback from OSPF)

Only applies to EIGRP EXT routes, which has a higher AD (170) than OSPF (110). These routes are redistributed into OSPF via R1. R2 learns from R1/OSPF, and prefers it. Therefore R2 will not redistribute these routes from EIGRP to OSPF, resulting in only one path used. Any EIGRP EXT route that doesn’t exist in OSPF will show up on one of the routers as preferring OSPF due to lower admin distance, breaking EIGRP->OSPF redistribution on one router. This means all traffic will be directed to one side.

This issues is resolved by setting EIGRP EXT AD to lower than OSPF (distance eigrp 90 100).

Note there is a side effect, the redistribution router will always prefer EIGRP path (due to lower EIGRP AD). But within OSPF, routes redistributed from EIGRP will have a higher metric (set with redistribution route map), therefore there is no risk of disturbing preference within OSPF.
Setting OSPF EXT AD(distance ospf ext 200) is similar. Both solutions will introduce issue 2, which we will cover in part 2.