AWS is continuously enhancing and adding new features. However, a number of fundamental networking features have been discussed for a while, based on recent interactions with AWS team, still not on roadmap.
Here are three of those features high on my list, and why.
1. Multi-Path Routing (ECMP)
Currently, AWS routing table does not allow multiple routes to the same destination. For example, I can only define my default route in a private route table to a single destination (which can be a single point of failure).
If ECMP is supported, user will have a lot of load sharing and resiliency options. For example, I can define multiple default route to point to redundant load sharing gateways in multiple zones.
However, user still needs to keep those route up to date if the target instances changes. This can be done by keeping the ENI persistent and reattaching to new instances, or trigger lambda to update routes when instance refreshes
2. ELB as Route Table target
Supporting load balancer as a routing target may not seem natural as a network solution, there needs to be internal implementation that forward traffic to resolved load balancer and instances behind them.
This type of capability will allow user to fully benefit from the scalability and resiliency of load balancer, and have "native" high availability without the need for a self-maintained layer of lambda checks and actions.
An example that this can be done can be found with Azure, User Defined Route (UDR) can point to Azure Load Balancer (ALB), this enables route table to send traffic to a cluster of gateway nodes behind of load balancer, which leads to simple and elegant resiliency.
3. Native Transit VPC
In large scale enterprise use of AWS, as the number of VPCs go up, transit VPC can really help to scale by consolidating connectivity. Currently, there is a Cisco CSR based solution. But any third party appliances would require maintenance overhead, and introduce bottlenecks.
The ideal solution would be AWS enabled transit, to allow user to self define, much like peering connections.
I hope the these requirements are echoed by user communities.
Showing posts with label routing. Show all posts
Showing posts with label routing. Show all posts
Saturday, March 18, 2017
Friday, May 18, 2012
BGP RIB-failure and effect on route advertisement
When examining routes advertised to BGP neighbor, notice some
routes are tagged with “r”:
rtr1#sh ip bgp neighbor
10.11.19.21 advertised
BGP table version is 1735468,
local router ID is 10.115.254.254
Status codes: s suppressed, d
damped, h history, * valid, > best, i - internal,
r RIB-failure, S Stale, m multipath, b backup-path, x
best-external, f RT-Filter, a additional-path
Origin codes: i - IGP, e -
EGP, ? - incomplete
Network Next Hop Metric LocPrf Weight Path
r>i10.115.254.0/30
10.115.254.9 0 100
0 i
*> 10.115.254.0/23
10.115.254.9 21 32768 i
r>i10.115.254.4/30
10.115.254.9 0 100
0 i
r>i10.115.254.8/30
10.115.254.9 0 100
0 i
Note this is BGP “RIB-failure”, which indicates BGP fails to
install the route in routing table. According to this link,
the likely cause is the route is already installed by IGP which has a lower AD.
Labels:
BGP,
BGP route advertisement,
RIB failure,
RIB-failure,
routing
Thursday, December 29, 2011
Beware of unpredictable OSPF to OSPF Mutual Redistribution
Just as it
may be necessary to run multiple OSPF processes, It may also be necessary to
redistribute routes between them. What happens when two OSPF processes redistribute
routes to each other? The results can be quite surprising.
As shown in
illustration, R1 runs two OSPF processes. R1 learns a network 10.0.0.0/24 from
both routing processes. From OSPF 1 (left) R1 learns it as an inter-area route.
From OSPF 2 (right) R1 learns it as an External route. Which direction would R1
prefer?
A simple
test demonstrates show results can be unpredictable. By shutting down the
interface towards OSPF 1 and turn it back on, R1 prefers E1 route to reach
10.0.0.0/24 via OSPF 2. Subsequently, by resetting the interface towards OSPF
2, R1 prefers inter-area route to reach 10.0.0.0/24 via OSPF 1.
OSPF’s preference of intra-area over
inter-area over external applies to routes learned via
the same process only; it does not apply to routes learned from multiple OSPF
processes.
So what determines preference
between routing processes? It’s Admin Distance. Since the default AD for OSPF
processes are the same, thus the unpredictable results. Therefore the results
can be “swung” by resetting interfaces.
When Administrative Distances are equal, the process that first
installs the route in the routing table wins, regardless of metric and type.
How to make it deterministic?
The key is obviously around AD. But to apply AD properly as a solution, the
desired behavior must be clearly defined. First, identify potentially
overlapping networks, that is, networks that can be advertised by both
processes. Next, how should the network behave for those networks?
If all overlapping networks
should prefer one process, for example, OSPF 1 inter-area should be preferred
over OSPF 2 E1, then AD of OSPF 2 can be increased:
router ospf 2
distance ospf external 120
The diagram illustrates the
result of the fix, routes from OSPF 1 will now be preferred due to
deterministic AD.
If the desired behavior is specific to networks, then AD must be selectively adjusted using filter list. And AD may need to be adjusted on both OSPF processes to arrive at the specific preference for specific networks.
The end results should always be
deterministic and predictable, verified using tests in normal and failure
scenarios.
As a side note, Before Cisco bug ID CSCdw10987 (integrated in Cisco IOS Software Releases 12.2(07.04)S,
12.2(07.04)T, and later), the last process to make an shortest path first
algorithm (SPF) would have won, and the two processes overwrite other routes in
the routing table. Now, if a route is installed via one process, it is not
overwritten by another OSPF process with the same administrative domain (AD),
unless the route is first deleted from the routing table by the process that
initially installed the route in the routing table.
Labels:
Administrative Distance,
mutual redistribution,
OSPF,
routing
Saturday, December 11, 2010
OSPF EIGRP BGP dual point mutual redistribution - Part 3
In part 1 and part 2 of the post, we mainly focused on OSPF and EIGRP mutual redistribution. We use administrative distance to control preference, and use tags to prevent loops.
In a typical enterprise WAN environment, carrier MPLS can also be used to carry traffic between sites and data centers. In a resilient architecture, there are multiple paths to the same destination. The business requirement may be such that certain traffic should take one type of link as its primary path, while still having a backup path in case of failure.
In the illustration, MPLS provides the WAN backup path for direct facilities (OSPF-EIGRP). BGP is used as the dynamic routing protocol through the MPLS cloud.
Recall from part 2, tag 25 is used to indicate routes originated in OSPF, and prevented from feed back from EIGRP back to OSPF. Why is there a third issue with BGP? Because the same route is advertised out of OSPF to MPLS via BGP. The data center running EIGRP will also learn the same route from MPLS cloud as a BGP route, in this case not tagged. On EIGRP to OSPF redistribution point, the tag filter does not stop feedback from a route learned via BGP. As long as east coast has a feasible successor (one with metric lower than current best FD), then this route will be advertised to west coast, with EIGRP distance of 100, thus preventing redistribution.
This is an example of network with the feedback issue. Note update tagged “1979” is sent due to better next hop FD. The end result is a network originated from west coast advertised out MPLS, became advertised back from east coast back on EIGRP, and preventing desired redistribution from OSPF into EIGRP.
East-RTR1#sh ip eigrp top 172.31.44.0 255.255.254.0
EIGRP-IPv4 (AS 100): Topology default(0) entry for 172.31.44.0/23
State is Passive, Query origin flag is 1, 2 Successor(s), FD is 30464
Routing Descriptor Blocks:
10.48.137.101 (GigabitEthernet0/0/1), from 10.48.137.101, Send flag is 0x0
Composite metric is (30464/30208), Route is External
Vector metric:
Minimum bandwidth is 625000 Kbit
Total delay is 1030 microseconds
Reliability is 255/255
Load is 31/255
Minimum MTU is 1500
Hop count is 3
External data:
Originating router is 168.147.152.166
AS number of route is 1
External protocol is OSPF, external metric is 20
Administrator tag is 250 (0x000000FA)
10.48.138.101 (GigabitEthernet0/0/2), from 10.48.138.101, Send flag is 0x0
Composite metric is (30464/30208), Route is External
Vector metric:
Minimum bandwidth is 625000 Kbit
Total delay is 1030 microseconds
Reliability is 255/255
Load is 2/255
Minimum MTU is 1500
Hop count is 3
External data:
Originating router is 168.147.152.166
AS number of route is 1
External protocol is OSPF, external metric is 20
Administrator tag is 250 (0x000000FA)
10.250.32.205 (GigabitEthernet0/2/0), from 10.250.32.205, Send flag is 0x0
Composite metric is (32768/7168), Route is External
Vector metric:
Minimum bandwidth is 625000 Kbit
Total delay is 1120 microseconds
Reliability is 255/255
Load is 2/255
Minimum MTU is 1500
Hop count is 3
External data:
Originating router is 10.250.248.1
AS number of route is 64601
External protocol is BGP, external metric is 0
Administrator tag is 1979 (0x0000369B)
The fix is by also setting tag (25) from routes coming from the west coast data center (identified by originating AS number) on the redistribution point from BGP to EIGRP. These tagged routes can be prevented from “feedback” on the EIGRP link with a route map. The route map can be applied on EIGRP interface distribute-list.
In a typical enterprise WAN environment, carrier MPLS can also be used to carry traffic between sites and data centers. In a resilient architecture, there are multiple paths to the same destination. The business requirement may be such that certain traffic should take one type of link as its primary path, while still having a backup path in case of failure.
In the illustration, MPLS provides the WAN backup path for direct facilities (OSPF-EIGRP). BGP is used as the dynamic routing protocol through the MPLS cloud.
Recall from part 2, tag 25 is used to indicate routes originated in OSPF, and prevented from feed back from EIGRP back to OSPF. Why is there a third issue with BGP? Because the same route is advertised out of OSPF to MPLS via BGP. The data center running EIGRP will also learn the same route from MPLS cloud as a BGP route, in this case not tagged. On EIGRP to OSPF redistribution point, the tag filter does not stop feedback from a route learned via BGP. As long as east coast has a feasible successor (one with metric lower than current best FD), then this route will be advertised to west coast, with EIGRP distance of 100, thus preventing redistribution.
This is an example of network with the feedback issue. Note update tagged “1979” is sent due to better next hop FD. The end result is a network originated from west coast advertised out MPLS, became advertised back from east coast back on EIGRP, and preventing desired redistribution from OSPF into EIGRP.
East-RTR1#sh ip eigrp top 172.31.44.0 255.255.254.0
EIGRP-IPv4 (AS 100): Topology default(0) entry for 172.31.44.0/23
State is Passive, Query origin flag is 1, 2 Successor(s), FD is 30464
Routing Descriptor Blocks:
10.48.137.101 (GigabitEthernet0/0/1), from 10.48.137.101, Send flag is 0x0
Composite metric is (30464/30208), Route is External
Vector metric:
Minimum bandwidth is 625000 Kbit
Total delay is 1030 microseconds
Reliability is 255/255
Load is 31/255
Minimum MTU is 1500
Hop count is 3
External data:
Originating router is 168.147.152.166
AS number of route is 1
External protocol is OSPF, external metric is 20
Administrator tag is 250 (0x000000FA)
10.48.138.101 (GigabitEthernet0/0/2), from 10.48.138.101, Send flag is 0x0
Composite metric is (30464/30208), Route is External
Vector metric:
Minimum bandwidth is 625000 Kbit
Total delay is 1030 microseconds
Reliability is 255/255
Load is 2/255
Minimum MTU is 1500
Hop count is 3
External data:
Originating router is 168.147.152.166
AS number of route is 1
External protocol is OSPF, external metric is 20
Administrator tag is 250 (0x000000FA)
10.250.32.205 (GigabitEthernet0/2/0), from 10.250.32.205, Send flag is 0x0
Composite metric is (32768/7168), Route is External
Vector metric:
Minimum bandwidth is 625000 Kbit
Total delay is 1120 microseconds
Reliability is 255/255
Load is 2/255
Minimum MTU is 1500
Hop count is 3
External data:
Originating router is 10.250.248.1
AS number of route is 64601
External protocol is BGP, external metric is 0
Administrator tag is 1979 (0x0000369B)
The fix is by also setting tag (25) from routes coming from the west coast data center (identified by originating AS number) on the redistribution point from BGP to EIGRP. These tagged routes can be prevented from “feedback” on the EIGRP link with a route map. The route map can be applied on EIGRP interface distribute-list.
Wednesday, June 9, 2010
OSPF EIGRP BGP dual point mutual redistribution - Part 2
In part 1 of the post, we solved feedback on OSPF side, but issue in the opposite direction still exists. OSPF routes redistributed into EIGRP on R1, R2 learns from R1/EIGRP and prefers it (since we have lowered EIGRP EXT AD to fix issue 1). Because R2 now prefers the feedback route, it will not redistribute these OSPF routes to EIGRP.
First, between R1 and R2, stop EIGRP advertising routes redistributed from OSPF.
What about routes learned from MPLS? Those are typically remote offices, which we don’t want to be redistributed from OSPF.
See an MPLS remote network 10.103.2.0/24. EIGRP side gets it from BGP as EIGRP EXT. OSPF side also gets it from BGP. Could redistributing router prefer EIGRP (since we set lower AD)? The router has two paths (left and right) to go out to the WAN to reach the remote site. We would want it to use the “local” MPLS exit point, in this case is through OSPF.
This is where EIGRP metric becomes important. Note EIGRP must have at least default metric for redistribution to happen. It is learning the same route from both BGP and OSPF. At the point of redistribution, the metric assigned to routes redistributed from OSPF to EIGRP should be lower than EIGRP metric going the other way (typically by setting lower bandwidth on WAN links). As a result, the redistribution router prefers the locally redistributed routes from OSPF.
Redistribution-rtr-1#sh ip eigrp top 10.103.2.0 255.255.255.0
EIGRP-IPv4 (AS 100): Topology default(0) entry for 10.103.2.0/24
State is Passive, Query origin flag is 1, 1 Successor(s), FD is 4352
Routing Descriptor Blocks:
168.147.152.161, from Redistributed, Send flag is 0x0
Composite metric is (4352/0), Route is External
Vector metric:
Minimum bandwidth is 625000 Kbit
Total delay is 10 microseconds
Reliability is 255/255
Load is 1/255
Minimum MTU is 1500
Hop count is 0
External data:
Originating router is 10.250.248.7 (this system)
AS number of route is 1
External protocol is OSPF, external metric is 7
Administrator tag is 200 (0x000000C8)
But MPLS does cause additional feedback issue, which we will cover in part 3.
First, between R1 and R2, stop EIGRP advertising routes redistributed from OSPF.
router
eigrp 1
distance eigrp 90 100
distribute-list route-map
Deny_routes_from_OSPF in
route-map
Deny_routes_from_OSPF deny 10
match tag [ospf tags]
As best practice, also use tag to stop the feedback to redistribution points, by blocking OSPF originated routes from redistributing back into OSPF.
route-map
redist_EIGRP-to-OSPF deny 10
match tag [ospf tags]
route-map
redist_EIGRP-to-OSPF permit 20
...
What about routes learned from MPLS? Those are typically remote offices, which we don’t want to be redistributed from OSPF.
See an MPLS remote network 10.103.2.0/24. EIGRP side gets it from BGP as EIGRP EXT. OSPF side also gets it from BGP. Could redistributing router prefer EIGRP (since we set lower AD)? The router has two paths (left and right) to go out to the WAN to reach the remote site. We would want it to use the “local” MPLS exit point, in this case is through OSPF.
This is where EIGRP metric becomes important. Note EIGRP must have at least default metric for redistribution to happen. It is learning the same route from both BGP and OSPF. At the point of redistribution, the metric assigned to routes redistributed from OSPF to EIGRP should be lower than EIGRP metric going the other way (typically by setting lower bandwidth on WAN links). As a result, the redistribution router prefers the locally redistributed routes from OSPF.
Redistribution-rtr-1#sh ip eigrp top 10.103.2.0 255.255.255.0
EIGRP-IPv4 (AS 100): Topology default(0) entry for 10.103.2.0/24
State is Passive, Query origin flag is 1, 1 Successor(s), FD is 4352
Routing Descriptor Blocks:
168.147.152.161, from Redistributed, Send flag is 0x0
Composite metric is (4352/0), Route is External
Vector metric:
Minimum bandwidth is 625000 Kbit
Total delay is 10 microseconds
Reliability is 255/255
Load is 1/255
Minimum MTU is 1500
Hop count is 0
External data:
Originating router is 10.250.248.7 (this system)
AS number of route is 1
External protocol is OSPF, external metric is 7
Administrator tag is 200 (0x000000C8)
But MPLS does cause additional feedback issue, which we will cover in part 3.
Monday, June 7, 2010
OSPF EIGRP BGP dual point mutual redistribution - Part 1
It is still a common design to use IGP in the enterprise core. For a large enterprise, the requirement to integrate OSPF with EIGRP may be the result of mergers and acquisitions.
Although it may look like a CCIE bootcamp lab, mutual redistribution can be even more challenging in a production environment, due to these factors:
-As it will be shown, dual router mutual redistribution can be very tricky
-"Backdoor" paths such as multiple MPLS WAN networks adds more complexity
-Having multiple data centers, and the need to route different traffic in different failure scenarios
As I worked through the issues in a very large enterprise environment, We have seen multiple issues having a chain reaction, making the symptoms very difficult to diagnose. Sometimes, fixing one issue may introduce new ones. Unless we have an absolute crisp grasp of the design, and a systematic approach, the chance of confusion is extremely high.
This is the first of a 3 part series which I thought would be worthwhile to share some basics, and provide a logical breakdown of the interacting issues into separate and manageable pieces.
The much simplified diagram shows dual mutual redistribution points (blue arrow). So what is the issue? with dual router mutual redistribution, feedback can occur in both directions, resulting in inconsistency on the two redistribution points, and sub-optimal routing, even potential routing loops.
Issue 1: EIGRP->OSPF (feedback from OSPF)
Only applies to EIGRP EXT routes, which has a higher AD (170) than OSPF (110). These routes are redistributed into OSPF via R1. R2 learns from R1/OSPF, and prefers it. Therefore R2 will not redistribute these routes from EIGRP to OSPF, resulting in only one path used. Any EIGRP EXT route that doesn’t exist in OSPF will show up on one of the routers as preferring OSPF due to lower admin distance, breaking EIGRP->OSPF redistribution on one router. This means all traffic will be directed to one side.
This issues is resolved by setting EIGRP EXT AD to lower than OSPF (distance eigrp 90 100).
Note there is a side effect, the redistribution router will always prefer EIGRP path (due to lower EIGRP AD). But within OSPF, routes redistributed from EIGRP will have a higher metric (set with redistribution route map), therefore there is no risk of disturbing preference within OSPF.
Setting OSPF EXT AD(distance ospf ext 200) is similar. Both solutions will introduce issue 2, which we will cover in part 2.
Although it may look like a CCIE bootcamp lab, mutual redistribution can be even more challenging in a production environment, due to these factors:
-As it will be shown, dual router mutual redistribution can be very tricky
-"Backdoor" paths such as multiple MPLS WAN networks adds more complexity
-Having multiple data centers, and the need to route different traffic in different failure scenarios
As I worked through the issues in a very large enterprise environment, We have seen multiple issues having a chain reaction, making the symptoms very difficult to diagnose. Sometimes, fixing one issue may introduce new ones. Unless we have an absolute crisp grasp of the design, and a systematic approach, the chance of confusion is extremely high.
This is the first of a 3 part series which I thought would be worthwhile to share some basics, and provide a logical breakdown of the interacting issues into separate and manageable pieces.
The much simplified diagram shows dual mutual redistribution points (blue arrow). So what is the issue? with dual router mutual redistribution, feedback can occur in both directions, resulting in inconsistency on the two redistribution points, and sub-optimal routing, even potential routing loops.
Issue 1: EIGRP->OSPF (feedback from OSPF)
Only applies to EIGRP EXT routes, which has a higher AD (170) than OSPF (110). These routes are redistributed into OSPF via R1. R2 learns from R1/OSPF, and prefers it. Therefore R2 will not redistribute these routes from EIGRP to OSPF, resulting in only one path used. Any EIGRP EXT route that doesn’t exist in OSPF will show up on one of the routers as preferring OSPF due to lower admin distance, breaking EIGRP->OSPF redistribution on one router. This means all traffic will be directed to one side.
This issues is resolved by setting EIGRP EXT AD to lower than OSPF (distance eigrp 90 100).
Note there is a side effect, the redistribution router will always prefer EIGRP path (due to lower EIGRP AD). But within OSPF, routes redistributed from EIGRP will have a higher metric (set with redistribution route map), therefore there is no risk of disturbing preference within OSPF.
Setting OSPF EXT AD(distance ospf ext 200) is similar. Both solutions will introduce issue 2, which we will cover in part 2.
Subscribe to:
Posts (Atom)