Saturday, September 24, 2011

A potential problem with Juniper’s implementation of OSPF Router ID


First of all, this non-compliant behavior is observed only on some Juniper devices, not all.

The potential effect of the observed behavior is such that certain OSPF routes fail to propagate as expected.

Here is the scenario: SRX originally uses a loopback 10.0.0.11 for its Router ID.  When that loopback was deleted, it changed its Router ID to a different address 10.0.17.4. This is expected behavior so far.

srx-node0> show interfaces lo0.1
error: interface lo0.1 not found

srx-node0> show ospf overview instance VR   
Instance: VR
  Router ID: 10.0.17.4

However, a closer look at SRX OSPF database reveals that it still has LSAs with 10.0.0.11 (the old Router ID) as its Advertising Router ID.
srx-node0> show ospf database summary instance VR

    OSPF database, Area 0.0.0.0
 Type       ID          Adv Rtr           Seq      Age  Opt  Cksum  Len
Summary  10.0.17.0    10.0.0.11      0x80000008  1423  0x22 0xc857  28
Summary *10.0.17.0    10.0.17.4      0x80000002   792  0x22 0x8794  28

    OSPF database, Area 0.0.0.69
 Type       ID          Adv Rtr           Seq      Age  Opt  Cksum  Len
Summary  10.0.16.3    10.0.0.11      0x8000001c  2280  0x22 0xfbfc  28
Summary  10.0.16.7    10.0.0.11      0x8000001c  2137  0x22 0xd321  28
Summary  10.0.16.8    10.0.0.11      0x8000001c  1994  0x22 0xbf35  28
Summary  10.0.17.16   10.0.0.11      0x8000001d  3423  0x22 0xfdfc  28
Summary  10.0.17.32   10.0.0.11      0x8000001d  1851  0x22 0xaf2e  28


According to RFC2328:
If a router's OSPF Router ID is changed, the router's OSPF software should be restarted before the new Router ID takes effect.  In this case the router should flush its self-originated LSAs from the routing domain before restarting

Note there are two desired behaviors when Router ID changes: 1) OSPF restarts; 2) originated LSA flushes. When that did not happen with JUNOS, the resulting behavior is that the old Router ID is still the “Advertising Router ID” in the LSA, an address that is no longer valid.

Why is that a problem? Because these LSAs will be flooded to neighbors (assuming the router here is an ABR). The neighbor would have noticed the change of Router ID, and thus it will check the validity of Advertising Router ID.

Again, definition of Advertising router according to RFC2328: 
This field indicates the Router ID of the router advertising the summary-LSA or AS-external-LSA that led to this path.
 
Since the neighbor sees the Advertising Router ID (the old Router ID) no longer matches the new Router ID, it will discard the LSA.

When troubleshooting OSPF routing involving Juniper devices, check OSPF databases for invalid entries.

To prevent such pitfalls, always set Router ID in OSPF. And more importantly, set Router ID using loopbacks, and make sure they are not accidentally deleted. 

Tuesday, September 13, 2011

ERSPAN with Nexus 1000v in a Virtualized Data Center


Encapsulated remote SPAN, or ERSPAN can be used to monitor traffic remotely. In a Nexus 1000v environment, it is not feasible to attach probe directly to the virtual switch. Therefore it is particularly valuable to monitor host traffic using ERSPAN, by routing monitored traffic through IP network to designated network analyzer.

A functioning ERSPAN system consists of these components working together:
·         Nexus1000v with specific port profile and SPAN session
·         Host configured to support monitoring interface
·         Destination switch to forward monitoring traffic to probe

A sample reference model is provided here, using Nexus 7000 attached probe as a common example.
ERSPAN - Cisco Networks

Nexus 1000v
First, choose a routed VLAN (2000) to carrying ERSPAN traffic. Chose a subnet size that will accommodate growth of hosts (each host uses an IP address). To illustrate, 10.1.0.0/24 is used for VLAN 2000.

Create a port profile for this VLAN on Nexus1000v, note this VLAN must be a system VLAN.

port-profile type vethernet ERSPAN_2000
  capability l3control
  vmware port-group
  vmware max-ports 64
  switchport mode access
  switchport access vlan 2000
  no shutdown
  system vlan 2000
  state enabled


Next, create a test ERSPAN session, for example, monitor VM on Veth88, send monitored traffic to destination 10.2.0.88. See Nexus 7000 section for destination configuration.

monitor session 1 type erspan-source
 source interface Vethernet88 both
  destination ip 10.2.0.88
  erspan-id 51
  ip ttl 64
  ip prec 0
  ip dscp 0
  mtu 1500
  header-type 2
  no shut

Add a VMKNIC for each host
Must be done from vCenter, for each host. An IP address in VLAN 2000 10.1.0.0/24 is required for each host.
Reference Vmware configuration guide for details.

Nexus 7000
The destination probe is connected to Nexus 7000. We’d want monitored traffic originating from Nexus 1000v, to be forwarded to the probe.

The destination 10.2.0.1 specified by ERSPAN session (on N1kv) has an ARP entry in vlan 3000. There is also a corresponding static MAC address entry pointing to the port which the probe is connected. As a result, the ERSPAN traffic destined for 10.2.0.1 will be forwarded to the probe.

interface Vlan2000
  …
  ip address 10.1.0.2/24
  hsrp 2000
ip 10.1.0.1

interface Vlan3001
  …
  ip address 10.2.0.1/24
  ip arp 10.2.0.88 00AA.BBCC.DD66

interface Ethernet2/2
  switchport
  switchport access vlan 3000
  no shutdown

mac address-table static 00AA.BBCC.DD66 vlan 3000 interface Ethernet2/2