OSPF reconvergence -- neighbor down but interface up

I probably need to lab this out, but we experienced an outage in production last week that I'm trying to understand and handle better.  I could use some validation and any tips on fixing it.

The setup is pretty straightforward: imagine two routers and two circuits (for redundancy) between them.  Each router also has a loopback interface.  One of those circuits is Metro E (or VPLS), and you've set a lower cost on it because it's the higher bandwidth one; let's pretend the other circuit is just a point-to-point of irrelevant type.  The goal of this whole setup is to make sure the loopback interface on the other router is reachable at all times, even when one circuit goes down.  The two routers run OSPF and are in the same area.

In the event that our Metro E provider experiences an internal data plane failure, the OSPF neighbor relationship between my routers over that path will fail, but the interfaces will stay up, and they'll remain peered up over the point-to-point circuit.

Here's the problem: will the loopback addresses now be routed via the point-to-point circuit?  It seems like both routers would still advertise the same LSAs -- "here's my router ID, and I'm connected to these three subnets -- the loopback, Metro E, and point-to-point."  What's to stop the other router from building a topology that assumes (since both routers are still advertising the Metro E subnet) that the Metro E circuit is still the best path?

Comments

  • JoeMJoeM ✭✭✭

    In the event that our Metro E provider experiences an internal data
    plane failure, the OSPF neighbor relationship between my routers over
    that path will fail,
    but the interfaces will stay up, and they'll remain
    peered up over the point-to-point circuit.

    Here's the problem: will the loopback addresses now be routed via the point-to-point circuit?  It seems like both routers would still advertise the same LSAs -- "here's my router ID, and I'm connected to these three subnets -- the loopback, Metro E, and point-to-point."

    Yes. Once each router realizes that there is no longer an OSPF relationship over the first connection. 

     

    There is no neighbor relationship on the Metro E circuit. It failed right?  

    There is a next-hop forwarding address on the backup circuit.

  • JoeMJoeM ✭✭✭

    If you are going to lab this, here is a simple method of testing the idea. You can expand on the idea, if you like.  I believe this follows what you are asking about.

                Here is a quick 2-router lab

    1. Default with two equal-paths

          1.0.0.0/32 is subnetted, 1 subnets
    O        1.1.1.1 [110/2] via 112.112.112.1, 00:00:00, FastEthernet0/1
                     [110/2] via 12.12.12.1, 00:00:41, FastEthernet0/0


                   
    2. Here I up the bandwidth on one interface  (faking scenario)
          
    note: I also change the auto-cost reference so it matters
       
    R2# sh ip route ospf
    <snip>
          1.0.0.0/32 is subnetted, 1 subnets
    O        1.1.1.1 [110/11] via 12.12.12.1, 00:00:50, FastEthernet0/0

    R2# sh ip ospf int br
    Interface    PID   Area            IP Address/Mask    Cost  State Nbrs F/C
    Lo0          1     0               2.2.2.2/32         1     LOOP  0/0
    Fa0/1        1     0               112.112.112.2/24   100   BDR   1/1
    Fa0/0        1     0               12.12.12.2/24      10    BDR   1/1

     

    3. Now I cause a neighbor issue on f0/0 (mismatch the MTU on one side)

    R2#sh ip route ospf
    <snip>
          1.0.0.0/32 is subnetted, 1 subnets
    O        1.1.1.1 [110/101] via 112.112.112.1, 00:00:11, FastEthernet0/1


    R2#trace 1.1.1.1
    <snip>
      1 112.112.112.1 88 msec 64 msec 88 msec




    R2# ping 12.12.12.2  <-- pinging original up/up (but there is no ospf neighbor on f0/0)
    <snip>
    !!!!!




    R2#sh ip int br
    Interface                  IP-Address      OK? Method Status                Protocol
    FastEthernet0/0            12.12.12.2      YES manual up                    up
    FastEthernet0/1            112.112.112.2   YES manual up                    up
    Loopback0                  2.2.2.2         YES manual up                    up

    R2#sh ip ospf int br

    Interface    PID   Area            IP Address/Mask    Cost  State Nbrs F/C
    Lo0          1     0               2.2.2.2/32         1     LOOP  0/0
    Fa0/1        1     0               112.112.112.2/24   100   BDR   1/1
    Fa0/0        1     0               12.12.12.2/24      10    BDR   0/1

    R2#sh ip ospf neig

    Neighbor ID     Pri   State           Dead Time   Address         Interface
    1.1.1.1           1   FULL/DR         00:00:36    112.112.112.1   FastEthernet0/1
    1.1.1.1           1   EXSTART/DR      00:00:39    12.12.12.1      FastEthernet0/0    
                 
    note:  the EXSTART is a hint about an MTU issue.  
















     

  • Here's the problem: will the loopback addresses now be routed via the point-to-point circuit?  It seems like both routers would still advertise the same LSAs -- "here's my router ID, and I'm connected to these three subnets -- the loopback, Metro E, and point-to-point."  What's to stop the other router from building a topology that assumes (since both routers are still advertising the Metro E subnet) that the Metro E circuit is still the best path?

    As there is no active adjacency over the Metro E network - OSPF will change the LSA - the network will be stub.  By definition you can't route traffic over this network if OSPF is the routing protocol in play.

  • They start advertising it as a stub network, not a transit network. If you run BFD over the metro E circuit the failover should be close to transparent. What you need to do is optimize the failure detection time. For more info see:


    Brian



    On Feb 25, 2015, at 1:36 PM, mikep1230 <[email protected]> wrote:

    I probably need to lab this out, but we experienced an outage in production last week that I'm trying to understand and handle better.  I could use some validation and any tips on fixing it.

    The setup is pretty straightforward: imagine two routers and two circuits (for redundancy) between them.  Each router also has a loopback interface.  One of those circuits is Metro E (or VPLS), and you've set a lower cost on it because it's the higher bandwidth one; let's pretend the other circuit is just a point-to-point of irrelevant type.  The goal of this whole setup is to make sure the loopback interface on the other router is reachable at all times, even when one circuit goes down.  The two routers run OSPF and are in the same area.

    In the event that our Metro E provider experiences an internal data plane failure, the OSPF neighbor relationship between my routers over that path will fail, but the interfaces will stay up, and they'll remain peered up over the point-to-point circuit.

    Here's the problem: will the loopback addresses now be routed via the point-to-point circuit?  It seems like both routers would still advertise the same LSAs -- "here's my router ID, and I'm connected to these three subnets -- the loopback, Metro E, and point-to-point."  What's to stop the other router from building a topology that assumes (since both routers are still advertising the Metro E subnet) that the Metro E circuit is still the best path?




    INE - The Industry Leader in CCIE Preparation

    http://www.INE.com



    Subscription information may be found at:

    http://www.ieoc.com/forums/ForumSubscriptions.aspx
Sign In or Register to comment.